Estimating the momentary level of engagement from multi-modal participant behaviour is an important prerequisite for assistive systems that support human interactions. At the same time, the connection between engagement and behaviour heavily depends on situational and cultural characteristics. Therefore, MultiMediate’25 poses the cross-cultural, multi-domain engagement estimation challenge. The challenge covers a unique combination of training and test data from diverse cultural backgrounds, including Japanese, Chinese, German, Arabic, Indonesian, and French speakers, as well as different interaction situations (group vs. dyadic). In addition to evaluation metrics that quantify the overall quality of predictions, MultiMediate’25 will also quantify the amount of bias in predictions with respect to the variables gender and cultural background. In addition to engagement estimation, we continue to invite submissions to popular tasks from previous iterations of the challenge: eye contact detection, bodily behaviour recognition, backchannel detection.