SemEval-2025 Task 9: The Food Hazard Detection Challenge

AI-driven solutions meet real-world challenges in food hazard detection.

This report presents the setup, results, and analysis of SemEval-2025 Task 9, which challenged participants to classify food safety incidents based on real-world web texts. It describes the two subtasks, along with dataset structure, evaluation metrics, and successful modeling strategies. Key findings highlight the role of synthetic data, ensemble models, and transformer diversity in tackling long-tail classification under realistic constraints.

Key Highlights:
  • Two subtasks: predicting hazard and product categories (ST1) and specific labels (ST2)

  • Dataset: 6,644 manually labeled food recall reports from official agencies (2012–2022)

  • Models evaluated on macro F1, with hazard detection weighted most

  • Synthetic data from LLMs improved rare class performance

  • Top systems used ensemble methods and domain-augmented inputs

  • No single transformer architecture outperformed others across tasks

Send us a message

Get our latest news

Subscribe
to our newsletter.