Topical PISA Quiz Task — Generating Test Items
A task of the 2026 ELOQUENT lab on evaluating quality of generative language models
Contact: eloquent-clef2026-organizers@googlegroups.com
Task overview
This task focuses on automatic test items generated from a given document, targeting students aged 10 to 15. The objective is to generate assessment items in the form of Question–Answer pairs based on a provided text (the “stimulus”).
We provide participants with a ready-to-use question–answer generation prompt as a baseline, along with five automatically generated QA test items and one gold test item, all derived from the same source stimulus. The scope of participation is intentionally open and flexible. Participants may choose to:
- improve or extend the existing prompt,
- experiment with different off-the-shelf or homemade models,
- modify or redesign the prompt to generate new types of questions,
- or explore any other innovative approach to test items generation.
For this edition, English is the selected language.
Quick Start
How to participate, in more detail
Example item
Some example items can be found here
Submission instructions
Data
PISA, public items, link to repo here
Quality Criteria
The following quality criteria will in various ways be taken into account.
- Topical relevance and coherence
- Naturalness and fluency
- Level of Difficulty
- Scoreability
- Anchoring in stimulus text
- Ambiguity (or rather, the lack thereof)
- Coverage of set of test items over stimulus
- Diversity of set of test items i.e. the distribution of different types of item
Scoring
The scoring of submissions will be made using expertise from human editors who have worked with putting together previous PISA editions.
Timeline
- Task launch: January 2026
- Presentation at European Conference on Information Retrieval (ECIR) in Delft: End of March 2026
- Task submission deadline: May 2026
- Reporting deadline: June 2026
- Task workshop at CLEF Conference in Jena: September 2026
Organisers
- Université Grenoble Alpes: Diandra Fabre, Lorraine Goeuriot, Philippe Mulhem, Didier Schwab, Markarit Vartampetian
- OECD: Said Ettejjari, Mario Piacentini, Luis Francisco Vargas Madriz, Katherina Thomas
- AMD Silo AI: Jussi Karlgren
Contact address for questions or suggestions: eloquent-clef2026-organizers@googlegroups.com
Bibliography
Some relevant previous work – feel free to suggest items for this list e.g. by a pull request!
- The PISA website for background on the survey itself: https://www.oecd.org/en/about/programmes/pisa.html
- A survey on automatic question generation: Nikahat Mulla and Prachi Gharpure. 2023. Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence 12. https://doi.org/10.1007/s13748-023-00295-9
- A typology of questions (originally for automatic question answering purposes): Wendy Lehnert. 1977. A Conceptual Theory of Question Answering. Proceedings of IJCAI
- A typology of educational goals: Bloom, Benjamin S., Max D. Engelhart, Edward J. Furst, Walker H. Hill, and David R. Krathwohl. Taxonomy of educational objectives: The classification of educational goals. New York: Longman, 1956.