2003 High Accuracy Retrieval from Documents (HARD)
The objective of the HARD project is to achieve high accuracy retrieval from documents by leveraging additional information about the searcher and/or the search context, through techniques such as passage retrieval, and using very targeted interaction with the searcher. HARD is being run as an evaluation track within TREC, the Text REtrieval Conference sponsored by NIST. There are three basic tasks to HARD: Topic Creation, Clarification Form completion and Relevance Assessment.
Topic CreationThis initial task involves not only the development of TREC-style topics, but garners additional Metadata about the query issuer. These supplementary fields include:Each of these fields leverages more information about the results being sought. In addition to these, for internal purposes, more demographic data is collected from the query issuer.
- Purpose
- Genre
- Granularity
- Familiarity
- Related Text
A link to the web-based topic creation interface is provided here:
Clarification Form Completion
Participating sites submit html pages, before they receive the metadata for each topic, in order to glean more targeted information from the query issuer. The actual assessor will complete each form pertaining to his or her topic, and return the results to the individual sites.
Relevance Assessment
Relevance assessment for each topic will be performed not only at the document level but also at the passage level. The documents that receive passage-level assessment are determined by the "granularity" field of metadata. This means that topics will be annotated first at the document level, after which on-topic documents will be annotated for phrase- to passage-level relevance.**See the relevance assessment guidelines for annotators.
Additional links
- HARD Timeline
- HARD Project Overview at the University of Massachussetts
- NIST website