Contextualized Evaluations: Judging Language Model Responses to Underspecified Queries
Abstract Language model users often issue queries that lack specification, where the context under which a query was issued—such as the user’s identity, the query’s intent, and the criteria for a response to be useful—is not explicit. For instance, a good r...
How do people cite this paper?
(generated 20 days ago)This paper's framing of underspecified queries and contextualized evaluation has been used to motivate research on accounting for social context in LLM assessments, to support arguments that human preferences are inherently context-dependent, to inform work on contextual faithfulness in reasoning models, and to underscore the need for alignment procedures that address the full spectrum of real-world, contextualized user interactions.