What does authentic assessment mean in the context of AI?

What does authentic assessment mean in the context of AI?

In recent discussions about assessment in higher education, responses to generative AI have converged on a familiar solution: authentic assessment.

The term is typically used to describe assessment tasks that resemble real-world or professional practice. In principle, such tasks are intended to reduce opportunities for misconduct and strengthen the relevance of learning.

This instinct is sound. Assessment that connects meaningfully to the contexts in which knowledge will be applied has genuine pedagogical value. But authenticity alone is not enough. Without accountability, authentic assessment risks becoming a reassuring label rather than a meaningful guarantee of learning.

In the context of AI, authentic assessment is often positioned as a safeguard. The underlying assumption is that sufficiently contextualised, complex, or experiential tasks will be resistant to automation, thereby preserving academic integrity. There is something to this. Well-designed authentic tasks can create conditions in which understanding matters more than output.

The difficulty is that authenticity, treated as a sufficient condition for integrity, rests on increasingly fragile foundations.

In the real world

Authentic assessment assumes that the “real world” serves as a stable reference point. Assessments are designed to simulate professional practice on the premise that such practice represents the authentic endpoint of education. Yet across professions, AI tools are already embedded in routine work. Lawyers draft contracts with AI assistance. Journalists use AI for ideation, drafting, and editing. Software developers rely on AI-generated code. In many domains, working without AI is no longer representative of professional practice. If authenticity is defined as mirroring real-world conditions, then authentic assessment must necessarily accommodate AI use, which means authenticity alone cannot do the work of verifying understanding.

This is where the concept risks becoming what some have dubbed a “thought-terminating cliché”, not because authentic assessment is without value, but because invoking authenticity as though it resolves the challenge of AI can obscure the deeper question: how do we know that a student understands what they have produced? An authentic task sets the right conditions. But we need to go further.

There is also a deeper structural issue. Historically, many assessments appeared “authentic” not because unaided cognitive work was inherently valued, but because effective shortcuts were unavailable or easily detected. Authenticity was often a by-product of constraint rather than a deliberate pedagogical principle. As those constraints erode, authenticity needs a companion principle, one that addresses what authenticity, by itself, cannot – the relationship between the student and the work.

From authenticity to accountability

Rather than abandoning authenticity, I propose that we complement it with accountability. Where authenticity shapes the nature of the task, grounding it in meaningful, real-world contexts, accountability addresses the nature of the evidence. An accountable assessment focuses on the extent to which students can substantiate, explain, and take responsibility for what they produce, regardless of whether AI tools were involved.

Under this framework, the central question is not whether work was produced unaided, but whether the student can demonstrate understanding that extends beyond the submitted artefact. Authenticity asks, does this task reflect how knowledge is actually used? Accountability asks, can this student show they understand what they’ve done and why?

Together, these two principles form a more complete basis for assessment in an AI-rich environment. Authentic tasks without accountability can produce polished outputs that mask shallow understanding. Accountable assessment without authenticity risks becoming a series of interrogations detached from meaningful practice. Both are needed.

First the assessment needs to be defensible. The student can justify decisions, explain reasoning, and take responsibility for the outcomes of their work. Second, it needs to be traceable. The student can demonstrate how their thinking developed over time, including revisions, abandoned approaches, and the role of tools such as AI. And third, it needs to be answerable. The student can respond to challenge, questioning, or critique from educators, peers, or practitioners, showing that understanding is not confined to a polished final product.

Together, these criteria shift attention from the artefact itself to the student’s relationship with it. They do not diminish the importance of authentic task design, instead they give it sharper purpose. An authentic task becomes the context within which accountability is demonstrated.

What this looks like in practice

Adopting an assessment that is both authentic and accountable has significant implications for design. For example, an oral defence (or viva) of written work, where students submit a written assignment and subsequently respond to questions that probe their reasoning, choices, and understanding, combines an authentic written task with an accountable demonstration of learning.

Assessment could also require students to document their learning processes, submitting drafts, notes, revision histories, design decisions, and, where relevant, records of AI interaction. The learning trajectory becomes assessable, not only the final submission. The task remains authentic and the process becomes accountable.

An interactive approach to assessment and feedback would distribute assessment across a course, allowing educators to engage with students’ thinking over time. Familiarity with a student’s reasoning reduces reliance on single high-stakes judgments and builds a richer picture of understanding within authentic disciplinary contexts.

Students could also be required to apply concepts in real time through problem-solving, demonstrations, simulations, or teaching others. Understanding is evidenced through action and explanation rather than solely through submitted artefacts, making it authentic in form and accountable in substance.

Weathering the change

These approaches are not immune to challenge. They demand time, interaction, and sustained engagement, and they sit uneasily within systems optimised for scale, anonymity, and efficiency.

Assessment that is both authentic and accountable cannot be implemented through anonymous submission and blind marking alone. It requires relational engagement between students and assessors. This has clear resource implications and raises questions about workload, equity, and institutional capacity.

Yet these constraints also clarify the stakes. If institutions wish to make meaningful claims about student understanding in an AI-rich environment, they must confront the limitations of assessment systems designed for a different technological era.

The emergence of generative AI does not merely challenge existing assessment practices, it exposes longstanding assumptions about authenticity, authorship, and integrity. Authenticity remains a valuable principle, it grounds assessment in the contexts where knowledge matters. But it was never designed to carry the full weight of verifying understanding, and in an era of generative AI, that weight has become unsustainable.

Accountability makes explicit what authenticity alone cannot guarantee: that assessment is ultimately a human judgment about learning, not a mechanical verification of process compliance. By pairing authentic task design with the principles of defensibility, traceability, and answerability, institutions can build assessment frameworks that acknowledge the reality of AI use while preserving the core educational aim of assessing understanding and knowledge creation.

Source link