Category: Assessment

  • Why is it so difficult to make reasonable adjustments when assessing disabled PGRs?

    Why is it so difficult to make reasonable adjustments when assessing disabled PGRs?

    Universities are required under the Equality Act 2010 to make reasonable adjustments for disabled students. While it’s often much clearer how to do this for undergraduate students and postgraduate taught students who have coursework and written exams – for example, by giving them extra time or a scribe – support for postgraduate research (PGR) students is far behind.

    Many universities and staff are less clear how to make adjustments for PGRs during supervision, when reading drafts of thesis chapters, and then for the traditional oral viva, which is problematic for many as it relies on instantaneous cognitive processing, fluency and other skills. The Abrahart vs University of Bristol case, in which a student died by suicide after being refused reasonable adjustments to a mode of assessment, highlighted just how critical this issue has become.

    Some universities and academics have expressed concerns that making adjustments for disabled PGR students will somehow “disadvantage” non-disabled students. This misunderstands the provisions of the Equality Act. Reasonable adjustments are a unique legal duty in relation to disability which go some way towards reducing the barriers that disabled people encounter on a daily basis.

    Cultural barriers

    Cultural beliefs – including that PGR study is “supposed to be difficult”, that overcoming the struggle is part of the achievement of obtaining a doctorate, and that adjustments devalue the doctorate – all contribute to unhelpful attitudes towards disabled PGRs and institutions meeting their legal obligations. The still widely held view that a doctorate is training the next generations of academics, limited oversight on progression, lack of consistent training for examiners and supervisors, and the closed-door nature of the viva indicate the cultural nature of many of the barriers.

    The recent work within universities on research culture, equality, diversity and inclusion, and widening participation has in many cases focused on everything other than disability. Where disability is considered, it’s often in relation to neurodivergence. Neurodivergent people may find themselves objects of fascination or considered difficult and a problem to be solved, rarely simply as human beings trying to navigate their way through a society which seems to have suddenly noticed they exist but is still reluctant to make the necessary changes.

    At the PhD viva, often the centring of the examiners’ experience takes priority – rigid arrangements, and the presumed importance of meeting examiners’ expectations, appear very much as priorities, leaving disabled PhD students without a voice or agency or made to feel demanding for simply suggesting they have legal rights which universities must meet.

    Mode of assessment or competence standard?

    The Disabled Students Commitment Competence Standards Guide clarifies that the Equality Act’s reference to the duty to make reasonable adjustments to any provision, criterion or practice (PCP) which places disabled students at a substantial (i.e. more than minor or trivial) disadvantage applies to modes of assessment. It is an indictment of entrenched cultural attitudes in the sector that it took the death of a student after being denied adjustments she was legally entitled to for this distinction to be clarified.

    Many in HE defend the current approach to PhD assessment as being a necessary way of assessing the types of skills a PGR would need as an academic. However, the QAA level 8 descriptors don’t specify a particular mode of assessment, or that the ability to communicate “ideas and conclusion clearly and effectively to specialist and non-specialist audiences” relates to academic contexts either solely or primarily, nor do they specify that assessment relates to whether or not examiners believe the candidate is “ready” for employment as a lecturer.

    The purpose of PhD assessment is to assess whether a candidate meets the assessment criteria to be awarded a doctoral degree. While the question as to whether these level 8 descriptors remain appropriate to assess a PhD may be valid, introducing additional unspoken criteria such as assumptions about academic career readiness is unacceptable for all students, but particularly so for disabled PGRs due to the constant demands on them and cognitive load required to navigate an already unclear system.

    Unhelpfully, the QAA characteristics statement for doctoral degrees asserts that “all doctoral candidates experience a similar format – that is, an assessment of the thesis followed by the closed oral examination.” This could conflict with the legal requirement to adjust assessment for disabled and neurodivergent students, and is despite the Quality Code on Assessment reflecting the importance of inclusive assessment which allows every student to demonstrate their achievements, “with no group or individual disadvantaged”.

    Sharing this reasoning and information is fundamental to changing entrenched and often misunderstandings in the sector about what we’re actually assessing in the PhD viva and how to approach that assessment.

    What needs to be done?

    Making adjustments for individual PGR vivas is time consuming when many adjustments could be made as standard (a “universal design” approach), releasing time to focus on making a smaller number of less commonly required adjustments. Many adjustments are easy to make: holding the viva in a ground floor room, linking to already existing accessibility information, limits on the length of the viva with compulsory breaks, ensuring there are toilets nearby, training for examiners, and options about the viva format.

    While many PGRs are content with the traditional oral viva, others would prefer a written option (for many years the standard option in Australasia) or a hybrid option with written questions in advance of a shorter oral viva. Universities often raise AI assistance as being a reason that an oral viva is necessary. However, this is best addressed through policies, training and declarations of authorship, rather than relying solely on an oral viva.

    Feedback from delegates at a webinar on the topic of inclusive viva which we delivered – hosted by UKCGE – underlined the need for clarity of expectations, standard approaches to adjustments, and training for everyone involved in the PGR journey to understand what the requirements of the Equality Act 2010 are. Adjustments for “visible” disabilities are often easier to understand and make – it would be difficult to deny a deaf PGR a British Sign Language interpreter.

    Where disabilities are less visible, cultural attitudes seem more difficult to shift to make these needed adjustments. Revisions to sector documents, such as the doctoral degrees characteristics statement are also overdue.

    Put simply, it’s not reasonable to deny a student the award of a degree that their research warrants due to an inappropriate mode of assessment.

    The authors would like to thank Charlotte Round, Head of Service for Disability Support at the University of Nottingham, for her involvement.

    Source link

  • Students must intentionally develop durable skills to thrive in an AI-dominated world

    Students must intentionally develop durable skills to thrive in an AI-dominated world

    Key points:

    As AI increasingly automates technical tasks across industries, students’ long-term career success will rely less on technical skills alone and more on durable skills or professional skills, often referred to as soft skills. These include empathy, resilience, collaboration, and ethical reasoning–skills that machines can’t replicate.

    This critical need is outlined in Future-Proofing Students: Professional Skills in the Age of AI, a new report from Acuity Insights. Drawing on a broad body of academic and market research, the report provides an analysis of how institutions can better prepare students with the professional skills most critical in an AI-driven world.

    Key findings from the report:

    • 75 percent of long-term job success is attributed to professional skills, not technical expertise.
    • Over 25 percent of executives say they won’t hire recent graduates due to lack of durable skills.
    • COVID-19 disrupted professional skill development, leaving many students underprepared for collaboration, communication, and professional norms.
    • Eight essential durable skills must be intentionally developed for students to thrive in an AI-driven workplace.

    “Technical skills may open the door, but it’s human skills like empathy and resilience that endure over time and lead to a fruitful and rewarding career,” says Matt Holland, CEO at Acuity Insights. “As AI reshapes the workforce, it has become critical for higher education to take the lead in preparing students with these skills that will define their long-term success.”

    The eight critical durable skills include:

    • Empathy
    • Teamwork
    • Communication
    • Motivation
    • Resilience
    • Ethical reasoning
    • Problem solving
    • Self-awareness

    These competencies don’t expire with technology–they grow stronger over time, helping graduates adapt, lead, and thrive in an AI-driven world.

    The report also outlines practical strategies for institutions, including assessing non-academic skills at admissions using Situational Judgment Tests (SJTs), and shares recommendations on embedding professional skills development throughout curricula and forming partnerships that bridge AI literacy with interpersonal and ethical reasoning.

    Latest posts by eSchool Media Contributors (see all)

    Source link

  • ACT and Texas Instruments Collaborate to Enhance Student Success in Mathematics

    ACT and Texas Instruments Collaborate to Enhance Student Success in Mathematics

    Iowa City, Iowa and Dallas, Texas (November 12, 2025) – ACT, a leader in college and career readiness assessment, and Texas Instruments Education Technology (TI), a division of the global semiconductor company, today announced a comprehensive partnership aimed at empowering students to achieve their best performance on the ACT mathematics test.

    This initiative brings together two education leaders to provide innovative resources and tools that maximize student potential. The partnership will start by providing:

    • A new dedicated online resource center featuring co-branded instructional videos demonstrating optimal use of TI calculators during the ACT mathematics test.
    • Additional study materials featuring TI calculators to help students build upon and apply their mathematical knowledge while maximizing their time on the ACT test.
    • Professional development programs for teachers focused on effective calculator-based testing strategies.

    “This partnership represents our commitment to providing students with the tools and resources they need to demonstrate their mathematical knowledge effectively,” said Andrew Taylor, Senior Vice President of Educational Solutions and International, ACT, “By working with Texas Instruments, we’re ensuring students have access to familiar, powerful technology tools during this important assessment.”

    “Texas Instruments is proud to partner with ACT to support student success,” said Laura Chambers, President at Texas Instruments Education Technology. “Our calculator technology, combined with targeted instructional resources, will help students showcase their true mathematical abilities during the ACT test.” 

    The new resources are available now to students and educators on the ACT website www.act.org under ACT Math Calculator Tips.

    About ACT

    ACT is transforming college and career readiness pathways so that everyone can discover and fulfill their potential. Grounded in more than 65 years of research, ACT’s learning resources, assessments, research, and work-ready credentials are trusted by students, job seekers, educators, schools, government agencies, and employers in the U.S. and around the world to help people achieve their education and career goals at every stage of life. Visit us at https://www.act.org/.  

    About Texas Instruments

    Texas Instruments Education Technology (TI) — the gold standard for excellence in math — provides exam-approved graphing calculators and interactive STEM technology. TI calculators and accessories drive student understanding and engagement without adding to online distractions. We are committed to empowering teachers, inspiring students and supporting real learning in classrooms everywhere. For more information, visit education.ti.com.

    Texas Instruments Incorporated (Nasdaq: TXN) is a global semiconductor company that designs, manufactures and sells analog and embedded processing chips for markets such as industrial, automotive, personal electronics, enterprise systems and communications equipment. At our core, we have a passion to create a better world by making electronics more affordable through semiconductors. This passion is alive today as each generation of innovation builds upon the last to make our technology more reliable, more affordable and lower power, making it possible for semiconductors to go into electronics everywhere. Learn more at TI.com.

    eSchool News Staff
    Latest posts by eSchool News Staff (see all)

    Source link

  • Algorithms aren’t the problem. It’s the classification system they support

    Algorithms aren’t the problem. It’s the classification system they support

    The Office for Students (OfS) has published its annual analysis of sector-level degree classifications over time, and alongside it a report on Bachelors’ degree classification algorithms.

    The former is of the style (and with the faults) we’ve seen before. The latter is the controversial bit, both to the extent to which parts of it represent a “new” set of regulatory requirements, and a “new” set of rules over what universities can and can’t do when calculating degree results.

    Elsewhere on the site my colleague David Kernohan tackles the regulation issue – the upshots of the “guidance” on the algorithms, including what it will expect universities to do both to algorithms in use now, and if a provider ever decides to revise them.

    Here I’m looking in detail at its judgements over two practices. Universities are, to all intents and purposes, being banned from any system which discounts credits with the lowest marks – a practice which the regulator says makes it difficult to demonstrate that awards reflect achievement.

    It’s also ruling out “best of” algorithm approaches – any universities that determine degree class by running multiple algorithms and selecting the one that gives the highest result will also have to cease doing so. Anyone still using these approaches by 31 July 2026 has to report itself to OfS.

    Powers and process do matter, as do questions as to whether this is new regulation, or merely a practical interpretation of existing rules. But here I’m concerned with the principle. Has OfS got a point? Do systems such as those described above amount to misleading people who look at degree results over what a student has achieved?

    More, not less

    A few months ago now on Radio 4’s More or Less, I was asked how Covid had impacted university students’ attainment. On a show driven by data, I was wary about admitting that as a whole, I think it would be fair to say that UK HE isn’t really sure.

    When in-person everything was cancelled back in 2020, universities scrambled to implement “no detriment” policies that promised students wouldn’t be disadvantaged by the disruption.

    Those policies took various forms – some guaranteed that classifications couldn’t fall below students’ pre-pandemic trajectory, others allowed students to select their best marks, and some excluded affected modules entirely.

    By 2021, more than a third of graduates were receiving first-class honours, compared to around 16 per cent a decade earlier – with ministers and OfS on the march over the risk of “baking in” the grade inflation.

    I found that pressure troubling at the time. It seemed to me that for a variety of reasons, providers may have, as a result of the pandemic, been confronting a range of faults with degree algorithms – for the students, courses and providers that we have now, it was the old algorithms that were the problem.

    But the other interesting thing for me was what those “safety net” policies revealed about the astonishing diversity of practice across the sector when it comes to working out the degree classification.

    For all of the comparison work done – including, in England, official metrics on the Access and Participation Dashboard over disparities in “good honours” awarding – I was wary about admitting to Radio 4’s listeners that it’s not just differences in teaching, assessment and curriculum that can drive someone getting a First here and a 2:2 up the road.

    When in-person teaching returned in 2022 and 2023, the question became what “returning to normal” actually meant. Many – under regulatory pressure not to “bake in” grade inflation – removed explicit no-detriment policies, and the proportion of firsts and upper seconds did ease slightly.

    But in many providers, many of the flexibilities introduced during Covid – around best-mark selection, module exclusions and borderline consideration – had made explicit and legitimate what was already implicit in many institutional frameworks. And many were kept.

    Now, in England, OfS is to all intents and purposes banning a couple of the key approaches that were deployed during Covid. For a sector that prizes its autonomy above almost everything else, that’ll trigger alarm.

    But a wider look at how universities actually calculate degree classifications reveals something – the current system embodies fundamentally different philosophies about what a degree represents, are philosophies that produce systematically different outcomes for identical student performance, and are philosophies that should not be written off lightly.

    What we found

    Building on David Allen’s exercise seven years ago, a couple of weeks ago I examined the publicly available degree classification regulations for more than 150 UK universities, trawling through academic handbooks, quality assurance documents and regulatory frameworks.

    The shock for the Radio 4 listener on the Clapham Omnibus would be that there is no standardised national system with minor variations, but there is a patchwork of fundamentally different approaches to calculating the same qualification.

    Almost every university claims to use the same framework for UG quals – the Quality Assurance Agency benchmarks, the Framework for Higher Education Qualifications and standard grade boundaries of 70 for a first, 60 for a 2:1, 50 for a 2:2 and 40 for a third. But underneath what looks like consistency there’s extraordinary diversity in how marks are then combined into final classifications.

    The variations cluster around a major divide. Some universities – predominantly but not exclusively in the Russell Group – operate on the principle that a degree classification should reflect the totality of your assessed work at higher levels. Every module (at least at Level 5 and 6) counts, every mark matters, and your classification is the weighted average of everything you did.

    Other universities – predominantly post-1992 institutions but with significant exceptions – take a different view. They appear to argue that a degree classification should represent your actual capability, demonstrated through your best work.

    Students encounter setbacks, personal difficulties and topics that don’t suit their strengths. Assessment should be about demonstrating competence, not punishing every misstep along a three-year journey.

    Neither philosophy is obviously wrong. The first prioritises consistency and comprehensiveness. The second prioritises fairness and recognition that learning isn’t linear. But they produce systematically different outcomes, and the current system does allow both to operate under the guise of a unified national framework.

    Five features that create flexibility

    Five structural features appear repeatedly across university algorithms, each pushing outcomes in one direction.

    1. Best-credit selection

    This first one has become widespread, particularly outside the Russell Group. Rather than using all module marks, many universities allow students to drop their worst performances.

    One uses the best 105 credits out of 120 at each of Levels 5 and 6. Another discards the lowest 20 credits automatically. A third takes only the best 90 credits at each level. Several others use the best 100 credits at each stage.

    The rationale is obvious – why should one difficult module or one difficult semester define an entire degree?

    But the consequence is equally obvious. A student who scores 75-75-75-75-55-55 across six modules averages 68.3 per cent. At universities where everything counts, that’s a 2:1. At universities using best-credit selection that drops the two 55s, it averages 75 – a clear first.

    Best-credit selection is the majority position among post-92s, but virtually absent at Russell Group universities. OfS is now pretty much banning this practice.

    The case against rests on B4.2(c) (academic regulations must be “designed to ensure” awards are credible) and B4.4(e) (credible means awards “reflect students’ knowledge and skills”). Discounting credits with lowest marks “excludes part of a student’s assessed achievement” and so:

    …may result in a student receiving a class of degree that overlooks material evidence of their performance against the full learning outcomes for the course.

    2. Multiple calculation routes

    These take that principle further. Several universities calculate your degree multiple ways and award whichever result is better. One runs two complete calculations – using only your best 100 credits at Level 6, or taking your best 100 at both levels with 20:80 weighting. You get whichever is higher.

    Another offers three complete routes – unweighted mean, weighted mean and a profile-based method. Students receive the highest classification any method produces.

    For those holding onto their “standards”, this sort of thing is mathematically guaranteed to inflate outcomes. You’re measuring the best possible interpretation of what students achieved, not what they achieved every time. As a result, comparison across institutions becomes meaningless. Again, this is now pretty much being banned.

    This time, the case against is that:

    …the classification awarded should not simply be the most favourable result, but the result that most accurately reflects the student’s level of achievement against the learning outcomes.

    3. Borderline uplift rules

    What happens on the cusps? Borderline uplift rules create all sorts of discretion around the theoretical boundaries.

    One university automatically uplifts students to the higher class if two-thirds of their final-stage credits fall within that band, even if their overall average sits below the threshold. Another operates a 0.5 percentage point automatic uplift zone. Several maintain 2.0 percentage point consideration zones where students can be promoted if profile criteria are met.

    If 10 per cent of students cluster around borderlines and half are uplifted, that’s a five per cent boost to top grades at each boundary – the cumulative effect is substantial.

    One small and specialist plays the counterfactual – when it gained degree-awarding powers, it explicitly removed all discretionary borderline uplift. The boundaries are fixed – and it argues this is more honest than trying to maintain discretion that inevitably becomes inconsistent.

    OfS could argue borderline uplift breaches B4.2(b)’s requirement that assessments be “reliable” – defined as requiring “consistency as between students.”

    When two students with 69.4% overall averages receive different classifications (one uplifted to First, one remaining 2:1) based on mark distribution patterns or examination board discretion, the system produces inconsistent outcomes for identical demonstrated performance.

    But OfS avoids this argument, likely because it would directly challenge decades of established discretion on borderlines – a core feature of the existing system. Eliminating all discretion would conflict with professional academic judgment practices that the sector considers fundamental, and OfS has chosen not to pick that fight.

    4. Exit acceleration

    Heavy final-year weighting amplifies improvement while minimising early difficulties. Where deployed, the near-universal pattern is now 25 to 30 per cent for Level 5 and 70 to 75 per cent for Level 6. Some institutions weight even more heavily, with year three counting for 60 per cent of the final mark.

    A student who averages 55 in year two and 72 in year three gets 67.2 overall with typical 30:70 weighting – a 2:1. A student who averages 72 in year two and 55 in year three gets 59.9 – just short of a 2:1.

    The magnitude of change is identical – it’s just that the direction differs. The system structurally rewards late bloomers and penalises any early starters who plateau.

    OfS could argue that 75 per cent final-year weighting breaches B4.2(a)’s requirement for “appropriately comprehensive” assessment. B4 Guidance 335M warns that assessment “focusing only on material taught at the end of a long course… is unlikely to provide a valid assessment of that course,” and heavy (though not exclusive) final-year emphasis arguably extends this principle – if the course’s subject matter is taught across three years, does minimizing assessment of two-thirds of that teaching constitute comprehensive evaluation?

    But OfS doesn’t make this argument either, likely because year weighting is explicit in published regulations, often driven by PSRB requirements, and represents settled institutional choices rather than recent innovations. Challenging it would mean questioning established pedagogical frameworks rather than targeting post-hoc changes that might mask grade inflation.

    5. First-year exclusion

    Finally, with a handful of institutional and PSRB exceptions, the first-year-not-counting is now pretty much universal, removing what used to be the bottom tail of performance distributions.

    While this is now so standard it seems natural, it represents a significant structural change from 20 to 30 years ago. You can score 40s across the board in first year and still graduate with a first if you score 70-plus in years two and three.

    Combine it with other features, and the interaction effects compound. At universities using best 105 credits at each of Levels 5 and 6 with 30:70 weighting, only 210 of 360 total credits – 58 per cent – actually contribute to your classification. And so on.

    OfS could argue first-year exclusion breaches comprehensiveness requirements – when combined with best-credit selection, only 210 of 360 total credits (58%) might count toward classification. But OfS explicitly notes this practice is now “pretty much universal” with only “a handful of institutional and PSRB exceptions,” treating it as neutral accepted practice rather than a compliance concern.

    Targeting something this deeply embedded across the sector would face overwhelming institutional autonomy defenses and would effectively require the sector to reinstate a practice it collectively abandoned over the past two decades.

    OfS’ strategy is to focus regulatory pressure on recent adoptions of “inherently inflationary” practices rather than challenging longstanding sector-wide norms.

    Institution type

    Russell Group universities generally operate on the totality-of-work philosophy. Research-intensives typically employ single calculation methods, count all credits and maintain narrow borderline zones.

    But there are exceptions. One I’ve seen has automatic borderline uplift that’s more generous than many post-92s. Another’s 2.0 percentage point borderline zone adds substantial flexibility. If anything, the pattern isn’t uniformity of rigour – it’s uniformity of philosophy.

    One London university has a marks-counting scheme rather than a weighted average – what some would say is the most “rigorous” system in England. And two others – you can guess who – don’t fit this analysis at all, with subject-specific systems and no university-wide algorithms.

    Post-1992s systematically deploy multiple flexibility features. Best-credit selection appears at roughly 70 per cent of post-92s. Multiple calculation routes appear at around 40 per cent of post-92s versus virtually zero per cent at research-intensive institutions. Several post-92s have introduced new, more flexible classification algorithms in the past five years, while Russell Group frameworks have been substantially stable for a decade or more.

    This difference reflects real pressures. Post-92s face acute scrutiny on student outcomes from league tables, OfS monitoring and recruitment competition, and disproportionately serve students from disadvantaged backgrounds with lower prior attainment.

    From one perspective, flexibility is a cynical response to metrics pressure. From another, it’s recognition that their students face different challenges. Both perspectives contain truth.

    Meanwhile, Scottish universities present a different model entirely, using GPA-based calculations across SCQF Levels 9 and 10 within four-year degree structures.

    The Scottish system is more internally standardised than the English system, but the two are fundamentally incompatible. As OfS attempts to mandate English standardisation, Scottish universities will surely refuse, citing devolved education powers.

    London is a city with maximum algorithmic diversity within minimum geographic distance. Major London universities use radically different calculation systems despite competing for similar students. A student with identical marks might receive a 2:1 at one, a first at another and a first with higher average at a third, purely over algorithmic differences.

    What the algorithm can’t tell you

    The “five features” capture most of the systematic variation between institutional algorithms. But they’re not the whole story.

    First, they measure the mechanics of aggregation, not the standards of marking. A 65 per cent essay at one university may represent genuinely different work from a 65 per cent at another. External examining is meant to moderate this, but the system depends heavily on trust and professional judgment. Algorithmic variation compounds whatever underlying marking variation exists – but marking standards themselves remain largely opaque.

    Second, several important rules fall outside the five-feature framework but still create significant variation. Compensation and condonement rules – how universities handle failed modules – differ substantially. Some allow up to 30 credits of condoned failure while still classifying for honours. Others exclude students from honours classification with any substantial failure, regardless of their other marks.

    Compulsory module rules also cut across the best-credit philosophy. Many universities mandate that dissertations or major projects must count toward classification even if they’re not among a student’s best marks. Others allow them to be dropped. A student who performs poorly on their dissertation but excellently elsewhere will face radically different outcomes depending on these rules.

    In a world where huge numbers of students now have radically less module choice than they did just a few years ago as a result of cuts, they would have reason to feel doubly aggrieved if modules they never wanted to take in the first place will now count when they didn’t last week.

    Several universities use explicit credit-volume requirements at each classification threshold. A student might need not just a 60 per cent average for a 2:1, but also at least 180 credits at 60 per cent or above, including specific volumes from the final year. This builds dual criteria into the system – you need both the average and the profile. It’s philosophically distinct from borderline uplift, which operates after the primary calculation.

    And finally, treatment of reassessed work varies. Nearly all universities cap resit marks at the pass threshold, but some exclude capped marks from “best credit” calculations while others include them. For students who fail and recover, this determines whether they can still achieve high classifications or are effectively capped at lower bands regardless of their other performance.

    The point isn’t so much that I (or OfS) have missed the “real” drivers of variation – the five features genuinely are the major structural mechanisms. But the system’s complexity runs deeper than any five-point list can capture. When we layer compensation rules onto best-credit selection, compulsory modules onto multiple calculation routes, and volume requirements onto borderline uplift, the number of possible institutional configurations runs into the thousands.

    The transparency problem

    Every day’s a school day at Wonkhe, but what has been striking for me is quite how difficult the information has been to access and compare. Some institutions publish comprehensive regulations as dense PDF documents. Others use modular web-based regulations across multiple pages. Some bury details in programme specifications. Several have no easily locatable public explanation at all.

    UUK’s position on this, I’d suggest, is a something of a stretch:

    University policies are now much more transparent to students. Universities are explaining how they calculate the classification of awards, what the different degree classifications mean and how external examiners ensure consistency between institutions.

    Publication cycles vary unpredictably, cohort applicability is often ambiguous, and cross-referencing between regulations, programme specifications and external requirements adds layers upon layers of complexity. The result is that meaningful comparison is effectively impossible for anyone outside the quality assurance sector.

    This opacity matters because it masks that non-comparability problem. When an employer sees “2:1, BA in History” on a CV, they have no way of knowing whether this candidate’s university used all marks or selected the best 100 credits, whether multiple calculation routes were available or how heavily final-year work was weighted. The classification looks identical regardless. That makes it more, not less, likely that they’ll just go on prejudices and league tables – regardless of the TEF medal.

    We can estimate the impact conservatively. Year one exclusion removes perhaps 10 to 15 per cent of the performance distribution. Best-credit selection removes another five to 10 per cent. Heavy final-year weighting amplifies improvement trajectories. Multiple calculation routes guarantee some students shift up a boundary. Borderline rules uplift perhaps three to five per cent of the cohort at each threshold.

    Stack these together and you could shift perhaps 15 to 25 per cent of students up one classification band compared to a system that counted everything equally with single-method calculation and no borderline flexibility. Degree classifications are measuring as much about institutional algorithm choices as about student learning or teaching quality.

    Yes, but

    When universities defend these features, the justifications are individually compelling. Best-credit selection rewards students’ strongest work rather than penalising every difficult moment. Multiple routes remove arbitrary disadvantage. Borderline uplift reflects that the difference between 69.4 and 69.6 per cent is statistically meaningless. Final-year emphasis recognises that learning develops over time. First-year exclusion creates space for genuine learning without constant pressure.

    None of these arguments is obviously wrong. Each reflects defensible beliefs about what education is for. The problem is that they’re not universal beliefs, and the current system allows multiple philosophies to coexist under a facade of equivalence.

    Post-92s add an equity dimension – their flexibility helps students from disadvantaged backgrounds who face greater obstacles. If standardisation forces them to adopt strict algorithms, degree outcomes will decline at institutions serving the most disadvantaged students. But did students really learn less, or attain to a “lower” standard?

    The counterargument is that if the algorithm itself makes classifications structurally easier to achieve, you haven’t promoted equity – you’ve devalued the qualification. And without the sort of smart, skills and competencies based transcripts that most of our pass/fail cousins across Europe adopt, UK students end up choosing between a rock and a hard place – if only they were conscious of that choice.

    The other thing that strikes me is that the arguments I made in December 2020 for “baking in” grade inflation haven’t gone away just because the pandemic has. If anything, the case for flexibility has strengthened as the cost of living crisis, inadequate maintenance support and deteriorating student mental health create circumstances that affect performance through no fault of students’ own.

    Students are working longer hours in paid employment to afford rent and food, living in unsuitable accommodation, caring for family members, and managing mental health conditions at record levels. The universities that retained pandemic-era flexibilities – best-credit selection, generous borderline rules, multiple calculation routes – aren’t being cynical about grade inflation. They’re recognising that their students disproportionately face these obstacles, and that a “totality-of-work” philosophy systematically penalises students for circumstances beyond their control rather than assessing what they’re actually capable of achieving.

    The philosophical question remains – should a degree classification reflect every difficult moment across three years, or should it represent genuine capability demonstrated when circumstances allow? Universities serving disadvantaged students have answered that question one way – research-intensive universities serving advantaged students have answered it another.

    OfS’s intervention threatens to impose the latter philosophy sector-wide, eliminating the flexibility that helps students from disadvantaged backgrounds show their “best selves” rather than punishing them for structural inequalities that affect their week-to-week performance.

    Now what

    As such, a regulator seeking to intervene faces an interesting challenge with no obviously good options – albeit one of its own making. Another approach might have been to cap the most egregious practices – prohibit triple-route calculations, limit best-credit selection to 90 per cent of total credits, cap borderline zones at 1.5 percentage points.

    That would eliminate the worst outliers while preserving meaningful autonomy. The sector would likely comply minimally while claiming victory, but oodles of variation would remain.

    A stricter approach would be mandating identical algorithms – but would provoke rebellion. Devolved nations would refuse, citing devolved powers and triggering a constitutional comparison. Research intensive universities would mount legal challenges on academic freedom grounds, if they’re not preparing to do so already. Post-92s would deploy equity arguments, claiming standardisation harms universities serving disadvantaged students.

    A politically savvy but inadequate approach might have been mandatory transparency rather than prescription. Requiring universities to publish algorithms in standardised format with some underpinning philosophy would help. That might preserve autonomy while creating a bit of accountability. Maybe competitive pressure and reputational risk will drive voluntary convergence.

    But universities will resist even being forced to quantify and publicise the effects of their grading systems. They’ll argue it undermines confidence and damages the UK’s international reputation.

    Given the diversity of courses, providers, students and PSRBs, algorithms also feel like a weird thing to standardise. I can make a much better case for a defined set of subject awards, a shared governance framework (including subject benchmark statements, related PSRBs and degree algorithms) than I can for tightening standardisation in isolation.

    The fundamental problem is that the UK degree classification system was designed for a different age, a different sector and a different set of students. It was probably a fiction to imagine that sorting everyone into First, 2:1, 2:2 and Third was possible even 40 years ago – but today, it’s such obvious nonsense that without richer transcripts, it just becomes another way to drag down the reputation of the sector and its students.

    Unfit for purpose

    In 2007, the Burgess Review – commissioned by Universities UK itself – recommended replacing honours degree classifications with detailed achievement transcripts.

    Burgess identified the exact problems we have today – considerable variation in institutional algorithms, the unreliability of classification as an indicator of achievement, and the fundamental inadequacy of trying to capture three years of diverse learning in a single grade.

    The sector chose not to implement Burgess’s recommendations, concerned that moving away from classifications would disadvantage UK graduates in labour markets “where the classification system is well understood.”

    Eighteen years later, the classification system is neither well understood nor meaningful. A 2:1 at one institution isn’t comparable to a 2:1 at another, but the system’s facade of equivalence persists.

    The sector chose legibility and inertia over accuracy and ended up with neither – sticking with a system that protected institutional diversity while robbing students of the ability to show off theirs. As we see over and over again, a failure to fix the roof when the sun was shining means reform may now arrive externally imposed.

    Now the regulator is knocking on the conformity door, there’s an easy response. OfS can’t take an annual pop at grade inflation if most of the sector abandons the outdated and inadequate degree classification system. Nothing in the rules seems to mandate it, some UG quals don’t use it (think regulated professional bachelors), and who knows where the White Paper’s demand for meaningful exit awards at Level 4 and 5 fit into all of this.

    Maybe we shouldn’t be surprised that a regulator that oversees a meaningless and opaque medal system with a complex algorithm that somehow boils an entire university down to “Bronze”, “Silver” Gold” or “Requires Improvement” is keen to keep hold of the equivalent for students.

    But killing off the dated relic would send a really powerful signal – that the sector is committed to developing the whole student, explaining their skills and attributes and what’s good about them – rather than pretending that the classification makes the holder of a 2:1 “better” than those with a Third, and “worse” than those with a First.

    Source link

  • Students taking resits need specific support

    Students taking resits need specific support

    In an era where higher education emphasises retention, progression, and student success, there remains a striking omission in policy and practice: how best to support students who are struggling to meet their course requirements.

    We talk confidently about inclusion, engagement and student voice but for students required to resit exams, the reality is often isolation, confusion, and a lack of meaningful academic contact. This is not just a pastoral concern, it’s strategic failure.

    The hidden cost of resits

    Every summer, thousands of students across the UK undertake resit assessments. Failing to pass second time around can delay progression or, in some cases, threaten continuation. To provide a sense of scale, it has been estimated that somewhere between five and 25 per cent of students need to resit at least one assessment during their degree – this could be around 90,000 or more.

    In many institutions, including my own at the University of Manchester, the resit period overlaps with a time when many academic staff are away or busy with other things. It is at a time (for us, in late August) when there are no structured teaching activities, and likely minimal tailored guidance. These students are often left navigating complex academic demands while juggling paid work, accommodation issues, and other commitments with little support beyond generic study tips. It’s a recipe for disengagement.

    Resits are rarely discussed in pedagogic terms, and almost never in policy conversations. This topic remains under-explored, under-theorised, and under-supported. Yet, resits are pivotal moments in students’ lives, with a clear link to continuation and completion. So why do we treat them as an afterthought?

    What students told us

    To better understand the support gaps, we ran a student-partnered inquiry at the University of Manchester, focusing on students’ experiences of resits. We set out to work with students to understand how they experience resits and what support might help them succeed the second time around.

    Using thematic analysis, we drew out three main themes from our discussions. Our findings weren’t surprising, but they were striking. Students reported a lack of academic contact during the summer period with, students feeling “out of touch and isolated” during the summer. Students struggled with concerns about how to improve their knowledge and they felt unclear on what doing better looked like. And critically, they lacked confidence in their own ability to succeed.

    Importantly, students weren’t necessarily asking for more support, but they were asking for the right support. Generic toolkits and peer mentoring were rated as the least useful support strategies. Instead, what they valued was targeted feedback, clarity about expectations, and a sense of continued connection to their course and teaching team.

    What needs to change

    If institutions are serious about retention and inclusive education, they need to take resits seriously and students undertaking resits need specific pedagogic support. This means embedding revision and review into regular teaching, providing personalised feedback that explicitly supports second attempts, and recognising the resit period as a time where academic confidence is likely to be low and meaningful academic contact can make or break motivation and self-efficacy.

    Our findings suggest that students facing resits are not a homogenous group. They are individuals each navigating their own set of academic, emotional, and logistical challenges. Critically, the strategies they value most are those that give them insight into their own performance and actionable ways to improve.

    More broadly, we need to challenge the idea that resits are just a student problem. Whether a resit is seen as a hurdle, a second chance, or a psychological burden has implications for how we structure and support our students. Resits are an organisational issue where institutional priorities, academic calendars, and staffing models collide to create patchy and inconsistent support.

    Resits should not be a footnote in our academic policies. They are a critical part of the learning journey for many students, and we need to consider examining both University led and individual led strategies of support. We need to also talk to students who don’t pass their resits. What support was missing? Were the barriers academic, personal, or structural? And crucially what interventions might have made a difference.

    We need sector-wide conversations about what effective resit support looks like, how it is resourced, and who is responsible. Research on this is scarce, but growing (you can read more about our student-partnered inquiry in our recently published Advance HE case study).

    Taking resits seriously is not about lowering standards. It’s about recognising that failure when properly supported may even serve as a pedagogical “leg up” for learning. However, when left unsupported, it risks becoming the moment students fall through the cracks.

    Source link

  • Dialogic assessments are the missing piece in contemporary assessment debates

    Dialogic assessments are the missing piece in contemporary assessment debates

    When I ask apprentices to reflect on their learning in professional discussions, I often hear a similar story:

    It wasn’t just about what I knew – it was how I connected it all. That’s when it clicked.

    That’s the value of dialogic assessment. It surfaces hidden knowledge, creates space for reflection, and validates professional judgement in ways that traditional essays often cannot.

    Dialogic assessment shifts the emphasis from static products – the essay, the exam – to dynamic, real-time engagement. These assessments include structured discussions, viva-style conversations, or portfolio presentations. What unites them is their reliance on interaction, reflection, and responsiveness in the moment.

    Unlike “oral exams” of old, these conversations require learners to explain reasoning, apply knowledge, and reflect on lived experience. They capture the complex but authentic process of thinking – not just the polished outcome.

    In Australia, “interactive orals” have been adopted at scale to promote integrity and authentic learning, with positive feedback from staff and students. Several UK universities have piloted viva-style alternatives to traditional coursework with similar results. What apprenticeships have long taken for granted is now being recognised more widely: dialogue is a powerful form of assessment.

    Lessons from apprenticeships

    In apprenticeships and work-based learning, dialogic assessment is not an add-on – it’s essential. Apprentices regularly take part in professional discussions (PDs) and portfolio presentations as part of both formative and end-point assessment.

    What makes them so powerful? They are inclusive, as they allow different strengths to emerge. Written tasks may favour those fluent in academic conventions, while discussions reveal applied judgement and reflective thinking. They are authentic, in that they mirror real workplace activities such as interviews, stakeholder reviews, and project pitches. And they can be transformative – apprentices often describe PDs as moments when fragmented knowledge comes together through dialogue.

    One apprentice told me:

    It wasn’t until I talked it through that I realised I knew more than I thought – I just couldn’t get it down on paper.

    For international students, dialogic assessment can also level the playing field by valuing applied reasoning over written fluency, reducing the barriers posed by rigid academic writing norms.

    My doctoral research has shown that PDs not only assess knowledge but also co-create it. They push learners to prepare more deeply, reflect more critically, and engage more authentically. Tutors report richer opportunities for feedback in the process itself, while employers highlight their relevance to workplace practice.

    And AI fits into this picture too. When ChatGPT and similar tools emerged in late 2022, many feared the end of traditional written assessment. Universities scrambled for answers – detection software, bans, or a return to the three-hour exam. The risk has been a slide towards high-surveillance, low-trust assessment cultures.

    But dialogic assessment offers another path. Its strength is precisely that it asks students to do what AI cannot:

    • authentic reflection, as learners connect insights to their own lived experience.
    • real-time reasoning – learners respond to questions, defend ideas, and adapt on the spot.
    • professional identity, where the kind of reflective judgement expected in real workplaces is practised.

    Assessment futures

    Scaling dialogic assessment isn’t without hurdles. Large cohorts and workload pressures can make universities hesitant. Online viva formats also raise equity issues for students without stable internet or quiet environments.

    But these challenges can be mitigated: clear rubrics, tutor training, and reliable digital platforms make it possible to mainstream dialogic formats without compromising rigour or inclusivity. Apprenticeships show it can be done at scale – thousands of students sit PDs every year.

    Crucially, dialogic assessment also aligns neatly with regulatory frameworks. The Office for Students requires that assessments be valid, reliable, and representative of authentic learning. The QAA Quality Code emphasises inclusivity and support for learning. Dialogic formats tick all these boxes.

    The AI panic has created a rare opportunity. Universities can either double down on outdated methods – or embrace formats that are more authentic, equitable, and future-oriented.

    This doesn’t mean abandoning essays or projects altogether. But it could mean ensuring every programme includes at least one dialogic assessment – whether a viva, professional discussion, or reflective dialogue.

    Apprenticeships have demonstrated that dialogic assessments are effective. They are rigorous, scalable, and trusted. Now is the time for the wider higher education sector to recognise their value – not as a niche alternative, but as a core element of assessment in the AI era.

    Source link

  • How teachers and administrators can overcome resistance to NGSS

    How teachers and administrators can overcome resistance to NGSS

    Key points:

    Although the Next Generation Science Standards (NGSS) were released more than a decade ago, adoption of them varies widely in California. I have been to districts that have taken the standards and run with them, but others have been slow to get off the ground with NGSS–even 12 years after their release. In some cases, this is due to a lack of funding, a lack of staffing, or even administrators’ lack of understanding of the active, student-driven pedagogies championed by the NGSS.

    Another potential challenge to implementing NGSS with fidelity comes from teachers’ and administrators’ epistemological beliefs–simply put, their beliefs about how people learn. Teachers bring so much of themselves to the classroom, and that means teaching in a way they think is going to help their students learn. So, it’s understandable that teachers who have found success with traditional lecture-based methods may be reluctant to embrace an inquiry-based approach. It also makes sense that administrators who are former teachers will expect classrooms to look the same as when they were teaching, which may mean students sitting in rows, facing the front, writing down notes.

    Based on my experience as both a science educator and an administrator, here are some strategies for encouraging both teachers and administrators to embrace the NGSS.

    For teachers: Shift expectations and embrace ‘organized chaos’

    A helpful first step is to approach the NGSS not as a set of standards, but rather a set of performance expectations. Those expectations include all three dimensions of science learning: disciplinary core ideas (DCIs), science and engineering practices (SEPs), and cross-cutting concepts (CCCs). The DCIs reflect the things that students know, the SEPs reflect what students are doing, and the CCCs reflect how students think. This three-dimensional approach sets the stage for a more active, engaged learning environment where students construct their own understanding of science content knowledge.

    To meet expectations laid out in the NGSS, teachers can start by modifying existing “recipe labs” to a more inquiry-based model that emphasizes student construction of knowledge. Resources like the NGSS-aligned digital curriculum from Kognity can simplify classroom implementation by providing a digital curriculum that empowers teachers with options for personalized instruction. Additionally, the Wonder of Science can help teachers integrate real-life phenomena into their NGSS-aligned labs to help provide students with real-life contexts to help build an understanding of scientific concepts related to. Lastly, Inquiry Hub offers open-source full-year curricula that can also aid teachers with refining their labs, classroom activities, and assessments.  

    For these updated labs to serve their purpose, teachers will need to reframe classroom management expectations to focus on student engagement and discussion. This may mean embracing what I call “organized chaos.” Over time, teachers will build a sense of efficacy through small successes, whether that’s spotting a studentconstructing their own knowledge or documenting an increased depth of knowledge in an entire class. The objective is to build on student understanding across the entire classroom, which teachers can do with much more confidence if they know that their administrators support them.

    For administrators: Rethink evaluations and offer support

    A recent survey found that 59 percent of administrators in California, where I work, understood how to support teachers with implementing the NGSS. Despite this, some administrators may need to recalibrate their expectations of what they’ll see when they observe classrooms. What they might see is organized chaos happening: students out of their seats, students talking, students engaged in all different sorts of activities. This is what NGSS-aligned learning looks like. 

    To provide a clear focus on student-centered learning indicators, they can revise observation rubrics to align with NGSS, or make their lives easier and use this one. As administrators track their teachers’ NGSS implementation, it helps to monitor their confidence levels. There will always be early implementers who take something new and run with it, and these educators can be inspiring models for those who are less eager to change.

    The overall goal for administrators is to make classrooms safe spaces for experimentation and growth. The more administrators understand about the NGSS, the better they can support teachers in implementing it. They may not know all the details of the DCIs, SEPs, and CCCs, but they must accept that the NGSS require students to be more active, with the teacher acting as more of a facilitator and guide, rather than the keeper of all the knowledge.

    Based on my experience in both teaching and administration roles, I can say that constructivist science classrooms may look and sound different–with more student talk, more questioning, and more chaos. By understanding these differences and supporting teachers through this transition, administrators ensure that all California students develop the deeper scientific thinking that NGSS was designed to foster.

    Latest posts by eSchool Media Contributors (see all)

    Source link

  • The case for collaborative purchasing of digital assessment technology

    The case for collaborative purchasing of digital assessment technology

    Higher education in the UK has a solid background in leveraging scale in purchasing digital content and licenses through Jisc. But when it comes to purchasing specific technology platforms higher education institutions have tended to go their own way, using distinct specifications tailored to their specific needs.

    There are some benefits to this individualistic approach, otherwise it would not have become the status quo. But as the Universities UK taskforce on transformation and efficiency proclaims a “new era of collaboration” some of the long standing assumptions about what can work in a sharing economy are being dusted off and held up to the light to see if they still hold. Efficiency – including finding ways to realise new forms of value but with less overall resource input – is no longer a nice to have; it’s essential for the sector to remain sustainable.

    At Jisc, licensing manager Hannah Lawrence is thinking about the ways that the sector’s digital services agency can build on existing approaches to collective procurement towards a more systematic collaboration, specifically, in her case, exploring ideas around a collaborative route to procurement for technology that supports assessment and feedback. Digital assessment is a compelling area for possible collaboration, partly because the operational challenges are fairly consistent between institutions – such as exam security, scalability, and accessibility – but also because of the shared pedagogical challenge of designing robust assessments that take account of the opportunities and risks of generative AI technology.

    The potential value in collaboration isn’t just in cost savings – it’s also about working together to test and pilot approaches, and share insight and good practice. “Collaboration works best when it’s built on trust, not just transaction,” says Hannah. “We’re aiming to be transparent and open, respecting the diversity of the sector, and making collaboration sustainable by demonstrating real outcomes and upholding data handling standards and ethics.” Hannah predicts that it may take several years to develop an initial iteration of joint procurement mechanism, in collaboration with a selection of vendors, recognising that the approach could evolve over years to offer “best on class” products at a competitive price to institutions who participate in collective procurement approaches.

    Reviewing the SIKTuation

    One way of learning how to build this new collaborative approach is to look to international examples. In Norway, SIKT is the higher education sector’s shared services agency. SIKT started with developing a national student information system, and has subsequently rolled out, among other initiatives, national scientific and diploma archives, and a national higher education application system – and a national tender for digital assessment.

    In its first iteration, when the technology for digital assessment was still evolving, three different vendors were appointed, but in the most recent version, SIKT appointed one single vendor – UNIwise – as the preferred supplier for digital assessment for all of Norwegian higher education. Universities in Norway are not required to follow the SIKT framework, of course, but there are significant advantages to doing so.

    “Through collaboration we create a powerful lobby,” says Christian Moen Fjære, service manager at SIKT. “By procuring for 30,000 staff and 300,000 students we can have a stronger voice and influence with vendors on the product development roadmap – much more so than any individual university. We can also be collectively more effective in sharing insight across the network, like sample exam questions, for example.” SIKT does not hold views about how students should be taught, but as pedagogy and technology become increasingly intertwined, SIKT’s discussions with vendors are typically informed by pedagogical developments. Christian explains, “You need to know what you want pedagogically to create the specification for the technical solution – you need to think what is best for teaching and assessment and then we can think how to change software to reflect that.”

    For vendors, it’s obviously great to be able to sell your product at scale in this way but there’s more to it than that – serving a critical mass of buyers gives vendors the confidence to invest in developing their product, knowing it will meet the needs of their customers. Products evolve in response to long-term sector need, rather than short-term sales goals.

    SIKT can also flex its muscles in negotiating favourable terms with vendors, and use its expertise and experience to avoid pitfalls in negotiating contracts. A particularly pertinent example is on data sharing, both securing assurances of ethical and anonymous sharing of assessment data, and clarity about ultimate ownership of the data. Participants in the network can benefit from a shared data pool, but all need to be confident both that the data will be handled appropriately and that ultimately it belongs to them, not the vendor. “We have baked into the latest requirements the ability to claw back data – we didn’t have this before, stupid, right?” says Christian. “But you learn as the needs arise.”

    Difference and competition

    In the UK context, the sector needs reassurance that diversity will be accommodated – there’s a wariness of anything that looks like it might be a one-size-fits-all model. While the political culture in Norway is undoubtedly more collectivist than in the UK, Norwegian higher education institutions have distinct missions, and they still compete for prestige and to recruit the best students and staff.

    SIKT acknowledges these differences through a detailed consultation process in the creation of national tenders – a “pre-project” on the list of requirements for any technology platform, followed by formal consultation on the final list, overseen by a steering group with diverse sector representation. But at the end of the day to realise the value of joining up, there does need to be some preparedness to compromise, or to put it another way, to find and build on areas of similarity rather than over-refining on what can often be minor differences. Having a coordinating body like SIKT convene the project helps to navigate these issues. And, of course, some institutions simply decide to go another way, and pay more for a more tailored product. There is nothing stopping them from doing so.

    As far as SIKT is concerned, competition between institutions is best considered in the academic realm, in subjects and provision, as that is what benefits the student. For operations, collaboration is more likely to deliver the best results for both institutions and students. But SIKT remains agnostic about whether specific institutions have a different view. “We don’t at SIKT decide what counts as competitive or not,” says Christian. “Universities will decide for themselves whether they want to get involved in particular frameworks based on whether they see a competitive advantage or some other advantage from doing so.”

    The medium term horizon for the UK sector, based on current discussions, is a much more networked approach to the purchase and utilisation of technology to support learning and teaching – though it’s worth noting that there is nothing stopping consortia of institutions getting together to negotiate a shared set of requirements with a particular vendor pending the development of national frameworks. There’s no reason to think the learning curve even needs to be especially steep – while some of the technical elements could require a bit of thinking through, the sector has a longstanding commitment to sharing and collaboration on high quality teaching and learning, and to some extent what’s being talked about right now is mostly about joining the dots between one domain and another.

    This article is published in association with UNIwise. For further information about UNIwise and the opportunity to collaborate contact Tim Peers, Head of Partnerships.

    Source link

  • NAEP scores for class of 2024 show major declines, with fewer students college ready

    NAEP scores for class of 2024 show major declines, with fewer students college ready

    This story was originally published by Chalkbeat. Sign up for their newsletters at ckbe.at/newsletters.

    Students from the class of 2024 had historically low scores on a major national test administered just months before they graduated.

    Results from the National Assessment of Educational Progress, or NAEP, released September 9, show scores for 12th graders declined in math and reading for all but the highest performing students, as well as widening gaps between high and low performers in math. More than half of these students reported being accepted into a four-year college, but the test results indicate that many of them are not academically prepared for college, officials said.

    “This means these students are taking their next steps in life with fewer skills and less knowledge in core academics than their predecessors a decade ago, and this is happening at a time when rapid advancements in technology and society demand more of future workers and citizens, not less,” said Lesley Muldoon, executive director of the National Assessment Governing Board. “We have seen progress before on NAEP, including greater percentages of students meeting the NAEP proficient level. We cannot lose sight of what is possible when we use valuable data like NAEP to drive change and improve learning in U.S. schools.”

    These results reflect similar trends seen in fourth and eighth grade NAEP results released in January, as well as eighth grade science results also released Tuesday.

    In a statement, Education Secretary Linda McMahon said the results show that federal involvement has not improved education, and that states should take more control.

    “If America is going to remain globally competitive, students must be able to read proficiently, think critically, and graduate equipped to solve complex problems,” she said. “We owe it to them to do better.”

    The students who took this test were in eighth grade in March of 2020 and experienced a highly disrupted freshman year of high school because of the pandemic. Those who went to college would now be entering their sophomore year.

    Roughly 19,300 students took the math test and 24,300 students took the reading test between January and March of 2024.

    The math test measures students’ knowledge in four areas: number properties and operations; measurement and geometry; data analysis, statistics, and probability; and algebra. The average score was the lowest it has been since 2005, and 45% of students scored below the NAEP Basic level, even as fewer students scored at NAEP Proficient or above.

    NAEP Proficient typically represents a higher bar than grade-level proficiency as measured on state- and district-level standardized tests. A student scoring in the proficient range might be able to pick the correct algebraic formula for a particular scenario or solve a two-dimensional geometric problem. A student scoring at the basic level likely would be able to determine probability from a simple table or find the population of an area when given the population density.

    Only students in the 90th percentile — the highest achieving students — didn’t see a decline, and the gap between high- and low-performing students in math was higher than on all previous assessments.

    This gap between high and low performers appeared before the pandemic, but has widened in most grade levels and subject areas since. The causes are not entirely clear but might reflect changes in how schools approach teaching as well as challenges outside the classroom.

    Testing officials estimate that 33% of students from the class of 2024 were ready for college-level math, down from 37% in 2019, even as more students said they intended to go to college.

    In reading, students similarly posted lower average scores than on any previous assessment, with only the highest performing students not seeing a decline.

    The reading test measures students’ comprehension of both literary and informational texts and requires students to interpret texts and demonstrate critical thinking skills, as well as understand the plain meaning of the words.

    A student scoring at the basic level likely would understand the purpose of a persuasive essay, for example, or the reaction of a potential audience, while a students scoring at the proficient level would be able to describe why the author made certain rhetorical choices.

    Roughly 32% of students scored below NAEP Basic, 12 percentage points higher than students in 1992, while fewer students scored above NAEP Proficient. An estimated 35% of students were ready for college-level work, down from 37% in 2019.

    In a survey attached to the test, students in 2024 were more likely to report having missed three or more days of school in the previous month than their counterparts in 2019. Students who miss more school typically score lower on NAEP and other tests. Higher performing students were more likely to say they missed no days of school in the previous month.

    Students in 2024 were less likely to report taking pre-calculus, though the rates of students taking both calculus and algebra II were similar in 2019 and 2024. Students reported less confidence in their math abilities than their 2019 counterparts, though students in 2024 were actually less likely to say they didn’t enjoy math.

    Students also reported lower confidence in their reading abilities. At the same time, higher percentages of students than in 2024 reported that their teachers asked them to do more sophisticated tasks, such as identifying evidence in a piece of persuasive writing, and fewer students reported a low interest in reading.

    Chalkbeat is a nonprofit news site covering educational change in public schools.

    For more news on national assessments, visit eSN’s Innovative Teaching hub.

    Latest posts by eSchool Media Contributors (see all)

    Source link

  • We cannot address the AI challenge by acting as though assessment is a standalone activity

    We cannot address the AI challenge by acting as though assessment is a standalone activity

    How to design reliable, valid and fair assessment in an AI-infused world is one of those challenges that feels intractable.

    The scale and extent of the task, it seems, outstrips the available resource to deal with it. In these circumstances it is always worth stepping back to re-frame, perhaps reconceptualise, what the problem is, exactly. Is our framing too narrow? Have we succeeded (yet) in perceiving the most salient aspects of it?

    As an educational development professional, seeking to support institutional policy and learning and teaching practices, I’ve been part of numerous discussions within and beyond my institution. At first, we framed the problem as a threat to the integrity of universities’ power to reliably and fairly award degrees and to certify levels of competence. How do we safeguard this authority and credibly certify learning when the evidence we collect of the learning having taken place can be mimicked so easily? And the act is so undetectable to boot?

    Seen this way the challenge is insurmountable.

    But this framing positions students as devoid of ethical intent, love of learning for its own sake, or capacity for disciplined “digital professionalism”. It also absolves us of the responsibility of providing an education which results in these outcomes. What if we frame the problem instead as a challenge of AI to higher education practices as a whole and not just to assessment? We know the use of AI in HE ranges widely, but we are only just beginning to comprehend the extent to which it redraws the basis of our educative relationship with students.

    Rooted in subject knowledge

    I’m finding that some very old ideas about what constitutes teaching expertise and how students learn are illuminating: the very questions that expert teachers have always asked themselves are in fact newly pertinent as we (re)design education in an AI world. This challenge of AI is not as novel as it first appeared.

    Fundamentally, we are responsible for curriculum design which builds students’ ethical, intellectual and creative development over the course of a whole programme in ways that are relevant to society and future employment. Academic subject content knowledge is at the core of this endeavour and it is this which is the most unnerving part of the challenge presented by AI. I have lost count of the number of times colleagues have said, “I am an expert in [insert relevant subject area], I did not train for this” – where “this” is AI.

    The most resource-intensive need that we have is for an expansion of subject content knowledge: every academic who teaches now needs a subject content knowledge which encompasses a consideration of the interplay between their field of expertise and AI, and specifically the use of AI in learning and professional practice in their field.

    It is only on the basis of this enhanced subject content knowledge that we can then go on to ask: what preconceptions are my students bringing to this subject matter? What prior experience and views do they have about AI use? What precisely will be my educational purpose? How will students engage with this through a newly adjusted repertoire of curriculum and teaching strategies? The task of HE remains a matter of comprehending a new reality and then designing for the comprehension of others. Perhaps the difference now is that the journey of comprehension is even more collaborative and even less finite that it once would have seemed.

    Beyond futile gestures

    All this is not to say that the specific challenge of ensuring that assessment is valid disappears. A universal need for all learners is to develop a capacity for qualitative judgement and to learn to seek, interpret and critically respond to feedback about their own work. AI may well assist in some of these processes, but developing students’ agency, competence and ethical use of it is arguably a prerequisite. In response to this conundrum, some colleagues suggest a return to the in-person examination – even as a baseline to establish in a valid way levels of students’ understanding.

    Let’s leave aside for a moment the argument about the extent to which in-person exams were ever a valid way of assessing much of what we claimed. Rather than focusing on how we can verify students’ learning, let’s emphasise more strongly the need for students themselves to be in touch with the extent and depth of their own understanding, independently of AI.

    What if we reimagined the in-person high stakes summative examination as a low-stakes diagnostic event in which students test and re-test their understanding, capacity to articulate new concepts or design novel solutions? What if such events became periodic collaborative learning reviews? And yes, also a baseline, which assists us all – including students, who after all also have a vested interest – in ensuring that our assessments are valid.

    Treating the challenge of AI as though assessment stands alone from the rest of higher education is too narrow a frame – one that consigns us to a kind of futile authoritarianism which renders assessment practices performative and irrelevant to our and our students’ reality.

    There is much work to do in expanding subject content knowledge and in reimagining our curricula and reconfiguring assessment design at programme level such that it redraws our educative relationship with students. Assessment more than ever has to become a common endeavour rather than something we “provide” to students. A focus on how we conceptualise the trajectory of students’ intellectual, ethical and creative development is inescapable if we are serious about tackling this challenge in meaningful way.

    Source link