At the recent e-Assessment Association International Conference, RM brought together a recent A-level candidate, an awarding body's Chief Academic Officer, and an education specialist.
Prompted by RM’s Chief Operating Officer they discussed how candidates of various ages, roles and pathways experience digital assessment and how their differing needs are met.
Who is digital assessment for?
Digital native. Two words that cover an enormous number of people. Many of them are at a stage in their lives when high-stakes assessments happen, whether in an academic context or as part of career development. As an industry it’s tempting to think of them as a homogenous group who naturally prefer, and are best served by, on-screen assessment. However, does shunting everyone who meets the age criteria into digital testing make as much sense as forcing everyone with size 7 feet to wear the same pair of shoes?
In the discussion, Teagan, the recent A-level candidate, revealed the nuances awarding bodies could miss if opting for a purely digital approach. For her, enthusiasm for digital assessment varied by subject. For those requiring long-form answers such as history, her preference was for digital assessment. However, some subjects require a candidate to arrive at an answer through a process of working out, like maths. In those cases, Teagan’s preference was for paper-based responses.
Fluency isn't generational, it's contextual
Even for a single candidate, their digital fluency is not static. It depends on context. The digital assessment industry may be designing around an average digital native candidate who doesn’t exist. And as Lt. Gilbert S Daniels demonstrated to the US Air Force, if you design for the average person you design for no one.
Teachers’ attitudes to digital assessment also cluster according to subject. In Rita Bateson’s experience as a head teacher and with the International Baccalaureate (IB) programme, languages and literature teachers can be more accepting of aspects of digital assessment. Their maths colleagues are more likely to push back on attempts to introduce it.
As such the divide between the enthusiasts and the detractors is not necessarily along generational lines. Rather, it is tied to the nature of the discipline being assessed and, crucially, what the assessment intends to measure.
Progress is cautious, rightly so
If assessment systems default to a generalised ‘candidates want digital now’ assumption they risk being well suited to some subjects and candidates, but badly suited to others. The negative impacts would be most severe for those disciplines and entrants that do not fit the template.
Looking now at the general qualification (GQ) awarding bodies that seemed to be pushing digital assessment adoption forward, they seem to have rowed back from full-scale adoption. Rita cited the example of the IB’s Diploma Programme. Had it met its original goal, candidates would have been taking on-screen assessments for several years by now. In fact, limited digital exams will take place for the first time this year. One of the world leaders in digital GQ testing, the New Zealand Qualifications Authority, still offers candidates a choice between digital and paper. Although many candidates embrace the digital option, not everyone chooses the digital route. Digital take up is increasing over time, from 28% in 2022 to 73% by the November 2024 session when over 70 subjects were successfully delivered digitally, including 14 native language exams.
What’s driving the slower than expected pace of change? Rita suggested that it could be an acknowledgment of unanticipated concerns about candidate experience and user interface. In the classroom, teachers have yet to look beyond a ‘paper on glass’ approach. Without fully understanding the impact and possibilities of digital assessment or appreciating the level of thought applied to the topic by the industry those preparing candidates can retreat to their comfort zone of preferring paper-based testing.
Gráinne Watson from RM plc, Teagan MacLeod A-Level student, Eleanor Andressen from Trinity College and Rita Bateson from Eblana Learning
“Digital” can mean different things for PQ and GQ
When considering professional qualifications, the panel acknowledged that in some PQ contexts, the mode of assessment can and should reflect the mode of the task or craft being assessed. Eleanor described how in her main domain, performing arts, the assessment artefact can be a recording of a performance sent for marking to an assessor who is not present for the performance itself. The process contains digital elements but not in the same way that a GQ awarding organisation might define digital assessment.
Gráinne brought up the example of other PQ domains such as accountancy. Digital tools, such as certain software packages, can be deeply embedded in the work candidates do. To exclude those tools from assessment creates an artificial gap between the exam and real-life practice. In such cases, the desire to match the exam to the work can drive the adoption of digital assessment.
Learners preparing for high-stakes GQ exams have digital tools constantly available to them in daily study, the most prominent now being generative AI. In contrast to PQ environments the candidates cannot use them for assessments, whether exams or course work.
Professional qualification assessment is evolving to mirror how professionals actually work with their digital tools of the trade. In most cases GQ assessment is not, or at least not at the same pace. The assumptions based around the needs of the average GQ candidate could be reinforcing that gap rather than closing it. Rita gave a view often expressed by those in the schools sector and shared by others on the panel - that a constant process of redefining what’s important based on the demands of industry is counterproductive. On the other hand, the scales cannot tip too far into institutional resistance either. Gráinne confirmed that in her experience, it’s a balance that every accreditor is trying to strike, in order to do the right thing by candidates and those that teach them.
Not everyone places the same value on trust and transparency
Candidates need to trust that the system that assesses them will be fair and secure. For GQ candidates, coming to the end of several years instruction by figures of authority that trust is hard-wired. Teagan trusts the exam boards to mark papers correctly, their accumulated authority not worth questioning.
Eleanor gave an example of how candidate engagement with the mechanisms of the assessment, facilitated by their teachers, produced positive results. Research from Switzerland showed that transparency about assessment criteria changed, and arguably improved, how students engaged. The candidates appreciated the openness about what was expected of them and their work improved accordingly.
However, transparency and trust-building measures that work for one cohort may not work for all. A more experienced, more invested PQ candidate may be more likely to seek out and use the information. By contrast, as suggested by Teagan’s comments, those measures may not even register with a teenage GQ candidate who has never had reason to question the system, and has no real point of comparison.
Abundant transparency about the assessment criteria has its potential drawbacks of course. With an appreciation of an algorithm or other mechanism behind a digital assessment comes the temptation to game it. Both candidates and their teachers may, albeit subconsciously, change their behaviour to meet its demands.
The use of AI by awarding bodies for marking looms large over questions of transparency and trust in the assessment process. Rita raised the EU AI Act’s classification of AI use as high risk in systems “intended to evaluate learning outcomes” with severe penalties in prospect for those that get it wrong. The stakes are also high for any individual candidate whose assessments are marked in that way. A single grade matters disproportionately to the person receiving it, however the system was designed.

Design to help the edge cases, not a mythical average
The panel’s experiences and views demonstrated that digital fluency, tool-access expectations, and trust in the system all vary. They do not fluctuate along one neat generational line, but across subject, role, pathway and individual experience, sometimes within the same person.
Gráinne closed by recounting a recent conversation she had had with representatives of a national level awarding body. In it they discussed how micro-credentialling can give the candidates the best opportunity to prove their abilities regardless of pathway. By breaking the qualifications down into smaller chunks could awarding bodies use a wider mix of digital and paper-based assessments, each designed to meet a more specific set of circumstances?
The acknowledged good intentions of everyone in the assessment community, applied to an imagined average candidate, can still miss the people at the edges of that average. Who is the "average candidate" assessment systems are currently designed for? And are we serving the majority who sit outside it as well as possible?