Can and should we use different assessments for different purposes?

07/01/2014Professor Paul Newton, Professor of Educational Assessment, Institute of Education, University of London

Having agreed to post some thoughts in response to the question of whether we can and should use different assessments for the purposes of certificating students, school accountability and measuring system improvement, I turned to Andrew Hall’s opening blog for inspiration. Andrew is keen to encourage blue skies thinking about the future of educational assessment in England, and has invited us to start by considering “what a really great assessment system would look like” in a way that is “unbounded by the reality of how the system is today”. In an attempt to be constructively provocative, I decided to reflect upon the meaning of ‘blue skies’ thinking in this context.

Over the years, I’ve had plenty to say about the uses of educational assessments. I’ve warned that an assessment that is fit for one purpose may be substantially less fit for another and might be entirely unfit for others. I’ve explained that even a procedure intended specifically to measure system improvement could serve many different kinds of purpose, with each purpose implying quite different assessment design decisions. Presumably, then, blue skies thinking about the characteristics of a really great assessment system ought to conclude that it comprises multiple, discrete assessment procedures, each engineered to support a particular purpose. After all, a really great assessment system would be as fit as possible for each and every purpose; and maximum fitness across the range of different uses could only be guaranteed if the system incorporated a range of different assessment procedures.

Yet, if this is blue skies thinking about the future of educational assessment, then it is not for me. An inevitable risk of blue skies thinking is that we set our sights too high. A ‘really great’ system is probably too high an aspiration; a ‘good enough’ system is more realistic. When we aspire to a system that is good enough, we open our minds to trade-off, to the realistic appraisal of costs against benefits. Conversely, in the blue sky world, the temptation is to be overly simplistic and idealistic; for instance, to insist that an assessment system should do no harm. In the real world, we should be prepared to accept that any assessment system will inevitably do some harm; even though, on balance, its benefits ought significantly to outweigh its costs. Blue skies thinking tends, ironically, to be black and white. The real world is not like this. The real world is grey.

So I am an advocate of ‘grey skies’ thinking. Grey skies thinking welcomes messiness. It acknowledges that we struggle even to articulate our policy goals, let alone to agree upon them, or to agree how best to achieve them. Fundamental to grey skies thinking is not abstraction from the complexity of the real world, but immersion in it. It involves thinking through the potential consequences of alternative assessment approaches in as much detail as possible. It means attempting to anticipate potential ‘fault lines’ and to gauge their likely severity. It means attempting to identify a broad range of social and educational impacts from alternative assessment approaches and to gauge their likely prevalence. It means focusing public debate on the prioritisation of policy objectives: How important are the various decisions that need to be made on the basis of assessment results and, therefore, how much assessment inaccuracy are we prepared to tolerate? How serious are the various impacts associated with alternative assessment approaches and, therefore, how tolerant of them should we be? In other words, what are we prepared to compromise on, and what are we not prepared to compromise on? Grey skies thinking suggests that it may be more fruitful to start by considering the really calamitous rather than the really great.

So, returning to my brief, can and should we use different assessments for the purposes of certificating students, school accountability and measuring system improvement? As I mentioned earlier, one blue skies answer to this question is an emphatic ‘yes’ – which is to invoke the ‘maximum accuracy’ principle. But an equally legitimate blue skies answer is an emphatic ‘no’ – which is to invoke the ‘collect once, use more than once’ principle, as Ofsted recently put it. Both of these answers are overly simplistic. The grey skies answer is neither an emphatic ‘yes’ nor an emphatic ‘no’ because the real world is far more complicated and messy than that. To provide plausible answers to this question we need grey skies thinkers who are willing and able to grapple with the kind of comprehensive and typically uncomfortable cost-benefit analyses that are fundamental to good policy making.

What forms of assessment are most appropriate for different types of learning?

10/12/2013Nansi Ellis, Assistant General Secretary (Policy), Association of Teachers and Lecturers

I was always quite good at exams. I know that to get good marks on this question I should identify some different types of learning, perhaps vocational and academic, practical and theoretical, skills-based, play based, knowledge based, and include some forms of assessment – observation, course work, project work, written exam, viva – with some good explanations of why they work for each type of learning.

But there are dangers in trying to map particular forms of assessment to particular types of learning and assuming we’ve solved a problem. There are many forms of assessment we could be using that we don’t, and our blinkered approach is damaging pupils’ learning. By increasing teachers’ skills in designing and using assessment, and pupils’, employers’ and politicians’ understanding of the importance of assessment, we could expand the range of assessments without compromising their rigour.

There are many forms of assessment, but lack of shared clarity over the purpose of assessment often means an assessment is used for too many purposes, which then distorts the assessment itself.

The prime purpose of assessment must be to support learning. Teachers assess their pupils all the time and are best placed to choose the form of assessment to suit the learning, if they have the skills to do so, and haven’t been browbeaten into using ‘optional tests’ and practice papers.

Formative assessment supports current learning – informing the learner, teacher, other teachers, parents. Summative assessment, and the resulting qualifications, supports learners to move on, informing employers, universities, colleges. Assessment helps teachers improve their teaching by understanding what pupils have learnt. And it helps governments to understand the impact of their policies on pupils’ learning. Each demand different measures, and different levels of reliability and validity.

Different methods can be used to assess what a learner knows, what they can do, whether they can apply their knowledge and skills in new situations. Employers often complain that employees have good exam grades but cannot write in work situations, or work as part of a team, or be creative. Our current system doesn’t prioritise the assessment of these things.

Increasingly all learning is geared towards end of course exams – GCSEs and A-levels, which causes problems because  we attempt to use the results to determine the future of students, teachers, schools and, potentially, the government.  In the process we’ve forgotten to decide what our priorities are for the education system and the education of young people, and to choose the appropriate assessments

Professor Mick Waters (formerly Director of Curriculum at the Qualifications and Curriculum Authority), in Thinking Allowed on Schooling, talks of holding ‘time trials’ instead of exams: “the student enters the room, is given a problem with three hours to solve it.. Then like most people in business and industry, they would contact others, hold small meetings, get on the web… gradually provide solutions, test out their solutions with colleagues and eventually work towards the best answer possible”.

People learn in myriad ways and we corral people into separate pathways at our peril. By 2025, I hope we can balance a need for consistent data with the flexibility to allow students to learn in ways that work for them.

We need to move away from the assumption that the only way to assess with rigour is to test all pupils on the same day and in the same way. I challenge the assessment community to develop assessment methods that can give consistent results while enabling pupils to choose different ways of being assessed. They need to work with teachers to improve their assessment skills so they can help young people to use the appropriate assessments. And they need to provide the government with persuasive evidence these forms of assessment can provide rigour without compromising student learning.