What can we learn from other uses of technology like flight simulators?

28/01/2014Gareth Mills, Trustee, Futurelab, and Member, 21st Century Learning Alliance

Technology enhances human capability. It always has done. The telescope allowed us to see further and the microscope helped us to look closer. Coupled with our incredible human capacity to imagine, technological tools have helped to unlock the wonders of the universe and the secrets of our genetic make-up. The history of mankind is a story of ingenuity in the use of tools to solve problems and create new possibilities.

It is surprising, given the transformations seen in many other professions, that so little of genuine significance has been done to exploit technology in the field of educational assessment. What has happened is the automation of many of the easy-to-automate processes of traditional assessment. This includes the marking of multiple-choice questions and the crunching and analysis of big data. The application of technology has tended to serve the needs of administrative efficiency rather than trigger genuine transformation.

Without undermining what has been achieved to date we might, by 2025, seek to harness technology to do more significant things.

So how might we use technology more imaginatively to see further and look closer? Let’s consider just three examples.

Even traditionalists tend to agree that sitting students in a hall to take pencil and paper tests is, at best, a proxy for something else we value much more. Whether students head for university or the world of work, employers and lecturers will value their capacity to manage themselves, show initiative, undertake research, think critically and creatively, work collaboratively and have good interpersonal skills. Employers also say that they look for qualities such as determination, optimism and emotional intelligence alongside competency in literacy and numeracy.

Modern conceptions of competency for future success in life include a wider set of attributes than can generally be found in the mark schemes of most GCSEs. Being fit for the future goes way beyond what can be captured adequately within three hours in an exam hall.

By 2025, one thing we should have explored is the use of scenarios and immersive environments in assessment. No doubt, some traditionalists will baulk at the suggestion; however, most of us feel reassured that the pilot flying our holiday jet has made good use of a flight simulator.  It is reassuring to know that the person at the controls has learned about the handling characteristics of the aircraft, practised how to deal with unusual weather conditions or mechanical failures and rehearsed landing at the world’s most difficult airports in a virtual environment. Immersive environments help to strengthen the authenticity of learning, they are dynamic enough to respond to the user and are able to test capability in many different contexts.

In medicine, the military and the health and safety industries we are seeing a growth in the use of virtual environments to support learning. We can find examples in education too, however, nothing has yet made it into the mainstream or challenged the hegemony of traditional tests.

Is it too far fetched to imagine that by 2025 education assessment might be making use of rich on-screen scenarios to support learning and assessment? Shouldn’t we be using our ingenuity to make assessment more authentic, dynamic and contextually situated? As I write, however, policymakers seem to be marching in the opposite direction.

By 2025 we should also have made significant progress in the use of existing technology in assessment situations. How about, for example, the use of internet-enabled laptops in the exam hall? In Denmark they were piloting such initiatives years ago.  With a set of challenging tasks and tracking software the skills of searching, selection, synthesis, analyses, argument and presentation can all be evaluated alongside the application of knowledge. Such an approach would better reflect the way many will be expected to work in real life. We use tools, not to cheat, but as a way to increase our capacity for critical and creative thought.

By 2025 we will have also taken some technology-enabled assessments to scale. When and how did you take the theory section of your driving test? Since the early 2000’s candidates have taken an online test and a screen-based hazard perception test, involving video clips and touch sensitive surfaces. Of course, a hands-on practical driving test is also required before successful candidates are let loose on the roads.  It seems like a well-balanced assessment to me – knowledge recall, perception testing and practical applied skills. Importantly, no one feels cheated because everyone doesn’t sit the on-line test nor drive along the same roads on the same day.

Perhaps in 2025 we might have more well-balanced, when-ready assessments rather than the set piece, once-a-year, no re-sits culture that drives assessment at the moment. If we can get technology assessment to scale in an important arena like driving, why not in others?

Despite media reports to the contrary, the UK has for many years been highly regarded for the quality of its public education and it is, consequently, a major exporter of educational services and assessments. I fear that by allowing our system to ossify, by not keeping pace with innovation we are in danger of missing a golden opportunity. As a country we need to be investing far more in R&D and developing new products and services to support high quality learning and assessment. We should seek to become the ‘silicon valley’ of technology-enabled learning.

Technology itself, of course, is not a silver bullet. Like all tools it is neutral. We can use a hammer to build or destroy. It is how we choose to use the tool that matters. We need to be at the leading edge in nurturing young people to develop the capacities they will need to flourish in life and work in the future. One way to do this will be through the use of technology coupled with, of course, that enduring human attribute… ingenuity.

How should assessment systems develop to meet the needs of the future?

13/01/2014Andreas Schleicher, Deputy Director for Education and Skills and Special Advisor on Education Policy to the Secretary General, OECD

A generation ago, teachers could expect that what they taught would last for a lifetime of their students. Today, schools need to prepare students for jobs that have not yet been created, to use technologies that have not yet been invented, and to solve problems that we don’t yet know will arise. The dilemma for educators is that the kinds of things that are easy to teach and easy to test are also the kinds of things that are easy to digitize, automate and outsource. In short, the world economy no longer pays people for what they know – Google knows everything – but for what they can do with what they know.

Of course, state-of-the-art knowledge will always remain important. But schooling today needs to be about ways of thinking, involving creativity, critical thinking, problem-solving and decision-making; about ways of working, including communication and collaboration; about tools for working, including the capacity to recognise and exploit the potential of new technologies; and, last but not least, about the capacity to live in a multi-faceted world as active and responsible citizens.

In today’s schools, students typically learn individually and at the end of the school year, we test their individual achievements. But the more interdependent the world becomes, the more we need great collaborators and orchestrators, and people who can appreciate and build on different values, beliefs, cultures. The conventional approach in school is often to break problems down into manageable bits and pieces and then to test whether students can solve problems about these bits and pieces. But in modern economies, we create value by synthesising different fields of knowledge, making connections between ideas that previously seemed unrelated, which requires being familiar with and receptive to knowledge in other fields. Modern schools need to help young individuals to constantly adapt and grow, to find and constantly adjust their right place in an increasingly complex world.

Typically, what is assessed is what gets taught.  Thus, education systems will need to get their goals and standards right and transform their assessment systems to reflect what is important, rather than what can be easily measured. The future is not about more high-stakes testing with one-size-fits-all assessments. It is about developing multi-layered, coherent assessment systems that: extend from classrooms to schools to regional to national to international levels; that support improvement of learning at all levels of the education system and actively involve teachers and other key stakeholders to help students learn better, teachers teach better, and schools work more effectively; that are derived from rigorous, focused and coherent educational standards with an eye on career and college-readiness; that measure individual student growth; that are largely performance-based and make students’ thinking visible and that allow for divergent thinking so that educators can shape better opportunities for student learning. Too often, we still treat learning and assessment as two distinct parts of the instructional process, with the idea that time for assessment takes time away from learning. But responding to assessments can significantly enhance student learning if the assessment tasks are well crafted to incorporate principles of learning. And capitalising on innovative data handling tools and technology connectivity can allow us to combine formative and summative assessment interpretations for a more complete picture of student learning and enhanced teaching.

Developing such assessments is not easy, the keys to success are coherence, comprehensiveness and continuity. Coherence means building on a well-structured conceptual base—an expected learning progression—as the foundation both for large scale and classroom assessments, and on consistency and complementarity across administrative levels of the system and across grades. Comprehensiveness is about using a range of assessment methods to ensure adequate measurement of intended constructs and measures of different grain size to serve different decision-making needs, and about providing productive feedback, at appropriate levels of detail, to fuel accountability and improvement decisions at multiple levels. And continuity is about delivering a continuous stream of evidence to students, teachers and educational administrations.

Sure, there are many methodological challenges involved in developing such new assessments. Can we sufficiently distinguish the role of context from that of the underlying cognitive construct? Do new types of items that are enabled by computers and networks change the constructs that are being measured? Can we drink from the firehose of increasing data streams that arise from new assessment modes? Can we utilise new technologies and new ways of thinking of assessments to gain more information from the classroom without overwhelming the classroom with more assessments? What is the right mix of crowd wisdom and traditional validity information? And most importantly, how can we create assessments that are activators of students’ own learning?

But if we invest just a small fraction of the resources that are currently devoted to mass testing with limited information gains, we will be able to address these challenges quickly.

Can and should we use different assessments for different purposes?

07/01/2014Professor Paul Newton, Professor of Educational Assessment, Institute of Education, University of London

Having agreed to post some thoughts in response to the question of whether we can and should use different assessments for the purposes of certificating students, school accountability and measuring system improvement, I turned to Andrew Hall’s opening blog for inspiration. Andrew is keen to encourage blue skies thinking about the future of educational assessment in England, and has invited us to start by considering “what a really great assessment system would look like” in a way that is “unbounded by the reality of how the system is today”. In an attempt to be constructively provocative, I decided to reflect upon the meaning of ‘blue skies’ thinking in this context.

Over the years, I’ve had plenty to say about the uses of educational assessments. I’ve warned that an assessment that is fit for one purpose may be substantially less fit for another and might be entirely unfit for others. I’ve explained that even a procedure intended specifically to measure system improvement could serve many different kinds of purpose, with each purpose implying quite different assessment design decisions. Presumably, then, blue skies thinking about the characteristics of a really great assessment system ought to conclude that it comprises multiple, discrete assessment procedures, each engineered to support a particular purpose. After all, a really great assessment system would be as fit as possible for each and every purpose; and maximum fitness across the range of different uses could only be guaranteed if the system incorporated a range of different assessment procedures.

Yet, if this is blue skies thinking about the future of educational assessment, then it is not for me. An inevitable risk of blue skies thinking is that we set our sights too high. A ‘really great’ system is probably too high an aspiration; a ‘good enough’ system is more realistic. When we aspire to a system that is good enough, we open our minds to trade-off, to the realistic appraisal of costs against benefits. Conversely, in the blue sky world, the temptation is to be overly simplistic and idealistic; for instance, to insist that an assessment system should do no harm. In the real world, we should be prepared to accept that any assessment system will inevitably do some harm; even though, on balance, its benefits ought significantly to outweigh its costs. Blue skies thinking tends, ironically, to be black and white. The real world is not like this. The real world is grey.

So I am an advocate of ‘grey skies’ thinking. Grey skies thinking welcomes messiness. It acknowledges that we struggle even to articulate our policy goals, let alone to agree upon them, or to agree how best to achieve them. Fundamental to grey skies thinking is not abstraction from the complexity of the real world, but immersion in it. It involves thinking through the potential consequences of alternative assessment approaches in as much detail as possible. It means attempting to anticipate potential ‘fault lines’ and to gauge their likely severity. It means attempting to identify a broad range of social and educational impacts from alternative assessment approaches and to gauge their likely prevalence. It means focusing public debate on the prioritisation of policy objectives: How important are the various decisions that need to be made on the basis of assessment results and, therefore, how much assessment inaccuracy are we prepared to tolerate? How serious are the various impacts associated with alternative assessment approaches and, therefore, how tolerant of them should we be? In other words, what are we prepared to compromise on, and what are we not prepared to compromise on? Grey skies thinking suggests that it may be more fruitful to start by considering the really calamitous rather than the really great.

So, returning to my brief, can and should we use different assessments for the purposes of certificating students, school accountability and measuring system improvement? As I mentioned earlier, one blue skies answer to this question is an emphatic ‘yes’ – which is to invoke the ‘maximum accuracy’ principle. But an equally legitimate blue skies answer is an emphatic ‘no’ – which is to invoke the ‘collect once, use more than once’ principle, as Ofsted recently put it. Both of these answers are overly simplistic. The grey skies answer is neither an emphatic ‘yes’ nor an emphatic ‘no’ because the real world is far more complicated and messy than that. To provide plausible answers to this question we need grey skies thinkers who are willing and able to grapple with the kind of comprehensive and typically uncomfortable cost-benefit analyses that are fundamental to good policy making.