A couple of weeks ago I spent 3 hours with the infinitely patient Lucy Rimmington from Ofqual, trying to get under the skin of Progress 8, the new GCSEs and what it all means for teachers, children and parents. Thanks to her and to several teachers who helped me with questions and queries along that way, I’ve written this blog to try to explain to parents and teachers some of the central issues in our exam system. I should be clear that Lucy was simply explaining processes and language to me and that any opinions or conclusions drawn are mine alone.

My first question centred around what I referred to as “norm referencing” and what is more correctly termed “comparative outcomes.” Is it true, I asked, that the proportion of pupils passing GCSEs is set in advance, regardless of criteria or achievement? The answer is yes, well sort of. Exam boards can appeal the boundaries for individual subjects if they feel that the cohort was unusually able, but this is very rare and difficult to prove.  So, in reality, it’s a yes. Already, it has been decided that this year, around 2/3rd of pupils will achieve a grade 4 or a C or above. In fact the proportions of 1s, 4s and 7s have been agreed in advance, regardless of what children do on the day. It means that although the government say that they have made exams harder and more rigorous, it hardly matters because the number of children passing will remain the same. So much for “raising standards.” It’s a clever PR ploy. Journalists and parents can look at the paper, say “ooh that’s hard,” look at the pass rate and assume that things are getting better. In reality, it simply means that children don’t have to score as highly to get the same grade. One might shrug if it weren’t for the pressure that the harder content puts on teachers and pupils and the extra worry it creates.

It seems mad, if you assume that having criteria means that children who reach it are rewarded with a corresponding grade, to find out that we have a system in which no-one can really improve – at least they can, but it must always be at the expense of someone else. But on the other hand, fixing the results in this way protects children from a catastrophic drop in results when government ministers have fiddled with the exam system. It creates stability. The alternative is what we saw with KS2 SATs last year – a criteria based system – where the % of children meeting expected standard fell from 80% to 53%. A drop like that at GCSE would be disastrous. As Lucy said, it seems like the fairest option in a flawed system.

This might not matter, if it were not for the fact that it creates a complete disconnect between reality and the expectations of government and Ofsted. If results are fixed in advance, then how can schools improve? If they are to be judged on data, how can it be fair to be expected not to “coast” when in fact the system is set up to ensure coasting? The fact is that every school that improves their data is doing so at the expense of another’s results. We are pitched against each other – child against child, teacher against teacher, school against school – in a fight to protect our position and to try to improve it, knowing full well that our Outstanding comes at a cost for someone else. No wonder so many schools are starting to select by excluding pupils who may skew their data. No wonder they are looking for ways to secure advantage over others. It actually makes the idea of sharing best practice an act of folly. Why should we collaborate when doing so could hurt our students’ chances of success?

The system is predicated on two central beliefs from Ofqual. One is that people don’t get more intelligent. Therefore, it is reasonable to assume that what they achieve at 11 is a fair indicator of what they will achieve at 16. So the proportion of pupils who will achieve 4s and above at GCSE is set in line with SATs results. All the research into growth mindsets and the evidence from MENSA that in fact human IQ is improving, is largely ignored. They point to the fact that data suggests that children move in a trajectory from KS2 to GCSE that is largely reliable. But that doesn’t allow for the possibility that our data tracking systems for the past twelve years or so have assumed this trajectory. That setting GCSE targets in line with KS2 results might create self-fulfilling prophecies where children achieve the grade that the adults around them expected them to. It doesn’t allow for the possibility that this thinking places a ceiling on achievement and potential.

The other belief, is that for some reason, teachers are not to be trusted – that they will cheat. And so the exam system has to be cheat proof. This lack of trust drives much decision making across the system. For example, reducing coursework; the number of resits; disallowing iGCSE from state school league table results; reducing the number of reviews (or remarks as teachers call them – a semantic tic that apparently irritates people at Ofqual) – all these things have been done to stop people “gaming” the system. Yet let’s take the last one – reviews. Last year, Ofqual announced that the right to have papers “reviewed” (or remarked) would be restricted. They claimed that this was to stop private schools from gaming the system by entering whole cohorts of students for review because they could afford it. By stopping that game, they made it harder for all the genuine applications to be successful. Yet IN THE SAME YEAR, their own research found that 50% of English Literature candidates and almost the same proportion of History candidates got the wrong mark. They publish evidence that exam marking is unreliable at the same time as they make it harder to have the paper reviewed. I don’t really need to say any more on that do I?

Similarly in an attempt to stop pesky teachers from “teaching to the tests” that will change the lives of the pupils they care about, they take care to ensure that the exams are increasingly unpredictable. That the questions cannot be guessed and that they will be written in such a way that children have to think laterally and apply their knowledge. On the one hand, that’s no bad thing – flexibility of mind is an important skill. On the other, when an exhausted child is sitting up to thirty exams in a three week period, it’s a farce to expect them to have the clarity of thinking that the test is designed to extract. At least, with the grade boundaries being fixed, it hardly matters what’s on the paper I suppose.

Having said all of this, schools do themselves no favours in challenging this perception of gaming. Take the ECDL (European Computer Driving Licence). Under Progress 8, schools are credited for a broader number of subjects than before under the old 5A-C accountability measure (for parents that means that schools used to be judged according to the number of pupils who achieved 5 grade As – C, now, they are judged across 8 subjects and the measure is not who gets Cs or above – but how much progress they make from SATs). That would seem fairer if it were not for the fact that the GCSE grades are set in line with SATs results. That means children are NOT EXPECTED to make progress that exceeds predictions at all by Ofqual, but they are by Ofsted. Indeed “good” progress is that which exceeds expectations – anything else is simply expected. And if they do make better than expected progress, someone else must do worse. I know, I know. Anyway, back to the ECDL.

There are three baskets for Progress 8. Schools must enter all pupils for English and Maths – the first basket – and these count for more points if you like than other subjects. They must also provide pupils with access to subjects in the second basket which are largely the EBacc subjects – Languages, Sciences, Humanities etc. The third basket is “other” this could include Arts subjects, vocational subjects and a host of others, like the ECDL. Although Ofqual states that these third basket subjects should involved a minimum number of hours of study, the fact that most pupils are already reasonably computer literate means that the ECDL can be taught and assessed in an intensive week or across a term. This Datalab blog post explains the gaming of ECDL in more detail. Putting all pupils in for ECDL could raise a school’s Progress 8 score by as much as 0.2 – a significant gain. So many schools, in line with the advice given to them by Regional Schools Commissioners (yes, that’s right – senior civil servants are actively encouraging schools to game the system) and MAT trustees, are now entering pupils for an assessment that has little value to them, but great value to the school. It’s no surprise then that the government have now announced that the ECDL will not count towards Progress 8 from 2019. And even before that, there is pressure on Ofsted to look carefully at how schools are creating their Progress 8 scores and whether their choices are made in the best interests of pupils. It’s another game of cat and mouse, and what it does is undermine further the credibility of the profession as well as placing more ethically minded schools at risk. Strategies such as these, while understandable, explain why it is that organisatons like Ofqual view us with suspicion.

It seems to me that we are stuck in a system of mistrust where each side is attempting to out manoeuvre the other in order to protect either themselves or the systems they create. And children get lost in the cross fire. One of the biggest problems leading to unreliability of exam marking is the poor quality of examiners. Struggling to recruit enough examiners to serve an enormously overloaded exam system, boards are turning to unqualified and inexperienced examiners, some of whom have never taught. Even without entering into the matter of subjectivity, this quality issue alone serves to undermine the validity of the whole system. Why are Ofqual and government not leaning on exam boards to ensure a basic level of qualification and experience for their examiners? They’d have to pay more; their profit margins would reduce. Why are government not giving experienced teachers incentives to examine – reduced timetables for example – and subsidising schools to encourage better quality examiners to come forward? Why is the cost of reviewing not met by the exam board if the quality of their service is so poor?

It seems to me that as long as Government allows the population to labour under the misapprehension that we live in a system of meritocracy where all can improve with effort, schools are in an impossible bind. It’s time for some open minded debate about how we best hold our schools to account; how we best assess our children and how we best communicate these aims to parents. I remain hopeful that there is a significant role to be played here by the Chartered College of Teachers to broker a relationship between the DfE, teachers, unions, Ofsted and Ofqual in a way that develops more realistic expectations, fairer systems of assessment and more open communication. What is clear, is that everyone is doing what they think is the best thing. Ofqual are trying to create a stable and reliable system. Ofsted are trying to create a fairer system of inspection. Teachers are trying to get children the best possible outcomes they can. All are working towards similar goals – to create an education system that equips young people for a successful future. But by working in competition and suspicion, they are undermining their stated purpose and the system is creaking under the pressure. We need to start again.


  1. Nice post Debra. You’re quite right that not all schools can improve their results. The best we can hope for is to eliminate variation in performance between schools once you account for variations in intake between schools. Incidentally the current approach is better described as “cohort referencing” rather than “norm referencing”. Comparative judgement is not currently used to my knowledge in awarding- it’s an alternative to traditional methods of marking.

    1. Thanks Dave – that’s helpful. Comparative outcomes was the term I was asked to use by Lucy – I’ve amended having confused judgement and outcomes – thanks for spotting it. I’m now trying to get my head around the kettle of fish that is the National Referencing Test and how this will impact on the system 🙂

      1. “Comparable outcomes” is the OFQUAL term and “comparative judgement” is the alternative to traditional methods of marking.

  2. Thanks, useful post that raises some truly awkward questions.
    I was wondering if, during your conversation with Ofqual on GCSEs, you made a distinction between newly reformed GCSEs and the how the system will work longer term. My understanding was that the fixed proportions of 1s,4s,7s was something that has been put in place initially to “protect” the first cohort of candidates to sit the reformed GCSE. Or is your understanding that these proportions will remain fixed, or remain tied to the SATs result of that cohort?
    Whichever it is, it is a result of rushing through too many changes in one go. Changes should be phased in over 5-10 years to provide time for review and so that individual cohorts are not adversely affected. But education works on a different timescale to politicians, a fundamental problem of political interference in the system.

    1. Good question – thanks. Yes, the view is that they will remain fixed in line with SATs. There is currently a trial into a National Reference Test – a test which will be taken by a sample of 16 year olds to create a comparative picture of current achievement across the year group and that this MAY, depending on how it goes, be used in conjunction with SATs results to create a more accurate picture. At the moment however, everything hinges on SATs. There hasn’t been criteria referencing since 2010 – it’s not just about the new GCSEs – when Michael Gove come into office, he was concerned about the possibility of grade inflation and so the system moved away from criteria referencing then. It’s just that the mechanism has become more connected to SATs outcomes now. At least that’s how I understand it!

  3. There are so many issues here not least the fundamental inappropriateness of the assessment system for making judgements about EITHER the achievement / attainment of the individual or the EFFECTIVENESS of the system. At the root this is the ideology of “all cannot have prizes” which is why we seem so scared of criterion referencing. The work that the Education Data Lab has done has shown that this model of linear progression from KS2 to GCSE is deeply, deeply flawed (https://educationdatalab.org.uk/2015/03/seven-things-you-might-not-know-about-our-schools/). There is a also the non trivial matter of the assumption that the KS2 SATs are themselves “fit for purpose”. Like so much of our current education policy there is a driver of ideology over evidence and reason – doubly ironic given the MoE’s speech on the “need for evidence” recently.

  4. It’s been evident for some time that the idea of progress in education (which sustained me and many others throughout our teaching careers) has been replaced by an effort to control ‘standards’ in order to maintain a stratified society. In 2012, achieving a C in GCSE English was made more difficult by 11 percentage points, an outragepous governmenal interference that led to the unsuccessful court action against Ofqual, AQA and Edexcel led by Lewisham Council. There is now no real interest in improving the quality of what children learn nor in promoting social mobility. All that matters is the “correct” data, and, if badly qualified examiners will produce this, why try to improve assessment? I don’t think that many people do “labour under the misapprehension that we live in a system of meritocracy”. The advent of Brexit and Trump attests to the despair felt by many about their place in the world, which is increasingly fixed by current political and educational regimes.

  5. Under so called criterion referencing it was quite possible for children to improve at the expense of others. The criteria were subjective (despite all attempts at standardisation). With coursework and later on under the controlled assessment system, it was easy for schools to cheat and give more help to pupils than was strictly permitted. It could be said that those schools and teachers were therefore gaining good grades at the expense of others. In addition, schools started teaching to the criteria, destroying love for the subject and reducing lessons to tick boxes which had to be ticked off. In my subject this led to horrible sentences such as “I like it because it’s good” (justify opinion – tick!) or “Forgetfully, he ran down the corridor” (fronted adverbial – tick!). In a sense, an element of competition is no bad thing, as then we will always be trying to raise our standards, rather than fulfil abstract criteria. For example, in MFL it would be no good my telling the children that they can get away with not learning X or Y bit of grammar. If other schools were teaching it and their students were consequently getting better results, the pressure ( and I admit it would be pressure) would be on me and my pupils to up the game. But yes, I take your point that there would be some pupils who, had they taken the test in a previous year, might have received a higher grade. This could be regarded as unfair. There are problems with both criterion referencing and norm referencing – neither system is perfect. But on balance, I have come round to preferring the healthy competition that norm referencing generates, rather than the gaming, cheating and the ” just do enough to satisfy the criteria” tick box approach that criterion referencing encourages. My students love the Vocab Express competition where they see how their school is doing compared to others. They don’t say “Oh Manchester Grammar will always be better, we may as well give up.” They simply try harder. Growth mindset, resilence….!!!

  6. Thank you so much for this article. I am a parent – not an educator – and I’m so pleased I read this tonight.

    A couple of weeks ago I wrote to my children’s school to formally remove my consent that they take part in all government standardised tests. One point you may have overlooked is that children’s potential isn’t set by testing when they are 11 via SATS – it is set when they enter primary school aged 4 via the baseline.

    I have three children – all different, all learning differently. My two sons – 8 and 6 – have suffered terribly through this current system for the simple fact that they started school with additional needs that they have since overcome. Tracked linearly from a low starting point (thanks to speech delays, temporary hearing loss, temporary motor control and poor eyesight) both children do not need to achieve very much to get the ‘value added’ tick in a box.

    I consider fixed ability thinking a form of child abuse and it seems I’m constantly fighting against it.

    Thank you.

    And I’ve just noticed you’ve written a growth mindset blog prior to this one so I’m off to read it now! 🙂

    1. Thank you so much for these comments – it’s so important that parents are aware of these issues – I think at the end of the day, parent power will be the lever for change. All these changes are made in the name of parents “it’s what parents want!” is the cry. If parents stand up and say no, it will be far more effective than all the head teachers and teachers in the country doing the same.

      1. I find it so difficult to reach parents. There is a lot of apathy and a reliance on government data because instincts are to want the best for your child. I’ve never once referred to an Ofsted report or SATS data in making a school/nursery decision and I never will, but the data is seductive for so many. They want their child to have the edge over their neighbours in the catchment next door.

        Since finding you blog, I have read your growth mindset article, a fab article you wrote from last year on setting and I’ve even bought Carol Dwerk’s book as my family is now fully committed to being a growth mindset family!! My younger son still has some speech difficulties, but my 9 year old is working hard and doing well. He is an avid reader and has known all of his timetables by heart since Year 3, but his target has been set and tracked along a line of low achievement. This culminated in him sitting in the wrong maths set for 9 months, coming home and kicking furniture because he was so angry and frustrated that his class were still learning their 3 times table when he knew them all. It took 3 months of pressure to get him moved and “re-assessed” when he jumped from set 4 to set 2. Obviously, my wish would be for no sets at all as his moving up will likely have been to the detriment of others. It truly is horrific!

        I wholeheartedly agree with your blog article about your friend’s twins from last year. My youngest son sat the government’s standardised baseline test last year and was placed in group five because he didn’t know any phonics (he still can’t pronounce some letter sounds). Middle child was told by classmates that his placing, which was made following eight hours of school aged four, meant he was “stupid”. Middle child excels in everything but reading/writing.

        Eldest child in particular would have achieved far more if the prediction (based on his temporary additional needs) was removed and he was awarded the dignity of an unknowable, unlimited potential. One of my main reasons for removing all three children from SATS is that both of my sons have been set up to fail by a fixed system.

        Thank you so so so much for your wonderful blog. You have helped me order my thoughts.

