Spikey U's mission is to provide first-class qualifications and products (such as Digital Assessments) that are valued by training providers, employers and students, and supported by exceptional teaching and learning resources. The validity of these qualifications lies at the heart of what we do.
Validity is defined by Ofqual, the qualifications and exams regulator for England, in this way: “Validity is about whether exam results are an accurate measure of what students know and can do (in relation to the purpose of the qualification). For an exam to be valid it must cover the important aspects of the syllabus and the process for setting standards must be appropriate.”
The important words here are measure and purpose, and the term exam may also be substituted by assessment, test or other familiar terms relating to measuring achievement against a standard at the end of a learning journey.
This policy aims to explain the concept of validity in relation to our suite of vocational qualifications, providing a greater understanding of why it is central to the development, delivery, and assessment of our qualifications. It also aims to help all our customers understand the role they play in ensuring that validity remains a constant focus.
What is validity?
When assessing, we need to be able to make judgements based on the evidence before us, and that evidence has to be such that it enables us to measure what we want it to measure. This is known as the construct. The important factor here is the quality of that measurement. There are several principles required that make the judgements valid – reliability, fairness and comparability – and there are two key threats to validity: construct under-representation and construct-irrelevant variance. All these terms are explained below.
Reliability determines how consistently a measurement of skill or knowledge yields similar results under varying conditions. If a measure has high reliability, it yields consistent results. So if an assessment is reliable, the learner taking the assessment at the time they are deemed ready to take it should achieve the same or similar result each time if they were to repeat it. An assessor marking an assessment should arrive at the same conclusion for the same assessment over different time periods, and similarly, different assessors marking the same assessment should also come to the same conclusions as each other in terms of results. other words, it is repeatable (“Reliability refers to the consistency of such measurements when the testing procedure is repeated on a population of individuals or groups.” AERA, APA, NCME, 1999, p.25).
Fairness accounts for access to assessment: the language used is without bias (against any protected characteristic, minority or generally), and any changes to assessment for a group or individuals doesn’t render a lack of fairness or disadvantage to other users taking different assessment for the same qualifications (see also: reasonable adjustments, flexible assessment and recognition of prior learning policies and procedures).
Comparability relates to reliability in that the qualifications and their assessments should mean the same thing to the same end users over time. So, the standard achieved and demonstrated for a certain qualification in one year (particularly when relating to apprenticeship standards) should be the same in another year. Further, other separate qualifications available from other awarding organisations that lead to the same or similar occupational roles or national licensed registration schemes (such as CIMSPA) should also be easily comparable by various stakeholders such as employers.
The word often used here is inference: we should be able to make inferences in relation to the achievement of certain qualifications. For example, when somebody passes their driving test in the UK, we can infer that they are competent behind the wheel and possess the knowledge and skills to allow them to be safe on the road for themselves and other users of the highways of the UK. However, we also know that the test isn’t the same as it was, say, 20 years ago, and that the test does not include motorway driving, driving in slippery or dangerous
conditions, etc. In light of this, can it still be comparable over time? And with the exclusion of these elements of skilled and competent driving, the construct could be said to be under-represented, leading to…
Construct under-representation is where the coverage of an assessment specification is insufficient in terms of the knowledge and skills it assesses in order to satisfy the requirements of the role the qualification is aiming to meet (the outcome). All Active IQ’s qualifications have been developed with the knowledge, collaboration, feedback and support of key stakeholders, particularly employers, professional bodies or training providers. We also talk in terms of sufficiency in relation to this area of validity.
Construct-irrelevant variance is where something other than that which should be being assessed affects the result of an assessment. This could be an assessor error, test anxiety, illness or something included in an assessment that shouldn’t be. While very occasionally there may be circumstances that could lead to the appeal of an assessment result, Active IQ ensures that all recognised centres are fully aware and signed up to following the prescribed assessment requirements and procedures, thus ensuring that construct-irrelevant variance is kept to an absolute minimum. The development stage of our qualifications also takes this aspect into account, ensuring that only what is relevant to the construct is assessed.
Roles, responsibilities and activities relating to validity for Spikey U
The product development team creates qualifications and assessments (including End-point Assessments for apprenticeship standards) that meet the needs of the vocational and technical education sector, and build in the above aspects of validity when writing assessment specifications. Any assessment will only test the skills, knowledge and competencies that have been determined as required for the role it aims to fill, and at the appropriate level of demand.
The external verification (quality assurance) team samples and verifies that assessment procedures and assessor judgements are valid via on-site visits, observations of practical skills assessments, sampling of written theory exams and reviewing internal quality assurance systems at centre/provider level, to check for a rigorous approach to all aspects of centre recognition requirements.
Internal reviews of the performance of all qualifications and EPAs occur at least quarterly; trends analyses are conducted in relation to specific exam performance; and centre agreements were developed for compliance by all parties undertaking the management, delivery and assessment of Active IQ qualifications and assessments.
Staff/assessor approval requirements encompass the need to ensure that only those with the experience, competence and relevant qualifications to undertake the role of assessor are approved to do so.
Stakeholder engagement occurs to ensure Active IQ creates qualifications and assessments that continue to be highly valued in the active leisure sector and beyond, and takes the form of:
an external advisory panel
membership of the Chartered Institute for the Management of Sport and Physical Activity (CIMSPA)
a formal strategic partnership with the professional trade body for the sector (ukactive)
gaining feedback from end users such as the annual customer service surve
The main centre contact has the important role of ensuring a recognised centre/provider complies with Active IQ’s requirements in relation to the management, delivery and assessment of our qualifications and assessments, and that their approach to these requirements is suitably and consistently robust.
The internal verifier has the role of objectively scrutinising their assessors in practice. That is, they produce and enact a sampling strategy that reviews and reports on assessor performance and judgemen over time, and across all qualifications to check for validity as described above.
The assessor makes objective judgements on performance, ability, knowledg and skills as described in the relevant assessment specification and against the prescribed standard without bias.
End users are those who have an interest in the qualification. They include:
employers who look for particular qualifications of value in order to offer appropriate employment
successful learners or apprentices who achieve the qualifications or complete the EPA and use them to gain or secure relevant employment
training providers, such as colleges, schools or private training providers, who need to make decisions about what to offer their students in order to provide qualifications of value that allow progression into the workplace or to higher education.