
Large-scale assessment refers to tests that are administered to large numbers of students and are used at local, state, and national levels to measure the progress of schools with respect to educational standards. In order to have accurate and fair measurements, large-scale assessment systems need to include all available students, which means a high volume of students, with large numbers of exams to be marked. The amount of marking that is required is extensive; thus marking exams at this scale requires a lot of work, which means a high volume of exam scripts need to be marked by tens of thousands of examiners appointed by the exam boards. The need for large-scale assessments and the high cost of manual marking and limited “turn around” time have led to developments, over some years, of automated assessment and marking. This chapter reviews the history and development of automated assessment systems. It includes findings from empirical research as well as highlights the theoretical considerations that emerge from such developments. In addition, the practical aspects of developing such assessments are explored with examples primarily from the UK and USA, including the systems and tools available, the current capabilities of natural language processing (NLP) approaches, and their limitations, ethical concerns, and future potential.