Automated essay scoring ‘as reliable as human markers’

by James Reid01 Dec 2015

Research, released by the Australian Curriculum and Reporting Authority (ACARA) on Monday, found four separate and independent automated essay scoring systems were able to mark NAPLAN persuasive writing tasks as reliably as human markers. 

ACARA CEO, Robert Randall, welcomed the findings, saying that automated scoring would also mean the quicker delivery of results to teachers and parents.

“Automated essay scoring of the writing component of NAPLAN will result in parents and teachers receiving their children and students’ results within two weeks of taking NAPLAN,” Randall said in a statement.

“The precision and earlier provision of the results will help teachers tailor their teaching to student needs.”

He added teachers and other markers would continue to be involved in the process by training the automated essay scoring system, marking a sample of essays to check and validate what the system is doing and by marking essays that the system might have difficulty marking.

“The research results show that automated essay scoring works for NAPLAN-type writing, but we will continue with our research to refine the system and to gather more evidence which we will use to assure parents and teachers of the viability of automated essay scoring and to make a final decision about proceeding,” he said.

“If need be, we could double mark samples of student essays, until everyone is comfortable with automated essay scoring.”

In 2012, four separate and experienced vendors were employed to score a sample of NAPLAN persuasive essays, using the current NAPLAN writing rubric.
 
The vendors represented a cross-section of different approaches and methods for automated assessment of writing and were provided with 1,014 essays – along with scores provided by human markers – to train and validate their automated essay scoring systems.
 
After training and validating the systems, the vendors used them to mark 339 tests.

On overall scores and each writing criteria assessed, the four automated essay scoring systems achieved levels of agreement comparable with the human markers.
 
“What is exciting about this research is that, although the four vendors had different automated essay scoring systems, they were all able to mark the essays as well as the human markers,” Randall said, adding this was not the end of the research into automated essay scoring.
 
“We intend to expand on this research in 2016 to include a larger sample of students and multiple prompts within and across writing genres [persuasive and narrative] before making a final decision about the approach to be used in 2017.”

However, the automated system has not been welcomed by all educators.

NSW Teachers Federation president, Maurie Mulheron, expressed concerns over ACARA’s plan to move NAPLAN online, telling The Educator the prospect of computers marking children’s extended writing exams reflected an “appalling psychology”.

“ACARA is saying they can train computers to mark a child’s writing. As someone who was a supervisor who marked HSC exams for 17 years, I have a problem with that on a number of levels,” Mulheron said.

“It took highly educated and trained teachers a long time to test extended writing. There were huge checks and balances and statistical analysis which were done hourly every night of the marking operation.”

Professor Ken Wiltshire, who co-led last year’s review into the national curriculum, also raised concerns about the automated process, telling The Australian that handwritten exams must be an option for students if NAPLAN is to remain equitable.

He said typed essays for children as young as seven would disadvantage students without computers at home.

“This will discriminate against a whole range of students,” Wiltshire said.

“There must be an optional writing test available for students – they can’t force everyone to do an online test. There needs to be further research done on the equity implications before this is taken further.”
 

COMMENTS