Holt, Rinehart, and Winston, a part of Harcourt, is a major textbook company in the US. Their website says, "Since 1866, we have been in the business of helping teachers teach and students learn." Our public school uses these textbooks and their internet programs.
Today Melissa, my ninth grader, wrote an essay in school. She typed it into the Holt Online Essay Scoring program, pushed a button, and in less than five seconds she had a score and an assessment page that told her she had "limited ability in word choice," "demonstrates minimal unity," "frequently loses focus," and "uses routine, predictable ideas" (among other things).
I didn't know a computer could grade an essay. I thought humans did that. I know the ACT and SAT have human essay scorers--I applied to be one. Either technology has taken a giant leap forward, or this machine was unnecessarily giving generic, negative comments, tearing down my 14 year-old's confidence in her ability to write.
I told Peter about this and he immediately wanted to look at this program to see what it did. Melissa knew the url and off he went. He answered the prompt and got a score of three out of six.
Here's the prompt:
"Your principal is considering a new grading policy that replaces number or letter grades on report cards with pass or fail. What is your position concerning this issue? Write a letter to your principal stating your position and supporting it with convincing reasons. Be sure to explain your reasons in detail."
Here's what Peter wrote as a test, to see what would happen:
The grading system currently used in the vast majority of schools is counterproductive. Students learn that the important part of education is impressing a teacher, rather than truly becoming proficient in the subject matter. Replacing this with a Pass/Fail system would solve this problem.
Under a Pass/Fail system, students could still be penalized if they don't do the work. Teachers would require a good faith effort in order to pass the class. However, the focus would no longer be on acing quizzes and other trivial factors. Students would be able to focus on learning the material. Students who are interested in becoming educated will still be motivated without grades, and students who are slackers already surely can't get any worse.
His automatic assessment said he had "limited word choice" with "noticeable lapses in the logical flow of ideas," etc. and gave him a 3 out of 6. Melissa was told the goal of the program is to get a four by the time you are a junior in high school. Not sure what that means exactly. I'll have to ask her English teacher.
Now Peter wanted to see what he could do to this first essay to get a higher score, so he added "Dear Mr. Principal" and "Sincerely, Stud Studentman." He still got a 3, but now the computer said he "competent ability in word choice." Wow! Way to go Peter!
Next he began to add nonsense words to see if the computer would react. Nope. No change--still a 3 with the same comments. The essay got increasingly bizarre but no change in score. So he decided to add a few paragraphs he copied off of Kem's Utterly Merciless Guide to Essay Writing. Now he had the traditional five paragraphs. He put in a couple transition words, pushed the button, and sure enough, he got the magic score of four! Here's the essay and the computer's assessment:
"On a 6-point scale, here's your score: 4
This response demonstrates competent success with the persuasive writing task. For the most part, the essay:
* focuses on a reasonable position, with minor distractions
* shows effective organization, with minor lapses
* offers mostly thoughtful ideas and reasons
* elaborates reasons and evidence with a mixture of the general and the specific
* exhibits general control of written language
(All essays that score a 4 get the above generic response. You can see it here.)
The Essay that Earned a Score of 4:
Dear Mr. Principal,
The grading system always used in the vast majority of cheeses is counterproductive. This is because students learn that the frankly important part of fly fishing is impressing a classroom, rather than truly becoming an eyesore in the subject matter. The three problems are too much cheese, acrobatic flying monkeys, and the electoral college system. Replacing this monstrosity of a teacher's pet with a Pass/Fail system would solve every problem in the world.
First, too much cheese leads to bad grades. Under a Pass/Fail system, students could still be drawn and quartered if they don't do the bull-baiting. Factory workers would require a good love effort in order to pass the cheese. However, the focus would no longer be on acing quizzes and other lunatic factors. Students would be able to focus on learning the website. Students who are interested in becoming housewives will still be motivated without grades, and students who are drug addicts already surely can't get any more smelly.
In addition, flying acrobatic monkeys will attack this student, whose name may as well be Siegfried. He is baffled at the assignment. Journey motifs? What does that mean? He takes a look at the two texts on which he wants to write, the General Prologue to Geoffrey Chaucer's Canterbury Tales and Book I of Edmund Spenser's Faerie Queene. Both texts involve physical journeys, but he can't really think of anything to say about that. They're just journeys, right? What's to say?
Finally, the electoral college system eats brains and takes names. Siegfried gets to work and busily lifts ideas (and sometimes even full paragraphs, since they fit so well) out of the online papers. He adds a generic introduction and a conclusion that repeats it almost word for word. VoilĂ : an essay...and he's hardly had to think about it at all.
In conclusion, if you haven't seen the comments on the last post, check them out. Eliminate the cheese, the monkeys, and colleges. Two Instigators of Filthy Plagiarism have infiltrated the site, and I and a reader named Kyle have declared war on them. Will you join in our crusade? Who will be strong and stand with me? Somewhere beyond the barricade is there a world you long to see? I think so.
Sincerely,
Stud Studentman
----
From the Intelligent Essay Assessor:
Analytic Feedback for Your Essay
Our system has analyzed your essay for five important writing traits:
* Content and Development
* Focus and Organization
* Effective Sentences
* Word Choice
* Grammar, Usage, and Mechanics
Study the statements that describe each trait to help you improve your writing.
Content and Development Your essay shows competent ability for this trait. For the most part, the essay:
* uses some meaningful and thoughtful ideas
* elaborates and supports some ideas with a mixture of general and specific details, reasons, explanations, and/or examples
Focus and Organization Your essay shows competent ability for this trait. For the most part, the essay:
* addresses the prompt, but may include minor digressions
* shows some awareness of audience
* displays effective organization and transitions, with minor lapses
* demonstrates general unity and completeness
Effective Sentences Your essay shows competent ability for this trait. For the most part, the essay:
* generally forms sentences correctly but with occasional errors
* demonstrates sentence quality with few, if any, awkward sentences
* displays some variety in sentence types, lengths, structures, and beginnings
* displays some fluency
Word Choice Your essay shows competent ability for this trait. For the most part, the essay:
* uses words that are generally appropriate to audience and purpose
* uses some words that are precise and accurate
* may use figurative language and imagery somewhat effectively
Grammar, Usage, and Mechanics Your essay shows competent ability for this trait. For the most part, the essay:
* demonstrates general command of language conventions
* exhibits general command of spelling, punctuation, and capitalization"
Needless to say, I'll be showing this to her English teacher.
update: I did talk to her teacher. Here's the response.








17 comments:
That is so awesome! (The system bucking, not the stupid program.) I have dealt with Holt as well as some of the other websites for my baby brother (he is 16) and have noticed similar things. The worst is the teachers who follow this blindly, even when they realize the system is broken. It trains kids to try to fool teachers and whoever else into thinking they know something instead of really learning it.
That is just too funny. It makes me glad that we've decided to homeschool.
Peter's revised essay is hilarious! My grad student brother and I have been discussing test scores a lot lately. I wonder what he would think about this.
I'd like to get as much feedback on this topic as possible. If you wouldn't mind linking to this post to help me find others who know anything about computer-graded essays, that would be great!
I hope you will share the teacher's reaction to this. I would love to hear whether or not she defends this system of scoring after reading Peter's essay. I will link the post to my blog.
This says so much about the condition of our public school system.
Well, as an example of how "objective" scoring of essays is just plain crazy, that is pretty good. Language is a living thing. This is insane.
Ah, a Pearson Product. Uses latent semantic analysis.
LSA has two drawbacks:
* The resulting dimensions might be difficult to interpret. For instance, in
{(car), (truck), (flower)} --> {(1.3452 * car + 0.2828 * truck), (flower)}
the (1.3452 * car + 0.2828 * truck) component could be interpreted as "vehicle". However, it is very likely that cases close to
{(car), (bottle), (flower)} --> {(1.3452 * car + 0.2828 * bottle), (flower)}
will occur. This leads to results which can be justified on the mathematical level, but have no interpretable meaning in natural language.
* The probabilistic model of LSA does not match observed data: LSA assumes that words and documents form a joint Gaussian model (ergodic hypothesis), while a Poisson distribution has been observed. Thus, a newer alternative is probabilistic latent semantic analysis, based on a multinomial model, which is reported to give better results than standard LSA[citation needed].
There are many cases of people writing the exact same paragraph five times and getting a near perfect score from one of these engines.
Generally, it takes two things into consideration: the math of your words and model comparison. Basically, it's given essays that are 'models' of what to look for - that is pretty easy to understand and for a machine to determine. However, the math of word relationships is faulty. As your son found out by using various correct parts of speech with nonsensical meanings. It can group two 'most like' nouns of three and assign meaning to that, but it cannot determine when something has *no* meaning. Good luck! I suspect you will get a response that is apathetic. Let us know how it goes.
OK, that is funny. Scary, but funny.
Hi Jena,
I read this post a few days ago, and I'm still spluttering about it... I'm posting to this link, and thanking goodness that there are people like you who speak up about stuff like this. Hope your daughter's okay.
Karen
PS The '4' essay was hilarious!
I would suggest that you send a copy of your daughter's essay, a copy of your son's essay, and a copy of the Holt response to each to your local newspaper, television and news radio stations, and state representatives. This nonsense needs to be stopped.
tychabrahe,
yes, I wonder if I should take a stronger, more aggressive stance on this. Thanks for your ideas. How did you hear about me?
I'm a high school student finding this blog as I'm sitting here writing one of these Holt essays. My teacher knows how stupid these are, but since we take a state writing assessment using this kind of system, we have to practice for it.
I think I'm going to go write about space monkeys making teen driving safer. My teacher will get a kick out of it. :)
Amanda,
I'm sorry you have to write one of these, but I'm glad your teacher doesn't take them seriously.
Good luck with space monkeys!
Hi. I am a middle school student. I am very concerned with this because our teacher will soon be grading our essays using this system. I put it to the test and it gave me a prompt about a role model. I submitted a strong, well thought out essay. It was reasonably long and it gave me 2/4. I was ashamed until I read your blog. (I looked holt online essay scoring up and it had this post on the list of results) I am worried that I will fail because my teacher will unknowingly give me a bad grade! (Could write about how to but out a grease fire with propane.)
Hi Avery,
Show your teacher this post. She could even copy and paste my son's essay into the program and see what happens. Hopefully she'll see that a computer can't really score an essay. If she still gives you a low grade, don't worry. Colleges never look at middle school grades. Just do your best--that's all you can do!
Most teachers who use this program know that to use it effectively, one must have students use it as a revision tool only. Many students really hate revising and that is part of the problem. In order for students to become better writers, they need to write more often. Online essay scoring allows that. However, a teacher still needs to score the essays. Students may write an essay and continue to revise it in a less painful way. It also allows the teacher to work one on one with struggling writers because after the Holt product gives the feedback, the teacher can take a look at the essay and give the individual student help immediately, not a week later as in a typical classroom setting. Remember, English teachers typically have 100-150 students per day and if you multiply that by two essays per month, the teacher has no life outside of work other than grading those essays. I doubt many of the posters on this site work more hourss than the typical high school English teacher for the money they earn. While I think your husband's essay is funny, he was only trying to buck the system. It obviously didn't work well for him. Many students are able to get a six on these essays. I guess they are just better writers. ;-)
Post a Comment