Here are some of the main points I made about testing during the discussion:
- Multiple choice tests largely measure shallow skills, not higher order thinking
- Attempts to measure deeper thinking via “efficient” measures is quixotic
- Isolates skills from real content – no connection to actual curriculum taught
- Results in test prep curriculum – shallow skills with little relevance to meaningful texts
- Test prep curriculum removes engagement with meaningful texts from those students who are most in need of access to rich literature
- Used primarily to evaluate teachers, not to diagnose students
- Based on proficiency
- Needs of students with disabilities are not factored into test design – accommodations are an afterthought
- Causes suffering to students who are struggling
We should abandon altogether the multiple-choice tests, which are in vogue not because they are an effective tool for judging teachers or students but because they are an efficient means of producing data. Instead, we should move toward extensive written exams, in which students could grapple with literary passages and books they have read in class, along with assessments of students’ reports and projects from throughout the year.
This is good advice. By connecting tests to the actual curriculum taught, we can avoid the tunnel vision of test prep.
Another great article on testing was recently posted on Washington Monthly, by Ed Sector‘s Susan Headden, entitled A Test Worth Teaching To. Headden notes many of the same issues that Hollander points out, and she also points out that tests are designed to be efficient and cheap, and thus don’t measure the higher order thinking that open ended questions would promote.
Headden is hopeful that the new tests designed by the Common Core testing consortiums will be tests worth teaching to, because they will be more akin to the open ended, higher order thinking challenges posed by AP and IB tests. She also notes that they will be computerized and adaptive, with performance learning tasks that can better diagnose students’ deeper analysis capabilities.
I am also hopeful about the new tests and believe that the adaptive nature of the questions will provide much more timely and useful information. However, I continue to remain skeptical of whether a test that assesses skills isolated from the actual curriculum taught can really be a great improvement.
There might be one other non-robotic way to bring down the cost of scoring: assign the task to local teachers instead of test-company employees. According to the Stanford Center for Opportunity Policy in Education, the very act of scoring a high-quality assessment provides teachers with rich opportunities for learning about their students’ abilities and about how to adjust instruction. So teachers could score assessments as part of their professional development—in which case their services would come “free.”
This is also good advice. I think scoring these deeper tests via local teachers would provide a great learning opportunity for the teachers to get deep into the questions and understand where their students are struggling.
So if we put Headden’s and Hollander’s advice together, we could perhaps have a test worth teaching to: tests based on real literature that students have read during the year, scored by local teachers instead of test-company employees or computers. Then if we also consider the needs of students with disabilities from the outset of test design, rather than as an afterthought, we could truly have some great tests.
But as I said on the live chat, we also need to stop our obsession of using tests as evaluative instruments. We could move testing to a randomized or staggered basis (every 2 or 3 years) and put the remaining money to the much more important direct observation of school learning environments and assessments of school curricula.
Efficient? No. But well worth the undertaking, given the issues outlined in our live chat.