ecept
 
Sections
Topics

2001 Co-PI Meeting

Student Teachers


CETP Websites
ACEPT
(Arizona)
CETP-PA
(Pennsylvania)
CRC:STNM
(New York)
EATP
(Michigan)
FCEPT
(Fresno)
FCETP
(Florida)
KCETP
(Kansas)
LaCEPT
(Louisiana)
LACTE
(Los Angeles)
LBESTEP
(Long Beach)
MASTEP
(San Jose)
MCTP
(Maryland)
MMSTEC
(Maine)
MSMCI
(Soutwest Texas)
New Mexico CETP
NDSU-COMSTEP
(North Dakota)
NYCETP
(New York)
OCEPT
(Oregon)
OTEC
(Oklahoma)
PETE
(El Paso)
Philadelphia CETP
Project TEACH
(Washington)
PRCETP
(Puerto Rico)
RMTEC
(Colorado)
STEMTEC
(Massachusetts)
STEP
(Montana)
S-CETP
(Sacremento)
techknow
(Iowa)
TxCETP
(Texas)
UIC-CC CETP
(Chicago)
UTeach
(UT Austin)
VCEPT
(Virginia)
CETP Evaluation

 
Arizona's High School Graduation Exam Part 2: Math Section Critique
posted by Rod on Wednesday March 28, @04:18PM
Math Analysis of the Mathematics Portion of the Arizona Instrument to Measure Standards (AIMS) High School Form A Released Items

James A. Middleton
Arizona State University

Intent

Arizona's Instrument to Measure Standards or AIMS, has been a hotly contested policy for some years now, with no discernible resolution to the various dilemmas which both advocates for and against have argued. This piece attempts to reveal the underlying problems and promise of a statewide educational assessment system, and project a means by which progress towards a meaningful assessment of high quality standards, and its resulting use for purposes of accountability can be achieved. I will focus my examples on the mathematics portion of the test, because my area of expertise is in the teaching and learning of mathematics, but the underlying philosophical arguments, I think, can be applied to writing and literacy as well... (Enter the forum for the rest of the article).

We will be contacting CTB/McGrawHill (the company who owns the test bank from which the AIMS math items were taken) as well as the Arizona Dept. of Education for a responses. If you have any questions you would like to ask either one of these institutions concerning the items, please post them in the forum, and we will pass them along.


Analysis

Using the released items from the Spring 1999 administration of the AIMS mathematics portion, High School Form A (Released January 26, 2001, nearly two years later), I have analyzed each item for mathematical accuracy, potential for multiple interpretations (which tends to cause confusion in children in a high-stakes situation, unrelated to their degree of understanding of the content), and realism in terms of any pragmatic context within which an item may have been embedded. Of the 38 Core items, fully 17 (45%) had some problem associated with it that could have caused a consistent measurement error, meaning that the score the student received for that item may not reflect their actual level of understanding of or skill in, the content. Of those 17, ten have problems significant enough to warrant their removal from the assessment. This analysis indicates that over 1/4 of the AIMS mathematics assessment, if the released items are a representative sample, provide incorrect data to the state department of education, school districts, parents and children anxious to graduate. If the AIMS test were subjected to the same level of rigor as I apply to my students, it would receive a C- grade--enough to warrant academic probation at any collegiate institution in the country, suspension from academic activities at any high school, and dismissal from any corporation who commissioned an employee to oversee its quality control.

To the lay eye, it may appear that I am being picky, criticizing the minutest detail of the exam. The lay eye is perceptive. I am being picky. Any first semester student of psychometrics (the statistical study of test design, administration, and analysis) could tell you that if a test is to provide reliable and valid data, its items must be designed well, reflect the standards of the content, and clearly allow students who understand the content, to demonstrate that understanding. All standardized assessments are subjected to rigorous developmental cycles to perfect the tool and make it useful for the purposes of the assessment. At this time, the AIMS test has not undergone enough work to conform to these standards of quality. This analysis points out the flaws in the released items, critiquing the instrument, and by extension the attenuated time frame and conceptual framework of its development. It does not make the case for completely dismantling the process, nor does it remonstrate any individual or agency who might be held accountable for these mistakes. Instead, it suggests that the people of Arizona have resources at their disposal in a multitude of institutions, that should work together in designing a useful and cost-effective program of assessment for Arizona's children.

The trouble begins on page 1, the AIMS Reference Sheet, on which are placed potentially useful formulas and theorems for the students to use in taking the test. Unfortunately, the students cannot trust the Reference Sheet as the formula for the Volume of a Sphere is incorrect. Instead of 4/3 pi r2, the stated formula, the actual formula should be 4/3 pi r3. Moreover, even if the student caught the mistake, they may not remember the value of pi, since the Key on page one suggests that students use 3.14 or 22/7 as the value for p, the Greek symbol for rho, not pi.

It gets worse from there...

The items on the exam with potential problems are listed here: 1, 2, 5, 12, 16, 17, 18, 20, 21, 23, 24, 25, 27, 29, 33, 34, and 36. Those in plainface type have some problems, but could be salvaged and used if they undergo some revision. Items in boldface are those that I determine to be seriously flawed. Below, I cite a few egregious examples. The figures and text for the items are redrawn and retyped. The intent is to faithfully reproduce the items as well as the computer of the author will allow in a short time frame for quick turnaround of this paper. Where there are differences, these are not mathematically relevant, nor do they relate to the context within which the problems are situated. For the actual text of the released items, download the PDF version from the State Department website: http://www.ade.state.az.us/AIMSReleaseSummary1-26.pdf.

    Examples

      Problem 16:

      Alex is building a ramp for a bike competition. He has two rectangular boards. One board is 6 meters long and the other is 5 meters long. If the ramp has to form a right triangle, what should its height be?
      A 3 meters
      B 4 meters
      C 3.3 meters
      D 7.8 meters

In this item, none of the answers is correct. The student is expected to use the Pythagorean Theorem (Hypotenuse2 = Side12 + Side22). So, (6m)2 = (5m)2 + (EF)2. To maintain a right triangle, the only correct answer is (11)1/2 meters, one that is cumbersome in real life, and so requires rounding off to an acceptable level of accuracy. Depending on the convention for rounding, a reasonable height could be 3 meters (if the convention is rounding to the nearest meter), 3.3 meters (if the convention is rounding to the nearest decimeter), 3.32 meters (if the convention is rounding to the nearest centimeter), and so on.

The answer marked as correct, 3.3 meters is actually about 1.2 centimeters off (about 1/2 inch). Any carpenter worth his or her salt would not make an error of 1/2 inch given a tape measure that is precise to 1/32 inch.

Moreover, as a male, I cringe at the thought of a bike competition that requires riders to jump off 3.3 meter heights (between 10 and 11 feet, ouch!). Or if the rider is to ride down the ramp, a slope of 66% (33.5 degrees) is steep enough to scare the bejeebers out of me.

Lastly, a 6 m board? Come on! When was the last time you found a board of 20 feet at Home Depot? In short, the context within which the problem is embedded shows a lack of the everyday sense for numbers that is required in the elementary standards for Arizona children.

      Problem 18:

      Which of the following is a secant of circle P?
      A (line) AB
      B (line) CE
      C (line segment) GP
      D (line segment) FD

(I use parenthetical terms in reconstructing this problem because my word processor has difficulties with the mathematical symbols--j.m.)

In this problem, there are three correct answers. Only answer C is incorrect. Line AB is a secant of circle P because it is tangent, and all tangents are defined in elementary calculus courses as degenerate secants using the epsilon-delta definition of a derivative at a point. Line segment FD is a secant, as it is the diameter of the circle, and therefore intersects the circle in two points. Line CE (the "right answer") is a secant to P since it intersects the circumference at two points.

Which answer should the student choose? The definition of a secant is "a straight line that intersects a curve in two points." There may be some argument over whether FD is a secant, as it is a line segment, and therefore only lies on the secant that contains FD. It may surprise Americans to realize that the term for straight objects in much of the world is the equivalent of "line", and a special designation of "infinite" or "unending" is placed before the word to denote what Euclid termed, "breadthless width." While this kind of argument over terms may be useful to establish norms for communication among people who speak different languages, it is unclear whether all high school graduates need to be so well versed in specific definitions that could be looked up in any mathematics dictionary. An advanced student would NOT want to choose the obvious answer as the case of AB and FD are much more interesting mathematically than CE.

Unfortunately, the AIMS test is scored where only one answer can be counted as correct. What about a student who was unsure, seeing three examples of secants, but only being able to choose one. "Do I remember the definition correctly?" "What if it is something else?" These kinds of questions in a high stakes exam throw the marginal student into unnecessary confusion, often leading to frustration and unnecessary errors merely as a result of taking the test.

      Problem 23:

      Aaron used the Pythagorean Theorem to find the height of a tree. He calculated that the tree was square-root(625) feet tall. Which of the following should be used to write the height of the tree?

      A +- 25 feet
      B 25 feet
      C - 25 feet
      D 252 feet

This problem illustrates lack of attention to the context within which the intended content is situated. Though it is not technically impossible to use the Pythagorean Theorem to calculate the height of a tree, it is absurdly impractical. To calculate the height using Pythagorus, one must first have the distance from the tree, and the length of the hypotenuse of the right triangle (the length of a wire if it were stretched from the tip top of the tree to the point where the observer, Aaron, is standing, see below).

Why go to the trouble of climbing the tree, stringing a wire and pacing off the distance, when a simple use of the tangent ratio can calculate the height with just the distance to the tree and the angle of elevation of the top. The tangent of the angle of elevation (tan alpha) is equal to the height of the tree divided by the distance (h/d). So, the height is equal to the tangent of the angle multiplied by the distance (h = d(tan alpha). This is a common middle school geometry activity.

Another reasonable method would be to hold up your thumb in front of your face and walk to or away from the tree until the tree appears to be the height of the tip of your thumb to the first knuckle (~ 1 inch). Then pace off the distance to the tree. The height of the tree is found using similar triangles where the ratio of the distance from your eye to your thumb : size of your thumb (here the ~ 1 inch becomes useful) is equivalent to the distance from your original position to the tree : height of the tree.

Any Scout could tell you this.

What the test designers are looking for is for students to find the positive square root of 625. Why not just ask, "What is the positive square root of 625?" Alternatively, if knowledge of the Pythagorean Theorem is desired, one could ask, "The dimensions of a rectangular parking lot are 25m by 15m. What is the length of the diagonal?"

This lack of attention to the details of context, is indicative of the generally shoddy engineering of the AIMS items.

      Problem 29.

      The graph depicts a real-world situation. Which of the following situations could it depict?
      A A person dove into the water
      B A person jumped from a tree to the grass below
      C A plane landed safely
      D A plane crashed into the runway

This problem is just awful. First, the problem brazenly states that the graph depicts a real-world situation. The authors of the test then go on to provide a graph that doesn't reasonably depict any of the situations presented as answers.

As any student of physics knows, the relationship between height and time for a body in freefall is curvilinear (parabolic, actually). This means that the first two answers (A and B) are both impossible (assuming a continuous time scale), as a jumping person does not reach terminal velocity in the short heights people can safely jump from.

We don't know from the graph what the scale is for either height or time. Does the graph depict the first millisecond, second, ten seconds, minute? Is the height astronomical? Infinitessimal? Reasonable? Are the scales equal interval or are they logarithmic? Are they idealized or do they depict actual data. Without these bits of information (which are necessary for the interpretation of any graph that depicts a real-world situation and not a purely mathematical one), we really cannot tell whether or not the last two answers (C and D) are plausible or not.

Is the Zero point for height the altitude of the runway? If so, D could be the most reasonable response: Because a plane has an engine, it could, conceiveably, put the engine in reverse to eliminate the acceleration of gravity or alternatively speed up to a velocity greater than (or equal to) terminal velocity. Either way, a plane could hit the earth after traveling a constant velocity for a period of time. The plane could have then plowed into the runway, where it hit a layer underneath the ground so elastic it took no appreciable time for the plane to bounce back up to ground level at approximately the same rate as it entered it.

C could also be the correct answer if the plane (in this case a small plane of the kind still thrown, I am told by first hand sources, in classes where the AIMS test is administered) dips below the Zero point, and then bounces up off an object to be caught in a net.

The "correct" answer, A, has the following plausible shape for the scenario proposed (again, forgive my computer's lack of attention to perfect drawings), and can therefore be eliminated as a reasonable answer to the item:

At any rate, again, the lack of attention to the realism of the context, and the ways in which problems may be interpreted here shows that the design of the test itself: The mathematical content as well as the context within which the problems are situated, is fundamentally flawed, causing the flawed items to be inaccurate indicators of student learning and achievement.

So, should AIMS be scrapped?

I would like to state up front that I am not against a statewide assessment of mathematics achievement. In fact, I advocate the development and administration of high quality assessments to assist schools in providing the best quality curriculum and instructions to all of our children--this is what they deserve. The key here, is that the assessment must be designed to provide detailed information to students, teachers, administration, and the state (in that order), as to how they are achieving high quality standards, and especially how they can improve. The current furor over AIMS is due, to a large extent, to the fact that the results of the test (disregarding any case that might be made about the dubious validity of the test itself) do nothing for anyone. So we found out that 48% of the high school students who took AIMS could sketch a cone (problem 24 in Form A). What does that say about instruction? What if a student did not sketch a cone successfully. Does that mean he or she didn't know what a cone was? Does it mean that the teacher did not provide enough "cone" experiences? Should we now focus our instruction on coneness? Hold on, 37% of the students tested did not even fill out the response. We don't know if they could answer the question or not, just that they chose not to. Hold on again, students who drew a "net" of a cone (a 2-d map of the figure that could be folded up to form a cone if cut out) only got 1 point instead of 2 possible points. If my understanding of development holds true (and it does), the ability to reallot shape is much more sophisticated both cognitively, and mathematically than being able to draw a figure from memory. Why isn't this taken into account?

The point here is that even if we know students do not perform up to our minimum standards on the AIMS assessment, the information the test provides to those who hold a stake in public education is virtually useless except to berate the system for failing again. What would happen if we designed an assessment system that supports high quality learning for all students, and constitutes an important piece of a feedback loop for the continual improvement of the system? What would such an assessment look like?

First, such an assessment must be embedded in both the quality mathematics that can be applied across pragmatic situations and those pragmatic situations that prove to all concerned that the mathematics being taught is, in fact, related to future life success. The superficial contextualization of problems as embodied by the current AIMS test does neither.

Second, the assessment must have a "high ceiling," so that as subsequent cohorts of students take the exam, the ability to show improvement is built in. Currently, the level of content of the AIMS test is pretty good. I anticipate that all children who pass through our high schools should be able to reason with the level of algebra, geometry, statistics, and discrete mathematics that the designers of the AIMS test have chosen--in time, as school districts adjust to a more rigorous mathematics curriculum, coupled with teaching methods with sound empirical data to back them up. I think the items measuring this level of content could be made to illustrate more useful and important situations that all informed citizens should be aware of, but the actual level of content as is, is NOT too difficult.

Despite this, however, the question of what an appropriate "cut score" is for such an assessment is problematic. Suppose 60 percent correct is determined to be an appropriate minimum standard. That means that an 18 year old student, who has gone through 13 years of public instruction in good faith, can be denied graduation (and all of those benefits of graduation such as a decent job, decent housing and self-respect) only by reason of not passing a sit-down math test. This in spite of 13 years of passing grades. Whose failure is this? The student's? Sure, he/she didn't meet minimum standards. I'll buy that. However, the system must also be held responsible, because the child and his/her parents have kept their part of the bargain that is entered into when a child first steps foot into the public schools. Furthermore, who determines the level to which the test measures potential for contribution to a vital economy? Isn't this what compulsory education is about, ostensibly? Does a 60 percent correct student really contribute while a 59 percenter does not? What recourse do the child and parents have, should the student fail?

Third, a tight feedback loop must be built from the results of the AIMS, back to the teaching and learning of the student. This is, after all, the only real justification for such an exam, that teaching and learning improve, continually, consistently. To do this, the design of the test itself must reflect the development of reasoning within the areas of algebra, geometry, statistics, and discrete mathematics, such that when a student's responses are scored, a defensible indicator of his or her level of understanding of fundamental concepts is produced. The teacher and student could then use this information to bolster areas where understanding is lacking. Without such a loop, only broad policy-level changes can be made, that may or may not benefit any individual student (see the past 10 years or so of standards-setting, etc. to reveal a national lesson in futility). The point here is, if you want to affect education where the rubber hits the road, you need information that helps you redesign either the tire or the road, or both. Without the information at the right level of detail, the public will be kept wondering, "What can we do to help education?" With the information, the public can answer the more telling question, "How effective have been our reforms?"

Last, the articulation of our education system must begin in the preschool years, where the quantitative and spatial bases for subsequent mathematics are laid down, and continue through university work and corporate training to insure that what happens early on really does affect the impact that a worker has on the economy of Arizona and the nation. Currently, in our state, the public school system, the teacher's unions, the universities, and the state government, are working at cross purposes. Some individuals have been able to cross boundaries, but institutionally, we do not support each other under the banner of "Whatever it takes for our children." No single entity is to blame for the difficulties we face. However, all are to blame if we fail our students. If an assessment system is to be developed that provides solid information about the health of public education, in whatever content is deemed necessary to sustain the New Economy, the answer lies in using the distributed expertise residing in the state, to design a coherent and consistent educational system of which the assessment is a small but integral part.

Should AIMS be scrapped? If AIMS is defined as the attempt to design a quality assessment system by which the stakeholders in education, the most important of which are children, their parents, and teachers, can come to understand where the system flourishes, where it is flawed, and what steps can be taken to improve it, then I say, "No." Such a system insures that accountability is based on evidence, and that changes in the system are designed to fill an identified need. If, however AIMS is considered the current instrument, of which the released items must be considered a representative sample (otherwise, why were they released?), then I say, "Yes, of course" because to continue to utilize a fundamentally flawed instrument would continue to waste the time of students, teachers, and administrators, the effort and attention of the State Department of Education, Board of Education, and various constituent groups, and the money of the Taxpayers of the State of Arizona. Enough has been wasted already.

Editor's Note: While searching for some additional information, we discovered a second version of the released AIMS items, which contains some additional items as well as a few that are slightly different from those presented in the version this article discusses. In particular, in the second form, the graph for item 29 is curvilinear, which raises the question of which version of this item ultimately appeared on the exam taken by students.


The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Re:Why are there physics problems on a math test? (Score:1)
by jimbo on Thursday March 29, @02:44PM
This is true, Rod. I love these kind of problems, but they tap into prior knowledge that is not specifically mathematics related. One has to know about accelerated motion to interpret the item I presented.

A second item on the test has students extrapolate from data. Unfortunately, the data is about utilities bills (which are periodic, relating to seasonal changes in temperature), but the item expects students to fit a line... Anyone who knows about periodic functions would get the item incorrect BECAUSE they know the math.

Jim

Why are there physics problems on a math test? (Score:1)
by Rod on Thursday March 29, @10:36AM
I understand the rationale for having students interpret graphs (I believe that's one of the standards being put forward). However, having physics problems, bad physics problems, on a math test seems rediculous to me.

Since most high school students do not take physics, it seems to me that many students would be at a disadvantage with respect to this kind of item (ambiguous interpretations aside). How many high school students (who have not taken physics) know that objects in free-fall follow parabolic trajectories?

On the other side of the coin, students who have taken physics won't be able to get this item correct because the graph is simply wrong. It's clear that somebody hasn't properly done thier homework with respect to a number of these test items.

National Science Foundation Arizona State University Center for Research on Education in Science, Mathematics, Engineering and Technology

Ecept | home

[ National Science Foundation | National Science Teachers Association
National Council of Teachers of Mathematics | National Association for Research in Science Teaching ]