1. What’s an algorithm?
An algorithm, generally, is a usually efficient set of well-defined steps that are followed to solve some pre-defined problem. In the case of a CAT algorithm, the problem is to reliably and efficiently estimate a student’s ability in a reasonable amount of time. Some CAT algorithms seek to solve this problem by selecting one question at a time, each subsequent question selected based on all of the student’s prior responses. Other algorithms look only at the most recently-answered question. Still, others evaluate responses to specific groups of questions.
CAT algorithms also vary with regard to the explicit criteria they use to select the next question (or sets of questions) to administer. Some try to minimize total measurement error. Others try to maximize the precision and accuracy of measurement for each question administered. Still, others try to select questions that will most refine the current ability estimate. As a consequence, CAT algorithms can vary greatly from one to another, depending on the specific implementation of the algorithm, and the intent of the algorithm developers.
2. Why does the GMAT use an algorithm when the linear LSAT seems to be a pretty decent gauge of proficiency?
One of the common goals in using a CAT algorithm is to reduce the number of questions a student needs to answer in order to establish, to a specified level of reliability, an estimate of the student’s ability. CATs are often more efficient than linear tests, and so fewer questions are needed to reach a desired level of reliability. The LSAT needs over 100 items to reach that level, while the GMAT needs fewer than 80 to reach a comparable level.
3. Is the entire GMAT adaptive?
Almost all large-scale standardized tests contain some number of “experimental” or “pretest” questions that are administered to the student but do not count toward the student’s final score. This is simply a way for the test makers to gather data on the questions, in order to determine how difficult they are and how well they distinguish between students at different ability levels. They also use the data collected to identify bad questions, so that they can eliminate or fix them before they count.
Some tests, like the LSAT, include all of the pretest questions in a single section. Others, like the GMAT, intermingle the pretest questions with the operational ones. Which section is the pretest section, and which questions are the pretest questions, is usually a well-guarded secret. It is a generally a bad strategy to spend time trying to guess whether a given question is operational or not. The price of guessing incorrectly is just too high.
4. How does the GMAT select which questions I get?
CATs like the GMAT have a blueprint — a set of specifications (difficulty, question type, content area, etc.) that define which questions you see. At the same time, each question has certain statistical characteristics that the algorithm uses, based on your response, to estimate your quantitative or verbal ability. The algorithm looks at your performance on the questions you have already answered and the characteristics of each question remaining in the pool and then selects for you the question that simultaneously best satisfies the blueprint and provides the most statistical information it can, to generate the best estimate of your ability.
How is the GMAT actually scored? Here are some more questions that students frequently have about its algorithm.
1. My score doesn’t seem to match my performance: I only got a few questions wrong, but my score isn’t as high as I thought it would be / I got a bunch of questions wrong, yet my score seems higher than it should be.
Most exams are linear assessments, like the SAT or your 10th grade history final. These are scored by counting the number of questions you answer correctly, and sometimes by penalizing for each question you answer incorrectly. The result, a raw score, is then converted to a scaled score, like the 600-2400 range for the SAT.
A computer-adaptive test (CAT) works very differently. It doesn’t really care as much about how many you get right or wrong, but rather which questions you get right and wrong. The CAT algorithm estimates your ability based on a variety of criteria, including the difficulty of a question. After each question, it evaluates your response and updates this estimate. When the test is over, the algorithm converts your quantitative and verbal ability estimates into the quantitative and verbal scaled scores, and then separately combines your quantitative and verbal ability estimates to calculate the overall score.
2. Do the first X number of questions matter more?
Many variables that come into play when the CAT selects your next question. One of them is the CAT’s current estimate of your ability. It uses this estimate to select questions that will be most useful in refining that estimate (if you’re a high performing student, giving you low difficulty questions isn’t usually as useful in discerning your true ability as giving you harder questions, and vice versa). What is important to remember is that you should not try to guess how you are doing by whether the question in front of you seems easy or difficult; every question deserves your full attention. With that understood, unless you have completely bombed the test, it is usually the case that missing a couple of very hard questions late in the test will have a smaller effect on your final score than missing a couple of very easy questions earlier, not because of their position within the test but because of their levels of difficulty.
3. How severe is the penalty for not finishing a section?
The penalty is significant. You can expect your scaled score to decrease by roughly 1 point for every question that you don’t answer. For example, if you correctly answer every question you encounter but fail to answer the last five, you generally won’t score higher than a 46.
4. I took the GMAT and got a 710, 44q/44v/6 AWA. A friend of mine happened to take the test 6 days later and get the exact same quant/verbal scaled scores but he got a 720. How this could happen?
Both the individual section scores and the overall score are calculated using an estimate of your Math and Verbal abilities derived from your performance on the CAT. Your overall score is not calculated from your section scores. Because your underlying ability estimate might be slightly different from your friend’s, your overall scores might be different.
For example, there are a range of ability estimates that translate into a Verbal score of 40, and there are a range of ability estimates that translate into a Math score of 42. Depending on which specific estimate is calculated for you, your overall score could range from 660 to 680. Please note that the Standard Error of Measurement (SEM) on the overall score for GMAT is 29 points, so scores of 660 / 680 all fall within the standard error.
How can my overall percentile be higher than both my quantitative and verbal percentiles?
Your overall score is calculated separately from your section scores, so you can score in the 99th percentile on the GMAT even if you didn’t score in the 99th percentile on either of the sections. For example, you could get a 48 on Quantitative (86th percentile), a 45 on Verbal (98th percentile), and a 760 overall (99th percentile).
Are the quantitative and verbal sections weighted equally in the total score?
Technically, yes — the estimates of your quantitative and verbal abilities that the CAT produces contribute the same amount to your overall score. However, the verbal section has a greater effect on your percentile rank because it is generally more difficult. If, for example, you scored a 40 on both the Quantitative and Verbal sections, your percentile rank for Quantitative would be 61st, but for Verbal it would be 91st. Your overall score (650) would be in the 84th percentile.
Why are scores above 51 rare? Why does the scale go up to 60? Can anyone get a 52?
For psychometric reasons, GMAC has truncated the scale at 51 (they do not report section scores higher than 51).
Why is it so difficult to create a good CAT?
A CAT needs to do many things well in order to reliably and accurately estimate your ability. It requires a robust algorithm to estimate your ability, a complex but speedy mechanism to identify the best question for you to see next, a rich pool of questions from which to select the questions, and a powerful scoring algorithm that translates the ability estimate into something meaningful.
Each test question has many characteristics that need to be simultaneously considered in the selection. The statistical characteristics of the questions all need to be determined beforehand through a process known as pretesting. Many, many questions are needed in order to be able to provide accurate assessment for all ability levels. And all of those questions need to be carefully constructed, reviewed, and statistically aligned so that they contribute meaningfully to your ability estimate.
How tests are scored
We’ve received grades all our lives. In fact, we’re so used to them that we often don’t think very much about what they mean, or how they are calculated. So today we’re going to look at some of the different ways in which tests are scored, and at what those scores mean.
In preschool, we receive grades in the form of category scores: gold stars, silver stars, or bronze stars.Â Sometimes we might get two gold stars, or even three gold stars. These kinds of grades divide the relevant universe of people into some small number of categories, usually low-medium-high.
Later on we start to receive simple tally scores: 8/10 or 23/25. Soon these are represented as percentages: 80% correct, or 92%.Â One of the funny things about grades is that by the time we’re in high school and college, grades have reverted back to category scores (A, B, C, D, F) through a transformation of the percentages.
Every teacher and school adopts slightly different transformations.Â In some places, a grade of A is reserved for 96% and above.Â In other places the cutoff is 92%.Â In still others, it might be 90%.Â So what an “A” means can vary widely from place to place.
Everyone knows that some test questions are more difficult than others.Â Occasionally, teachers will take this into account by awarding more points for the hard questions than for the easy ones.
The basic sequence for most kinds of scoring is this:
- Count the number of questions, or the number of points associated with each question, that you answered correctly.
- Subtract, if applicable, any penalty for incorrect answers. This result is your “raw score.”
- Apply some transformation to your raw score (e.g., divide by total possible points, or use some more complicated function) to arrive at your “scaled score.”
For those of you taking the GMAT, the basic sequence is very different. Because the GMAT is an adaptive test, it looks at your performance on each question as you respond to it, and estimates your math or verbal ability along the way.Â Then it uses that ability estimate to calculate your score. For the GMAT, the basic sequence is:
- Deliver a test question.Â Based on your answer, estimate your ability, based on a number of factors, including the difficulty of the question.
- Based on the current estimate of your ability, select a question that will maximize the amount of information that can be used to refine the ability estimate.
- Loop through (1) and (2) until the test is complete.
- Apply a transformation to the resulting estimate of your ability to determine your section score.
- When you have completed all sections of the test, apply a transformation using all of the resulting ability estimates to determine your overall score.
What the GMAT does explicitly is what all tests try to do implicitly, namely, try to ascertain what you know and are able to do, in some context or another. It’s a more responsive way of testing, and we use the same adaptive technology in our GMAT practice tests.
In a later post, we’ll talk about validity, which has to do with what your score really means within a context, and why anyone would care.
Until then, do your homework!
It’s Wordy, It’s Awkward, It’s… Correct!
Written by Joanna Bersin, Knewton’s resident GMAT Sentence Correction expert.
Like a salesman trying to trick you into purchasing an expensive item by appealing to your emotions, the makers of the GMAT try to trick test-takers into both “buying” grammatically incorrect answer choices by making them concise and eliminating answer choices that are grammatically correct by making them appear awkward and unwieldy.
How do we typically avoid splurging on unnecessary purchases? We train ourselves to shop wisely, basing our decisions on a range of criteria and not solely on what “seems” to be the most attractive option in the store. We focus on specific features, using logic to compare items. How can you choose the correct answer on test day? You don’t just listen to your ear; first make sure that each sentence you eliminate violates a concrete rule of English grammar. When choosing between the remaining, seemingly error-free, constructions, use the differences between the options to identify errors; all other things being equal, always pick the less wordy, less awkward, and more active answer choice.
But buyer, beware: The test-makers, like salesmen, want your ear to tell you what to do. Before going into “negotiations” with these tricksters, it’s best to learn some of their most common tricks. First, make sure to hold on to wordy and awkward but otherwise error-free constructions. The test-makers especially like to make choice A (the original sentence in the prompt) sound particularly awkward, even when it is the only error-free option. This encourages test-takers to eliminate it immediately, and then to waste time picking between the remaining options. They want us to think “This is the ‘sentence correction’ section, our minds tell us, so this sentence, especially a wordy and awkward one, must need some correcting.”Â But not necessarily!
Next, do not waste time struggling with pronoun-antecedent errors in complex sentences. Because it is easy to spot a pronoun within a sentence, there is not much that the test-makers can do to create errors with an underlined pronoun. Therefore, do not let pronoun use distract you; check for a logical antecedent, and make sure that the pronoun agrees with this antecedent in number- and move on.Â On the GMAT, a pronoun is even allowed have two physically possible antecedents within a sentence as long as only one of these antecedents is logical.
On questions dealing with parallelism, items that are linked must be the same part of speech. Options that follow this rule are sufficiently parallel. Once you are choosing between sufficiently parallel options, look for other errors. On tough questions especially, the GMAT-makers will often make the most parallel-looking option incorrect for some other reason, luring you to into choosing it over a sufficiently parallel option without other errors.
“For the play, the creation of a humorous script and the care of the cast being chosen are important.”
“For the play, the creation of a humorous script and the care with which the cast is chosen are important.”
… are both parallel. The first sentence uses “of” after “care” and looks even more parallel than the second sentence. However, the less parallel-looking option is grammatically correct and logical, whereas the more parallel-looking option is awkward and unidiomatic. Don’t be fooled- appearances aren’t everything.
Finally, when down to those final two options, plug each back into the original sentence and check for sentence logic. An underlined portion itself may read error-free, but, when read in the context of the entire sentence, may be illogical. Which option clearly places all modifiers, especially adjectival ones, as closely as possible to the words they modify? Which choice connects clauses logically?
The salesmen use the same tricks over and over again. Learn the gimmicks and buy only what you came for.
GMAT test day, minute by minute
In reality, test day is not that different from any other day of preparation—test-takers must be attentive, focused, and fully prepared to bring their A-game. But for many test-takers, the term “test day” brings a variety of symptoms: cold sweats, night terrors, shakes, and so on. Knowing the nitty-gritty of what to expect when you get to the testing center can help relieve some of that unnecessary anxiety. Here’s Knewton’s minute-to-minute breakdown of a typical testing experience.
1. Arrive early, but don’t plan on studying at the testing center. 30 minutes before liftoff.
Show up to the test center 30 minutes before the official time, as the GMAC suggests. Although this may mean waking up even earlier than expected, avoiding any feeling of being rushed is priceless. However, many testing centers don’t allow studying in the waiting room, so don’t plan on getting there early and reviewing notes. Use the time before the test to relax and focus on the task at hand.
2. Locker Room. 10 minutes before liftoff.
After presenting your identification and test reservation, you may be given a key to a locker, into which you must put everything on your person other than your identification itself. This includes pens, paper, books, cell phones, house keys, lucky rabbit’s feet… everything. All you are allowed to bring in is your identification and the locker key itself. Think of this as a cleansing ritual, or a locker room warm-up. Although some centers may be laxer than others, in no circumstances expect to carry anything into the testing room.
3. Entering the Testing Room. 2 minutes before liftoff
The testing room will be a room filled with computers. It will be shut off from the rest of the testing center and under constant video monitoring. You may feel like the subject of some strange scientific experiment entering this room, but fear not. No shocks will be administered, and you will be far too wrapped up in your computer screen to notice the cameras or the half-lidded gaze of the proctors. Also note that you will be not only starting the test on a different schedule than other test-takers, but that it is likely that the others in the room may be taking different tests altogether. Whispering or passing notes is neither an option nor a temptation; this is not high school.
4. Tools of the Trade. Seconds before liftoff.
You will be provided with several tools with which to conquer the GMAT. The scratch pad looks and feels like a laminated legal pad; it is lined, yellow and shiny, and you will be provided with a thin black dry-erase upon which to write. These both work well, and you are allowed at any time to raise your hand to get the proctor’s attention if you need replacement pads or pens. You may also be provided with noise-canceling headphones (like those used by jackhammer-using construction workers). These work like a charm, even though the noise you’ll be canceling is the clickity-clacking keyboards of a dozen other test-takers.
5. Liftoff. The argument essay (30 min).
After signing in (perhaps with the proctor’s input), you’re off! You begin with the argument essay and are given a 30:00 ticking digital clock in the corner of the screen by which to measure your progress. Depending on your comfort with this time period, you may want to outline your essay on the pad before writing, especially noting which examples you expect to use and in what order.
6. Getting Personal. 30-60 minutes in. Issue Essay.
Same deal; you know the drill.
7. Eight is Enough. 60-68 minutes in. Break 1 (8 minutes).
You have the option to take an 8-minute break at this point. Keep in mind that the break starts the second you click “yes,” meaning that once you raise your hand to get the proctor, sign out by using your ID, and leave the room, you have less time than you might think to get back. This is enough time for a bathroom break or a breather, but no more. Up to this point, you have been at the test center for an hour and a half, and not yet seen one verbal or math question. So the first third of test day is all warming up and doing the essays; try to time your caffeine intake accordingly.
8. Test Day Begins. 68-143 minutes. Math (75 minutes).
Test day begins in earnest. The quant section will come first, and you’ll have 75 minutes to complete it. Since the math section is considered far more difficult to finish in this time period than is the verbal for most test-takers, plan accordingly (and use timed practice to understand your own timing). The math section will have you using that scratch pad in earnest, and you may want to use it to virtually “eliminate” choices on the verbal section by writing out A, B, C, D and E and crossing out choices as you go. The number of each question (and how many are left) is provided at all times, as is the time.
9. Eight is Enough Part 2: 143 minutes- 151 minutes. Break 2 (8 minutes).
Just like Break 1, except it’s likely that you will need this break even more. Take it to get a breather and prepare for the next section. Shift from math to verbal mentally, with the different timing considerations in your mind.
10. The Home Stretch! 151- 226 minutes. Verbal (75 minutes).
Stay alert! You’ve been at the test center for almost 4 hours at this point, but your concentration and focus is as necessary as ever. Watch those questions count down as you go…
11. Getting Down to Business. Score Reporting Info. 226-234.
As your reward for finishing the test, you get to decide which schools get your (still unreported) score. Let visions of leafy campuses, whiteboards, and elbow-patched professors fill your mind as you enter the schools you’d like to receive your score reports.
12. Do or Die: Canceling Your Score. 234- 236.
Last step: you have two minutes (with a ticking clock) to decide whether to cancel your score or report it. What’s your final answer? If you decide to report the score, you will immediately be informed of your scores and percentiles on the math and verbal reports. Either way, after four hours, almost half of which did not involve any math or verbal questions, test day has become history. It wasn’t so bad, was it?