Note: these posts are copied over from the ‘mathbucket’ section of my old tumblr blog and I haven’t put much effort into this, so there is likely to be context or formatting missing.
The Cognitive Reflection Test came up in the SSC Superforecasters review. I’ve seen it a couple of times before, and it always interests me:
 A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?

If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?

In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake?
I always have the same reaction, and I don’t know if it’s common or I’m just the lone idiot with this problem. The ‘obvious wrong answers’ for 2. and 3. are completely unappealing to me (I had to look up 3. to check what the obvious answer was supposed to be). Obviously the machinewidget ratio hasn’t changed, and obviously exponential growth works like exponential growth.
When I see 1., however, I always think ‘oh it’s that bastard bat and ball question again, I know the correct answer but cannot see it’. And I have to stare at it for a minute or so to work it out, slowed down dramatically by the fact that Obvious Wrong Answer is jumping up and down trying to distract me.
I did a maths degree. I have a physics phd. This is not a hard question. Why does this happen?
I know I have a very intuitionheavy style of learning and doing maths. For the second two I have very strong cached intuitions that they map to, whereas I’m really lacking that for the first one for some reason. I mean, I can visualise a line 110 units long, and move another 100unit line along it until there’s equal space at each end, but it’s not some natural thought for me.
Now, apparently:
The CRT was designed to assess a specific cognitive ability. It assesses individuals’ ability to suppress an intuitive and spontaneous (“system 1”) wrong answer in favor of a reflective and deliberative (“system 2”) right answer.
Yeah so that definitely isn’t getting tested for me. My System 2 hates maths and has no intention of putting in any effort on this test, but luckily System 1 has internalised the ‘intuitive and spontaneous’ answer for two of the questions for me. I will fail the first question unless my equally strong ‘the answer can’t be that obvious’ intuition fires, but that one makes me seriously worried about my answer to 3. as well.
My inability to internalise the bat and ball thing might be a quirk of my brain, but I’m sceptical of this test in general. It’s extremely vulnerable to having the right cached ideas.
Random SSC lurker that saw your post.
Here are my thoughts on the bat and ball question:
The bat and ball question primes the test taker to try to split the total into two parts by listing two things, a bat and a ball. Then it gives you a thing that looks like the size of one part, triggering heuristics to use subtraction for the remaining part, without hinting that the result of subtraction needs to be divided in half (one half for the ball, one half for the bat).
Consider this alternative problem:
A centered piece of text and its margins are 110 columns wide. The text is 100 columns wide. How wide is one margin?
Same numbers, same mathematical formula to reach the solution. But less misleading because you know there are two margins, and thus know to divide by two after subtracting.
On the topic of thinking by heuristics, I write code and design algorithms based on heuristics, with careful deliberation saved only for the details that really need it. It works well because I’ve been doing it for a long time. Use of heuristics probably varies with familiarity with an area. Slow deliberation when an area looks unfamiliar, until heuristics that perform well enough are developed. Then the heuristics are used to the extent that they work, with deliberate thought filling in what heuristics can’t do.
Heuristics are useful, but it’s important not to start them with a bad path. The CRT questions attempt to set a bad starting path, but that only works on people who do not already recognize the situation and heuristically pick their own starting path.
So I agree that the CRT is vulnerable to having the right background.
LikeLiked by 1 person
Thanks for the comment – I really like your version with the paper margins! As well as making it clear that there are two margins, in your version the 100unit object is something tangible. In the batandball version, the 100unit object is a more abstract quantity, the difference in price between the bat and the ball. As you say it’s much more tempting to slap those 100 units onto one of the existing tangible objects, the bat, leaving 10 units for the ball.
And of course the nice round numbers help make the wrong version even more appealing! There’s a surprising amount to unpack in such a simple problem. I still think that with this problem it’s harder to guard against the obvious wrong answer than the other two – still not quite clear on what heuristic I can pick up that will allow me to just intuit the correct result without thinking.
I agree with your comment on heuristics too. I’m sort of a lazy thinker who’s normally too keen to jump to heuristics, with the dangers of jumping to a wrong path that that implies. Absorbing new heuristics is the fun part though!
LikeLike
I have no degrees in anything, so this might not make much sense or use terminology in strange ways, or just be completely alien.
Maybe abstraction level is the key concept here. A thing that is derived from relative values of more tangible things is more abstract. The bat is tangible (AL 1), the ball is tangible (AL 1). Their sum is AL 2. If that sum was used as part of something, that something would be AL 3.
If an AL N value is composed of two AL N1 values added together, subtracting one naturally yields the other. Similar operation reversal rules apply to the other mathematical operations.
The batandball problem gives you two AL 2 values: bat plus ball, and bat minus ball. Neither AL 1 value is directly revealed, so subtraction yields an AL 3 value that happens to be double the AL 1 value that solves the problem. The solution path is not steps that always reduce the abstraction level.
The margins version gives you one AL 1 value (text) and one AL 2 value (text plus margins (maybe this should be considered AL 3, since margins is AL 2)). Subtracting the text results in the AL 2 value of the margins added together. Neither margin is known, but they are known to be equal, so dividing by two gives the AL 1 value of one margin. Each step on the solution path reduces the abstraction level of the things being handled.
Does this analysis make sense for the other problems? Can their parts be described by abstraction levels, and how the solution paths change those abstraction levels?
5 machines (AL 1), 5 minutes (AL 1), 5 widgets (AL 2 (machines and time combined)). Widgets are tangible, so the natural reaction is to assign them AL 1, and when two of three AL 1 values in a problem are multiplied by 20, it’s natural to do the same to the remaining AL 1 value.
Time is not actually affected by either the number of machines or the number of widgets because it is an AL 1 value. A heuristic that recognizes that leads to the correct solution path.
Problem 3, the lily pad problem, does not involve abstraction level confusion. 48 days counts as AL 1, “whole lake” is AL N (the exact value of N is irrelevant), “half the lake” is AL N1, “time to cover half” is AL 1. This problem relies on operation confusion, using the wrong operation relating the AL N and AL N1 value to reach the AL 1 solution value.
After writing that out, abstraction levels seem to work for describing problem 1, but for 2 and 3 abstraction levels seem kind of contrived and unhelpful. So maybe it’s not a good way to describe the parts of a problem and find a solution.
LikeLike
Kyzentun (doesn’t look like I can reply directly to a comment that nested): interesting analysis. I’ll have to think about it more some time, but I think you’re right that abstraction is a key part of the problem for question 1 but doesn’t particularly help with the other two.
I’m increasingly convinced that these questions aren’t really a natural set of anything much, and I’m not sure how they were picked in the first place. Maybe I’ll have to, like, read the actual paper sometime. I probably have no more time to think about this thing for the next couple of weeks though!
Slightly off topic, your abstraction levels idea reminded me of Bret Victor’s Up and Down the Ladder of Abstraction essay, which is well worth reading if you haven’t already!
LikeLike