Self-similar procrastination

Sierpinski triangle
Your to do list should look like this. Or something. I’m still working out the finer details.

[image source]

I had a gigantic insight on my walk home tonight, which is clearly going to make me millions, but before I start on my self-help book empire I needed to write this rushed crappy blog post explaining it. And before that, I needed to do the dishes. Because my grand theory requires self-similar competence at all scales.

OK, so really this thing is not very profound or original at all. It’s pretty much the same as John Perry’s structured procrastination, but with a slightly different emphasis. (And probably this emphasis appears elsewhere too.)

If you somehow haven’t come across the structured procrastination essay before, it’s wonderful and you should read it. The key part:

Procrastinators often follow exactly the wrong tack. They try to minimize their commitments, assuming that if they have only a few things to do, they will quit procrastinating and get them done. But this approach ignores the basic nature of the procrastinator and destroys his most important source of motivation. The few tasks on his list will be, by definition, the most important. And the only way to avoid doing them will be to do nothing. This is the way to become a couch potato, not an effective human being.

I’m highly susceptible to this particular bad idea and end up doing nothing too often. Partly this is because I tend to have overambitious crackpot plans, so there’s always a good supply of ‘most important’ tasks to put at the top of the list. And partly it’s because I don’t seem to get bored as easily as most people and am unusually good at sitting around doing nothing very much, so it’s easy to slump into the couch potato ground state.

I do think sitting around doing nothing very much is highly underrated by a lot of people, but that would be a different post. In my case, I definitely need nudging towards actually getting shit done, instead of thinking idly about things I could do.

John Perry advocates avoiding this low-energy stuck state by filling up your to do list with ‘a hierarchy of the tasks you have to do, in order of importance from the most urgent to the least important’. The main mechanism he advances for why this works is that fear of the big intimidating tasks at the top drives you down the list, pushing you into actually completing many of the lower-ranked items.

I think this effect is somewhat important, but my thesis is that the most important mechanism is actually going the other way, from the bottom of the list up. Deadline fear is definitely useful, but I think that the energy and confidence created by completing the small tasks is the most important bit.

My intuition is that energy tends to be created at the microscale and bubbles up from there. I definitely use this principle to try and build momentum at work when I have a seriously boring task to do. I don’t want to do the actual task, but maybe I can be bothered to open up the password manager and get out the password I need for the server, and then maybe I can be bothered to open up a terminal and log on. Then maybe I’ll type in some trivial command to get myself used to the fact that I’m going to be typing in some commands. I don’t actually care what files are in that particular directory, but listing them has enough of the flavour of ‘doing work’ that it’s often enough to push me over the threshold into doing real work.

I think this is all pretty uncontroversial at the microscale. Any grumpy old fart who writes a weekly column for the Telegraph on how The Kids These Days Have No Discipline could tell you that sitting up straight and making your bed in the morning (or whatever) will propagate through to getting more done in general.

Where it gets interesting is at the mid-scale – projects you’re spending weeks to months on, but that aren’t all that important, at least in comparison with Big Intimidating Project at the top of the list. These are the ones I find myself wanting to cross off the list, because they’re ‘wasting time’.

But I’m coming to realise that they play an extremely important role in the task ecosystem. In some sense these are the largest-scale projects that you know you can actually pull off. Really big intimidating projects tend to have some sort of ‘research’ type element, where you don’t know what would even constitute a solution when starting out. Mid-sized projects, on the other hand, take a considerable amount of effort but are much more well defined. You more-or-less know how you’re going to tackle them, and what a successful outcome will look like. Successful mid-sized projects give you the confidence and energy to keep going, gradually allowing you to push further and further up the scale.

(I think it’s also important that at least some of these are self-contained projects in their own right, rather than subtasks of Big Intimidating Project. Lopping chunks off of Big Intimidating Project and tackling them separately is an excellent strategy, but my intuition is that this can’t be the only thing. Probably this is something to do with needing a supply of new ideas to keep bubbling up at all scales, but I haven’t thought about it very carefully.)

Here’s an example of an idea bubbling up. Last August I declared a Shitty Projects Month, as I could tell I needed some sort of break from more focussed work. It’s the kind of idea you can’t really fail at, and I did indeed do some shitty work on a couple of shitty projects, but I didn’t feel too pleased with how it went at the time. I suppose I was hoping that the results would be, well, less shitty.

Somehow, though, I found myself coming back to one of the projects a couple of months ago. I’d had an idea for a toy project I could try and do using the d3.js visualisation library – nothing useful, but it would look pretty if I got it right. I spent most of my time fighting my poor understanding of the library, and indeed of Javascript in general, and didn’t get very far.

Eventually it came back into my head, though, and this time I had the bright idea of prototyping in Inkscape. Once I could see something visually I was a lot more excited about the project and made rapid progress. I haven’t finished yet because there’s other stuff I have to do this month, but it looks likely that it’s going to be a major component in the visual design of the proper website I’m finally going to make, which will be my next mid-range non-physics project. If I don’t run into any more weird distractions.

And of course, the best example of a mid-range project bubbling up from triviality is this blog itself. I got the shittiest possible blog, a basic tumblr with the default design, and started writing with no particular plan. It turned out that what I wanted to write about was mathematical intuition (and chalk!? no idea about that one) so I went with that. And then got it off tumblr and turned it into something approximating a proper blog.

I haven’t run out of ideas yet, so hopefully this one can keep bubbling up. That’s my excuse for writing essentially the same post over and over again. Self-similar blogging at all scales!

Everything And Its Discontents

I was trying to dig up some blog I read ages ago that I thought used a variation on ‘X and its Discontents’ in its tagline. I haven’t found it, but I have dug up a pretty extensive list of things people on the internet are discontented about (who knew?).

Main conclusions are that everyone hates -isms and Europe:

The New Turkey
The China Boom
The Crowded Public Sphere
Data Nationalism
Late Neoliberalism
Late Style
Art Direction
Nordic Art
Ceramics as a Medium
Method Acting
Scholarly Publishing
Political Rule
Multidimensional Poverty
Hearts and Minds
Social Media
The Enlightenment
European Integration
The European Court of Human Rights
The Dollar
Film Adaptation
The United Nations
The Social Imaginary
Modern Water
Grand Coalition Politics
Class Size
Urban Computing
Urban Governance
Urban Tourism
The 21st Century Urban Housing Crisis
The Bacteriological City
The Internet
Open Science
The Modern University
The Nobel Prize
The Singularity
Spooky Interaction
Communal Living
Mission Creep
Patent Alienability
The Standard Model of Talent Development
The Jobless World
Conspiracy Theory
Curatorial Education
Climate Based Daylight Modelling
Continental Realism
Preservative Realism
Realist Criminology
Situational Crime Prevention
The Achievement Society
Technological Progress
The Technological Body
Strategic Philanthropy
Transitional Justice
Cosmopolitan Justice
Judicial Engagement
International Law
Family Law
The Lawn-Chemical Economy
The World’s Course

Research debt, double distilled

I said a few weeks ago that I was going to talk more about this article by Chris Olah and Shan Carter on the idea of research debt, but every time I went back to it I felt that the original article made the point so clearly and elegantly that there was very little I wanted to add. (Other than ‘I want this thing in physics too please!’)

So I started to pull out my favourite bits, intending to do a very lazy quotes-and-comments sort of post, and realised that with a couple of additions they made a coherent summary of the original post on their own. In the process I discovered that there were a few things I wanted to say, after all.

If you just read straight down the blockquotes you get a double-distilled microversion of the original essay. Or you can also read the bits I’ve stuck in between.

For centuries, countless minds have climbed the mountain range of mathematics and laid new boulders at the top. Over time, different peaks formed, built on top of particularly beautiful results. Now the peaks of mathematics are so numerous and steep that no person can climb them all.

This has always saddened me. In Men of Mathematics, E. T. Bell labels Henri Poincaré as the ‘The Last Universalist’, the last person to be able to range freely across all fields of mathematics as they existed in his time. Now Bell did have a tendency to over-dramatise things, but I think this is basically right.

Probably this is unavoidable; probably the expansion wave has accelerated too fast, and those days will not return. I’m temperamentally susceptible to millenarian dreams of the return of the once and future universalists, but I accept that this is unlikely.

Still, there is a lot of compression that is within reach:

The climb is seen as an intellectual pilgrimage, the labor a rite of passage. But the climb could be massively easier. It’s entirely possible to build paths and staircases into these mountains. The climb isn’t something to be proud of.

The climb isn’t progress: the climb is a mountain of debt.

The analogy is with technical debt in programming, which is all the awkward stuff thrown to the side in an effort to get software into production quickly. Eventually you have to go back and deal with the awkward stuff, which has an unfortunate tendency to compound over time.

The insidious thing about research debt is that it’s normal. Everyone takes it for granted, and doesn’t realize that things could be different. For example, it’s normal to give very mediocre explanations of research, and people perceive that to be the ceiling of explanation quality. On the rare occasions that truly excellent explanations come along, people see them as one-off miracles rather than a sign that we could systematically be doing better.

People who are truly excellent at explaining research are probably rare. But ‘better explanations than we have currently’ seems like a very, very easy target to hit, once people are persuaded to put resources into hitting it.

I plan to finally start taking my own advice soon, and start putting whatever notes and bits of intuition I’ve gathered online. I’m not too convinced that they’ll be especially great, but the current floor is pretty low.

Research distillation is the opposite of research debt. It can be incredibly satisfying, combining deep scientific understanding, empathy, and design to do justice to our research and lay bare beautiful insights.

Distillation is also hard. It’s tempting to think of explaining an idea as just putting a layer of polish on it, but good explanations often involve transforming the idea. This kind of refinement of an idea can take just as much effort and deep understanding as the initial discovery.

Distillation is fundamentally a different sort of activity to the types of research that are currently well supported by academia. Distillers aren’t mountain climbers; they engage with their subject by criss-crossing the same ground over and over again, following internally-generated trails of fascination that can be hard to interpret from the outside. They want to understand!

An aspiring research distiller lacks many things that are easy to take for granted: a career path, places to learn, examples and role models. Underlying this is a deeper issue: their work isn’t seen as a real research contribution. We need to fix this.

Distillers generally have little interest in who can get to the top of the mountain fastest, and anyway it certainly won’t be them. In an environment that rewards no other activity, they tend to disappear quickly. They require different infrastructure.

None of this infrastructure currently exists, but it easily could do. Research distillation doesn’t intrinsically need to cost huge amounts of money. It’s not like we need to spend billions on a gigantic high-energy collider to smash our current explanations together. This is an area where transitioning from moaning about academia to actually doing something about it looks to be pretty straightforward.

It’s one of the nice sorts of problems where small efforts at the margins are already useful. It certainly helps if you have Google’s resources behind you, but you can also just polish up any half-decent notes you have lying around on a topic that’s currently poorly explained and put them online, and you’ve made a tiny contribution towards fixing the problem.

If you are excited to distill ideas, seek clarity, and build beautiful explanations, we are letting you down. You have something precious to contribute and we aren’t supporting you the way we should.

I’ve been saying ‘they’ throughout this post, but, I mean, it’s obvious why I care about this thing. This is my old tumblr ‘about me’ page:


It’s amusing self-deprecation, but unfortunately I also meant it a lot of the time. (I still believe the programmer bit, but I’m starting to have some optimism about improvement there too.) My standard line after finishing my thesis was ‘I love physics but I’m bad at research’.

I had a poorly understood but strongly felt sense of what I wanted instead, academia was clearly not going to provide it, and I just wanted to get out. ‘Research distillation’, however, is a reasonably close fit. (Maybe not an exact one. I feel the ‘criss-crossing existing territory’ approach goes deeper than just refining existing ideas, and is a valid route to original research in itself. But it’s an ecosystem I think I would have been able to cope with, and succeed in.)

So I’ll admit my enthusiasm for the idea of research distillation is mostly pure self-interest. But I’m pretty sure that a thriving ecosystem of distillers would also help academia. After all, you only criss-cross the territory for love of the subject. The external rewards are currently too poor for any other motivation to make sense.

Reading into the landscape

Written quickly and probably not very clear – it’s a workbook post not a polished-final-thoughts post. Vaguely inspired by this exchange between Julia Galef and Michael Nielsen.

One of my favourite things is the point in learning a new topic where it starts to get internalised, and you begin to be able to see more. You can read into a situation where previously you had no idea what was going on.

Sometimes the ‘seeing’ is metaphorical, but sometimes it’s literal. I go walking quite a lot, and this year I’m seeing more than before, thanks to an improved ability to read into the landscape.

I got this from Light and Colour in the Outdoors, a classic 30s book on atmospheric phenomena by the physicist Marcel Minnaert. It’s really good, and I’m now regretting being cheap and getting the Dover version instead of the fancy coffee-table book (note to self: never buy a black-and-white edition of a book with the word ‘colour’ in the title).

I’ve only read a few sections, but already I notice more. Last weekend I got the coach to London, and on the way out I saw a sun dog I’d probably have missed before. And then on the way back it was raining with the sun shining onto the coach windscreen in front, and I thought to myself, ‘I should probably look behind me’. I turned, and right on cue:

2017-05-20 20.23.49

This is entry-level reading into the landscape, but still quite satisfying. Those with exceptional abilities seem to have superpowers. George Monbiot in Feral talks about his friend Ritchie Tassell:

… he has an engagement with the natural world so intense that at times it seems almost supernatural. Walking through a wood he will suddenly stop and whisper ‘sparrowhawk’. You look for the bird in vain. He tells you to wait. A couple of minutes later a sparrowhawk flies across the path. He had not seen the bird, nor had he heard it; but he had heard what the other birds were saying: they have different alarm calls for different kinds of threat.

This is the kind of learning that fascinates me! You can do it with maths as well as with sparrowhawks…

This has been on my mind recently as I read/reread Venkatesh Rao’s posts on ambiguity and uncertainty. I really need to do a lot more thinking on this, so this post might look stupid to me rather rapidly, but it’s already helping clarify my thoughts. Rao explains his use of the two terms here:

I like to use the term ambiguity for unclear ontology and uncertainty for unclear epistemology…

The ambiguity versus uncertainty distinction helps you define a simpler, though more restricted, test for whether something is a matter of ontology or epistemology. When you are missing information, that’s uncertainty, and an epistemological matter. When you are lacking an interpretation, that’s ambiguity, and an ontological matter.

Ambiguity is the one that maps to the reading-into-the-landscape sort of learning I’m most fascinated by, and reducing it is an act of fog-clearing:

20/ In decision-making we often use the metaphors of chess (perfect information) and poker (imperfect information) to compare decision-makers.

21/ The fog of intention breaks that metaphor because the game board /rules are inside people’s heads. Even if you see exactly what they see, you won’t see the game they see.

22/ Another way of thinking about this is that they’re making meaning out of what they see differently from you. The world is more legible to them; they can read/write more into it.

I think this is my main way of thinking about learning, and probably accounts for a fair amount of my confusion when interacting with the rationalist community. I’m obsessed with ambiguity-clearing, while the rationalists are strongly uncertainty-oriented.

For example, here’s Julia Galef on evaluating ‘crazy ideas’:

In my experience, rationalists are far more likely to look at that crazy idea and say: “Well, my inside view says that’s dumb. But my outside view says that brilliant ideas often look dumb at first, so the fact that it seems dumb isn’t great evidence about whether it will pan out. And when I think about the EV here [expected value] it seems clearly worth the cost of someone trying it, even if the probability of success is low.”

I’ve never thought like that in my life! I’d be hopeless at the rationalist strategy of finding a difficult, ambitious problem to work on and planning out high-risk steps for how to get there, but luckily there are other ways of navigating. I mostly follow my internal sense of what confusions I have that I might be able to attack, and try to clear a bit of ambiguity-fog at a time.

That sounds annoyingly vague and abstract. I plan to do a concrete maths-example post some time soon. In the meantime, have a picture of a sun dog:


Hackers and painters but not physicists?

There are two common routes that people go down after a physics PhD. The first is, of course, to stay on in academia and get a postdoc, and then hopefully another postdoc or two, and then hopefully a permanent job.

Many people fail one of these steps, and many others don’t fancy the whole process in the first place. So the second common option is to leave and look for a job that uses the same skills in some way. This could be in, for example, data science, algorithmically intensive areas of programming, quantitative finance, or industrial research and development. These are challenging jobs that require a lot of thinking, normally with long work hours attached. There isn’t much time and energy left to spare for learning physics, so mostly people don’t do that any more.

I wasn’t particularly good at my PhD, so playing the first game would have been a struggle. And there was enough that annoyed me about academia that I was pretty OK with leaving.

But I also didn’t like the look of the second game. I don’t want to do a challenging job that ‘uses my technical skills’ in some other field. I don’t care about my technical skills. They’re not even very good! I’m completely lacking in the kind of sharp, focussed intellect that excels at rapid problem solving in unfamiliar contexts. I will not pass your whiteboard interview.

I just want to think about physics. I have specific questions that are stuck in my head, and I want to work on those. I may not succeed in doing this very well or finding anything useful, but it’s not going to stop me thinking about them. Getting to use some decontextualised ‘problem solving skills’ elsewhere is not much of a consolation prize, because everything I care about is in the context itself. (This mindset actually makes most academic postdocs look quite unappealing too.)

So in my case it makes more sense to be relatively unambitious career-wise and try to free up time for learning physics. The main useful features of my current job are that it’s reasonably not horrible, has sane hours, and leaves me with mental energy to spare. Eventually I want to do better, and figure out how to cut down the hours I work.

This seems to be an unusual choice in physics. I can only think of two vaguely relevant archetypes: cranks of the classic ‘retired electrical engineer who’s just realised relativity is WRONG’ variety, and the occasional seriously impressive person who ends up the news for making important contributions to number theory while working in Subway. I feel embarrassed admitting to people what I’m doing, because I worry that the subtext they’ll pick up is either ‘I’ve deluded myself into thinking I’m an unrecognised genius’, or ‘I’ve gone full crackpot and no longer care what anyone thinks, also did you know that Einstein was wrong?’ Neither of those exactly sounds good, so I tend to talk about my plans like it’s just a big joke.

The annoying thing is that what I’m trying to do is not at all rare in other fields. Everyone understands the concept of people in the arts having day jobs, for example. The musician who works in a coffee shop to pay the bills is a standard cultural stereotype.

I like several things about this stereotype. For a start, it’s nice to simply have it available, so you can explain what you are doing to others quickly without looking too odd. (Possibly someone will give you a bit of ‘get a real job!’ grief for it, but that’s also part of the script! You already know how to play along in that exchange, so you escape a lot of awkwardness.)

There’s also no requirement for you to have any particular level of ability, so you escape the genius/crackpot dichotomy. You can be a brilliant musician, but it’s also OK to be sort of mediocre, or even downright awful, but love it anyway.

And best of all, it’s expected that your interests will be specific and contextual, that you’ll be driven by a particular obsession. If someone advises you to ‘apply your technical skills’ to writing advertising jingles, it’s a pragmatic suggestion for how to pay rent, not an indication that it could be a comparably fulfilling way to spend your time.

Maybe it’s not so surprising that the idea of people working on physics in their spare time is not well represented in the wider culture. Most people aren’t especially interested in physics, and might not even realise that you can actually care this much. But I’m not just talking about cultural representation. I’m confused about why hardly anyone seems to be doing it at all.

One very obvious point is that the financial situation is so much better than in the arts. All those jobs in data science and industrial research pay very well. This is not something I want to complain about! But it means there are strong forces pulling people into these careers, and away from having time for physics. This probably makes it harder to notice the idea in the first place.

I don’t think this is the whole story, though. In this respect it’s interesting to compare us against a different population of STEM nerds. Programmers are also well paid. But, like musicians, they still manage to have a robust culture of day jobs and side projects. It might not be such a sitcom stereotype, but within the field it’s not seen as so unusual for someone to spend the day at their dull enterprise Java job, and then go home and contribute to an open source project they really care about.

In fact, once you’ve thought of it, the financial situation makes the ‘day job’ idea easier, not harder. The most challenging and well-paid jobs may not leave you much extra time for thinking, but there are other less impressive-sounding options that still pay pretty well. Writing line-of-business applications at Large Dull Company may not be the most exciting way to spend your day, but that means you’re likely to have brainpower to spare. And if you want to reduce your hours further or have more flexibility with your time, there are reasonable pathways in consulting and contracting.

As with artists and musicians, you don’t have to be brilliant to fit this pattern. You just have to be fascinated enough with a specific idea to want to work on it even if nobody is paying you. Writing a basic CRUD app in your spare time is fine if there’s something in there that really interests you.

Paul Graham famously made the same comparison in his Hackers and Painters essay:

The other problem with startups is that there is not much overlap between the kind of software that makes money and the kind that’s interesting to write…

All makers face this problem. Prices are determined by supply and demand, and there is just not as much demand for things that are fun to work on as there is for things that solve the mundane problems of individual customers. Acting in off-Broadway plays just doesn’t pay as well as wearing a gorilla suit in someone’s booth at a trade show. Writing novels doesn’t pay as well as writing ad copy for garbage disposals. And hacking programming languages doesn’t pay as well as figuring out how to connect some company’s legacy database to their Web server.

I think the answer to this problem, in the case of software, is a concept known to nearly all makers: the day job. This phrase began with musicians, who perform at night. More generally, it means that you have one kind of work you do for money, and another for love.

So why is this missing from physics? One good reason is that working independently doesn’t make much sense for a lot of people if they want to still produce original research. If you enjoy being an experimentalist in a big collaboration, you’re out of luck. You can’t build a small version of the LHC in your shed and hope to keep contributing to high energy physics. Some areas of theory would also not work well, such as big numerical simulations, or fast-moving subfields developing highly technical methods that would be hard for an outsider to keep up with.

I think that still leaves a reasonable number of options, though. It’s the same for programmers, surely: you won’t have access to Google-sized datasets in your one-person side project, either. And I’m not necessarily even talking about original research anyway. In fact, I’m explicitly in favour of a much wider conception of what constitutes a useful contribution to physics. I talked about some of the things I value briefly in this ‘niches’ comment on David Chapman’s Meaningness blog:

What I’d like to see is more niches, in the ecological sense: more acceptable ways to be ‘successful’ in academia so that everyone isn’t stuck trying to shove each other down the same boring hill in their quest for the summit. Off the top of my head I would like all these things to be valued as highly as ‘high-impact’ research: teaching, reproduction of experiments, new ways of visualising or conceptualising existing results, communication of existing concepts to people outside of your speciality, programming new tools to make research easier, digging around in the historical archives of the field for interesting lost insights…

Many of these are completely feasible to work on as an individual outside of academia. So I don’t think this is the whole problem.

It’s true that a lot of people are in physics mostly for the problem solving element. In that case, applying your skills elsewhere for more money might look extremely attractive, as you aren’t really losing anything by solving the same kinds of problems in a different field. This could actually be enough to explain the situation completely – maybe there just are very few weirdos like me out there who care about specific questions in physics rather than broadly applicable techniques, but who also don’t try to stick it out in academia. But in that case I don’t really understand why the situation in programming is so different.

I do have to wonder, though, whether one reason that people don’t do this in physics is simply that, well, people don’t do this in physics. Nobody sees anyone else doing it (apart from a few crackpots and geniuses who are easily discounted) so they don’t think to try it themselves. This would be the most optimistic explanation, because it looks so easily changeable! Maybe if more of us continued doing physics outside of academia it would become a thing. Maybe there are already a fair number of people trying this, but they’re currently keeping rather quiet about it.

Mostly I’m just confused, though. Suggestions gratefully received!

Two types of mathematician, yet again

I’ve recently been browsing through Season 1 of Venkatesh Rao’s Breaking Smart newsletter. I didn’t sign up for this originally because I assumed it was some kind of business thing I wouldn’t care about, but I should have realised it wouldn’t stray far from the central Ribbonfarm obsessions. In particular, there’s an emphasis on my favourite one: figuring out how to make progress in domains where the questions you are asking are still fuzzy and ambiguous.

‘Is there a there there? You’ll know when you find it’ is explicitly about this, and even better, it links to an interesting article that ties in to one of my central obsessions, the perennial ‘two types of mathematician’ question. It’s just a short Wired article without a lot of detail, but the authors have also written a pop science book it’s based on, The Eureka Factor. From the blurb it looks very poppy, but also extremely close to my interests, so I plan to read it. If I had any sense I’d do this before I started writing about it, but this braindump somehow just appeared anyway.

The book is not focussed on maths – it’s a general interest book about problem solving and creativity in any domain. But it looks like it has a very similar way of splitting problem solvers into two groups, ‘insightfuls’ and ‘analysts’. ‘Analysts’ follow a linear, methodical approach to work through a problem step by step. Importantly, they also have cognitive access to those steps – if they’re asked what they did to solve the problem, they can reconstruct the argument.

‘Insightfuls’ have no such access to the way they solved the problem. Instead, a solution just ‘pops into their heads’.

Of course, nobody is really a pure ‘insightful’ or ‘analyst’. And most significant problems demand a mixed strategy. But it does seem like many people have a tendency towards one or the other.

A nice toy problem for thinking about how this works in maths is the one Seymour Papert discusses in a fascinating epilogue to his Mindstorms book. I’ve written about this before but I’m likely to want to return to it a lot, so it’s probably worth writing out in a more coherent form that the tumblr post.

Papert considers two proofs of the irrationality of the square root of two, “which differ along a dimension one might call ‘gestalt versus atomistic’ or ‘aha-single-flash-insight versus step-by-step reasoning’.” Both start with the usual proof by contradiction: let \sqrt{2} = \frac{p}{q}, a fraction expressed in its lowest terms, and rearrange it to get

p^2 = 2 q^2 .

The standard proof I learnt as a first year maths student does the job. You notice that p must be even, so you write it as p=2r, sub it back in and notice that q is going to have to be even too. But you started with the fraction expressed in its lowest terms, so the factors shouldn’t be there and you have a contradiction. Done.

This is a classic ‘analytical’ step-by-step proof, and it’s short and neat enough that it’s actually reasonably satisfying. But I much prefer Papert’s ‘aha-single-flash-insight’ proof.

Think of p as a product of its prime factors, e.g. 6=2*3. Then p^2 will have an even number of each prime factor, e.g. 36=2*2*3*3.

But then our equation p^2 = 2 q^2 is saying that an even set of prime factors equals another even set multiplied by a 2 on its own, which makes no sense at all.

This proof still has some step-by-step analytical setup. You follow the same proof by contradiction method to start off with, and the idea of viewing p and q as prime factors still has to be preloaded into your head in a more-or-less logical way. But once you’ve done that, the core step is insight-based. You don’t need to think about why the original equation is wrong any more. You can just see it’s wrong by looking at it. In fact, I’m now surprised that it didn’t look wrong before!

For me, all of the fascination of maths is in this kind of insight step. And also most of the frustration… you can’t see into the black box properly, so what exactly is going on?

My real, selfish reason for being obsessed with this question is that my ability to do any form of explicit step-by-step reasoning in my head is rubbish. I would guess it’s probably bad compared to the average person; it’s definitely bad compared to most people who do maths.

This is a major problem in a few very narrow situations, such as trying to play a strategy game. I’m honestly not sure if I could remember how to draw at noughts and crosses, so trying to play anything with a higher level of sophistication is embarrassing.

Strategy games are pretty easy to avoid most of the time. (Though not as easy to avoid as I’d like, because most STEM people seem to love this crap 😦 ). But you’d think that this would be a serious issue in learning maths as well. It does slow me down a lot, sometimes, when trying to pick up a new idea. But it doesn’t seem to stop me making progress in the long run; somehow I’m managing to route round it. So what I’m trying to understand when I think about this question is how I’m doing this.

It’s hard to figure it out, but I think I use several skills. One is simply that I can follow the same chains of reasoning as everyone else, given enough time and a piece of paper. It’s not some sort of generalised ‘inability to think logically’, or then I suppose I really would be in the shit. Subjectively at least, it feels more like the bit of my brain that I have access to is extremely noisy and unfocussed, and has to be goaded through the steps in a very slow, explicit way.

Another skill I enjoy is building fluency, getting subtasks like bits of algebraic manipulation ‘under my fingers’ so I don’t have to think about them at all. This is the same as practising a musical instrument and I’m familiar with how to do it.

But the fun one is definitely insight. Whatever’s going on in Papert’s ‘aha-single-flash-insight’ is the whole reason why I do maths and physics, and I wish I understood it better. I also wish there were more resources for learning how to work with it, as I’m pretty sure it’s my main trick for working round my poor explicit reasoning skills.

My workflow for trying to understand a new concept is something like:

  1. search John Baez’s website in the hope that he’s written about it;
  2. google something like ‘[X] intuitively’ and pick out any fragments of insight I can find from blog posts, StackExchange answers and lecture notes;
  3. (back when I had easy access to an academic library) pull a load of vaguely relevant books off the shelf and skim them;
  4. resign myself to actually having to think for myself, and work through the simplest example I can find.

The aim is always to find something like Papert’s ‘set of prime factors’ insight, some key idea that makes the point of the concept pop out. For example, suppose I want to know about the Maurer-Cartan form in differential geometry, which has this fairly unilluminating definition on Wikipedia:


Then I’m done at step 1, because John Baez has this to say:

Let’s start with the Maurer-Cartan form. This is a gadget that shows up in the study of Lie groups. It works like this. Suppose you have a Lie group G with Lie algebra Lie(G). Suppose you have a tangent vector at any point of the group G. Then you can translate it to the identity element of G and get a tangent vector at the identity of G. But, this is nothing but an element of Lie(G)!

So, we have a god-given linear map from tangent vectors on G to the Lie algebra Lie(G). This is called a “Lie(G)-valued 1-form” on G, since an ordinary 1-form eats tangent vectors and spits out numbers, while this spits out elements of Lie(G). This particular god-given Lie(G)-valued 1-form on G is called the “Maurer-Cartan form”, and denoted ω.

This requires a lot more knowledge going in than the square root of two example, because I need to know what a Lie group and a Lie algebra and a 1-form are to get any use out of it. But if I’ve already struggled through getting the necessary insights for those things, I now have exactly the further insight I need: if you can translate your tangent vector back to the identity it’ll magically turn into a Lie algebra element, so then you’ve got yourself a map between the two sorts of things. And if I don’t know what a Lie group and a Lie algebra and a 1-form are, it’s pointless me trying to learn about the Maurer-Cartan form anyway.

Unfortunately, nobody has locked John Baez in a room and made him write about every topic in mathematics, so normally I have to go further down my algorithm, and that’s where things get difficult. There’s surprisingly poor support for an insight-based route through maths. If you want insights you have to dig for them, one piece at a time.

Presumably this is at least partially a hangover of the twentieth century’s obsession with formalism. Insights don’t look like proper logical maths with all the steps written out. You just sort of look at them, and the work’s mostly being done by a black box in your head. So this is definitely not a workflow I was taught by anyone during my maths degree; it’s one I improvised over time so that I could get through it anyway, when presented with definitions as opaque as the one from the Wikipedia article.

I’m confident that we can do better. And also that we will, as there seems to be an increasing interest in developing better conceptual explanations. I think Google’s Distill project and their idea of ‘research debt’ is especially promising. But that article’s interesting enough that it should really be a separate post sometime.

“pretentious theme statement”

I haven’t posted anything in a couple of weeks, not because I haven’t been writing but because I keep writing overambitious longer posts that get to a point where they seem something like 80% done and then die horribly. I’m hopeful that I can reanimate some of the dead posts but in the meantime it would be nice to keep a bit of momentum.

So I was looking at my folder of half-written draft crap (which starts with ‘academia_rant.txt’ and ‘asdfsdffsd.txt’ and doesn’t get any better) and found this thing I wrote for the tumblr blog and had half forgotten about, under the title ‘pretentious theme statement’. Maybe I decided it was too pretentious. But reading it back, I like it, and I think it’s accurate for at least part of what I want to do on this newer blog too:


If this blog has any sort of theme, beyond ‘let’s write the same boring post about mathematical intuition a thousand times’, it’s something like this:

Say you have some idea which can be written down in language in a more or less coherent and logical way. That’s the bit I’m mostly not interested in here. (Though these are really good! I definitely approve of coherent and logical thoughts. Sometimes I even manage to have one.)

Instead I find myself poking again and again at the cluster of stuff that’s packed around it that’s rather more difficult to get a hold on in language – the emotional tone the thought has, the mental images or bits of analogy that support it. Sort of like the ‘dressed’ thought rather than the ‘bare’ thought.

‘The role of intuition in maths’ is how I mostly approach it because it’s close to my own odd obsessions, it has a tiny fascinating literature that I’ve mostly read, and the divide seems particularly obvious there. It’s really common to have the experience of following a mathematical proof with several indisputably-correct steps and get to the end completely convinced of the result, but still have that feeling of urghh BUT WHY is this true?? And it’s really common to then find a reframing that makes it obvious.

But a bunch of my other posts seem to be about this too – there’s the assorted crap under the ‘tastes in the head’ tag, and some throwaway stuff like my new sort-of-interest in geology.

I’m definitely not talking about this because I understand it. Finding ways to talk about all this extra stuff is hard, there’s no one source of literature on it, and it’s possible that it varies so widely from person to person it’s essentially not even worth trying. Certainly people vary widely in their preferred mathematical learning styles. But the topic has some kind of, well, hard-to-describe quality that makes me keep returning to it.

(It’s also well-suited to tumblr because all I really know how to do is produce these sort of confused fragments. I’m definitely not going to be producing a 5000 word chunk of confidently-stated insight porn off the back of any of this any time soon.)