Research debt, double distilled

I said a few weeks ago that I was going to talk more about this article by Chris Olah and Shan Carter on the idea of research debt, but every time I went back to it I felt that the original article made the point so clearly and elegantly that there was very little I wanted to add. (Other than ‘I want this thing in physics too please!’)

So I started to pull out my favourite bits, intending to do a very lazy quotes-and-comments sort of post, and realised that with a couple of additions they made a coherent summary of the original post on their own. In the process I discovered that there were a few things I wanted to say, after all.

If you just read straight down the blockquotes you get a double-distilled microversion of the original essay. Or you can also read the bits I’ve stuck in between.


For centuries, countless minds have climbed the mountain range of mathematics and laid new boulders at the top. Over time, different peaks formed, built on top of particularly beautiful results. Now the peaks of mathematics are so numerous and steep that no person can climb them all.

This has always saddened me. In Men of Mathematics, E. T. Bell labels Henri Poincaré as the ‘The Last Universalist’, the last person to be able to range freely across all fields of mathematics as they existed in his time. Now Bell did have a tendency to over-dramatise things, but I think this is basically right.

Probably this is unavoidable; probably the expansion wave has accelerated too fast, and those days will not return. I’m temperamentally susceptible to millenarian dreams of the return of the once and future universalists, but I accept that this is unlikely.

Still, there is a lot of compression that is within reach:

The climb is seen as an intellectual pilgrimage, the labor a rite of passage. But the climb could be massively easier. It’s entirely possible to build paths and staircases into these mountains. The climb isn’t something to be proud of.

The climb isn’t progress: the climb is a mountain of debt.

The analogy is with technical debt in programming, which is all the awkward stuff thrown to the side in an effort to get software into production quickly. Eventually you have to go back and deal with the awkward stuff, which has an unfortunate tendency to compound over time.

The insidious thing about research debt is that it’s normal. Everyone takes it for granted, and doesn’t realize that things could be different. For example, it’s normal to give very mediocre explanations of research, and people perceive that to be the ceiling of explanation quality. On the rare occasions that truly excellent explanations come along, people see them as one-off miracles rather than a sign that we could systematically be doing better.

People who are truly excellent at explaining research are probably rare. But ‘better explanations than we have currently’ seems like a very, very easy target to hit, once people are persuaded to put resources into hitting it.

I plan to finally start taking my own advice soon, and start putting whatever notes and bits of intuition I’ve gathered online. I’m not too convinced that they’ll be especially great, but the current floor is pretty low.

Research distillation is the opposite of research debt. It can be incredibly satisfying, combining deep scientific understanding, empathy, and design to do justice to our research and lay bare beautiful insights.

Distillation is also hard. It’s tempting to think of explaining an idea as just putting a layer of polish on it, but good explanations often involve transforming the idea. This kind of refinement of an idea can take just as much effort and deep understanding as the initial discovery.

Distillation is fundamentally a different sort of activity to the types of research that are currently well supported by academia. Distillers aren’t mountain climbers; they engage with their subject by criss-crossing the same ground over and over again, following internally-generated trails of fascination that can be hard to interpret from the outside. They want to understand!

An aspiring research distiller lacks many things that are easy to take for granted: a career path, places to learn, examples and role models. Underlying this is a deeper issue: their work isn’t seen as a real research contribution. We need to fix this.

Distillers generally have little interest in who can get to the top of the mountain fastest, and anyway it certainly won’t be them. In an environment that rewards no other activity, they tend to disappear quickly. They require different infrastructure.

None of this infrastructure currently exists, but it easily could do. Research distillation doesn’t intrinsically need to cost huge amounts of money. It’s not like we need to spend billions on a gigantic high-energy collider to smash our current explanations together. This is an area where transitioning from moaning about academia to actually doing something about it looks to be pretty straightforward.

It’s one of the nice sorts of problems where small efforts at the margins are already useful. It certainly helps if you have Google’s resources behind you, but you can also just polish up any half-decent notes you have lying around on a topic that’s currently poorly explained and put them online, and you’ve made a tiny contribution towards fixing the problem.

If you are excited to distill ideas, seek clarity, and build beautiful explanations, we are letting you down. You have something precious to contribute and we aren’t supporting you the way we should.


I’ve been saying ‘they’ throughout this post, but, I mean, it’s obvious why I care about this thing. This is my old tumblr ‘about me’ page:

aboutme

It’s amusing self-deprecation, but unfortunately I also meant it a lot of the time. (I still believe the programmer bit, but I’m starting to have some optimism about improvement there too.) My standard line after finishing my thesis was ‘I love physics but I’m bad at research’.

I had a poorly understood but strongly felt sense of what I wanted instead, academia was clearly not going to provide it, and I just wanted to get out. ‘Research distillation’, however, is a reasonably close fit. (Maybe not an exact one. I feel the ‘criss-crossing existing territory’ approach goes deeper than just refining existing ideas, and is a valid route to original research in itself. But it’s an ecosystem I think I would have been able to cope with, and succeed in.)

So I’ll admit my enthusiasm for the idea of research distillation is mostly pure self-interest. But I’m pretty sure that a thriving ecosystem of distillers would also help academia. After all, you only criss-cross the territory for love of the subject. The external rewards are currently too poor for any other motivation to make sense.

Reading into the landscape

Written quickly and probably not very clear – it’s a workbook post not a polished-final-thoughts post. Vaguely inspired by this exchange between Julia Galef and Michael Nielsen.

One of my favourite things is the point in learning a new topic where it starts to get internalised, and you begin to be able to see more. You can read into a situation where previously you had no idea what was going on.

Sometimes the ‘seeing’ is metaphorical, but sometimes it’s literal. I go walking quite a lot, and this year I’m seeing more than before, thanks to an improved ability to read into the landscape.

I got this from Light and Colour in the Outdoors, a classic 30s book on atmospheric phenomena by the physicist Marcel Minnaert. It’s really good, and I’m now regretting being cheap and getting the Dover version instead of the fancy coffee-table book (note to self: never buy a black-and-white edition of a book with the word ‘colour’ in the title).

I’ve only read a few sections, but already I notice more. Last weekend I got the coach to London, and on the way out I saw a sun dog I’d probably have missed before. And then on the way back it was raining with the sun shining onto the coach windscreen in front, and I thought to myself, ‘I should probably look behind me’. I turned, and right on cue:

2017-05-20 20.23.49

This is entry-level reading into the landscape, but still quite satisfying. Those with exceptional abilities seem to have superpowers. George Monbiot in Feral talks about his friend Ritchie Tassell:

… he has an engagement with the natural world so intense that at times it seems almost supernatural. Walking through a wood he will suddenly stop and whisper ‘sparrowhawk’. You look for the bird in vain. He tells you to wait. A couple of minutes later a sparrowhawk flies across the path. He had not seen the bird, nor had he heard it; but he had heard what the other birds were saying: they have different alarm calls for different kinds of threat.

This is the kind of learning that fascinates me! You can do it with maths as well as with sparrowhawks…

This has been on my mind recently as I read/reread Venkatesh Rao’s posts on ambiguity and uncertainty. I really need to do a lot more thinking on this, so this post might look stupid to me rather rapidly, but it’s already helping clarify my thoughts. Rao explains his use of the two terms here:

I like to use the term ambiguity for unclear ontology and uncertainty for unclear epistemology…

The ambiguity versus uncertainty distinction helps you define a simpler, though more restricted, test for whether something is a matter of ontology or epistemology. When you are missing information, that’s uncertainty, and an epistemological matter. When you are lacking an interpretation, that’s ambiguity, and an ontological matter.

Ambiguity is the one that maps to the reading-into-the-landscape sort of learning I’m most fascinated by, and reducing it is an act of fog-clearing:

20/ In decision-making we often use the metaphors of chess (perfect information) and poker (imperfect information) to compare decision-makers.

21/ The fog of intention breaks that metaphor because the game board /rules are inside people’s heads. Even if you see exactly what they see, you won’t see the game they see.

22/ Another way of thinking about this is that they’re making meaning out of what they see differently from you. The world is more legible to them; they can read/write more into it.

I think this is my main way of thinking about learning, and probably accounts for a fair amount of my confusion when interacting with the rationalist community. I’m obsessed with ambiguity-clearing, while the rationalists are strongly uncertainty-oriented.

For example, here’s Julia Galef on evaluating ‘crazy ideas’:

In my experience, rationalists are far more likely to look at that crazy idea and say: “Well, my inside view says that’s dumb. But my outside view says that brilliant ideas often look dumb at first, so the fact that it seems dumb isn’t great evidence about whether it will pan out. And when I think about the EV here [expected value] it seems clearly worth the cost of someone trying it, even if the probability of success is low.”

I’ve never thought like that in my life! I’d be hopeless at the rationalist strategy of finding a difficult, ambitious problem to work on and planning out high-risk steps for how to get there, but luckily there are other ways of navigating. I mostly follow my internal sense of what confusions I have that I might be able to attack, and try to clear a bit of ambiguity-fog at a time.

That sounds annoyingly vague and abstract. I plan to do a concrete maths-example post some time soon. In the meantime, have a picture of a sun dog:

sundog

Hackers and painters but not physicists?

There are two common routes that people go down after a physics PhD. The first is, of course, to stay on in academia and get a postdoc, and then hopefully another postdoc or two, and then hopefully a permanent job.

Many people fail one of these steps, and many others don’t fancy the whole process in the first place. So the second common option is to leave and look for a job that uses the same skills in some way. This could be in, for example, data science, algorithmically intensive areas of programming, quantitative finance, or industrial research and development. These are challenging jobs that require a lot of thinking, normally with long work hours attached. There isn’t much time and energy left to spare for learning physics, so mostly people don’t do that any more.

I wasn’t particularly good at my PhD, so playing the first game would have been a struggle. And there was enough that annoyed me about academia that I was pretty OK with leaving.

But I also didn’t like the look of the second game. I don’t want to do a challenging job that ‘uses my technical skills’ in some other field. I don’t care about my technical skills. They’re not even very good! I’m completely lacking in the kind of sharp, focussed intellect that excels at rapid problem solving in unfamiliar contexts. I will not pass your whiteboard interview.

I just want to think about physics. I have specific questions that are stuck in my head, and I want to work on those. I may not succeed in doing this very well or finding anything useful, but it’s not going to stop me thinking about them. Getting to use some decontextualised ‘problem solving skills’ elsewhere is not much of a consolation prize, because everything I care about is in the context itself. (This mindset actually makes most academic postdocs look quite unappealing too.)

So in my case it makes more sense to be relatively unambitious career-wise and try to free up time for learning physics. The main useful features of my current job are that it’s reasonably not horrible, has sane hours, and leaves me with mental energy to spare. Eventually I want to do better, and figure out how to cut down the hours I work.


This seems to be an unusual choice in physics. I can only think of two vaguely relevant archetypes: cranks of the classic ‘retired electrical engineer who’s just realised relativity is WRONG’ variety, and the occasional seriously impressive person who ends up the news for making important contributions to number theory while working in Subway. I feel embarrassed admitting to people what I’m doing, because I worry that the subtext they’ll pick up is either ‘I’ve deluded myself into thinking I’m an unrecognised genius’, or ‘I’ve gone full crackpot and no longer care what anyone thinks, also did you know that Einstein was wrong?’ Neither of those exactly sounds good, so I tend to talk about my plans like it’s just a big joke.

The annoying thing is that what I’m trying to do is not at all rare in other fields. Everyone understands the concept of people in the arts having day jobs, for example. The musician who works in a coffee shop to pay the bills is a standard cultural stereotype.

I like several things about this stereotype. For a start, it’s nice to simply have it available, so you can explain what you are doing to others quickly without looking too odd. (Possibly someone will give you a bit of ‘get a real job!’ grief for it, but that’s also part of the script! You already know how to play along in that exchange, so you escape a lot of awkwardness.)

There’s also no requirement for you to have any particular level of ability, so you escape the genius/crackpot dichotomy. You can be a brilliant musician, but it’s also OK to be sort of mediocre, or even downright awful, but love it anyway.

And best of all, it’s expected that your interests will be specific and contextual, that you’ll be driven by a particular obsession. If someone advises you to ‘apply your technical skills’ to writing advertising jingles, it’s a pragmatic suggestion for how to pay rent, not an indication that it could be a comparably fulfilling way to spend your time.


Maybe it’s not so surprising that the idea of people working on physics in their spare time is not well represented in the wider culture. Most people aren’t especially interested in physics, and might not even realise that you can actually care this much. But I’m not just talking about cultural representation. I’m confused about why hardly anyone seems to be doing it at all.

One very obvious point is that the financial situation is so much better than in the arts. All those jobs in data science and industrial research pay very well. This is not something I want to complain about! But it means there are strong forces pulling people into these careers, and away from having time for physics. This probably makes it harder to notice the idea in the first place.

I don’t think this is the whole story, though. In this respect it’s interesting to compare us against a different population of STEM nerds. Programmers are also well paid. But, like musicians, they still manage to have a robust culture of day jobs and side projects. It might not be such a sitcom stereotype, but within the field it’s not seen as so unusual for someone to spend the day at their dull enterprise Java job, and then go home and contribute to an open source project they really care about.

In fact, once you’ve thought of it, the financial situation makes the ‘day job’ idea easier, not harder. The most challenging and well-paid jobs may not leave you much extra time for thinking, but there are other less impressive-sounding options that still pay pretty well. Writing line-of-business applications at Large Dull Company may not be the most exciting way to spend your day, but that means you’re likely to have brainpower to spare. And if you want to reduce your hours further or have more flexibility with your time, there are reasonable pathways in consulting and contracting.

As with artists and musicians, you don’t have to be brilliant to fit this pattern. You just have to be fascinated enough with a specific idea to want to work on it even if nobody is paying you. Writing a basic CRUD app in your spare time is fine if there’s something in there that really interests you.

Paul Graham famously made the same comparison in his Hackers and Painters essay:

The other problem with startups is that there is not much overlap between the kind of software that makes money and the kind that’s interesting to write…

All makers face this problem. Prices are determined by supply and demand, and there is just not as much demand for things that are fun to work on as there is for things that solve the mundane problems of individual customers. Acting in off-Broadway plays just doesn’t pay as well as wearing a gorilla suit in someone’s booth at a trade show. Writing novels doesn’t pay as well as writing ad copy for garbage disposals. And hacking programming languages doesn’t pay as well as figuring out how to connect some company’s legacy database to their Web server.

I think the answer to this problem, in the case of software, is a concept known to nearly all makers: the day job. This phrase began with musicians, who perform at night. More generally, it means that you have one kind of work you do for money, and another for love.


So why is this missing from physics? One good reason is that working independently doesn’t make much sense for a lot of people if they want to still produce original research. If you enjoy being an experimentalist in a big collaboration, you’re out of luck. You can’t build a small version of the LHC in your shed and hope to keep contributing to high energy physics. Some areas of theory would also not work well, such as big numerical simulations, or fast-moving subfields developing highly technical methods that would be hard for an outsider to keep up with.

I think that still leaves a reasonable number of options, though. It’s the same for programmers, surely: you won’t have access to Google-sized datasets in your one-person side project, either. And I’m not necessarily even talking about original research anyway. In fact, I’m explicitly in favour of a much wider conception of what constitutes a useful contribution to physics. I talked about some of the things I value briefly in this ‘niches’ comment on David Chapman’s Meaningness blog:

What I’d like to see is more niches, in the ecological sense: more acceptable ways to be ‘successful’ in academia so that everyone isn’t stuck trying to shove each other down the same boring hill in their quest for the summit. Off the top of my head I would like all these things to be valued as highly as ‘high-impact’ research: teaching, reproduction of experiments, new ways of visualising or conceptualising existing results, communication of existing concepts to people outside of your speciality, programming new tools to make research easier, digging around in the historical archives of the field for interesting lost insights…

Many of these are completely feasible to work on as an individual outside of academia. So I don’t think this is the whole problem.

It’s true that a lot of people are in physics mostly for the problem solving element. In that case, applying your skills elsewhere for more money might look extremely attractive, as you aren’t really losing anything by solving the same kinds of problems in a different field. This could actually be enough to explain the situation completely – maybe there just are very few weirdos like me out there who care about specific questions in physics rather than broadly applicable techniques, but who also don’t try to stick it out in academia. But in that case I don’t really understand why the situation in programming is so different.

I do have to wonder, though, whether one reason that people don’t do this in physics is simply that, well, people don’t do this in physics. Nobody sees anyone else doing it (apart from a few crackpots and geniuses who are easily discounted) so they don’t think to try it themselves. This would be the most optimistic explanation, because it looks so easily changeable! Maybe if more of us continued doing physics outside of academia it would become a thing. Maybe there are already a fair number of people trying this, but they’re currently keeping rather quiet about it.

Mostly I’m just confused, though. Suggestions gratefully received!

Two types of mathematician, yet again

I’ve recently been browsing through Season 1 of Venkatesh Rao’s Breaking Smart newsletter. I didn’t sign up for this originally because I assumed it was some kind of business thing I wouldn’t care about, but I should have realised it wouldn’t stray far from the central Ribbonfarm obsessions. In particular, there’s an emphasis on my favourite one: figuring out how to make progress in domains where the questions you are asking are still fuzzy and ambiguous.

‘Is there a there there? You’ll know when you find it’ is explicitly about this, and even better, it links to an interesting article that ties in to one of my central obsessions, the perennial ‘two types of mathematician’ question. It’s just a short Wired article without a lot of detail, but the authors have also written a pop science book it’s based on, The Eureka Factor. From the blurb it looks very poppy, but also extremely close to my interests, so I plan to read it. If I had any sense I’d do this before I started writing about it, but this braindump somehow just appeared anyway.

The book is not focussed on maths – it’s a general interest book about problem solving and creativity in any domain. But it looks like it has a very similar way of splitting problem solvers into two groups, ‘insightfuls’ and ‘analysts’. ‘Analysts’ follow a linear, methodical approach to work through a problem step by step. Importantly, they also have cognitive access to those steps – if they’re asked what they did to solve the problem, they can reconstruct the argument.

‘Insightfuls’ have no such access to the way they solved the problem. Instead, a solution just ‘pops into their heads’.

Of course, nobody is really a pure ‘insightful’ or ‘analyst’. And most significant problems demand a mixed strategy. But it does seem like many people have a tendency towards one or the other.


A nice toy problem for thinking about how this works in maths is the one Seymour Papert discusses in a fascinating epilogue to his Mindstorms book. I’ve written about this before but I’m likely to want to return to it a lot, so it’s probably worth writing out in a more coherent form that the tumblr post.

Papert considers two proofs of the irrationality of the square root of two, “which differ along a dimension one might call ‘gestalt versus atomistic’ or ‘aha-single-flash-insight versus step-by-step reasoning’.” Both start with the usual proof by contradiction: let \sqrt{2} = \frac{p}{q}, a fraction expressed in its lowest terms, and rearrange it to get

p^2 = 2 q^2 .

The standard proof I learnt as a first year maths student does the job. You notice that p must be even, so you write it as p=2r, sub it back in and notice that q is going to have to be even too. But you started with the fraction expressed in its lowest terms, so the factors shouldn’t be there and you have a contradiction. Done.

This is a classic ‘analytical’ step-by-step proof, and it’s short and neat enough that it’s actually reasonably satisfying. But I much prefer Papert’s ‘aha-single-flash-insight’ proof.

Think of p as a product of its prime factors, e.g. 6=2*3. Then p^2 will have an even number of each prime factor, e.g. 36=2*2*3*3.

But then our equation p^2 = 2 q^2 is saying that an even set of prime factors equals another even set multiplied by a 2 on its own, which makes no sense at all.

This proof still has some step-by-step analytical setup. You follow the same proof by contradiction method to start off with, and the idea of viewing p and q as prime factors still has to be preloaded into your head in a more-or-less logical way. But once you’ve done that, the core step is insight-based. You don’t need to think about why the original equation is wrong any more. You can just see it’s wrong by looking at it. In fact, I’m now surprised that it didn’t look wrong before!

For me, all of the fascination of maths is in this kind of insight step. And also most of the frustration… you can’t see into the black box properly, so what exactly is going on?


My real, selfish reason for being obsessed with this question is that my ability to do any form of explicit step-by-step reasoning in my head is rubbish. I would guess it’s probably bad compared to the average person; it’s definitely bad compared to most people who do maths.

This is a major problem in a few very narrow situations, such as trying to play a strategy game. I’m honestly not sure if I could remember how to draw at noughts and crosses, so trying to play anything with a higher level of sophistication is embarrassing.

Strategy games are pretty easy to avoid most of the time. (Though not as easy to avoid as I’d like, because most STEM people seem to love this crap 😦 ). But you’d think that this would be a serious issue in learning maths as well. It does slow me down a lot, sometimes, when trying to pick up a new idea. But it doesn’t seem to stop me making progress in the long run; somehow I’m managing to route round it. So what I’m trying to understand when I think about this question is how I’m doing this.

It’s hard to figure it out, but I think I use several skills. One is simply that I can follow the same chains of reasoning as everyone else, given enough time and a piece of paper. It’s not some sort of generalised ‘inability to think logically’, or then I suppose I really would be in the shit. Subjectively at least, it feels more like the bit of my brain that I have access to is extremely noisy and unfocussed, and has to be goaded through the steps in a very slow, explicit way.

Another skill I enjoy is building fluency, getting subtasks like bits of algebraic manipulation ‘under my fingers’ so I don’t have to think about them at all. This is the same as practising a musical instrument and I’m familiar with how to do it.

But the fun one is definitely insight. Whatever’s going on in Papert’s ‘aha-single-flash-insight’ is the whole reason why I do maths and physics, and I wish I understood it better. I also wish there were more resources for learning how to work with it, as I’m pretty sure it’s my main trick for working round my poor explicit reasoning skills.


My workflow for trying to understand a new concept is something like:

  1. search John Baez’s website in the hope that he’s written about it;
  2. google something like ‘[X] intuitively’ and pick out any fragments of insight I can find from blog posts, StackExchange answers and lecture notes;
  3. (back when I had easy access to an academic library) pull a load of vaguely relevant books off the shelf and skim them;
  4. resign myself to actually having to think for myself, and work through the simplest example I can find.

The aim is always to find something like Papert’s ‘set of prime factors’ insight, some key idea that makes the point of the concept pop out. For example, suppose I want to know about the Maurer-Cartan form in differential geometry, which has this fairly unilluminating definition on Wikipedia:

maurercartan

Then I’m done at step 1, because John Baez has this to say:

Let’s start with the Maurer-Cartan form. This is a gadget that shows up in the study of Lie groups. It works like this. Suppose you have a Lie group G with Lie algebra Lie(G). Suppose you have a tangent vector at any point of the group G. Then you can translate it to the identity element of G and get a tangent vector at the identity of G. But, this is nothing but an element of Lie(G)!

So, we have a god-given linear map from tangent vectors on G to the Lie algebra Lie(G). This is called a “Lie(G)-valued 1-form” on G, since an ordinary 1-form eats tangent vectors and spits out numbers, while this spits out elements of Lie(G). This particular god-given Lie(G)-valued 1-form on G is called the “Maurer-Cartan form”, and denoted ω.

This requires a lot more knowledge going in than the square root of two example, because I need to know what a Lie group and a Lie algebra and a 1-form are to get any use out of it. But if I’ve already struggled through getting the necessary insights for those things, I now have exactly the further insight I need: if you can translate your tangent vector back to the identity it’ll magically turn into a Lie algebra element, so then you’ve got yourself a map between the two sorts of things. And if I don’t know what a Lie group and a Lie algebra and a 1-form are, it’s pointless me trying to learn about the Maurer-Cartan form anyway.

Unfortunately, nobody has locked John Baez in a room and made him write about every topic in mathematics, so normally I have to go further down my algorithm, and that’s where things get difficult. There’s surprisingly poor support for an insight-based route through maths. If you want insights you have to dig for them, one piece at a time.

Presumably this is at least partially a hangover of the twentieth century’s obsession with formalism. Insights don’t look like proper logical maths with all the steps written out. You just sort of look at them, and the work’s mostly being done by a black box in your head. So this is definitely not a workflow I was taught by anyone during my maths degree; it’s one I improvised over time so that I could get through it anyway, when presented with definitions as opaque as the one from the Wikipedia article.

I’m confident that we can do better. And also that we will, as there seems to be an increasing interest in developing better conceptual explanations. I think Google’s Distill project and their idea of ‘research debt’ is especially promising. But that article’s interesting enough that it should really be a separate post sometime.

“pretentious theme statement”

I haven’t posted anything in a couple of weeks, not because I haven’t been writing but because I keep writing overambitious longer posts that get to a point where they seem something like 80% done and then die horribly. I’m hopeful that I can reanimate some of the dead posts but in the meantime it would be nice to keep a bit of momentum.

So I was looking at my folder of half-written draft crap (which starts with ‘academia_rant.txt’ and ‘asdfsdffsd.txt’ and doesn’t get any better) and found this thing I wrote for the tumblr blog and had half forgotten about, under the title ‘pretentious theme statement’. Maybe I decided it was too pretentious. But reading it back, I like it, and I think it’s accurate for at least part of what I want to do on this newer blog too:

 

If this blog has any sort of theme, beyond ‘let’s write the same boring post about mathematical intuition a thousand times’, it’s something like this:

Say you have some idea which can be written down in language in a more or less coherent and logical way. That’s the bit I’m mostly not interested in here. (Though these are really good! I definitely approve of coherent and logical thoughts. Sometimes I even manage to have one.)

Instead I find myself poking again and again at the cluster of stuff that’s packed around it that’s rather more difficult to get a hold on in language – the emotional tone the thought has, the mental images or bits of analogy that support it. Sort of like the ‘dressed’ thought rather than the ‘bare’ thought.

‘The role of intuition in maths’ is how I mostly approach it because it’s close to my own odd obsessions, it has a tiny fascinating literature that I’ve mostly read, and the divide seems particularly obvious there. It’s really common to have the experience of following a mathematical proof with several indisputably-correct steps and get to the end completely convinced of the result, but still have that feeling of urghh BUT WHY is this true?? And it’s really common to then find a reframing that makes it obvious.

But a bunch of my other posts seem to be about this too – there’s the assorted crap under the ‘tastes in the head’ tag, and some throwaway stuff like my new sort-of-interest in geology.

I’m definitely not talking about this because I understand it. Finding ways to talk about all this extra stuff is hard, there’s no one source of literature on it, and it’s possible that it varies so widely from person to person it’s essentially not even worth trying. Certainly people vary widely in their preferred mathematical learning styles. But the topic has some kind of, well, hard-to-describe quality that makes me keep returning to it.

(It’s also well-suited to tumblr because all I really know how to do is produce these sort of confused fragments. I’m definitely not going to be producing a 5000 word chunk of confidently-stated insight porn off the back of any of this any time soon.)

One theory to the tune of another

My second favourite type of question in physics, after ‘what’s the simplest non-trivial example of this thing?’, is probably ‘how can I write these two things in the same formalism, so that the differences stand out more clearly?’

This may look like an odd choice, given that all I ever do here is grumble about how crap I am at picking up new formal techniques. But actually that’s part of why I like it!

Writing two theories in the same language is like putting two similar transparencies on top of each other, and holding them up to the light. Suddenly the genuine conceptual differences pop out visibly, freed from the distraction of all the tedious extraneous machinery that surrounds them.

Or at least that’s always the hope – it’s actually pretty hard work to do this.

There are two maps between classical and quantum physics that I’m interested in learning, and should probably have included in my crackpot grand plan. (I guess they can be shoved into the quantum foundations grab bag.)

One is the phase space reformulation of quantum mechanics. This is sort of a standard technique, but I still managed to avoid hearing about it until quite recently. Some subfields apparently use it a lot, but you’re unlikely to see it in any standard quantum course. It also has a weird lack of decent introductory texts. I met someone at the workshop I went to who uses it in their research and asked what I should read, and he just looked pained and said ‘My thesis, maybe? When I write it?’ So learning it may not be especially fun.

It looks really interesting though! You can dump all the operators and use something that looks very like a normal probability distribution, so the parallels with classical statistical mechanics are much more explicit. There are obviously differences – this distribution can be negative, for a start. (It’s known as a quasidistribution.) Ideally, I’d like to be able to hold them both up to the light and see exactly where all the differences are.

It’s less well known that you can also do classical mechanics on Hilbert space! It’s called Koopman – von Neumann theory. If you ever thought ‘what classical mechanics is really missing is a load of complex wavefunctions on configuration space’, then this is the formalism for you.

In this case, I ought to be luckier with the notes, because Frank Wilczek wrote some a couple of years ago.

I’m not so clear on exactly what this thing is and what I’d get out of learning it, but the novelty value of a Born rule in classical mechanics is high enough that I can’t resist giving it a go. And I’d have a new pair of formalisms to hold up to the light.

crisis in english everything

2017-04-16-09-22-38.jpg

I grew up fascinated with this kind of thing, sort of improbably for a teenager in 2003 or so.

I’m going to write more about this; today I’m just taking bad photos while I’m at my parents’ house and have the materials to hand. But the point is that ‘systems of meaning all in flames’ isn’t just an abstract piece of history for me. I might not have understood the context very well, but the emotional tone got through anyway.

(The image is from the beginning of ‘Crisis in English Poetry’ by the excellently named Vivian De Sola Pinto, published 1951, which I picked out from a second-hand bookshop for 50p because I liked the doom-laden title. This stuff is easy to find once you’ve developed a taste for it.)

Some advice nobody asked for

Related to the previous post, here is some free PhD advice for all three people who occasionally read the blog, none of whom it’s probably relevant to. I really did not excel in my PhD and then left academia, so this advice may not be worth having, but I felt like writing it down anyway.

I saw this really insightful answer on academia.stackexchange, in response to someone asking how they could attract more good applicants to a PhD programme in an ‘awesome’ but (implicitly) not super-top-level-wow-prestigious university. The core part:

So currently, you are getting two types of student: A) Those for whom you are accidentally special, e.g. they live in your city and don’t want to move, and B) those who dreamed to get into Harvard. A will contain the usual mix of brilliant and average students, while from B, Harvard picked all the chocolate chips from the cookie.

The solution is that you become top player by getting into a niche which has been overlooked. It may be completely new, or it may have 1-2 players which are in it accidentally, so you can beat them easily. Suddenly, you’ll start getting applications from C), the students who dream of being in that niche. Not only did you open yourself to a new set of students, but those who know early on what they know, and find out which university offers it, tend to be the best. This is a set of self-selected people who are motivated and effective.

I joined one of these sort of research groups and they are fantastic. They are obviously not as good for your future career as getting into Fancy Subfield at Imperial or Stanford. On the other hand, you won’t be expected to chew each others’ limbs off to get to the top of whichever bullshit status ladder is currently dominating the field. Also, nobody has gone there just to show off how amazingly brilliant they are, because if that was really important to them they’d have picked a trendier field/university/city. So you get a relaxed, collaborative atmosphere where people just really care about the subject and want to help each other learn.

Obviously the usefulness of this advice depends on the general intellectual health of your subject. If you reckon the existing status ladder in the field does line up nicely with actual useful progress, then it might be worth risking your limbs at the top groups. If instead you look at the ladder and think um, not so much, then you may as well go and have fun somewhere else.

my old tribe

There was one more thing I meant to port over from the old tumblr and forgot: a list of what I loved about my old research group. It was never under the ‘mathbucket’ tag, but goes a long way to explaining what I care about in maths and physics, what I missed horribly when I left and what I’m working towards finding again.

  • Everyone is interested in a wide variety of things – other areas of maths and physics, other academic subjects, various sports and arts and hobbies. Nobody expects you to just be narrowly focussed on learning about your specialism.

  • Getting better at these things is valued. A little bit of bragging is alright as long as you don’t get too obnoxious about it.

  • Helping other people get better at these things is valued. Being able to explain your work in plain language is valued. Writing a clear paper, giving an entertaining talk or writing for a nonspecialist audience are all considered worthwhile, as well as technical competence in your own research area.

  • (This one’s important) An almost total absence of that competitive one-upping thing where everyone spends their time proving how much smarter they are than everyone else, or looking down on other subdisciplines as less important/fundamental/difficult/rigorous than their own. This is all over the place in physics, I hate it, and I was very very lucky to avoid most of it.

  • Playfulness, silliness, gurning, stupid repetitive injokes, awful songs played over and over again, pointless fun distracting projects with absolutely no relevance to anyone’s research.

  • A kind of glorying in being stubbornly independent-minded and prepared to defend your own stupid opinion. ‘That is bollocks and I will tell you why.’ But always grinning as you say it, and sometimes you discover that it isn’t bollocks and admit you’ve changed your mind.

(Disclaimer I added a bit later: I’m not saying it was a perfect fit for me. It was more undisciplined and structureless and anarchic than I really knew how to deal with, and I was very lazy and unfocussed a lot of the time. This stored up problems for me in the long run, and finishing on time was a miserable ordeal. But there was a lot of good there.)

I’m a bricoleur scientist

I’ve just read a fascinating paper, ‘Epistemological Pluralism and the Revaluation of the Concrete’ by Sherry Turkle and Seymour Papert. I’m lucky that I only found the paper recently: I love Papert but I’m not sure I’d have been able to stomach it even two years ago. The very first paragraph manages to combine a couple of ideas I’m seriously allergic to:

The concerns that fuel the discussion of women and computers are best served by talking about more than women and more than computers. Women’s access to science and engineering has historically been blocked by prejudice and discrimination. Here we address sources of exclusion determined not by rules that keep women out, but by ways of thinking that make them reluctant to join in. Our central thesis is that equal access to even the most basic elements of computation requires an epistemological pluralism, accepting the validity of multiple ways of knowing and thinking.

So, first of all, this is a paper about Women In STEM, considered capitalised as an Important Social Issue. Being lumped in with my gender automatically puts me on edge, as I tend to assume that I’m not going to fit in very well.

Then we have the phrase ‘ways of knowing’, which I’ve sort of unfairly come to associate with the worst of pomo nonsense. Like that anthropology course my flatmate did, where literally any bullshit explanation of anything ever advanced by some isolated tribe had to be taken seriously as an ‘equally valid’ way of understanding the world.

Put these two together and this article threatens to be about, er, ‘women’s ways of knowing in STEM’, a phrase which is literally making me cringe as I type it out. A couple of years I would have stopped here, unable to cope with the kind of associations this gave me with the awful gender-essentialist woo stuff that some women inexplicably find inspiring and not horrific. Like, stuff of the form ‘women have special kinds of intuition, which are probably to do with being really in touch with the earth or something, and also lots of feelings are going to be involved’.

Anyway I’ve calmed down about this a bit recently, to the point where I could possibly even extract something worthwhile from a full-fat gender-essentialist-woo piece of writing. And of course this paper is not like that.

Even so, this paper pretty much is about ‘women’s ways of knowing in STEM’ (in broad statistical strokes, rather than an essentialist claim that This Is How All Women Feel). And, um, it actually fits me rather well? Some of it is off, but it also includes the best description of my particular learning style that I have ever come across anywhere.


The basic setup here is one of those ‘two types of mathematician’ divisions I love. Except here there are two types of programmer. There’s this standard (straw? I don’t think so, but it’s hard for me to tell) idea of a programmer:

For some people, what is exciting about computers is working within a rule-driven system that can be mastered in a top-down, divide-and-conquer way. Their structured “planner’s” approach, the approach being taught in the Harvard programming course, is validated by industry and the academy. It decrees that the “right way” to solve a programming problem is to dissect it into separate parts and design a set of modular solutions that will fit the parts into an intended whole. Some programmers work this way because their teachers or employers insist that they do. But for others, it is a preferred approach; to them, it seems natural to make a plan, divide the task, use modules and subprocedures.

Then there’s ‘a very different style’:

They are not drawn to structured programming; their work at the computer is marked by a desire to play with the elements of the program, to move them around almost as though they were material elements — the words in a sentence, the notes on a keyboard, the elements of a collage.

Turkle and Papert call this ‘bricolage’, a term they got from Levi-Strauss. I know nothing about Levi-Strauss so can’t really say what he meant by it. The Wikipedia article on bricolage describes it as ‘the construction or creation of a work from a diverse range of things that happen to be available, or a work created by such a process’, which seems close enough to the usage in the paper.

Bricoleur scientists, apparently, work in the following way:

The bricoleur scientist does not move abstractly and hierarchically from axiom to theorem to corollary. Bricoleurs construct theories by arranging and rearranging, by negotiating and renegotiating with a set of well-known materials.

To which, all I can say is:

!!!

This is the thing! This is a perfect description of the thing!


My favourite sort of problem is something that could probably be labelled ‘synthesis’, but at ground level looks like this: you have a bunch of concepts you don’t understand very well, but for some reason you’re convinced they can be combined. Sometimes this is a pointless exercise in making patterns out of noise, like staring at the Easyjet seat pattern for too long. Other times you have valid intellectual reasons for why they would fit together.

This is a bit vague, so here are some examples. There are some ideas in maths and physics that have this particular quality for me. They aren’t ones where I’m making much useful progress, and at least one is probably outright bad. They’re just examples of the kind of thing where once it’s in my head, it’s really in my head.

  • There’s a variant form of general relativity called teleparallel gravity. GR takes place in curved spacetime, and one way of thinking of this mathematically is that as you move from place to place, your frame of reference rotates in a manner described by an object called the connection. The GR connection has nonzero curvature, but there’s also some other geometrical property it could have called torsion, that’s set to zero in GR.

It turns out that you can also make a perfectly good connection with zero curvature (it’s ‘teleparallel’ – parallel lines stay parallel). Instead, it has nonzero torsion. And if you choose some coefficients right in some Lagrangian, you can reproduce GR in some sense. Buh? The formulation is pretty opaque, so what’s really going on?

  • Pedalling back a bit because we quite clearly need to, what are these curvature and torsion thingies? You can calculate quite well with limited understanding of what’s going on geometrically. GR people love to do this in a very opaque way with lots of shuffling little superscripts and subscripts around (it’s fast once you’ve learned it). In an intro course this is normally connected back to geometry at a specific ritual point, which involves shoving a vector round a loop and observing that it rotates a bit. This is not especially satisfying. It’s obviously possible to get a far better understanding, and people in the field manage this, but at least for me that’s involved extracting it painfully one piece at a time from many different sources.

  • A subquestion of this that wasted hours and hours of my time over several years (this is the ‘probably outright bad’ one): there’s curvature and torsion of a connection, but there’s also the simpler idea of curvature and torsion of a a curve in 3D space. I’d convinced myself that there was some sort of analogy between them that had to do with taking a curve off a manifold and developing it in flat Euclidean space. In fact I even got it into my head that I’d read this one of Cartan’s own books! But a lot about the idea didn’t fit so well.

I eventually couldn’t stand it any more and risked asking about it on Mathoverflow, where I feel massively underqualified. Robert Bryant answered me, which was pretty amazing. There is probably nobody better placed in the world to answer questions about Cartan – he’s apparently read the whole lot. He very politely explained that he thinks it’s a red herring, and that Cartan had a different picture in mind when he introduced the torsion of a connection. And I can’t find anything about my brilliant idea in the Cartan book I read.

So it looks like there’s probably nothing there, but I can’t quite say it’s fully out of my head yet. It’s the Easyjet seat pattern of maths questions.

  • A current one: what’s going on in QFT that makes it different to classical perturbation theory? Suddenly the diagrams have loops; why? OK, so some propagator’s nonzero at some point. What does that mean? Why can’t I get that out of a classical theory?

There’s two main parts to all these questions. One, how do the things fit together? And in order to answer this: two, what are these things really? Where ‘really’ is poorly defined, but just being able to reproduce a formal calculation definitely won’t cut it.

And the process for working on them? It’s exactly as in the quote: you do it ‘by arranging and rearranging, by negotiating and renegotiating with a set of well-known materials’. ‘Well-known’, because you’ve spent hours thinking about specific concrete instantiations, in the process of trying to understand what they ‘really’ are. Particular connections, particular propagators. ‘Negotiating and renegotiating’, because they’re your friends by now and you want them to get on. Maybe one side of your explanation meshes poorly with another side. Maybe there’s a reframing that can combine them.

If this is bricolage, then sign me up.


Doing maths and physics in this style requires a certain stubbornness in the face of never getting taught that way. I lost confidence eventually, but I seem to have it back now. I’m convinced that it absolutely can work. It’s not some kind of second-prize way to flail around the curriculum, inferior to a more structured approach. It has its own distinctive methods and produces its own distinctive questions, which I think are often good questions.

It could work even better if it was supported better.

I’m a bricoleur scientist.