Triangles by Milestoned, CC BY 2.0

A long time ago, I had an incredibly formative learning experience – formative in part because it came from someone who didn’t want me to learn.

I was eight or so years old and at my friend L’s house. L was not just my friend, she was my only friend. Like many nerdy outcasts I grabbed onto my intelligence like a life preserver, convinced that it was the only thing keeping me afloat. I was, in a word, insufferable.

But L had an older brother who, unfortunately, had to suffer me. One day, in a fit of frustration, he grabbed a piece of paper and drew a triangle on it. He put a different number in two of the three corners, and an x in the other, and thrust it at me. “If you’re so smart,” he said, or something like it, “tell me what x is.”

I stared at it blankly for several minutes. I had no idea what x was or how to figure it out. FInally, feeling humiliated, I confessed that I had no clue.

L’s brother was triumphant. “The answer is obvious,” he said, solving for it quickly. “All triangles are one hundred and eighty degrees. We know these two angles, so you just have to subtract.”

“Oh,” I said, grasping the concept immediately. I took the pen and paper back. “So if you have two angles with fifty degrees, the last angle would be, um… eighty. Cool!”

L’s brother realized he had only fed my ego, and went to go play videogames, deflated. I continued to play with the triangles, and while I did I internalized a very important life lesson.

The vast majority of time, when you don’t understand something, it’s because you’re missing a crucial piece of information, not because you’re unintelligent. The role of a teacher or mentor is to figure out what piece of information you’re missing, and help you find it.

Sometimes that just means telling you the answer, because finding the answer will take too long and isn’t worth your time. This is incredibly reasonable. No one can recreate the bulk of human knowledge from first principles, as fun as that sounds to try.

Sometimes that means walking you through a proof, leading you to the right spot so you can grab the piece yourself. If L’s brother had wanted to do that, he could have drawn a set of triangles with all the angles filled in, and prodded me with questions like, “Each triangle has three numbers. What else do they have three of?”

And sometimes that means following behind you, watching you make your own way. Often, the best thing a teacher can do is show you how to teach yourself. “What do you know about triangles?” L’s brother might have asked. “What do you know about angles? How do you think you could use that information here? Let’s brainstorm all the things these numbers could represent. Which of these options do you want to explore first?”

A good teacher asks you how you want to learn. “Do you want to try and figure this out yourself?” they ask. “Why don’t you try thinking aloud?” A good teacher encourages you when you’re struggling. “This was just as hard for me when I first encountered it! I did it, and I know you can do it. I believe in you.”

Good teaching is hard. Like a lot of feminine-coded work, it’s undervalued, both easily and sorely missed. I’ve never been to a coding bootcamp, but I’m willing to be they’re full of people with more programming experience than teaching experience. Most workplaces are too.

I too often see people who are struggling with a task or concept get told to figure it out for themselves. I see them trying and failing to build the world from first principles, I see them fumbling around without knowing what they’re looking for. And I think of L’s brother and his triangles.

It’s easy to solve the problem when you know the context. It’s knowing the context that’s hard. If you’re not helping your students find the context, you’re not teaching them. If you’re trying to learn, and there’s no one to help you find the context, here are some tips:

  • Write out the problem as best you understand it. Underline or highlight any words or concepts you’re unsure about. This works well for me because I learn by writing – you might prefer drawing a diagram or speaking out loud. Whatever medium you use, it should help you figure out where you’re getting lost. Hint: it’s the part where you end up writing, “And then a thing happens!” or drawing “??? :(“
  • Ask for help. You may not have a mentor who can walk you through the process, but you’ll always have friends and internet strangers who can answer a question or two. “I’m trying to learn more about nested code,” you might say into an IRC channel, or on Twitter, or shout out your front door. “But I’m not sure what to call it. Are there terms for code that uses itself?” Someone replies: “Oh, you mean recursion?” And suddenly you have a whole new term to search for.
  • Use the internet! When I’m learning, I iterate through dozens of web searches, finding tutorials and explanations that give me new concepts and terms to search for. I often alternate between web searches, asking friends, and writing out my current understanding of the context as I build towards my solution.

My most important tip? Be kind to yourself. Solving problems when you lack context is an incredibly hard thing to do. It was just as hard for me when I started doing it. But I did it, and I know you can do it too.

I believe in you.

A wild Saturday night

Not feeling well, so I’m spending Saturday night tucked in bed reading Gödel, Escher, Bach, which I picked up from the library last week. Let me just say that if it takes as long to read the full twenty chapters as it has the first three, there’s no way I’m finishing this before I have to return it. I generally like Hofstadter’s style but it’s quite brief in places, and I find myself having to try out paraphrases and interpretations to understand what he means. I am currently stuck on his formal system for primality at the end of chapter 3 and I’m sure I’m just misunderstanding some part of the notation. I googled around in the hope of locating another explanation.  I have found:

I will do my best to paraphrase the problem, in case one of my readers can help me out, but also because I believe there should be a variety of explanations of a given topic available online.

Hofstadter begins by stating that x, y and z are strings of hyphens*, and identifying an “axiom schema” or “thing that you just have to take on faith is true”. That schema is xyDNDx. We can interpret this as string x plus string y does not divide string y, although it’s important to point out that the formal system doesn’t inherently care about what you can or cannot divide – that’s the meaning we’re assigning to the letters DND. Because we trust Hofstader that this system represents division, primality, and all those good things (as Hofstader would say, the system is isomorphic to primality) we can use discrepancies between the two to locate flaws in our own thinking. But we can’t actually reason from them – we can’t say that xyDNDx is true because a number greater than x (xy) cannot divide x.  That’s why it’s an axiom, a thing we have to take on faith within the formal system.

(* Probably worth saying that anytime you see a “-” below it is a hyphen, not a minus sign.)

(Added later: In the process of typing this out, I figured out where I was getting stuck.  Hofstadter uses a confusing notation, where the values of x y and z change from axiom to rule and from rule to rule. Why doesn’t he just create new variables each time – x,y,z,a,b,c,etc? Or, if we’re re-using variables, why do we need z?  We never need more than 2 strings in an axiom or rule. Maybe there’s a reason for these choices, but I find them deeply confusing. I imagine he’ll follow this pattern throughout the book, so I’ll try to let it go (let it go! (sorry.))  Thank you, readers, for being my Rubber Duck.)

Anyway.  There are a few rules that allow us to generate theorems from this axiom schema.  Rule 1 states that xDNDy -> xDNDxy.  Regardless, if you try this out with real number examples, you’ll see that it works isomorphically.  For instance:

x,y xyDNDx xDNDxy xDNDxy (again)
3,4 7DND3 7DND10 7DND17
2,8 10DND2 10DND12 10DND22
1,1 2DND1 2DND3 2DND5

So now we have a formal way of representing the concept “does not divide”. But how can we represent “has no divisors”, which is what primeness/primality actually is?  Rule 2 gives us the rule –DNDz -> zDF–.  This is easy to interpret into plain English.  If two does not divide a string, than that string has no divisors less than or equal to two. Hofstadter calls this being ‘divisor-free’, represented as DF.  Rule 3 extends this with (zDFx and x-DNDz) -> zDFx-. Wolfram Alpha paraphrases well: “if z is divisor free up to x and if x+1 does not divide z, then z is divisor free up to x+1.” For example, if we’ve already established that 7 is divisor free up to 5, and 6 doesn’t divide 7, then 7 is divisor free up to 6.  Note that this rule, taken alone, allows us to say “35 is divisor free up to 3, and 4 doesn’t divide 35, so 35 is divisor free up to 4″. Which is totally true. Iterating just one more time, though, will have us checking if 5DND35 is a theorem.  It’s not – you can’t generate it using rule 1 – so we can’t add it to our pile of primes.

We make said pile of primes with the following final rule: z-DFz -> Pz-.  Again, Wolfram Alpha has a good interpretation: “if z+1 is divisor free up to z  then z+1 is prime”.  For example, if 13 is divisor-free up to 12, then 13 is prime.

To generate primes using this ruleset, you’d take the following steps:

1) You start with a number that you want to test is prime. If you’re generating primes, you can start from 1.  Let’s test 5.  We start with rule 2, replacing z with 5.  Is 2DND5 a theorem?

2) To answer this question, we attempt to generate it from the axiom schema.  We can do so by assigning x to 1 and y to 1.  Then our initial axiom is (1+1)DND1 or 2DND1.  Applying rule 1, we get the theorem 2DND(1+2) or 2DND3.  Applying again, we get 2DND(1+2+2) or 2DND5.  So it is a theorem. Since 2DND5, 5DF2 – that is, 5 is divisor-free up to 2.

3) We next see if we can apply rule number 3.  We’ve shown that zDFx – that 5 is divisor-free up to 2.  Is (2+1)DND5 a theorem?  Again, we attempt to generate from the axiom schema. We can do this by assigning x to 2 and y to 1.  This gives us an axiom 3DND2.  We can add to it by applying rule 1: 3DND(2+3) or 3DND5. Excellent! So we can apply rule 3, and say that 5 is divisor-free up to 3.

4) After each attempt at rule number 3, we check rule number 4 to see if the conditions are satisfied.  Rule 4 says that z-DFz -> Pz-. Remember, we’re not substituting our value of z in for z here, because variables don’t need to be consistent between rules, only within them.  Instead, we can ask: have we shown 5DF4? The answer is no, we’ve only gotten up to 5DF3.  So: once more back to rule 3.

5)  We’ve shown that 5DF3. Is 4DND5 a theorem? Yes, because we can generate it from our axiom schema by assigning x to 1 and y to 3.  This gives us an axiom of 4DND1.  Applying rule 2, we get 4DND(1+4) or 4DND5. Yay! So 5DF3 and 4DND5, which means 5DF4 – 5 is divisor-free up to 4.

6) Back to rule number 4.  Have we shown that 5DF4? I believe we have. Applying rule number 4, we determine that P5 – that the number 5 is prime.

 Okay, I’m going to bed. :)


On several occasions recently, I’ve wanted to perform a statistical operation that does not exist in Python (or at least, not in the NumPy, Pandas, or Statsmodels libraries) but does in R.  So I’ve been trying to get more comfortable in R.

I’ve certainly more familiar with R, but I’m not sure I can ever be comfortable with a language that does something as novice-unfriendly as silent recycling.  Basically, if you ask R to do something with vectors of different lengths, R will recycle the shorter of the two vectors until the length matches. If the longer vector is a multiple of the shorter vector, it will do so silently, without telling you.

I vented in this gist:

The lack of a warning for when recycling is done evenly seems to rely on the assumption that if that happens, it must be intentional. That seems unwarranted to me.

I wrote a quick (and possibly buggy) piece of code to see how often a number is divisible by another number:

[[See the gist for code.]]

Which gave me a result of 0.02632785643547798, which I interpret as, “For a random number between 1 and 1000, there is a 2% chance that a smaller number will go into it evenly.”

I think this a conservative estimate of the number of times someone might accidentally recycle silently, because I think people are more likely to be working with multiples when working with real data. If I want to divide my 100 treatment+control measure A by my 100 treatment+control measure B but accidentally divide by my 50 treatment only measure B, I’m not going to notice that I’ve done anything wrong.

And of course, a vector of one – which seems like a very common accidental length – goes into everything evenly.

As an author Ben linked me to puts it, this is “fucking ghastly”.

I agree.  And I’m a pacifist when it comes to the language wars – I even added a line to the OpenHatch IRC Code of Conduct asking people to “keep any criticism constructive and specific”, mostly in response to people bashing PHP.  But this has me in a fighting mood.  I’m now convinced that some significant number of scientific studies using R have accidentally recycled when they didn’t mean to, with questionable results.

I’m thinking of proposing a “Recycling Day” where volunteers get together, maybe with bottles of wine, to go through as much published R code as they can and report the recycling errors.  Maybe if we found enough, we could get the defaults changed so R always warns when recycling, or so that you have to actively choose to recycle.  In the meantime, there’s this R package which prevents recycling.

A Brief History of Humankind, briefer

A few months back, a friend recommended watching the videos from a Coursera course, A Brief History of Humankind.  I recently finished, and I second the recommendation.  The lectures are engaging and encompassing without being overly shallow, and I like Yuval Harari’s approach of describing theories while continuously pointing out their contradictions and uncertain nature.

To make the course go even faster, I have two suggestions.  First, watch the videos on at least 1.5x speed. Dr. Harari speaks painfully slowly, and I was often up to 2.5x speed.  Second, I have compiled a list of the segments I got the most out of.  Watching only these on 2x speed, you should get through the course in just four or five hours.  As I said, I do recommend the whole thing, but here were the highlights for me:

  • Lesson 2: The Cognitive Revolution, Segment 1: An easy to understand overview of what the cognitive revolution was, when it happened, and what the consequences were.  If this piques your interest, watch the four subsequent sections of this lesson.
  • Lesson 4: The Human Flood, Segment 2: The impact of the cognitive revolution on earth ecosystems.
  • Lesson 5: History’s Biggest Fraud, Segment 1: The agricultural revolution and its downsides.  The three subsequent sections in this lesson are also quite good.
  • Lesson 6: Building Pyramids, Segment 3: The rise of mathematical thinking, and the invention of math to help us handle it. I’ll recommend any segment that covers Sumer.
  • Lesson 7: There is No Justice in History, Segment 2  & Segment 4: Talks about how hierarchies formed, focusing on race/caste (segment 2) and patriarchy (segment 4). The value of these segments probably depends on the extent to which you’ve previously thought and learned about race and gender.
  • Lesson 8: The Direction of History, Segment 3: The history of trust. Possibly my favorite segment of the whole course – I found myself thinking about the characterization of money as a form of quantified trust long after the segment was over.
  • Lesson 10: The Law of Religion, Segment 2: An overview of theistic religions, with a focus on how polytheism and dualism influenced monotheism.
  • Lesson 11: The Discovery of Ignorance, Segment 1: How science and imperialism grew with each other.  Can be summed up with this line: “The real aim of modern science is not truth, it is power.”
  • Lesson 13: The Capitalist Creed, Segment 2 & Segment 4: Segment 2 is as good an introduction to capitalism as I’ve found anywhere.  Segment 4 is short and very much worth watching.  It talks about unregulated capitalism, monopolies, and indifference, using the Atlantic slave trade as an example.
  • Lesson 14: The Industrial Revolution, Segment 3: This one is actually an anti-recommendation.  An introduction to consumerism, it has a shallow and frankly offensive take on obesity.  I found it a useful reminder that all histories are narratives and all narrators are fallible.  (Another reminder: a not too tactful reference to transgender experiences in Lecture 17 Segment 1.)
  • Lesson 15: A Permanent Revolution, Segment 1 & Segment 4: Segment 1 covers our changing approaches to time.  If you like this segment, I recommend reading A Geography of Time by Robert Levine.  Segment 4 talks about the surprising peacefulness of our time.

I will leave you with the last few sentences from the course:

So I hope that you leave this course with more questions than you had when you entered it, and that you leave this course with a desire, with a wish to study and to learn more about our history. In addition, I hope that you leave this course feeling a bit more uneasy than when you started it. Uneasy about the many questions to which we humans have no clear answer yet. Uneasy about the many problematic events that happened in the past, and uneasy about the direction history may be taking us in the future.

Money is hard, too, but you knew that

I received a number of comments regarding my recent posts (1,2) about how abstraction/quantification of trust leads to negative consequences such as competition, social judgment, and “gaming the system”. This does not surprise me, because my model for quantifying trust is money. And it is hard to find a technology more profoundly impactful and more deeply flawed than money.

This video segment from a Coursera class I’ve been taking does a good job of explaining the invention of money as a way of abstracting trust. From the transcript:

Why are people willing to work, for entire month, doing, many times, things they don’t really like, just in order to get the end of the month, a few colorful pieces of paper? People are willing to do such things when they trust the figments of the collective imagination. These, these cowry shells or these colorful pieces of paper. Trust is the real raw material from which all types of money in history have been minted.

When a wealthy farmer, say in ancient China, sold all his possessions in exchange for a sack of cowrie shells. And then travelled with them to another province, he trusted this ancient Chinese farmer, he trusted that when he reached his destination, other people, complete stranger, who he never met before would be willing to sell him rice, to sell him a house, to sell him fields in exchange for his cowry shells.

Money is accordingly a system of mutual trust. And not just any system of mutual trust. Money is the most universal and most efficient system of mutual trust ever devised by human beings.

Money is an abstraction that allows us to trust that we will get our physical needs met without having to do the work ourselves of building our own houses, growing our own food, making our own medicines. With money, we were able to expand our resource networks to vast numbers of strangers.  What I’m imagining is something similar, but in the realm of information.  With so many facts, claims, data points, anecdotes, and opinions constantly surrounding us, we end up making sense of things through something very primitive: gossip from our friends, and faith in people who look like us and talk like us.  I don’t think we’re wrong to do so.  But it’s not a very efficient system.

What if we could create a system, an abstraction, that allows us to trust that the knowledge and statements that are proposed to us are true, with a specific and transparent level of confidence?

Of course, we already have this in some respects. Most notably, we have a large and quite profitable academic and industrial system whose sole purpose is to create, collect and verify knowledge. But this system functions largely based on implicit rather than explicit trust. You trust the judgment of editors and peer reviewers of journals, of grant-makers and tenure-granters, and you trust these judgments because they belong to institutions with good reputations, or because their own work is frequently cited, or because you’ve read their work before and haven’t found any flaws. Some of these reasons are flimsier than others, but all are understandable, because verification is hard, trust is also hard, and there’s no other game going.

Which brings us around to the criticisms.  Let’s say we could make a much more efficient system, the equivalent of money but for knowledge.  Should we?

After all, money is the cause of so many problems in our society.  It causes corruption in our political systems, resentment and conflict in our personal lives, and suffering and death for many who do not have a lot of it.  Why on earth would we want to make anything more like money?

I don’t have a pat answer to that, but I do have two responses that I think are worth exploring.

First: for all the awful downsides to money, one can argue that we owe the last several thousand years of social and technical advancements to money.  Could we live the life we do now, with our cell phones and low infant mortality and space missions and chemotherapy treatments and takeout dinners and musicals, without money? This is a pretty epic counterfactual, so I’m not expecting immediate agreement, but I tend to believe that money has done more good than harm.

And second: money is very simple.  It was invented five millennia ago, and that shows.  Sure, Wall Street sharks and shysters like to create complex financial instruments but they certainly aren’t doing so to benefit society.  But we have computers now – most of us carry them around in our pockets, and sleep with them by our bedsides – and they keep way better track of value than cuneiform tablets.  For some ideas about how we could improve money, I recommend reading Charles Eisenstein’s Sacred Economics, especially the chapters Currencies of the Commons and Negative Interested Economics.  Bitcoin, of course, is an attempt at improving money through technology as well.  Unfortunately, money’s got a lot of baggage.  A trust abstraction system for information, if devised, could be structured to mitigate harms from the beginning.

That sounds great in theory, but what might it look like in practice?  I’ll sketch out some ideas in future blog posts.