A wild Saturday night

Not feeling well, so I’m spending Saturday night tucked in bed reading Gödel, Escher, Bach, which I picked up from the library last week. Let me just say that if it takes as long to read the full twenty chapters as it has the first three, there’s no way I’m finishing this before I have to return it. I generally like Hofstadter’s style but it’s quite brief in places, and I find myself having to try out paraphrases and interpretations to understand what he means. I am currently stuck on his formal system for primality at the end of chapter 3 and I’m sure I’m just misunderstanding some part of the notation. I googled around in the hope of locating another explanation.  I have found:

I will do my best to paraphrase the problem, in case one of my readers can help me out, but also because I believe there should be a variety of explanations of a given topic available online.

Hofstadter begins by stating that x, y and z are strings of hyphens*, and identifying an “axiom schema” or “thing that you just have to take on faith is true”. That schema is xyDNDx. We can interpret this as string x plus string y does not divide string y, although it’s important to point out that the formal system doesn’t inherently care about what you can or cannot divide – that’s the meaning we’re assigning to the letters DND. Because we trust Hofstader that this system represents division, primality, and all those good things (as Hofstader would say, the system is isomorphic to primality) we can use discrepancies between the two to locate flaws in our own thinking. But we can’t actually reason from them – we can’t say that xyDNDx is true because a number greater than x (xy) cannot divide x.  That’s why it’s an axiom, a thing we have to take on faith within the formal system.

(* Probably worth saying that anytime you see a “-” below it is a hyphen, not a minus sign.)

(Added later: In the process of typing this out, I figured out where I was getting stuck.  Hofstadter uses a confusing notation, where the values of x y and z change from axiom to rule and from rule to rule. Why doesn’t he just create new variables each time – x,y,z,a,b,c,etc? Or, if we’re re-using variables, why do we need z?  We never need more than 2 strings in an axiom or rule. Maybe there’s a reason for these choices, but I find them deeply confusing. I imagine he’ll follow this pattern throughout the book, so I’ll try to let it go (let it go! (sorry.))  Thank you, readers, for being my Rubber Duck.)

Anyway.  There are a few rules that allow us to generate theorems from this axiom schema.  Rule 1 states that xDNDy -> xDNDxy.  Regardless, if you try this out with real number examples, you’ll see that it works isomorphically.  For instance:

x,y xyDNDx xDNDxy xDNDxy (again)
3,4 7DND3 7DND10 7DND17
2,8 10DND2 10DND12 10DND22
1,1 2DND1 2DND3 2DND5

So now we have a formal way of representing the concept “does not divide”. But how can we represent “has no divisors”, which is what primeness/primality actually is?  Rule 2 gives us the rule –DNDz -> zDF–.  This is easy to interpret into plain English.  If two does not divide a string, than that string has no divisors less than or equal to two. Hofstadter calls this being ‘divisor-free’, represented as DF.  Rule 3 extends this with (zDFx and x-DNDz) -> zDFx-. Wolfram Alpha paraphrases well: “if z is divisor free up to x and if x+1 does not divide z, then z is divisor free up to x+1.” For example, if we’ve already established that 7 is divisor free up to 5, and 6 doesn’t divide 7, then 7 is divisor free up to 6.  Note that this rule, taken alone, allows us to say “35 is divisor free up to 3, and 4 doesn’t divide 35, so 35 is divisor free up to 4″. Which is totally true. Iterating just one more time, though, will have us checking if 5DND35 is a theorem.  It’s not – you can’t generate it using rule 1 – so we can’t add it to our pile of primes.

We make said pile of primes with the following final rule: z-DFz -> Pz-.  Again, Wolfram Alpha has a good interpretation: “if z+1 is divisor free up to z  then z+1 is prime”.  For example, if 13 is divisor-free up to 12, then 13 is prime.

To generate primes using this ruleset, you’d take the following steps:

1) You start with a number that you want to test is prime. If you’re generating primes, you can start from 1.  Let’s test 5.  We start with rule 2, replacing z with 5.  Is 2DND5 a theorem?

2) To answer this question, we attempt to generate it from the axiom schema.  We can do so by assigning x to 1 and y to 1.  Then our initial axiom is (1+1)DND1 or 2DND1.  Applying rule 1, we get the theorem 2DND(1+2) or 2DND3.  Applying again, we get 2DND(1+2+2) or 2DND5.  So it is a theorem. Since 2DND5, 5DF2 – that is, 5 is divisor-free up to 2.

3) We next see if we can apply rule number 3.  We’ve shown that zDFx – that 5 is divisor-free up to 2.  Is (2+1)DND5 a theorem?  Again, we attempt to generate from the axiom schema. We can do this by assigning x to 2 and y to 1.  This gives us an axiom 3DND2.  We can add to it by applying rule 1: 3DND(2+3) or 3DND5. Excellent! So we can apply rule 3, and say that 5 is divisor-free up to 3.

4) After each attempt at rule number 3, we check rule number 4 to see if the conditions are satisfied.  Rule 4 says that z-DFz -> Pz-. Remember, we’re not substituting our value of z in for z here, because variables don’t need to be consistent between rules, only within them.  Instead, we can ask: have we shown 5DF4? The answer is no, we’ve only gotten up to 5DF3.  So: once more back to rule 3.

5)  We’ve shown that 5DF3. Is 4DND5 a theorem? Yes, because we can generate it from our axiom schema by assigning x to 1 and y to 3.  This gives us an axiom of 4DND1.  Applying rule 2, we get 4DND(1+4) or 4DND5. Yay! So 5DF3 and 4DND5, which means 5DF4 – 5 is divisor-free up to 4.

6) Back to rule number 4.  Have we shown that 5DF4? I believe we have. Applying rule number 4, we determine that P5 – that the number 5 is prime.

 Okay, I’m going to bed. :)

R

On several occasions recently, I’ve wanted to perform a statistical operation that does not exist in Python (or at least, not in the NumPy, Pandas, or Statsmodels libraries) but does in R.  So I’ve been trying to get more comfortable in R.

I’ve certainly more familiar with R, but I’m not sure I can ever be comfortable with a language that does something as novice-unfriendly as silent recycling.  Basically, if you ask R to do something with vectors of different lengths, R will recycle the shorter of the two vectors until the length matches. If the longer vector is a multiple of the shorter vector, it will do so silently, without telling you.

I vented in this gist:

The lack of a warning for when recycling is done evenly seems to rely on the assumption that if that happens, it must be intentional. That seems unwarranted to me.

I wrote a quick (and possibly buggy) piece of code to see how often a number is divisible by another number:

[[See the gist for code.]]

Which gave me a result of 0.02632785643547798, which I interpret as, “For a random number between 1 and 1000, there is a 2% chance that a smaller number will go into it evenly.”

I think this a conservative estimate of the number of times someone might accidentally recycle silently, because I think people are more likely to be working with multiples when working with real data. If I want to divide my 100 treatment+control measure A by my 100 treatment+control measure B but accidentally divide by my 50 treatment only measure B, I’m not going to notice that I’ve done anything wrong.

And of course, a vector of one – which seems like a very common accidental length – goes into everything evenly.

As an author Ben linked me to puts it, this is “fucking ghastly”.

I agree.  And I’m a pacifist when it comes to the language wars – I even added a line to the OpenHatch IRC Code of Conduct asking people to “keep any criticism constructive and specific”, mostly in response to people bashing PHP.  But this has me in a fighting mood.  I’m now convinced that some significant number of scientific studies using R have accidentally recycled when they didn’t mean to, with questionable results.

I’m thinking of proposing a “Recycling Day” where volunteers get together, maybe with bottles of wine, to go through as much published R code as they can and report the recycling errors.  Maybe if we found enough, we could get the defaults changed so R always warns when recycling, or so that you have to actively choose to recycle.  In the meantime, there’s this R package which prevents recycling.

A Brief History of Humankind, briefer

A few months back, a friend recommended watching the videos from a Coursera course, A Brief History of Humankind.  I recently finished, and I second the recommendation.  The lectures are engaging and encompassing without being overly shallow, and I like Yuval Harari’s approach of describing theories while continuously pointing out their contradictions and uncertain nature.

To make the course go even faster, I have two suggestions.  First, watch the videos on at least 1.5x speed. Dr. Harari speaks painfully slowly, and I was often up to 2.5x speed.  Second, I have compiled a list of the segments I got the most out of.  Watching only these on 2x speed, you should get through the course in just four or five hours.  As I said, I do recommend the whole thing, but here were the highlights for me:

  • Lesson 2: The Cognitive Revolution, Segment 1: An easy to understand overview of what the cognitive revolution was, when it happened, and what the consequences were.  If this piques your interest, watch the four subsequent sections of this lesson.
  • Lesson 4: The Human Flood, Segment 2: The impact of the cognitive revolution on earth ecosystems.
  • Lesson 5: History’s Biggest Fraud, Segment 1: The agricultural revolution and its downsides.  The three subsequent sections in this lesson are also quite good.
  • Lesson 6: Building Pyramids, Segment 3: The rise of mathematical thinking, and the invention of math to help us handle it. I’ll recommend any segment that covers Sumer.
  • Lesson 7: There is No Justice in History, Segment 2  & Segment 4: Talks about how hierarchies formed, focusing on race/caste (segment 2) and patriarchy (segment 4). The value of these segments probably depends on the extent to which you’ve previously thought and learned about race and gender.
  • Lesson 8: The Direction of History, Segment 3: The history of trust. Possibly my favorite segment of the whole course – I found myself thinking about the characterization of money as a form of quantified trust long after the segment was over.
  • Lesson 10: The Law of Religion, Segment 2: An overview of theistic religions, with a focus on how polytheism and dualism influenced monotheism.
  • Lesson 11: The Discovery of Ignorance, Segment 1: How science and imperialism grew with each other.  Can be summed up with this line: “The real aim of modern science is not truth, it is power.”
  • Lesson 13: The Capitalist Creed, Segment 2 & Segment 4: Segment 2 is as good an introduction to capitalism as I’ve found anywhere.  Segment 4 is short and very much worth watching.  It talks about unregulated capitalism, monopolies, and indifference, using the Atlantic slave trade as an example.
  • Lesson 14: The Industrial Revolution, Segment 3: This one is actually an anti-recommendation.  An introduction to consumerism, it has a shallow and frankly offensive take on obesity.  I found it a useful reminder that all histories are narratives and all narrators are fallible.  (Another reminder: a not too tactful reference to transgender experiences in Lecture 17 Segment 1.)
  • Lesson 15: A Permanent Revolution, Segment 1 & Segment 4: Segment 1 covers our changing approaches to time.  If you like this segment, I recommend reading A Geography of Time by Robert Levine.  Segment 4 talks about the surprising peacefulness of our time.

I will leave you with the last few sentences from the course:

So I hope that you leave this course with more questions than you had when you entered it, and that you leave this course with a desire, with a wish to study and to learn more about our history. In addition, I hope that you leave this course feeling a bit more uneasy than when you started it. Uneasy about the many questions to which we humans have no clear answer yet. Uneasy about the many problematic events that happened in the past, and uneasy about the direction history may be taking us in the future.

Money is hard, too, but you knew that

I received a number of comments regarding my recent posts (1,2) about how abstraction/quantification of trust leads to negative consequences such as competition, social judgment, and “gaming the system”. This does not surprise me, because my model for quantifying trust is money. And it is hard to find a technology more profoundly impactful and more deeply flawed than money.

This video segment from a Coursera class I’ve been taking does a good job of explaining the invention of money as a way of abstracting trust. From the transcript:

Why are people willing to work, for entire month, doing, many times, things they don’t really like, just in order to get the end of the month, a few colorful pieces of paper? People are willing to do such things when they trust the figments of the collective imagination. These, these cowry shells or these colorful pieces of paper. Trust is the real raw material from which all types of money in history have been minted.

When a wealthy farmer, say in ancient China, sold all his possessions in exchange for a sack of cowrie shells. And then travelled with them to another province, he trusted this ancient Chinese farmer, he trusted that when he reached his destination, other people, complete stranger, who he never met before would be willing to sell him rice, to sell him a house, to sell him fields in exchange for his cowry shells.

Money is accordingly a system of mutual trust. And not just any system of mutual trust. Money is the most universal and most efficient system of mutual trust ever devised by human beings.

Money is an abstraction that allows us to trust that we will get our physical needs met without having to do the work ourselves of building our own houses, growing our own food, making our own medicines. With money, we were able to expand our resource networks to vast numbers of strangers.  What I’m imagining is something similar, but in the realm of information.  With so many facts, claims, data points, anecdotes, and opinions constantly surrounding us, we end up making sense of things through something very primitive: gossip from our friends, and faith in people who look like us and talk like us.  I don’t think we’re wrong to do so.  But it’s not a very efficient system.

What if we could create a system, an abstraction, that allows us to trust that the knowledge and statements that are proposed to us are true, with a specific and transparent level of confidence?

Of course, we already have this in some respects. Most notably, we have a large and quite profitable academic and industrial system whose sole purpose is to create, collect and verify knowledge. But this system functions largely based on implicit rather than explicit trust. You trust the judgment of editors and peer reviewers of journals, of grant-makers and tenure-granters, and you trust these judgments because they belong to institutions with good reputations, or because their own work is frequently cited, or because you’ve read their work before and haven’t found any flaws. Some of these reasons are flimsier than others, but all are understandable, because verification is hard, trust is also hard, and there’s no other game going.

Which brings us around to the criticisms.  Let’s say we could make a much more efficient system, the equivalent of money but for knowledge.  Should we?

After all, money is the cause of so many problems in our society.  It causes corruption in our political systems, resentment and conflict in our personal lives, and suffering and death for many who do not have a lot of it.  Why on earth would we want to make anything more like money?

I don’t have a pat answer to that, but I do have two responses that I think are worth exploring.

First: for all the awful downsides to money, one can argue that we owe the last several thousand years of social and technical advancements to money.  Could we live the life we do now, with our cell phones and low infant mortality and space missions and chemotherapy treatments and takeout dinners and musicals, without money? This is a pretty epic counterfactual, so I’m not expecting immediate agreement, but I tend to believe that money has done more good than harm.

And second: money is very simple.  It was invented five millennia ago, and that shows.  Sure, Wall Street sharks and shysters like to create complex financial instruments but they certainly aren’t doing so to benefit society.  But we have computers now – most of us carry them around in our pockets, and sleep with them by our bedsides – and they keep way better track of value than cuneiform tablets.  For some ideas about how we could improve money, I recommend reading Charles Eisenstein’s Sacred Economics, especially the chapters Currencies of the Commons and Negative Interested Economics.  Bitcoin, of course, is an attempt at improving money through technology as well.  Unfortunately, money’s got a lot of baggage.  A trust abstraction system for information, if devised, could be structured to mitigate harms from the beginning.

That sounds great in theory, but what might it look like in practice?  I’ll sketch out some ideas in future blog posts.

 

Certainty

I have a habit of qualifying my statements with estimated likelihoods and error bars. “I think about ten people are coming, plus or minus two.” “I’m, like, eighty percent sure that it runs on Windows.” I worry that it comes off as an affectation, but I also worry that I’m not conveying my level of certainty effectively. I know how certain I am, and it pains me when that information gets lost to the ambiguities and inefficiencies of the English language. (I’m told that qualification by certainty is built into Lojban, which I believe with 99% certainty.)

When I send my female friends cover letters and grant proposals, they strike out words like “I think” and “I believe” and “probably” and “try”. I let them do it – we all know that there’s a confidence gap that disadvantages women – but it chafes. Aside from dangly earrings, uncertainty is the aspect of femininity I am most comfortable with.

Perhaps too comfortable.

There is one line in particular from this thoroughly memorable poem that I cannot get out of my head:

I asked five questions in genetics class today and all of them started with the word “sorry”.

My least favorite thing about blogging is the authoritative tone that nearly all bloggers adopt. To avoid it, I center myself and my experiences, about which I am actually the authority. Then my writing seems self-involved. (A lot of women essayists and authors are ridiculed as self-involved. But on what other topics are they allowed to speak with authority?)

I would rather just qualify everything I say. I’ve thought about using hover text for this purpose, or color-coding my sentences to reflect just how confident I am in them:

My least favorite thing about blogging is the authoritative tone that nearly all bloggers adopt. To avoid it, I center myself and my experiences, about which I am actually the authority. Then my writing seems self-involved. (A lot of women essayists and authors are ridiculed as self-involved. But on what other topics are they allowed to speak with authority?)

But you can’t color-code the spoken word – you can’t color-code most of your life – and you can’t shrink down when somebody else is waiting to take your space.  David Dunning writes that ”I don’t know” should be “an enviable success, a crucial signpost that shows us we are traveling in the right direction toward the truth” but for now it is disregard, devalued, and feminized.

What can we do about it?  I don’t know.  I don’t know I don’t know I don’t know I don’t know.

Which leaves me here, with my self-involved writing and my error bars, trying to be taken seriously, but not with certainty.