Building a Memex by Andrew Louis

The Memex becomes weather-ready

On July 7, 1995, a summer thunderstorm at midnight kept young Andrew Louis awake, causing him to wake up late the next morning. This week, I spent a bit of time looking into historical weather data and found was pleased to discover my childhood journal seems to match objective reality. Here’s another example from a few months later — a rainy day that resulted in us having indoor recess at school:

This week, I worked on an importer that adds weather data to my personal timeline, using the fantastic DarkSky API. Seeing the weather alongside other records from a day helps add context or bring back additional memories.

Having weather data accessible also lets me do silly Quantified Self things like analyze the relationship between a day’s temperature and time spent on my bike:

A farewell to Facebook’s message history API

As machine learning gets more and more powerful, monopolized data becomes a moat protecting your platform’s competitive advantage. That’s my hunch, at least, and I’m sure it’s a big part of the reason Facebook is deprecating the endpoint for retrieving messaging history next week.

Even though the majority of my Facebook messages aren’t worth remembering, it’s a great feeling to know that I can rely on my Memex to find needles in haystacks the times I do need to search for something (and even better, not to have to remember which haystack to search in).

A few people have written libraries for scraping Facebook Messenger’s AJAX endpoints and I’m pretty confident I can get an importer working this way. I played around with this golang one and I think I can integrate it into a proper importer. These sorts of workarounds make me nervous though — they’re fragile, might get my personal account in trouble if Facebook notices, and will be hard to support in an official way as I start bringing the Memex to more people. #webwelost

The microphotography revolution

In 1839, John Benjamin Dancer figured out how to make microphotographs, tiny representations of pages of text, stored on film. Around 1900, microfilm (left) began being used in libraries, saving valuable shelf space; microcards (centre) used the same principles but added classification and bibliographic data; microfiche (right) was a French format that superseded the others in the 50s.

Rebecca Lemov spends a chapter in Database of Dreams (also mentioned in last week’s email — I’ve really enjoyed reading it) talking about the revolution that was microphotography.

In the 1920s, microphotography exploded, coinciding with an explosion in the amount of information produced by corporations, researchers, and bureaucracies. Microphotography also allowed cheques to become commonly-used, let railroad companies stay on top of their piles of records, and was crucial for the management of mail-order businesses (Sears Roebucks was processing almost 100k orders / day). During WWII, 1.5 billion letters were sent to soldiers overseas (“you’ll write, he’ll fight”) but instead of sending the physical letters, each letter was microphotographed, sent as tiny film across the ocean, and reprinted at 3/5 the size and given to soldiers. When then British Museum was damaged by air bombing, fear gripped archivists everywhere, spurring an even faster push to microphotograph material.

As successful as microphotography was at compressing information and storing it cheaply, it didn’t support complex operations, causing one archivist to conclude that the technologies were just “one giant headache.” An attempt to tame the chaos came in the form of numerous new indexes and classification schemes for areas like diseases and crimes; the idea of a bibliography also emerged in this era.

It’s in this historical context that we should understand the goals of the Memex. The author, Vannevar Bush, had already made a few attempts to solve this data management problem. The Rapid Selector was a device that took microfilm and added punchcard patterns into the margins to allow for mechanical sorting/searching. But even with fast searching, Bush felt hierarchical categories and schemas were arbitrary and made relevant information hard to get to, regardless of how fast you could pull up records. The Memex was to be a microfilm-based personal library — this idea would be familiar to his readers. Bush’s innovative idea was letting users create links between arbitrary items (later named hyperlinks) and use them to navigate non-linearly when searching for something later, mirroring how our own memories work by association.

The Memex, along with other fantasies of perfect information systems, identified real challenges of managing information but existed long before we had the technologies to properly build them. I don’t know if we’re in a different boat today —the amount of data we produce eclipses the technologies and conceptual techniques required to make sense of it, both on corporate and personal level. Like Bush’s innovative attempts to tame his own library, I think there’s lots of room for new techniques to stay on top of our digital histories.

Calendar journals

In last week’s newsletter, I talked about how many people use calendars as a journal. This week, I chatted with one such person, my friend Kamilah. She has over a dozen calendars in Google Calendar that she fills in retroactively for things like movie and TV watching, meals, chores, and social engagements. Here’s an obfuscated screenshot of what her system looks like:

I’m pretty excited about the possibilities of using calendars as a bridge for people who like journalling so they can capture and do more with their data. Calendars work well both as a way to display digests of historical personal data as well as an input method for new journal entries. If you know anyone else who obsessively uses calendars this way, I’d love to have a chat!

Quantified Self Art

I love coming across artists who work with their own personal data. Here are a few:

I have importers for and track a lot of the same types of information as these artists. I’d love if this Memex project could give artists an easy way to work with this type of data without having to build importers/tools from scratch.

If you come across artists doing work in this area, let me know!