Building a Memex by Andrew Louis

Bad remembrance machines

Are there any downsides to having so much data from the past available? A friend asked me this recently and I mentioned the unpleasantness of awkward high school chat messages popping up in search results.

When Vannevar Bush conceived of the Memex, computers were room-sized and exorbitantly expensive. The idea of a personal computing device to help individuals was visionary but he still imagined computers as being a tool for professionals — the examples in the Memex essay are about how lawyers, chemists, or physicians could benefit from such a device.

Computers are now cheap and ubiquitous and our digital histories span many years with personal and professional data completely intertwined.

Ellen Ullman, a programmer and author, just published a new book called Life in Code, a collection of essays written over the last twenty years. It’s great! One of the essays (Memory and Megabytes, written in 2002) describes a dilemma she had when setting up a new laptop. Should she copy over all her old files?

When the migration wizard prompted her to start copying everything over, she got cold feet. Would it be dangerous to have such immediate access to so many old memories?

The moment you recall it, the memory has become woven into other experiences. The newness, the clarity, is gone forever.

Photo by Elena Mudd from Mythos Magazine profile

She continues:

The files were unforgiving, frozen, perfect mechanisms for recall, and I wanted to be what I was meant to be: a bad remembrance machine. I could almost feel the memories run away from me as I approached them, scattering at the touch of thought. […] And I realized that the fact that memory changes is what allows us to tolerate something called memory in the first place. If we could not continuously reinterpret the past—could not turn experience over and over, and so interweave it with hope and unknowing—memory would be a tyranny. It would be unbearable, a torture, a bad recurring dream. Like data spinning away on a disk: forever the same, the same, the same.

Not having durable long-term memories is in many ways a feature, not a bug. She wonders whether she’d ever choose to write another book if she were constantly reminded of how many drafts of her previous books had middle-of-the-night timestamps. Or if she had easy access to the anguished documents in the Letters for the Drawer folder, would she have chosen to get married again?

Tossing out all our old personal data seems like an overreaction. As Joan Didion warned us, it’s good to be on “nodding terms” with the people we used to be, otherwise they’ll show up unannounced at 4 a.m. at the mind’s door. In a previous newsletter, I mentioned a friend who had sealed her old journals in lacquer and put them on a shelf.

What’s the digital equivalent of making these records hard to stumble upon?

On a computer, there is no basement or attic. At any moment, while you are whiling away time, maybe avoiding another task, or just daring yourself to think of the past, you might go “click,” and then it all pops out at you: fresh, unyellowed, cruelly unchanged.

Ullman decided not to migrate her old documents. To see them, she would have to go through the trouble of booting up the old devices:

So I cleared a space for them on the bookshelf and stood them up, vertically, like the notebooks they are. And they took their place among the other diaries and journals, treacherously holding the past, their power units crouched behind them like small, sleeping rats.

A bit of nostalgia is healthy from time to time.

Nostalgia used to be thought of as a neurological illness. It was first used to describe the sickness that seemed to plague Swiss mercenaries who missed their mountains while out fighting elsewhere in Europe. Perhaps the disease was caused by brain damage from the constant clanging of cowbells in Switzerland, hypothesized physicians.

Swiss mercenaries nostalgic for the Alps

But now, researchers are coming around:

Nostalgia has had this - historically had this stigma, as we talked about it. It started out very much as being considered a disease. And people - even today, a lot of people will say, well, I’m not nostalgic because I think about the future. You know, I’m not the type of person that likes to fixate or get stuck in the past. And I think what they’re missing when they say that is there is a big element of nostalgia that isn’t about us retreating to the past. It’s about us pulling the past forward to the present and using it to mobilize us, to energize us to take on new challenges and opportunities.

That’s from an episode of Hidden Brain that I listened to recently. It’s a summary of the latest thinking on nostalgia by researcher Clay Routledge who runs a lab specialized in researching this topic.

Nostalgia seems to actually orient people towards the future. And so part of what seems to be going on is you experience some kind of distress, which kind of makes you shrink a little bit from pursuing goals and from the future. Like, it does kick you into more of this defensive mode. And then you bring to mind these nostalgic experiences that they don’t only make you feel good; we now have evidence that they actually make you feel optimistic and hopeful about the future.

I’m really happy that my version of the Memex has a long memory. For example, while reading Ellen Ullman’s book, I was about to search for other places I had encountered her name in the past, finding things like a quote I saved from a book I read a decade ago or a tweet I fav’d that mentions her. It provides me with amazing context.

I’m also really happy that I can easily read the chat logs that showcase my high school political beliefs, if I’m in the mood to read them.

But some thought needs to go into keeping the nostalgia traps out of search results by default. Perhaps simply applying a heavy dampening function for old results would do the trick. Perhaps some more sophisticated rules need to be worked out.

W3C’s ActivityPub

But before I start worrying about nuanced and complicated features like this, I need to finish more basic things like stabilizing the schema.

Schemas are hard to change retroactively, especially when I intend to not host a centralized database for all users. I’ve been working on ironing out as many decisions as possible

In the last few years, the W3C’s Social Web Working Group has been doing a lot of work on building a spec for activity feeds. The Activity Streams spec describes how activity streams should be organized and the Activity Pub protocol describes how stream data should be shared. A lot of is being used in Mastodon (an open-source and decentralized version of Twitter) and the feed-as-a-service startup Stream.

A lot of thought has gone into how to best describe social activities. Here’s a good overview. My own Memex schema is pretty close. For schema issues I’m unhappy with, I’ve been trying to use the W3C spec as a guide.

The most basic components of an Activity object:

This maps pretty well to the triplestore architecture that stores triples of subject-predicate-object.

There are more optional properties of an Activity.

Even with this spec, there are lot of situations with no clear right answer.

For example, I recently refactored all my bike/car/walking activities to share the same travelled with the instrument property set to the mode of transportation. But sometimes a bike ride is also part of a workout — how should the travelled and the exercised activities be linked together?

Or another complicated example: let’s say I’m having a phone meeting with a few people about a work project. There are three activities that are involved: the worked activity to track time on the project, the met activity to track the actual act of talking with others, and another conversed activity generated by my phone’s call logs. Should there me a meeting entity that holds everything together? Should the participants be attached to this entity or perhaps to the conversed activity? If the latter, should I allow for activities to have multiple objects or do I need to create a new group entity to describe the collection of people I talked to?

My approach right now is to remind myself that there will never be One True Ontology for this stuff and a working 80% version is better than a perfect but unimplemented version.