Mirroring Human Memory for AI Personalities

Technology around AI changes really quickly, and I’m always looking into my crystal ball, trying to guess at what the future looks like for the tools that I use (and build) that rely on the current big players in the AI space (Meta, OpenAI, Anthropic, etc). I also constantly wonder at the changes that I’m watching, in real time, in tools that I wouldn’t have guessed needed AI.

Sometimes, like in the case of document processing and the analysis of less-structured-data, AI has been an amazing addition to my life. In other areas, like the slow-loading GPT-powered responses that major search engines tried to implement, felt like they got in the way (thankfully, the UX designers have clawed back those implementations to something much more palatable).

Human mimicry on the market?

Beyond data processing, data transformation, and summarization, I have been watching the early attempts at building consistent “personalities” on top of Large Language Models (LLMs). Some of the more interesting types of applications are centered around a type of human-mimicry—more than just achieving human-like writing, but also in emulating emotion in generated speech (in the open-source world, these would be models/apps like Coqui, Tortoise, Bark, and just taking a look at some of the generated samples is enough to demonstrate that emotional speech emulation is not science fiction and will only improve over time).

I consistently see new startups and new apps popping up that make use of the generative capabilities of AI tools and promise to be an “AI Friend”, “AI Girlfriend”, “AI Personal Assistant”, and even an “AI Counselor”. I haven’t searched, but I wouldn’t be surprised to find “AI Pastors” and “AI Life Coaches” already on the app stores.

It still feels like I’m talking to a robot…

Without getting into an endless discussion on whether selling an “AI Pastor” or an “AI Girlfriend” can ever be ethical, I want to focus on the experience of using some of these apps for the user. As a less controversial example, let’s take a look at a journaling app or two that make use of AI personalities.

I tried out one app recently, after it popped up on my feed. Reflectr is a journal app where you make a post that feels like the old versions of a Facebook status update or like a longer-form tweet. Based on your settings (and if you have a paid vs free account), different “personalities” will comment on your post. If you have a premium account, you can even reply to those comments and have a separate conversation with each personality about what you’ve written.

I really wanted to like the app. I know, from studies and life experience, that journaling can be incredibly useful for understanding your own thoughts and for processing life experiences. When I tried the app, I had hoped that the comments and personalities would feel like “wise old friends” or “wise-cracking friends” who would show interest in my post and really push me to (like the app name suggests) reflect on my own personal experiences.

My experience, however, was very sterile.

The personalities like “philosopher” and “optimist” and “comedian” felt like cookie-cutter caricatures of those roles. Their comments were short and lacked depth, and the questions were generic. Within a conversation thread, things would improve as I wrote back and forth, but there didn’t seem to be any connection between conversations with the same personality on different posts.

Ultimately, it didn’t help me accomplish my goals of journaling more or of examining my own thoughts more deeply. I uninstalled the app.

Remember better

I am still fascinated by the concept of what a good AI-infused journal might look like, and the more I experiment and build AI powered apps and agents, the more I am convinced that ability to remember past conversations is a critical feature. But more than that, the way we design these systems to remember and recall is going to be a major differentiator.

Memory as a graph

There are a lot of theories that try to explain the way that humans store and retrieve memories. One of my personal favorite visualizations is to think of small concepts as “nodes” in a 3-dimensional graph, and each node can have dozens, if not thousands of connections (“edges”) to other nodes. Each edge might have a different “strength” which represents how connected two nodes are.

In your own brain, the node for the smell of vanilla might have strong connections to baking cookies or maybe an air freshener you used in your first car. There might also be weaker edges to other nodes, like the weekly coupon book for the grocery store that has a sale on vanilla that you just threw in the recycling bin earlier this morning. And of course, there are secondary and tertiary edges which might be stronger than the weak primary edges—like an association with your spouse (who makes the best cookies) or to your children (who always try to eat the batter or sneak a cookie off the counter when you aren’t looking).

[Im]perfect memories

One of the trademark quirks of our memories (for most of us, with a few exceptions) is that we do not remember things exactly as they were, and that, as we experience more things and remember more things, our past memories are affected. There is a famous word-list study from the 90’s that shows how confidently incorrect we can be at remembering even simple details, and how association can essentially prompt us to remember things that didn’t occur. That study and its tests have been used and cited in many articles and studies since then, and I think its safe to say that the method and measured affect is still relevant.

What about the machines?

When I store a document or text or image in a database, I store the exact data. When I retrieve that data later, it hasn’t changed. In older systems, I started by storing the data with a unique identifier (like an ISBN for a library book). In order to look that item up, I needed to know its identifier. Obviously, some systems needed to support the lookup of items by other attributes (like genre, author, length, etc), and over time the structure we’ve built up in our linear, perfect databases, start to look a little graph-like.

Eventually, you might even develop (and get awarded a patent for) an entire system that excels at finding, categorizing, and presenting what feels like a magic inference of dynamically changing attributes in a way that both humans and machines can understand.

But the retrieval of that information remains relatively perfect (both in repeatability and in the representation of the stored information). That’s great for enterprise software, but my working theory is that this kind of method is too cold for emulating a convincing.

Memory graphs for machines

We already have some excellent tools, readily available to any developer willing to learn and build, that allow us to turn text (words, sentences, paragraphs) into a series of multi-dimensional coordinates that can be used represent how similar or dissimilar words/phrases are as a matter of distance (this is a huge simplification, but I think it keeps the mental model clean while preserving the general nature of vector databases).

These vector databases (and the models used to create the coordinate-like “embeddings” from the documents) are already being used to allow businesses and people to tokenize, store, and semantically search all kinds of textual data. Right now, I could store a journal entry of “eating delicious cookies with my kids”, and then I could semantically search for “smell of vanilla”, and I would probably find the cookie-eating journal entry above entries like “walking to work” or “playing Legos with the kids”.

The searching capabilities are definitely starting to feel more human, and when we let a system take in a user’s prompt, and then we use a semantic search to find (what we think is) relevant information, and we feed all of it into our conversation model, we start to get “memory aware results”. (This is the basic premise of RAG applications). But the responses can have funny results (like remembering the coupon book next more intensely than the cookies) and the overall responses themselves are still not quite warm enough, even with significant modifications to the prompts.

Great. We can remember things, but what next?

I don’t normally think too hard about things I need to remember. I’m one of the lucky ones who can read something yesterday and recall a good chunk of it today (with a funny exception for names and exact dates). I think, however, that I need to have some sort of structure for the memory of my human-like AI personality. I need to make the memory less perfect.

Think about the last really long conversation you had with a friend. What did you talk about? How did you flow from topic to topic? How did you decide what to bring up, when to ask questions, when to share your own tangentially related experiences? How did you know if you could relate (or not relate) to the other experience?

If we can make a rough theory and guess at how we achieved the conversational flow, we can take a craic at creating a system around that theory.

We’ll have to explore that more in a future post, but you can look forward to some rough examples of my memory and conversation models, how I decide what to store (remember) and what to ignore (forget). Eventually, I’ll even describe a few ways of allowing the AI model to make really flexible decisions and how to distill all of the data into something filtered and even a little imperfect (in the best of ways).