On November 30, 2022, OpenAI launched ChatGPT, or what is arguably the most advanced AI chatbot that has ever been made available to the public.1 It hit 1 million users just 4 days later.2
The days following its launch were marked by a wave of screenshots across social media showcasing ChatGPT’s capabilities.
It is, undeniably, an incredible technology with any number of use cases. Yet so is a lot of technology in the AI space. What made ChatGPT strike a chord in ways that similar technologies haven’t was likely some combination of the following 3 factors:
- Unlike much of the latest work in AI, ChatGPT is freely3 available to the public.
- Unlike other similar technologies, ChatGPT is available to the public in a way that’s simultaneously easy-to-implement and, once implemented, user-friendly. Accessing it doesn’t require users to download any software, navigate a GitHub repository, or reach out to the authors of a study for access to their code, and using ChatGPT doesn’t require any expert knowledge or specialized skillsets.
- ChatGPT benefits from at least one aspect of what economists have termed ‘the network effect,’ which describes the phenomenon whereby a service becomes better or more valuable as the number of people using that service increases. In the case of ChatGPT, this took shape in the form of users learning how best to use the technology from other users sharing how they had used the technology, which in turn encouraged more users to use it (and on and on and on).
Things move quickly in this space, though, and there are already those who have begun looking beyond ChatGPT’s current form in anticipation of what future iterations of the technology might be able to offer. There are rumours, for example, that GPT-4 — the successor to GPT-3, of which ChatGPT is a variant — will feature 100 trillion parameters, but others have challenged this claim.4
Regardless of their exact specs, the most common improvements — whether to ChatGPT or to similar technologies in the future — are likely to be increases in what might be termed ‘reference material.’ Future iterations of AI chatbots will, for example, be able to discuss a wider range of material, to cover that material at greater depth, and to feature more recent material than what is presently available.
I predict that while these kinds of improvements will undoubtedly be useful, they won’t be the breakthrough innovation that leads to mainstream adoption of ChatGPT or to major leaps in user benefit. Rather, the breakthrough innovation for AI chatbots will be an increase in what I’m calling ‘contextual capability.’ Conceptually, contextual capability represents a complex set of processes which could be expressed in many different ways. For the sake of simplicity, however, I’ll break this concept down to just three primary ‘levels.’
The first level of contextual capability is invisible to the user, as it represents things like the amount of training data and the number of parameters that an AI chatbot was developed on and can therefore utilize in preparing its responses. ChatGPT performs exceptionally well on this level. For now, perhaps its primary shortcoming on this level is the fact that, at the time of this writing, its training data only goes up to June of 2021. As a result, it can’t refer to, describe, or otherwise engage with events or developments that took place beyond that point in time.
Large models are not particularly impressive, novel, or distinct simply by virtue of their size, however.
Rather, it’s with the second level of contextual capability that things really start to get interesting. The second level of contextual capability represents the ability of an AI chatbot to take in a certain level of context from a prompt and produce a response that takes into account all of that context. Or, in the words of ChatGPT:
If you’ve been following along with ChatGPT since its launch, you might have noticed that many of the most compelling examples of ChatGPT’s responses have been the result of the most context-heavy prompts. We can demonstrate this by comparing ChatGPT’s responses to the following prompts.
Here’s another example.
There are a few issues with ChatGPT’s responses to the above prompts. But, for the moment, let’s focus on how its responses become increasingly more compelling as more and more context is added to the prompt.
This leads us to the third level of contextual capability, which might be termed ‘the world-building level.’ This level represents the extent to which ChatGPT is able to take in a user’s contextual variables (as presented to it through prompts), produce a response that takes into account those variables, and then store both the user-provided prompts and its own responses to those prompts for the remainder of the session.
As an example of what this looks like in practice, let’s return to our song about Phineas, the anxious goose. ChatGPT has already provided us with a wonderful response to our initial prompt, but we’d like to remain in the world of Phineas and his goose-related troubles for just a little while longer. Perhaps we’re a songwriter who has been tasked with writing an album about Phineas, and as a result we need more material than just the one song. Let’s see what we can do.
Truly, it’s remarkable what ChatGPT has been able to produce in response to these follow-up prompts in world-building. And yet, as we can see, ChatGPT hasn’t quite fulfilled the requirements of the third level of contextual capability. Remember, for example, that Phineas and Greta met in the winter, not in the spring. It doesn’t make sense, moreover, that they would be flying south in the spring, as springtime is when they would be returning north.
When we point out these errors to ChatGPT, it’s able to produce a more accurate revision:
In fulfilling the third level of contextual capability, however, the question is not whether ChatGPT can correct its responses with our feedback. The question is whether ChatGPT can take in our initial contextual variables, produce a response that takes into account those variables, and then store both our variable-containing prompts and its responses to those prompts for the remainder of the session as we add to or otherwise change our initial variables. Already, as we have seen, ChatGPT is able to do this to some extent. But I predict that future iterations of this technology will be significantly more adept at doing so.
In the FAQ for ChatGPT, OpenAI states that ChatGPT “is able to remember what the user has said earlier in the conversation … up to approximately 3000 words (or 4000 tokens)” in the past. “Any information beyond that,” they explain, “is not stored,” and “ChatGPT is not able to access past conversations to inform its responses.” This functionality of third-level contextual capability, in other words, has already been built into the model, which indicates that it was seen to be a sufficiently valuable feature by its developers. In a model of this scale and complexity, that certainly wasn’t a given, as it would have required considerable resources of time and expertise to build out. With the above response, moreover, there is the suggestion that those who are working on further improving ChatGPT recognize that its capabilities in this regard are currently somewhat lacking relative to what its users would ideally like for it to be able to do.
The reason why future improvements in contextual capability will be important in establishing ChatGPT as a breakthrough innovation is suggested both by the above question and by OpenAI’s response. Let’s return to them both now, this time in their full extent. See if you can spot it:
It comes down to one key word: remember.
The act of remembering is a practice that’s unique to living organisms. Humans, for example, will remember to pick up the dry cleaning on their way home from work. (Or, at least, they’ll try to.) A dog will remember the sound of their name and perk up when it’s called. Even cells — the smallest unit of an organism — are capable of remembering. But computers? Computers don’t remember things. Computers store information. With the right prompts, computers are able to retrieve and then present that information to their users.
Why, then, would OpenAI use the term ‘remember’ to describe what ChatGPT is capable of doing? I suspect that, in part, it’s a gesture towards a future in which ChatGPT has undergone significant improvements in contextual capability.
Think of it this way: you meet a friend for coffee, and you have so much to talk about that you end up staying out with them for hours. At the end of that conversation, your friend will still remember things that you said to them at the beginning of the conversation. They might not remember every single detail, but even weeks or months later they will still remember enough about your conversation to be able to both refer back to and build upon what you shared — and you’ll be able to do the same for them. That conversation, in other words, will have entered into the world-building of your friendship.
Now, think of the possibilities involved in having even a fraction of that level of contextual capability with an AI chatbot. It won’t need to remember weeks’ or months’ worth of contextual variables in that way that your friend does. Imagine, for example, if ChatGPT could simply remember the contextual variables from an interaction that spanned the entirety of a single day. My sense is that this functionality is coming, and that it will be here sooner than we might expect.
In the near future, you’ll boot up ChatGPT at the beginning of your workday, during which time you’ll initiate your world-building for the day’s tasks with a series of context-heavy prompts that will include any domain-specific or organizational-specific variables it will need to know in order to assist you in those tasks. (e.g., “With this iteration of the project, we’re optimizing for cost efficiency.”) From there, you’ll keep interacting with ChatGPT over the course of the day as you work to complete your tasks, and all throughout your exchanges it will remember each update or addition to the initial set of contextual variables you provided.
It will be like working not just with an expert in your field, but with an expert in your specific subfield of your field who also happens to be working the same job — at the same company, with the same team, and on the same project — as you.
1 The New York Times has called it “the best artificial intelligence chatbot ever released to the general public.”↩
2 It’s worth noting that, in order to gain access to ChatGPT, users have to supply not only an email address but also an active phone number. This, in turn, makes the fact of ChatGPT hitting the 1-million-user mark so quickly all the more remarkable.↩
3 This is likely to change in future iterations of ChatGPT, as OpenAI has suggested.↩
4 See also this post on language model scaling.↩
Featured image: Suzuki Kiitsu’s “Morning Glories” (c. 1800)
Jana M. Perkins is a PhD student in Information Science who works at the intersections of fields like natural language processing, machine learning, and cultural analytics. Her research has been federally funded by the Social Sciences and Humanities Research Council of Canada (SSHRC) since 2019. Together with Miranda Hickman, she is co-authoring a book that will be published by Routledge.
You can find her on Twitter (@jcontd) or at jcontd.com.