Build a website with Next.jsBuild a Discord bot with PythonBuild an app with SwiftUIPush your own modelPush a Diffusers modelPush a Transformers modelPush a model using GitHub ActionsDeploy a custom modelGet a GPU machine
Home / Guides / Language models

Advanced prompting for open source large language models


Crafting effective prompts is still more of an art than a science. We've covered some tried-and-true tactics. This chapter will cover more of the theory of how prompts work, and how to use that theory to craft effective prompts.

To work through these examples, you can use a language model on Replicate, like meta/llama-3.1-405b-instruct.

Language models are models of the world

We know that language models are trained to predict the next token in a sequence, based off the distribution of tokens in their training data. They learn to emulate the process that creates those text sequences.

That process, of course, is "humans writing words". But we don't write words in isolation. The way we write reflects the way we think, and the way we think reflects the world. Language models see the world as it is reflected through human writing: its colors and textures, people and places, causes and effects, are all part of the process that leads one token to come after another.

Language models act like world simulators. At each step, they take into account all the previous tokens in the context window, and connect all the implications and connotations of those words to "imagine" a world in which that text would be written. The dynamics of that world are implied by the contents of the prompt.

Try guessing the next token before running the following prompt:

The bowling ball balanced precariously atop the ladder for a moment, but then

The model doesn't expect the bowling ball to float away like a balloon, or to remain balanced forever while some new unrelated thing happens. The prompt sets up a possible world, and for each token the model plays "what next" in that world.

Ambiguous or unclear prompts can bring undesirable connotations into your model's simulated world. For instance, the prompt Write about a bank could lead to text about a financial institution or a river bank, depending on the model's training data and the rest of the context.

Note that this simulator theory was developed in regard to base models. Instruct- or RL-tuned models have been modified to simulate more specific worlds, often worlds where an "Assistant" character is interacting with a "User." The theory still applies, but you'll have to account for the dialogue-based world that the model expects.

The art of crafting prompts

A seed, though small, contains all the information to grow into a tree. Similarly, your prompt contains all the information necessary for the model to simulate a world. That information can be transmitted through several channels.

Explicit information

The most obvious way to include information about a world is, well, to include it! Just declare the information directly in the prompt itself.

The bowling ball sat atop the ladder.

Implicit information

Of course, that explicit information also comes with connotations. A bowling ball usually doesn't sit atop a ladder, at least not for long. This is another channel you can use to tell your model about the world. We can increase the likelihood that the ball will fall without explicitly telling the model to do that, by modifying the verb we choose:

The bowling ball balanced precariously atop the ladder.

Shadow information

A weird thing about language models is that the inclusion of any word brings implicit information, even if we negate the actual concept itself. This "shadow" information can cause unpredictable behavior, so it's hard to use it well. Nonetheless it's an important thing to consider, if only to avoid possible bad outcomes. It is often better to phrase your prompt in a positive explicit manner, rather than trying to negate an undesirable word or phrase.

Compare the following two prompts:

The bowling ball balanced, NOT precariously, atop the ladder.

The bowling ball balanced, securely, atop the ladder.

The former contains shadow information. It implies that maybe the ball should be balanced precariously, that such is the natural state of things. People might get hurt and it is surprising that they don't. In the latter prompt, the possible fall is less salient, as we know the ball to be securely balanced.

Metatextual information

Another channel through which we can shape the simulation is through aspects of the text itself, rather than the world it describes. Voice, style, genre tropes, even formatting can have an effect on the model's predictions.

Even the choice of one word can affect the entire world downstream:

The door lensed open.

The door creaked open.

The two sentences have the same effect "in-world", but the connotations they bring change the scene entirely.

Proxied information

Proxy prompting is a special case of implied information, where you use a "proxy" character or situation to point to information that you want included in the resulting text. For example, you might want your model to reveal its knowledge of a specific domain. A simple instruction might not access the actual depths of knowledge in the model, because the "average" piece of text in the training data is not specialist knowledge.

The following is a conversation between Adam and Beth. A: I wonder if an octopus can change its color. B:

The following is a conversation between Adam and Beth. Beth is a marine biologist who did her doctorate on the noble octopus. A: I wonder if an octopus can change its color. B:

Sculpting possibility

As we have seen , every word in your prompt tells the model about the world you are trying to simulate. In this way prompting is an art like sculpting: you start with a block of pure possibility and chisel it down to a certain set of possible worlds with every token.

Every word you add to your prompt is a constraint on the possible worlds that the model can simulate. The more words you add, the more constrained the simulation becomes. This can be good for producing more coherent text, but it can also lead to less interesting or intelligent responses.

The trick is to find the right balance between coherence and creativity. You want to give the model enough information to produce coherent text, but not so much that it can't surprise you.