ChatML: ChatGPT API expects a structured format, called Chat Markup Language

Traditionally, GPT models consumed unstructured text. ChatGPT models
instead expect a structured format, called Chat Markup Language
(ChatML for short).
ChatML documents consists of a sequence of messages. Each message
contains a header (which today consists of who said it, but in the
future will contain other metadata) and contents (which today is a
text payload, but in the future will contain other datatypes).
We are still evolving ChatML, but the current version (ChatML v0) can
be represented with our upcoming “list of dicts” JSON format as
follows:

“},
“systemnYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.nKnowledge cutoff: 2021-09-01nCurrent date: 2023-03-01”,
{“token”: “<|im_end|>“}, “n”, {“token”: “<|im_start|>“},
“usernHow are you”,
{“token”: “<|im_end|>“}, “n”, {“token”: “<|im_start|>“},
“assistantnI am doing well!”,
{“token”: “<|im_end|>“}, “n”, {“token”: “<|im_start|>“},
“usernHow are you now?”,
{“token”: “<|im_end|>“}, “n”
]”>

[
 {"token": "<|im_start|>"},
 "systemnYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.nKnowledge cutoff: 2021-09-01nCurrent date: 2023-03-01",
 {"token": "<|im_end|>"}, "n", {"token": "<|im_start|>"},
 "usernHow are you",
 {"token": "<|im_end|>"}, "n", {"token": "<|im_start|>"},
 "assistantnI am doing well!",
 {"token": "<|im_end|>"}, "n", {"token": "<|im_start|>"},
 "usernHow are you now?",
 {"token": "<|im_end|>"}, "n"
]

You could also represent it in the classic “unsafe raw string”
format. Note this format inherently allows injections from user input
containing special-token syntax, similar to a SQL injections:

Non-chat use-cases

ChatML can be applied to classic GPT use-cases that are not
traditionally thought of as chat. For example, instruction following
(where a user requests for the AI to complete an instruction) can be
implemented as a ChatML query like the following:

“},
“usernList off some good ideas:”,
{“token”: “<|im_end|>“}, “n”, {“token”: “<|im_start|>“},
“assistant”
]”>

[
 {"token": "<|im_start|>"},
 "usernList off some good ideas:",
 {"token": "<|im_end|>"}, "n", {"token": "<|im_start|>"},
 "assistant"
]

We do not currently allow autocompleting of partial messages,

“},
“systemnPlease autocomplete the user’s message.”
{“token”: “<|im_end|>“}, “n”, {“token”: “<|im_start|>“},
“usernThis morning I decided to eat a giant”
]”>

[
 {"token": "<|im_start|>"},
 "systemnPlease autocomplete the user's message."
 {"token": "<|im_end|>"}, "n", {"token": "<|im_start|>"},
 "usernThis morning I decided to eat a giant"
]

Note that ChatML makes explicit to the model the source of each piece
of text, and particularly shows the boundary between human and AI
text. This gives an opportunity to mitigate and eventually solve
injections, as the model can tell which instructions come from the
developer, the user, or its own input.

Few-shot prompting

In general, we recommend adding few-shot examples using separate
system messages with a name field of example_user or
example_assistant. For example, here is a 1-shot prompt:

If adding instructions in the system message doesn’t work, you can
also try putting them into a user message. (In the near future, we
will train our models to be much more steerable via the system
message. But to date, we have trained only on a few system messages,
so the models pay much most attention to user examples.)