Structured outputs in the context of Large Language Models (LLMs) refer to the generation of data in a predefined format or schema, rather than free-form text. This can include outputs like JSON objects, tables, lists, or any other structured data format that can be easily parsed and utilized by other systems or applications.
This is a crucial element in integrating classic programming with LLMs, since the output can typically be very diverse, statistical and different for each LLM used. Rectifying this variance is crucial for downstream processing and turns the LLM output into something more akin to web-services.
Although models can output JSON this goes a step further, it complies to a predefined schema via a given set of Pydantic formats.
Some models have explicit support for structured output and even without it you can use some prompt engineering like this:
import openai# Define the prompt with instructions for structured outputprompt ="""Extract the address from the following input and provide it in JSON format:Input: "Send the package to 123 Main St, Springfield, IL 62701."Output:{ "street": "123 Main St", "city": "Springfield", "state": "IL", "zip_code": "62701"}"""response = openai.Completion.create( engine="text-davinci-003", prompt=prompt, max_tokens=100)print(response.choices[0].text.strip())
A generic solution comes from the instructor package which acts as an adapter for any LLM. To enable it you have to simply wrap the LLM like so:
import instructorfrom pydantic import BaseModelfrom openai import OpenAIimport jsonimport logginglogging.basicConfig(level=logging.CRITICAL)class Address(BaseModel): street: strzip: str city: str country: strllm = OpenAI( base_url="http://localhost:11434/v1")client = instructor.from_openai(llm)def gen(input:str):return client.chat.completions.create( model="qwen2.5:14b", response_model=Address, messages=[{"role": "user", "content": f"Extract the address from the following input: {input}"}], )entity = gen("OpenAI has its headquarters at San Francisco, 3180 18th St, United States.")if entity isnotNone:print(json.dumps(entity.dict(), indent=4))else:print("Could not extract anything.")
Of course, this has an impact on performance but that’s the price to pay for integrability.
The instructor package is more than this and allows for a variety of use-cases. For example, the rephrase and respond (RaR) pattern:
from pydantic import BaseModelimport instructorfrom openai import OpenAIllm = OpenAI( base_url="http://localhost:11434/v1")client = instructor.from_openai(llm)class Response(BaseModel): rephrased_question: str answer: strdef rephrase_and_respond(query):return client.chat.completions.create( model="llama3.2", messages=[ {"role": "user","content": f"""{query}\nRephrase and expand the question, and respond.""", } ], response_model=Response, )query ="Take the last letters of the words in 'Edgar Bob' and concatinate them."response = rephrase_and_respond(query)print(response.rephrased_question)print(response.answer)
What is the concatenated last letter of each word in the name Edgar Bob?
rg
The structure can themselves be valuable inside an agent logic. For example, this splits an initial question into multiple ones which get answered in parallel and finnaly consolidated into one answer (following this paper by Meta):
import instructorfrom openai import AsyncOpenAIfrom pydantic import BaseModel, Fieldimport asynciofrom typing import Optionalimport nest_asyncionest_asyncio.apply()client = instructor.from_openai(AsyncOpenAI())class ReasoningAndResponse(BaseModel): intermediate_reasoning: str= Field(description="""Intermediate reasoning steps""") correct_answer: strclass MaybeResponse(BaseModel): result: Optional[ReasoningAndResponse] error: Optional[bool] error_message: Optional[str] = Field( description="""Informative explanation of why the reasoning chain was unable to generate a result""" )class QueryDecomposition(BaseModel): queries: list[str] = Field(description="""A list of queries that need to be answered in order to derive the final answer""")asyncdef generate_queries(query: str):returnawait client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system","content": """You are a helpful assistant that decomposes a query into multiple sub-queries.""", }, {"role": "user", "content": query}, ], response_model=QueryDecomposition, )asyncdef generate_reasoning_chain(query: str) -> MaybeResponse:returnawait client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system","content": """ Given a question and a context, answer the question step-by-step. Indicate the intermediate reasoning steps. """, }, {"role": "user", "content": query}, ], response_model=MaybeResponse, )asyncdef batch_reasoning_chains( queries: list[str],) ->list[MaybeResponse]: coros = [generate_reasoning_chain(query) for query in queries] results =await asyncio.gather(*coros)return resultsasyncdef generate_response(query: str, context: list[MaybeResponse]): formatted_context ="\n".join( [f"""{item.result.intermediate_reasoning}{item.result.correct_answer} """for item in contextifnot item.error and item.result ] )returnawait client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system","content": """ Given a question and a context, answer the question step-by-step. If you are unsure, answer Unknown. """, }, {"role": "user","content": f""" <question>{query} </question> <context>{formatted_context} </context> """, }, ], response_model=ReasoningAndResponse, )query ="""Would Arnold Schwarzenegger have been able to deadlift an adult Black rhinoceros at his peak strength?"""decomposed_queries = asyncio.run(generate_queries(query))for generated_query in decomposed_queries.queries:print(generated_query)chains = asyncio.run(batch_reasoning_chains( decomposed_queries.queries))for chain in chains:print(chain.model_dump_json(indent=2)) response = asyncio.run(generate_response(query, chains))print(response.model_dump_json(indent=2))
What is the maximum deadlift weight achieved by Arnold Schwarzenegger at his peak?
What is the average weight of an adult Black rhinoceros?
Can Arnold Schwarzenegger's deadlift capability surpass the average weight of an adult Black rhinoceros?
{
"result": {
"intermediate_reasoning": "Arnold Schwarzenegger, primarily known for his bodybuilding achievements, did not have a recorded deadlift competition weight that is well-publicized. However, it is widely reported that during his peak, he was able to deadlift approximately 700 pounds (around 317.5 kg). This figure is noted based on his overall strength and conditioning as part of his training regime while competing in bodybuilding. He was more focused on bodybuilding lifts such as squats and bench presses, but his deadlift was also quite significant.",
"correct_answer": "Approximately 700 pounds (317.5 kg)"
},
"error": null,
"error_message": null
}
{
"result": {
"intermediate_reasoning": "The average weight of an adult Black rhinoceros can vary depending on several factors such as their age, sex, and subspecies. Adult Black rhinos typically weigh between 800 to 1,400 pounds (363 to 635 kg). To find an average we can calculate the midpoint of this range.",
"correct_answer": "The average weight of an adult Black rhinoceros is approximately 1,000 pounds (454 kg)."
},
"error": null,
"error_message": null
}
{
"result": {
"intermediate_reasoning": "To determine if Arnold Schwarzenegger's deadlift capability can surpass the average weight of an adult black rhinoceros, we first need to know the figures involved. Arnold Schwarzenegger, during his peak bodybuilding years, had a deadlift maximum around 710 pounds (approximately 322 kg). On the other hand, the average adult black rhinoceros weighs between 1,800 to 2,200 pounds (approximately 816 to 998 kg). Since Schwarzenegger's deadlift maximum (710 lbs) is significantly lower than the minimum weight of an adult black rhinoceros (1,800 lbs), we conclude that his deadlift capability cannot surpass this weight.",
"correct_answer": "No, Arnold Schwarzenegger's deadlift capability cannot surpass the average weight of an adult Black rhinoceros."
},
"error": null,
"error_message": null
}
{
"intermediate_reasoning": "Arnold Schwarzenegger's peak deadlift was around 710 pounds (approximately 322 kg). In contrast, the weight of an adult Black rhinoceros ranges from about 1,800 to 2,200 pounds (approximately 816 to 998 kg). Since Schwarzenegger's deadlift capability (710 lbs) is significantly lower than the minimum weight for an adult Black rhinoceros (1,800 lbs), he would not have been able to deadlift one.",
"correct_answer": "No, Arnold Schwarzenegger would not have been able to deadlift an adult Black rhinoceros at his peak strength."
}