cosette

Cosette is Claudette’s sister, a helper for OpenAI GPT

Install

pip install cosette

Getting started

OpenAI’s Python SDK will automatically be installed with Cosette, if you don’t already have it.

from cosette import *

Cosette only exports the symbols that are needed to use the library, so you can use import * to import them. Alternatively, just use:

import cosette

…and then add the prefix cosette. to any usages of the module.

Cosette provides models, which is a list of models currently available from the SDK.

' '.join(models)

For these examples, we’ll use GPT-5-mini.

model = first(m for m in models if 'mini' in m)

Chat

The main interface to Cosette is the Chat class, which provides a stateful interface to the models. You can pass message keywords to either Chat or when you call the model.

chatkw = dict(
    text={ "verbosity": "low" },
    reasoning={ "effort": "minimal" }
)

chat = Chat(model, sp="You are a helpful and concise assistant.", **chatkw)
chat("I'm Jeremy")

Nice to meet you, Jeremy. How can I help you today?

id: resp_0d3d0fd9abbe53b800691bb8f36a8081a287420e10bd559959
created_at: 1763424499.0
error: None
incomplete_details: None
instructions: You are a helpful and concise assistant.
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_0d3d0fd9abbe53b800691bb8f3f96481a2b973102cdd2f71ae’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_0d3d0fd9abbe53b800691bb8f42c2081a2844a8c1c40c9e319’, content=[ResponseOutputText(annotations=[], text=‘Nice to meet you, Jeremy. How can I help you today?’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=20, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=20, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=40)
user: None
billing: {‘payer’: ‘developer’}
store: True

r = chat("What's my name?")
r

Your name is Jeremy.

id: resp_0d3d0fd9abbe53b800691bb8f4f94481a2810c8643c0b3d40a
created_at: 1763424500.0
error: None
incomplete_details: None
instructions: You are a helpful and concise assistant.
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_0d3d0fd9abbe53b800691bb8f60dfc81a29e49561b99db8e9f’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_0d3d0fd9abbe53b800691bb8f63d2881a29391dc42f032a3ed’, content=[ResponseOutputText(annotations=[], text=‘Your name is Jeremy.’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=48, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=11, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=59)
user: None
billing: {‘payer’: ‘developer’}
store: True

As you see above, displaying the results of a call in a notebook shows just the message contents, with the other details hidden behind a collapsible section. Alternatively you can print the details:

print(r)

Response(id='resp_0d3d0fd9abbe53b800691bb8f4f94481a2810c8643c0b3d40a', created_at=1763424500.0, error=None, incomplete_details=None, instructions='You are a helpful and concise assistant.', metadata={}, model='gpt-5-mini-2025-08-07', object='response', output=[ResponseReasoningItem(id='rs_0d3d0fd9abbe53b800691bb8f60dfc81a29e49561b99db8e9f', summary=[], type='reasoning', content=None, encrypted_content=None, status=None), ResponseOutputMessage(id='msg_0d3d0fd9abbe53b800691bb8f63d2881a29391dc42f032a3ed', content=[ResponseOutputText(annotations=[], text='Your name is Jeremy.', type='output_text', logprobs=[])], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=4096, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, prompt_cache_retention=None, reasoning=Reasoning(effort='minimal', generate_summary=None, summary=None), safety_identifier=None, service_tier='default', status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity='low'), top_logprobs=0, truncation='disabled', usage=In: 48; Out: 11; Total: 59, user=None, billing={'payer': 'developer'}, store=True)

You can use stream=True to stream the results as soon as they arrive (although you will only see the gradual generation if you execute the notebook yourself, of course!)

for o in chat("What's your name?", stream=True): print(o, end='')

I'm ChatGPT.

Model Capabilities

Different OpenAI models have different capabilities. Some models such as o1-mini do not have support for streaming, system prompts, or temperature. Query these capbilities using these functions:

# o1 does not support streaming or setting the temperature
can_stream('o1'), can_set_sp('o1'), can_set_temp('o1')

(True, True, False)

# gpt-4o has these capabilities
can_stream('gpt-4o'), can_set_sp('gpt-4o'), can_set_temp('gpt-4o')

(True, True, True)

Tool use

Tool use lets the model use external tools.

We use docments to make defining Python functions as ergonomic as possible. Each parameter (and the return value) should have a type, and a docments comment with the description of what it is. As an example we’ll write a simple function that adds numbers together, and will tell us when it’s being called:

def sums(
    a:int,  # First thing to sum
    b:int=1 # Second thing to sum
) -> int: # The sum of the inputs
    "Adds a + b."
    print(f"Finding the sum of {a} and {b}")
    return a + b

Sometimes the model will say something like “according to the sums tool the answer is” – generally we’d rather it just tells the user the answer, so we can use a system prompt to help with this:

sp = "Never mention what tools you use."

We’ll get the model to add up some long numbers:

a,b = 604542,6458932
pr = f"What is {a}+{b}?"
pr

'What is 604542+6458932?'

To use tools, pass a list of them to Chat:

chat = Chat(model, sp=sp, tools=[sums], **chatkw)

Now when we call that with our prompt, the model doesn’t return the answer, but instead returns a tool_use message, which means we have to call the named tool with the provided parameters:

r = chat(pr)
r.output

Finding the sum of 604542 and 6458932

[ResponseReasoningItem(id='rs_0e80f673f989086700691bb8f93f808196a2d8be5fabc7ab32', summary=[], type='reasoning', content=None, encrypted_content=None, status=None),
 ResponseFunctionToolCall(arguments='{"a":604542,"b":6458932}', call_id='call_G8jd6G7UGQtVFcJHiCYfLN8b', name='sums', type='function_call', id='fc_0e80f673f989086700691bb8f9f3dc81969e7bfce86b2e8cdc', status='completed')]

Cosette handles all that for us – we just have to pass along the message, and it all happens automatically:

chat()

7,063,474

id: resp_0e80f673f989086700691bb8fb04ec8196854dd7610c320811
created_at: 1763424507.0
error: None
incomplete_details: None
instructions: Never mention what tools you use.
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseOutputMessage(id=‘msg_0e80f673f989086700691bb8fbaf4081969d32466017fe56b9’, content=[ResponseOutputText(annotations=[], text=‘7,063,474’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: [FunctionTool(name=‘sums’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to sum’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to sum’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Adds a + b.:- type: integer’)]
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=142, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=9, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=151)
user: None
billing: {‘payer’: ‘developer’}
store: True

You can see how many tokens have been used at any time by checking the use property.

chat.use

In: 231; Out: 36; Total: 267

Tool loop

def show(x):
    if getattr(x, 'output_text', None): display(x)

We can do everything needed to use tools in a single step, by using Chat.toolloop. This can even call multiple tools as needed solve a problem. For example, let’s define a tool to handle multiplication:

def mults(
    a:int,  # First thing to multiply
    b:int=1 # Second thing to multiply
) -> int: # The product of the inputs
    "Multiplies a * b."
    print(f"Finding the product of {a} and {b}")
    return a * b

Now with a single call we can calculate (a+b)*2:

chat = Chat(model, tools=[sums,mults], **chatkw)
pr = f'Calculate ({a}+{b})*2 and display the result as US$'
pr

'Calculate (604542+6458932)*2 and display the result as US$'

r = chat.toolloop(pr)

for o in r: show(o)

Finding the sum of 604542 and 6458932
Finding the product of 7063474 and 2

US$14,126,948

id: resp_0b2a84d51ad623c400691bb900c6d08195b323fdb870a37bec
created_at: 1763424512.0
error: None
incomplete_details: None
instructions: None
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseOutputMessage(id=‘msg_0b2a84d51ad623c400691bb9014e088195b6bf52d37fafebf7’, content=[ResponseOutputText(annotations=[], text=‘US$14,126,948’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: [FunctionTool(name=‘sums’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to sum’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to sum’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Adds a + b.:- type: integer’), FunctionTool(name=‘mults’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to multiply’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to multiply’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Multiplies a * b.:- type: integer’)]
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=225, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=11, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=236)
user: None
billing: {‘payer’: ‘developer’}
store: True

Images

As everyone knows, when testing image APIs you have to use a cute puppy.

fn = Path('samples/puppy.jpg')
Image(filename=fn, width=200)

We create a Chat object as before:

chat = Chat(model, **chatkw)

Claudia expects images as a list of bytes, so we read in the file:

img = fn.read_bytes()

Prompts to Claudia can be lists, containing text, images, or both, eg:

chat([img, "In brief, what color flowers are in this image?"])

They are purple.

id: resp_07fbd059d2633f7000691bb902fd98819194e5dbe9d2bb9909
created_at: 1763424515.0
error: None
incomplete_details: None
instructions: None
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_07fbd059d2633f7000691bb90390cc819181e8ec1955d1f7ed’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_07fbd059d2633f7000691bb903d1b081918d272dbb0556c7d0’, content=[ResponseOutputText(annotations=[], text=‘They are purple.’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=102, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=10, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=112)
user: None
billing: {‘payer’: ‘developer’}
store: True

The image is included as input tokens.

chat.use

In: 102; Out: 10; Total: 112

Alternatively, Cosette supports creating a multi-stage chat with separate image and text prompts. For instance, you can pass just the image as the initial prompt (in which case the model will make some general comments about what it sees), and then follow up with questions in additional prompts:

chat = Chat(model, **chatkw)
chat(img)

This is a small puppy lying on grass next to purple flowers. It has white fur with brown patches, long floppy ears, and dark eyes—appears to be a young spaniel-type dog (e.g., Cavalier King Charles Spaniel).

id: resp_04ca2ec79ffca3d500691bb904744c81958f7812565de27daa
created_at: 1763424516.0
error: None
incomplete_details: None
instructions: None
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_04ca2ec79ffca3d500691bb904d1fc8195a696751f6ac10748’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_04ca2ec79ffca3d500691bb90503008195ac8efa68f28eccd1’, content=[ResponseOutputText(annotations=[], text=‘This is a small puppy lying on grass next to purple flowers. It has white fur with brown patches, long floppy ears, and dark eyes—appears to be a young spaniel-type dog (e.g., Cavalier King Charles Spaniel).’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=92, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=56, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=148)
user: None
billing: {‘payer’: ‘developer’}
store: True

chat('What direction is the puppy facing?')

The puppy is facing toward the camera (front-facing).

id: resp_04ca2ec79ffca3d500691bb906d068819599b648197ed3ff12
created_at: 1763424518.0
error: None
incomplete_details: None
instructions: None
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_04ca2ec79ffca3d500691bb9078c048195861906ac7c42938a’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_04ca2ec79ffca3d500691bb907b5188195a62b8261e5561acb’, content=[ResponseOutputText(annotations=[], text=‘The puppy is facing toward the camera (front-facing).’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=159, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=17, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=176)
user: None
billing: {‘payer’: ‘developer’}
store: True

chat('What color is it?')

The puppy is mostly white with brown patches (especially on the ears and around the eyes).

id: resp_04ca2ec79ffca3d500691bb90888b8819596a6c054a1750183
created_at: 1763424520.0
error: None
incomplete_details: None
instructions: None
metadata: {}
model: gpt-5-mini-2025-08-07
object: response
output: [ResponseReasoningItem(id=‘rs_04ca2ec79ffca3d500691bb9093cc48195baa2c98e9038941c’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_04ca2ec79ffca3d500691bb90965b48195ba6d501e5be77817’, content=[ResponseOutputText(annotations=[], text=‘The puppy is mostly white with brown patches (especially on the ears and around the eyes).’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
top_logprobs: 0
truncation: disabled
usage: ResponseUsage(input_tokens=185, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=24, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=209)
user: None
billing: {‘payer’: ‘developer’}
store: True

Note that the image is passed in again for every input in the dialog, so that number of input tokens increases quickly with this kind of chat.

chat.use

In: 436; Out: 97; Total: 533

Other providers

Here’s an example of using the library with Groq:

groq_c = Client(
    model="openai/gpt-oss-20b",
    api_key_env="GROQ_KEY",
    base_url="https://api.groq.com/openai/v1"
)

groq_c("Hello! What's 2+2?")

Sure! 2 + 2 equals 4.

id: resp_01kaa4wr25fb9sg6j2t0h74115
created_at: 1763424755.0
error: None
incomplete_details: None
instructions:
metadata: {}
model: openai/gpt-oss-20b
object: response
output: [ResponseReasoningItem(id=‘resp_01kaa4wr25fba9v3q7sqdmnmhr’, summary=[], type=‘reasoning’, content=[Content(text=‘We need to respond as ChatGPT with a friendly answer: 2+2=4. Ensure no policy conflicts.’, type=‘reasoning_text’)], encrypted_content=None, status=‘completed’), ResponseOutputMessage(id=‘msg_01kaa4wr25fbas7kjb537b2yw1’, content=[ResponseOutputText(annotations=[], text=‘Sure! 202f+02f2 equals 4.’, type=‘output_text’, logprobs=None)], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘medium’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=None)
top_logprobs: None
truncation: disabled
usage: ResponseUsage(input_tokens=79, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=47, output_tokens_details=OutputTokensDetails(reasoning_tokens=25), total_tokens=126)
user: None
store: False

gchat = Chat(cli=groq_c)
gchat("Hello! I'm Jeremy")

Hello Jeremy! How can I help you today?

id: resp_01kaa4wsaze4ms5ezjrezw1baa
created_at: 1763424757.0
error: None
incomplete_details: None
instructions:
metadata: {}
model: openai/gpt-oss-20b
object: response
output: [ResponseReasoningItem(id=‘resp_01kaa4wsaze4n9hze8pq6vvv75’, summary=[], type=‘reasoning’, content=[Content(text=‘User: “Hello! I'm Jeremy” We can greet. Possibly ask how can help.’, type=‘reasoning_text’)], encrypted_content=None, status=‘completed’), ResponseOutputMessage(id=‘msg_01kaa4wsaze4nsb4xcv927zww0’, content=[ResponseOutputText(annotations=[], text=‘Hello Jeremy! How can I help you today?’, type=‘output_text’, logprobs=None)], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘medium’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=None)
top_logprobs: None
truncation: disabled
usage: ResponseUsage(input_tokens=75, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=38, output_tokens_details=OutputTokensDetails(reasoning_tokens=19), total_tokens=113)
user: None
store: False

gchat("What's my name?")

You’re Jeremy!

id: resp_01kaa4wtmcfbrb3fn5vzpd2fvy
created_at: 1763424758.0
error: None
incomplete_details: None
instructions:
metadata: {}
model: openai/gpt-oss-20b
object: response
output: [ResponseReasoningItem(id=‘resp_01kaa4wtmcfbrrd02tzsq8z47y’, summary=[], type=‘reasoning’, content=[Content(text=‘User asks “What's my name?” The name is Jeremy from the greeting. So answer: Jeremy.’, type=‘reasoning_text’)], encrypted_content=None, status=‘completed’), ResponseOutputMessage(id=‘msg_01kaa4wtmcfbs8pa0xnvdkdkpq’, content=[ResponseOutputText(annotations=[], text=‘You’re Jeremy!’, type=‘output_text’, logprobs=None)], role=‘assistant’, status=‘completed’, type=‘message’)]
parallel_tool_calls: True
temperature: 1.0
tool_choice: auto
tools: []
top_p: 1.0
background: False
conversation: None
max_output_tokens: 4096
max_tool_calls: None
previous_response_id: None
prompt: None
prompt_cache_key: None
prompt_cache_retention: None
reasoning: Reasoning(effort=‘medium’, generate_summary=None, summary=None)
safety_identifier: None
service_tier: default
status: completed
text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=None)
top_logprobs: None
truncation: disabled
usage: ResponseUsage(input_tokens=99, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=34, output_tokens_details=OutputTokensDetails(reasoning_tokens=21), total_tokens=133)
user: None
store: False