cosette

Cosette is Claudette’s sister, a helper for OpenAI GPT

Install

pip install cosette

Getting started

OpenAI’s Python SDK will automatically be installed with Cosette, if you don’t already have it.

from cosette import *

Cosette only exports the symbols that are needed to use the library, so you can use import * to import them. Alternatively, just use:

import cosette

…and then add the prefix cosette. to any usages of the module.

Cosette provides models, which is a list of models currently available from the SDK.

' '.join(models)
'gpt-5 gpt-5-mini gpt-5-nano o1-preview o1-mini gpt-4o gpt-4o-mini gpt-4-turbo gpt-4 gpt-4-32k gpt-3.5-turbo gpt-3.5-turbo-instruct o1 o3-mini chatgpt-4o-latest o1-pro o3 o4-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano'

For these examples, we’ll use GPT-5-mini.

model = models[1]

Chat

The main interface to Cosette is the Chat class, which provides a stateful interface to the models. You can pass message keywords to either Chat or when you call the model.

chatkw = dict(
    text={ "verbosity": "low" },
    reasoning={ "effort": "minimal" }
)
chat = Chat(model, sp="You are a helpful and concise assistant.", **chatkw)
chat("I'm Jeremy")

Nice to meet you, Jeremy. How can I help you today?

  • id: resp_6897d6bf742081a29d6f5ed0d4fcc6b905935ac4b74d3abc
  • created_at: 1754781375.0
  • error: None
  • incomplete_details: None
  • instructions: You are a helpful and concise assistant.
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d6bfbc3481a294c74c025cf3965605935ac4b74d3abc’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d6bfd4e481a2a3b96e6859d73d0005935ac4b74d3abc’, content=[ResponseOutputText(annotations=[], text=‘Nice to meet you, Jeremy. How can I help you today?’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=20, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=20, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=40)
  • user: None
  • store: True
r = chat("What's my name?")
r

Your name is Jeremy.

  • id: resp_6897d6c249c481a2ab32c89f1f9e5ffe05935ac4b74d3abc
  • created_at: 1754781378.0
  • error: None
  • incomplete_details: None
  • instructions: You are a helpful and concise assistant.
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d6c31b3081a29771f246090620aa05935ac4b74d3abc’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d6c327a081a2bae1eec5cd30917305935ac4b74d3abc’, content=[ResponseOutputText(annotations=[], text=‘Your name is Jeremy.’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=48, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=11, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=59)
  • user: None
  • store: True

As you see above, displaying the results of a call in a notebook shows just the message contents, with the other details hidden behind a collapsible section. Alternatively you can print the details:

print(r)
Response(id='resp_6897d6c249c481a2ab32c89f1f9e5ffe05935ac4b74d3abc', created_at=1754781378.0, error=None, incomplete_details=None, instructions='You are a helpful and concise assistant.', metadata={}, model='gpt-5-mini-2025-08-07', object='response', output=[ResponseReasoningItem(id='rs_6897d6c31b3081a29771f246090620aa05935ac4b74d3abc', summary=[], type='reasoning', content=None, encrypted_content=None, status=None), ResponseOutputMessage(id='msg_6897d6c327a081a2bae1eec5cd30917305935ac4b74d3abc', content=[ResponseOutputText(annotations=[], text='Your name is Jeremy.', type='output_text', logprobs=[])], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, max_output_tokens=4096, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort='minimal', generate_summary=None, summary=None), safety_identifier=None, service_tier='default', status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity='low'), top_logprobs=0, truncation='disabled', usage=In: 48; Out: 11; Total: 59, user=None, store=True)

You can use stream=True to stream the results as soon as they arrive (although you will only see the gradual generation if you execute the notebook yourself, of course!)

for o in chat("What's your name?", stream=True): print(o, end='')
I'm ChatGPT.

Model Capabilities

Different OpenAI models have different capabilities. Some models such as o1-mini do not have support for streaming, system prompts, or temperature. Query these capbilities using these functions:

# o1 does not support streaming or setting the temperature
can_stream('o1'), can_set_sp('o1'), can_set_temp('o1')
(True, True, False)
# gpt-4o has these capabilities
can_stream('gpt-4o'), can_set_sp('gpt-4o'), can_set_temp('gpt-4o')
(True, True, True)

Tool use

Tool use lets the model use external tools.

We use docments to make defining Python functions as ergonomic as possible. Each parameter (and the return value) should have a type, and a docments comment with the description of what it is. As an example we’ll write a simple function that adds numbers together, and will tell us when it’s being called:

def sums(
    a:int,  # First thing to sum
    b:int=1 # Second thing to sum
) -> int: # The sum of the inputs
    "Adds a + b."
    print(f"Finding the sum of {a} and {b}")
    return a + b

Sometimes the model will say something like “according to the sums tool the answer is” – generally we’d rather it just tells the user the answer, so we can use a system prompt to help with this:

sp = "Never mention what tools you use."

We’ll get the model to add up some long numbers:

a,b = 604542,6458932
pr = f"What is {a}+{b}?"
pr
'What is 604542+6458932?'

To use tools, pass a list of them to Chat:

chat = Chat(model, sp=sp, tools=[sums], **chatkw)

Now when we call that with our prompt, the model doesn’t return the answer, but instead returns a tool_use message, which means we have to call the named tool with the provided parameters:

r = chat(pr)
r.output
Finding the sum of 604542 and 6458932
[ResponseReasoningItem(id='rs_6897d70b74b4819f97bd7ccb549bce65056524066556f4c9', summary=[], type='reasoning', content=None, encrypted_content=None, status=None),
 ResponseFunctionToolCall(arguments='{"a":604542,"b":6458932}', call_id='call_RFHB5vQbpgkpuPGN2cdSMHlJ', name='sums', type='function_call', id='fc_6897d70bab7c819fb48d16260fb4118c056524066556f4c9', status='completed')]

Cosette handles all that for us – we just have to pass along the message, and it all happens automatically:

chat()

7,063,474

  • id: resp_6897d70dd204819fbb21bfd7608ae6dd056524066556f4c9
  • created_at: 1754781453.0
  • error: None
  • incomplete_details: None
  • instructions: Never mention what tools you use.
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseOutputMessage(id=‘msg_6897d70e51b0819f8993eba3ec6ca440056524066556f4c9’, content=[ResponseOutputText(annotations=[], text=‘7,063,474’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: [FunctionTool(name=‘sums’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to sum’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to sum’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Adds a + b.:- type: integer’)]
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=142, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=9, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=151)
  • user: None
  • store: True

You can see how many tokens have been used at any time by checking the use property.

chat.use
In: 231; Out: 36; Total: 267

Tool loop

def show(x):
    if getattr(x, 'output_text', None): display(x)

We can do everything needed to use tools in a single step, by using Chat.toolloop. This can even call multiple tools as needed solve a problem. For example, let’s define a tool to handle multiplication:

def mults(
    a:int,  # First thing to multiply
    b:int=1 # Second thing to multiply
) -> int: # The product of the inputs
    "Multiplies a * b."
    print(f"Finding the product of {a} and {b}")
    return a * b

Now with a single call we can calculate (a+b)*2:

chat = Chat(model, tools=[sums,mults], **chatkw)
pr = f'Calculate ({a}+{b})*2 and display the result as US$'
pr
'Calculate (604542+6458932)*2 and display the result as US$'
r = chat.toolloop(pr)
for o in r: show(o)
Finding the sum of 604542 and 6458932
Finding the product of 7063474 and 2

US$14,126,948

  • id: resp_6897e3bb932081969ee99109c1b66db20b5a46ab1de6956b
  • created_at: 1754784699.0
  • error: None
  • incomplete_details: None
  • instructions: None
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseOutputMessage(id=‘msg_6897e3bbe2948196a0e696d834f92b5a0b5a46ab1de6956b’, content=[ResponseOutputText(annotations=[], text=‘US$14,126,948’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: [FunctionTool(name=‘sums’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to sum’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to sum’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Adds a + b.:- type: integer’), FunctionTool(name=‘mults’, parameters={‘type’: ‘object’, ‘properties’: {‘a’: {‘type’: ‘integer’, ‘description’: ‘First thing to multiply’}, ‘b’: {‘type’: ‘integer’, ‘description’: ‘Second thing to multiply’, ‘default’: 1}}, ‘required’: [‘a’, ‘b’], ‘additionalProperties’: False}, strict=True, type=‘function’, description=‘Multiplies a * b.:- type: integer’)]
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=225, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=11, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=236)
  • user: None
  • store: True

Images

As everyone knows, when testing image APIs you have to use a cute puppy.

fn = Path('samples/puppy.jpg')
Image(filename=fn, width=200)

We create a Chat object as before:

chat = Chat(model, **chatkw)

Claudia expects images as a list of bytes, so we read in the file:

img = fn.read_bytes()

Prompts to Claudia can be lists, containing text, images, or both, eg:

chat([img, "In brief, what color flowers are in this image?"])

Purple.

  • id: resp_6897d7493f148197a7555683dead90e504a418ea2e45fb88
  • created_at: 1754781513.0
  • error: None
  • incomplete_details: None
  • instructions: None
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d74a033081979b865a7ceeb4084e04a418ea2e45fb88’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d74a1d70819790b7a077a0dfcd0804a418ea2e45fb88’, content=[ResponseOutputText(annotations=[], text=‘Purple.’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=102, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=8, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=110)
  • user: None
  • store: True

The image is included as input tokens.

chat.use
In: 102; Out: 8; Total: 110

Alternatively, Cosette supports creating a multi-stage chat with separate image and text prompts. For instance, you can pass just the image as the initial prompt (in which case the model will make some general comments about what it sees), and then follow up with questions in additional prompts:

chat = Chat(model, **chatkw)
chat(img)

What would you like to know or do with this photo of the puppy? (e.g., identify breed, give care tips, help with caption, edit suggestions)

  • id: resp_6897d74cc47881a3957fdb4946c7e02603dfba9cf4322ea6
  • created_at: 1754781516.0
  • error: None
  • incomplete_details: None
  • instructions: None
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d74d2bfc81a39f3ffde4ccf99c9503dfba9cf4322ea6’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d74d44e481a3971238c119e8f71d03dfba9cf4322ea6’, content=[ResponseOutputText(annotations=[], text=‘What would you like to know or do with this photo of the puppy? (e.g., identify breed, give care tips, help with caption, edit suggestions)’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=92, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=39, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=131)
  • user: None
  • store: True
chat('What direction is the puppy facing?')

The puppy is facing toward the camera (slightly to the viewer’s right).

  • id: resp_6897d74eab1c81a390df5c03ad22415403dfba9cf4322ea6
  • created_at: 1754781518.0
  • error: None
  • incomplete_details: None
  • instructions: None
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d74f1be481a3ad6c67714d67106403dfba9cf4322ea6’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d74f336081a3820708998493d90f03dfba9cf4322ea6’, content=[ResponseOutputText(annotations=[], text=‘The puppy is facing toward the camera (slightly to the viewer’s right).’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=142, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=22, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=164)
  • user: None
  • store: True
chat('What color is it?')

The puppy is mostly white with brown patches (brown ears and brown markings on the face).

  • id: resp_6897d74fede481a389bac0f29103f0a903dfba9cf4322ea6
  • created_at: 1754781520.0
  • error: None
  • incomplete_details: None
  • instructions: None
  • metadata: {}
  • model: gpt-5-mini-2025-08-07
  • object: response
  • output: [ResponseReasoningItem(id=‘rs_6897d750c6e881a3bc2c39941787c97303dfba9cf4322ea6’, summary=[], type=‘reasoning’, content=None, encrypted_content=None, status=None), ResponseOutputMessage(id=‘msg_6897d750e8ec81a38c30a0f7c6dce6d203dfba9cf4322ea6’, content=[ResponseOutputText(annotations=[], text=‘The puppy is mostly white with brown patches (brown ears and brown markings on the face).’, type=‘output_text’, logprobs=[])], role=‘assistant’, status=‘completed’, type=‘message’)]
  • parallel_tool_calls: True
  • temperature: 1.0
  • tool_choice: auto
  • tools: []
  • top_p: 1.0
  • background: False
  • max_output_tokens: 4096
  • max_tool_calls: None
  • previous_response_id: None
  • prompt: None
  • prompt_cache_key: None
  • reasoning: Reasoning(effort=‘minimal’, generate_summary=None, summary=None)
  • safety_identifier: None
  • service_tier: default
  • status: completed
  • text: ResponseTextConfig(format=ResponseFormatText(type=‘text’), verbosity=‘low’)
  • top_logprobs: 0
  • truncation: disabled
  • usage: ResponseUsage(input_tokens=173, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=24, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=197)
  • user: None
  • store: True

Note that the image is passed in again for every input in the dialog, so that number of input tokens increases quickly with this kind of chat.

chat.use
In: 407; Out: 85; Total: 492