!brew list | grep postgrespostgresql@18
fastasyncpg is a simple wrapper for asyncpg. We’ll explain how it works and build up the module in a “literate” nbdev style.
On macOS, the recommended way to install PostgreSQL is via Homebrew. Other options include Postgres.app (a menubar app) and the EDB installer, but Homebrew integrates best with command-line workflows and makes updates simple.
To install PostgreSQL 18 (the current latest stable release):
To check if you already have PostgreSQL installed via Homebrew, run brew list | grep postgres. You can also check which version is in your PATH with psql --version.
Let’s verify the installation is working:
After installation, run brew info postgresql@18 to see setup instructions. PostgreSQL 18 is “keg-only”, meaning it’s not automatically symlinked into your PATH.
You’ll see something like:
This formula has created a default database cluster with:
initdb --locale=en_US.UTF-8 -E UTF-8 /opt/homebrew/var/postgresql@18
When uninstalling, some dead symlinks are left behind so you may want to run:
brew cleanup --prune-prefix
postgresql@18 is keg-only, which means it was not symlinked into /opt/homebrew,
because this is an alternate version of another formula.
If you need to have postgresql@18 first in your PATH, run:
echo 'export PATH="/opt/homebrew/opt/postgresql@18/bin:$PATH"' >> /Users/jhoward/.bash_profile
To start postgresql@18 now and restart at login:
brew services start postgresql@18
The brew info output (above) tells you exactly what to do:
echo 'export PATH="/opt/homebrew/opt/postgresql@18/bin:$PATH"' >> ~/.bash_profile && source ~/.bash_profilebrew services start postgresql@18This registers PostgreSQL to start automatically at login.
To run non-interactive queries from a shell, use -c to pass a command directly:
version
-----------------------------------------------------------------------------------------------------------------------------
PostgreSQL 18.1 (Homebrew) on aarch64-apple-darwin25.2.0, compiled by Apple clang version 17.0.0 (clang-1700.6.3.2), 64-bit
(1 row)
Running brew services start registers PostgreSQL to start automatically at login/reboot. You can verify this with brew services list, which shows all Homebrew-managed services and their status.
Name Status User File
cloudflared none
postgresql@18 started jhoward ~/Library/LaunchAgents/homebrew.mxcl.postgresql@18.plist
unbound none
To control auto-start behavior: - brew services stop postgresql@18 — stop and disable auto-start - brew services start postgresql@18 — start and enable auto-start
- brew services run postgresql@18 — run once without enabling auto-start
On Ubuntu, the standard way to get the latest PostgreSQL is through the official PostgreSQL APT repository (PGDG), since Ubuntu’s default repos often have older versions.
The most popular Python libraries for PostgreSQL are psycopg2/psycopg3 (synchronous) and asyncpg (async). For async work, asyncpg is about 5x faster than psycopg3 and is the recommended choice.
We’ll use asyncpg for this wrapper — it’s the fastest Python PostgreSQL library for async code.
asyncpg uses await for all database operations. The connect function returns a connection object, and fetchval is a convenience method that returns a single value from the first row:
'PostgreSQL 18.1 (Homebrew) on aarch64-apple-darwin25.2.0, compiled by Apple clang version 17.0.0 (clang-1700.6.3.2), 64-bit'
Let’s create a simple test table to explore basic operations:
'CREATE TABLE'
Great! Now let’s insert some data:
PostgreSQL uses $1, $2, etc. for parameterized queries, not ? like SQLite. This syntax allows you to reference the same parameter multiple times and makes the order explicit.
fetch returns a list of Record objects. Each record supports dict-like access by column name or index:
asyncpg.Record objects use dict-like access (r['name'] or r[0]), not attribute access. You can use dict2obj if you want the latter.
Unlike psycopg2, asyncpg doesn’t use traditional cursors. Instead, use async for record in conn.cursor(...) to iterate over results. However, cursors in asyncpg require an explicit transaction:
<Record id=1 name='Alice' age=30>
By default, asyncpg operates in auto-commit mode — changes are applied immediately when not in an explicit transaction block. Regular queries (execute, fetch, etc.) don’t need transactions, but cursors do. This is a direct reflection of how PostgreSQL itself handles “portals” (the underlying mechanism for cursors).
In PostgreSQL, a portal is generally only valid for the duration of a transaction. If a transaction isn’t explicitly started, PostgreSQL runs each command in its own “one-shot” transaction. For a cursor to stay open so you can fetch multiple batches of rows, the transaction it belongs to must remain open.
Other libraries (like psycopg2) often hide this by starting a transaction for you automatically when you create a cursor, whereas asyncpg chooses to be more “explicit” about the underlying database state.
For testing with a real dataset, we’ll use the Chinook sample database, which has tables for artists, albums, tracks, etc. The PostgreSQL version is available on GitHub:
Now we need to create a database and run that script. First, let’s create a database called chinook:
Then we can load the SQL file into it:
Always close connections when done — this releases the database connection back to PostgreSQL:
Now let’s connect to the Chinook database to work with more realistic data:
Results is a simple list subclass that renders as an HTML table in notebooks. It displays all rows with column headers, making query results easy to read:
Built-in mutable sequence.
If no argument is given, the constructor creates a new empty list. The argument must be an iterable if specified.
sql is a quick helper to run SQL and return results as a list of records:
| column_name | data_type | is_nullable |
|---|---|---|
| artist_id | integer | NO |
| name | character varying | YES |
In PostgreSQL, every table belongs to a schema — think of it as a folder or namespace. The default schema is public. When you create a table without specifying a schema, it lands there. You can query schema information via information_schema views:
| column_name | data_type | is_nullable |
|---|---|---|
| artist_id | integer | NO |
| name | character varying | YES |
To list all tables in the public schema, you can query the information_schema.tables view:
[<Record table_name='artist'>,
<Record table_name='album'>,
<Record table_name='employee'>,
<Record table_name='customer'>,
<Record table_name='invoice'>,
<Record table_name='invoice_line'>,
<Record table_name='track'>,
<Record table_name='playlist'>,
<Record table_name='playlist_track'>,
<Record table_name='genre'>,
<Record table_name='media_type'>,
<Record table_name='cats'>,
<Record table_name='cat'>,
<Record table_name='dog'>]
To customize how records behave, we need access to the underlying Record class:
FRecord extends asyncpg’s Record with two conveniences: attribute access (r.name instead of r['name']) and HTML rendering for notebooks:
Initialize self. See help(type(self)) for accurate signature.
We can use record_class to auto-wrap with FRecord:
table_names and view_names query PostgreSQL’s system catalogs to list tables and views in a schema. We use pg_class and pg_namespace rather than information_schema for better performance:
List of view names in schema
List of table names in schema
artist album employee customer invoice invoice_line track playlist playlist_track genre media_type cats cat dog
columns_info returns a dict mapping column names to their PostgreSQL data types. It queries pg_attribute directly for efficiency:
Dict mapping column names to data types for table
We’ll need to know each table’s primary key. PostgreSQL stores this in pg_index. The ::regclass cast is idiomatic PostgreSQL — it converts a table name string to its internal object ID, automatically handling schema resolution. The indisprimary flag identifies the primary key index.
Get primary key column(s) for table
Database wraps an asyncpg connection (or pool) and provides table/view metadata caching. It delegates unknown attributes to the underlying connection via __getattr__, so you can call db.fetch(...) directly. The t property returns a _TablesGetter for convenient table access.
Initialize self. See help(type(self)) for accurate signature.
Table represents a database table with metadata like columns and primary keys. The xtra method lets you set persistent row filters (useful for multi-tenancy). Tables stringify as quoted identifiers for safe SQL interpolation.
Initialize self. See help(type(self)) for accurate signature.
_Getter is a base class for “magic accessor” objects that provide multiple ways to access items — by attribute (dt.artist), by index (dt['artist']), or by iteration (for t in dt). It implements __dir__ so tab-completion works in notebooks, __repr__ for nice display, __contains__ for in checks, and both __getattr__ and __getitem__ for flexible access.
_TablesGetter specializes this for tables, reading from the database’s _tnames list. The db.t property returns one of these, giving you a clean API: db.t.artist instead of db.table('artist').
connect is our main entry point — it creates an asyncpg connection with FRecord as the default record class, sets up JSON codecs, and returns a Database wrapper with metadata already loaded:
'postgresql://jhoward@127.0.0.1:5432/chinook'
_Col represents a single column, with __str__ returning fully-qualified SQL ("table"."column"). _ColsGetter follows the same pattern as _TablesGetter — it’s a magic accessor that lets you write artist.c.name to get a column reference. The __call__ method returns all columns as _Col objects, useful for building queries programmatically.
Columns stringify in a format suitable for including in SQL statements.
Tables and views do the same.
We will support ? in addition to the standard pgsql $1 form placeholders, by using sqlparse to parse the query.
sqlparse returns a token list per statement, eg:
Token.Keyword.DML 'SELECT'
Token.Text.Whitespace ' '
Token.Wildcard '*'
Token.Text.Whitespace ' '
Token.Keyword 'FROM'
Token.Text.Whitespace ' '
Token.Name 'artist'
Token.Text.Whitespace ' '
Token.Keyword 'WHERE'
Token.Text.Whitespace ' '
Token.Name 'artist_id'
Token.Text.Whitespace ' '
Token.Operator.Comparison '='
Token.Text.Whitespace ' '
Token.Name.Placeholder '?'
Convert ? and :name placeholders to PostgreSQL $n style
SELECT * FROM a WHERE id = $1;
SELECT * FROM b WHERE name = $2
('SELECT * FROM artist WHERE artist_id = $1 AND name = $1', [])
We get back values for the kwargs in the order they appear:
('SELECT * FROM artist WHERE artist_id = $1 AND name = $2', [5, 'AC/DC'])
A repeated kw placeholder name results in a single arg result:
('SELECT * FROM a WHERE id = $1 OR creator = $1', [42])
Placeholder types can be mixed:
('SELECT * FROM a WHERE id = $1 AND name = $3 and foo=$2', ['test'])
db.q is a convenience method that runs a SQL query and wraps results in a Results list for nice HTML rendering:
| artist_id | name |
|---|---|
| 1 | AC/DC |
PostgreSQL has many data types that map to Python equivalents. We’ll import the Python types we need:
get_typ extracts the base PostgreSQL type (stripping size specifiers like (120)) and maps it to the corresponding Python type:
Get Python type for PostgreSQL type string
The pg_to_py dict maps PostgreSQL type names to Python types. This covers the most common types — numeric, string, temporal, JSON, network, and geometric:
asyncpg doesn’t automatically decode JSON columns — we need to register custom codecs. The setup_json function configures both json and jsonb types to use Python’s json module:
We’ll re-define connect to use json now:
We’ll use Python’s dataclasses module to auto-generate typed classes from table schemas:
With the type mapping in place, we can auto-generate Python dataclasses from table schemas. The _get_flds helper extracts field definitions, and dataclass() creates a dataclass matching the table structure. We use flexiclass from fastcore to make the dataclass more flexible (allowing partial instantiation).
Artist(artist_id=1, name='AC/DC')
You can get the definition of the dataclass using fastcore’s dataclass_src:
all_dcs generates dataclasses for every table (and optionally views) in the database. This is useful for type-checking and IDE autocompletion:
dataclasses for all objects in db
The xtra method (defined earlier) lets you set persistent filters on a table. The _add_xtra helper injects these constraints into WHERE clauses. This is useful for multi-tenant apps or any situation where you want automatic row filtering — e.g., album.xtra(artist_id=1) ensures all subsequent queries only see albums by artist 1.
__getitem__ provides dict-style access by primary key. It raises NotFoundError if the row doesn’t exist (or doesn’t match xtra constraints). If a dataclass has been generated for the table, results are automatically converted to that type.
Get row by primary key, raising NotFoundError if missing
Common base class for all non-exit exceptions.
album = dt.album
Album = album.dataclass()
print("Album 1:", await album[1])
print("Artist ID of album 1:", (await album[1]).artist_id)
album.xtra(artist_id=1)
print("\nWith xtra(artist_id=1):")
print("Album 1:", await album[1]) # Should work - album 1 is by artist 1
try: await album[2] # Album 2 is by a different artist
except NotFoundError as e: print('Error correctly raised:', type(e))Album 1: Album(album_id=1, title='For Those About To Rock We Salute You', artist_id=1)
Artist ID of album 1: 1
With xtra(artist_id=1):
Album 1: Album(album_id=1, title='For Those About To Rock We Salute You', artist_id=1)
Error correctly raised: <class '__main__.NotFoundError'>
get is the “safe” version of __getitem__ — it returns None instead of raising an exception when a row isn’t found. This mirrors the pattern in fastlite and Python’s dict.get().
Get row by primary key, or None if missing
rows_where is the core query method. It builds SQL from its parameters, applies xtra constraints, and optionally converts results to the table’s dataclass. Unlike psycopg2/sqlite which use ? placeholders, PostgreSQL uses $1, $2 positional parameters.
Iterate over rows matching where clause
[Artist(artist_id=1, name='AC/DC'),
Artist(artist_id=2, name='Accept'),
Artist(artist_id=3, name='Aerosmith')]
[Album(album_id=1, title='For Those About To Rock We Salute You', artist_id=1),
Album(album_id=4, title='Let There Be Rock', artist_id=1)]
count is an async property that returns the number of rows in a table. It respects xtra constraints, so if you’ve set filters, only matching rows are counted:
get_field extracts a value from either a dict-like object (using [k]) or a dataclass/object (using getattr). This lets us handle both Record and dataclass results uniformly:
pks_and_rows_where wraps rows_where but returns (pk, row) tuples — useful when you need to know which primary key each row has without inspecting the row itself.
Like rows_where but returns (pk, row) tuples
[(1, Artist(artist_id=1, name='AC/DC')),
(2, Artist(artist_id=2, name='Accept')),
(3, Artist(artist_id=3, name='Aerosmith'))]
__call__ makes tables callable, providing a convenient shorthand for queries. await artist(limit=3) is equivalent to await artist.rows_where(limit=3). The with_pk parameter switches to returning (pk, row) tuples.
Query table rows
[Artist(artist_id=1, name='AC/DC'),
Artist(artist_id=2, name='Accept'),
Artist(artist_id=3, name='Aerosmith')]
[(1, Artist(artist_id=1, name='AC/DC')),
(2, Artist(artist_id=2, name='Accept')),
(3, Artist(artist_id=3, name='Aerosmith'))]
[Album(album_id=1, title='For Those About To Rock We Salute You', artist_id=1),
Album(album_id=4, title='Let There Be Rock', artist_id=1)]
selectone returns exactly one row matching the query, raising NotFoundError if none found or ValueError if multiple found. It passes limit=2 internally so it can detect the “not unique” case without fetching the entire table.
def selectone(
where:str | None=None, # SQL where fragment to use, for example `id > ?`
where_args:Union=None, # Parameters to use with `where`; iterable for `id>?`, or dict for `id>:id`
select:str='*', # Comma-separated list of columns to select
as_cls:bool=True, # Convert returned dict to stored dataclass?
debug:bool=False
)->list:
Shortcut for __call__ that returns exactly one item
SELECT * FROM "artist" WHERE Name=$1 LIMIT 2
Artist(artist_id=1, name='AC/DC')
db.item is for scalar queries — it returns a single field from a single row. Useful for things like SELECT count(*) or SELECT max(price).
Execute sql and return a single field from a single row
create_mod generates a Python module file containing dataclass definitions for all tables in the database. This lets you import type-checked dataclasses directly rather than regenerating them each time. The generated file includes proper imports and uses UNSET defaults for flexible instantiation.
Create module for dataclasses for db
link_dcs reconnects a database’s tables to dataclasses from a previously generated module. This is useful when you’ve imported dataclasses from a file created by create_mod and want the ORM to use them.
Set the internal dataclass type links for tables using mod (created via create_mod)
Track(track_id=1, name='For Those About To Rock (We Salute You)', album_id=1, media_type_id=1, genre_id=1, composer='Angus Young, Malcolm Young, Brian Johnson', milliseconds=343719, bytes=11170334, unit_price=Decimal('0.99'))
set_classes is a convenience method that links all table dataclasses from a namespace (typically globals()). It expects dataclass names to be title-cased versions of table names (e.g., Artist for table artist).
Add set all table dataclasses using types in namespace glb
artist, album, employee, customer, invoice, invoice_line, track, playlist, playlist_track, genre, media_type, cats, cat, dog
get_tables injects table objects into a namespace with pluralized names — so db.t.album becomes available as albums. Combined with set_classes, this gives you a clean API: await albums(limit=3) returns a list of Album dataclass instances.
Add objects for all table objects to namespace glb
[Album(album_id=1, title='For Those About To Rock We Salute You', artist_id=1)]
[Employee(employee_id=1, last_name='Adams', first_name='Andrew', title='General Manager', reports_to=None, birth_date=datetime.datetime(1962, 2, 18, 0, 0), hire_date=datetime.datetime(2002, 8, 14, 0, 0), address='11120 Jasper Ave NW', city='Edmonton', state='AB', country='Canada', postal_code='T5K 2N1', phone='+1 (780) 428-9482', fax='+1 (780) 428-3457', email='andrew@chinookcorp.com')]
To support both dataclasses and dicts as input, and to handle Enum values properly, we need these imports:
_process_row converts a dataclass (or dict) to a plain dict, filtering out UNSET values and extracting .value from Enum fields. This lets you pass partially-filled dataclasses to insert/update.
insert adds a row to the table. It accepts either a dataclass/dict as record, keyword arguments, or both (kwargs override record fields). PostgreSQL’s RETURNING * clause lets us get the inserted row back in one query — including any auto-generated values like SERIAL primary keys. The xtra constraints are automatically merged in.
Insert a row and return it
For DDL statements like CREATE TABLE, use execute rather than fetch/q. DDL statements don’t return rows — they return a status string like 'CREATE TABLE'. PostgreSQL uses SERIAL for auto-incrementing integers (instead of SQLite’s INTEGER PRIMARY KEY) and REAL instead of FLOAT.
'CREATE TABLE'
_retr_tbl is a helper that refreshes the database metadata and returns the table object for a given name. This ensures you’re working with up-to-date schema information after creating or modifying tables.
table2glb is a convenience method that refreshes metadata, creates the dataclass, and injects both the table object (pluralized name) and the dataclass into a namespace. This is handy after creating a new table.
Get table by name, refreshing metadata and creating dataclass, adding to glb
With xtra set, insert automatically includes those constraints. Here we set uid=1, so the inserted cat gets that value even though we didn’t pass it explicitly.
Cat(id=2, name='purr', weight=4.0, uid=1)
Calling xtra() with no arguments clears the filter by setting xtra_id = {}. Now queries return all rows again.
_pk_where builds a WHERE clause for primary key matching, using PostgreSQL’s $1, $2 placeholders with an offset to account for preceding parameters in the query.
update modifies an existing row by primary key. It builds an UPDATE ... SET ... WHERE pk = $n RETURNING * statement. Like insert, it respects xtra constraints — if you try to update a row that doesn’t match the xtra filter, you’ll get NotFoundError.
Update a row and return it
Correctly blocked: cat[[1, 2]]
delete removes a row by primary key, returning the deleted row (using RETURNING *). Like the other methods, it respects xtra constraints — attempting to delete a row that doesn’t match the filter raises NotFoundError.
Delete row by primary key, returning the deleted row
Let’s verify delete works — first check what cats we have:
Delete returns the deleted row, so you can see exactly what was removed:
The xtra filter also applies to deletes. If you try to delete a row that doesn’t match the constraint, you get NotFoundError:
Correctly blocked: cat[2]
delete_where is the bulk version of delete — it removes all rows matching a WHERE clause (or all rows if none given), returning the deleted rows as a list. Like delete, it respects xtra constraints and uses RETURNING * to give back what was removed. This is useful for cleanup operations like removing all rows above a threshold.
Cat(id=4, name='Skitter', weight=2.0, uid=None)
[Cat(id=2, name='purr', weight=4.0, uid=1),
Cat(id=3, name='Cat McCat Face', weight=9.0, uid=None)]
To create tables from Python classes, we need a reverse mapping from Python types to PostgreSQL types. We generate it from pg_to_py and override some entries for cleaner defaults (e.g., TEXT instead of character varying).
col_def builds a column definition for CREATE TABLE. If the column is the primary key and has type int, it becomes SERIAL PRIMARY KEY (PostgreSQL’s auto-incrementing integer). Otherwise it maps the Python type to PostgreSQL and adds NOT NULL if specified.
Build column definition SQL for CREATE TABLE
db.create creates a table from a Python class (or dataclass). It extracts field names and types, builds column definitions, and executes the CREATE TABLE statement. The replace=True option drops any existing table first (with CASCADE to handle dependencies).
Create table from cls
drop removes a table from the database. The cascade=True option also drops any dependent objects (like foreign key references from other tables).
Drop this table
Now let’s test create with a simple Dog class:
{'id': 'integer', 'name': 'text', 'age': 'integer'}
The auto-generated SERIAL primary key handles auto-increment automatically:
Foreign keys are specified as a dict mapping column names to (table, column) tuples:
{'id': 'integer', 'name': 'text', 'dog_id': 'integer'}
The foreign key constraint is enforced by PostgreSQL — inserting a toy with an invalid dog_id would raise an error:
upsert performs an “insert or update” operation using PostgreSQL’s ON CONFLICT ... DO UPDATE clause. If a row with the same primary key exists, it updates it; otherwise it inserts a new row. Like insert, it uses _prep_row for row processing and _exec_returning for result handling, and respects xtra constraints.
Insert or update a row and return it
Let’s test upsert — first check current state:
Updating an existing row — change Rex’s age from 5 to 6:
Inserting a new row — upsert without an existing id creates a new record:
With xtra set, upsert merges those constraints into the row:
For production use, you’ll want a connection pool instead of a single connection. asyncpg.Pool has the same query methods (fetch, execute, etc.) as a connection, so our Database wrapper works with both. The key difference is that JSON codecs must be registered via the init callback (which runs on each new connection in the pool).
Let’s test that the pool works the same as a single connection:
[<FRecord artist_id=1 name='AC/DC'>,
<FRecord artist_id=2 name='Accept'>,
<FRecord artist_id=3 name='Aerosmith'>]