SQLAlchemy 2.0 Released

SQLAlchemy 2.0.0, the first production release of SQLAlchemy 2.0, is
now available.

With this release, the default version of SQLAlchemy that will install
when one runs pip install sqlalchemy will be version 2.0. Note that
version 2.0 has significant API changes compared to the 1.4 series, so
applications that run on the 1.x series of SQLAlchemy which have not gone
through the migration process
should seek to ensure requirements
are set to maintain the desired major SQLAlchemy release series in place.

The history of the SQLAlchemy 2.0 series starts over four years ago on August
8, 2018, with some short ideas for how SQLAlchemy’s notion of Core and ORM
querying might be unified. The first plan for a real “SQLAlchemy 2.0” concept
formed in November of that year, focused on the two areas of drastically
simplifying Core execution and transactional APIs as well as seeking to unify
querying across Core and ORM.

The changes to foundational concepts were dramatic enough that SQLAlchemy 2.0
was separated into two major phases. The first phase was the SQLAlchemy 1.4
series, which provided an entirely new unified Core / ORM SQL querying system,
all while building on top of a new universal statement caching architecture.
This phase provided the complete implementation for SQLAlchemy 2.0’s SQL
construction approach (minus pep-484 typing support), while maintaining the
legacy Query API completely. Along with this release, a comprehensive
migration path
inspired by lessons learned from the Python 2->3 migration process was
provided, which describes how to port applications so that they can continue
to run in SQLAlchemy 1.4 while being entirely forwards compatible with
SQLAlchemy 2.0.

The second phase is the SQLAlchemy 2.0 series, which removes the majority
of deprecated elements, relegating the remaining ones (primarily Query) to
long term “legacy” status, fully moves to Python 3 only, and at the same time
adds many new features that build on top of the new architecture, taking full
advantage of Python 3 features (including dataclasses, enums, inline
annotations) as well as the new unified query architecture.

The key advantage to this approach is that the most significant and by far the
riskiest architectural changes, that of the rewrite of Core/ORM querying on top
of a new caching layer, have already been in production use with SQLAlchemy 1.4
for nearly two years. So while SQLAlchemy 2.0 will certainly have a lot of new
issues once it is used by the full developer audience, they should not be of
the “new cracks in the foundational approach” variety, as the architectural
foundations are already in widespread use. We expect that the vast majority of
issues will be related to the new typing system as well as issues of existing
applications adjusting to use the new APIs.

SQLAlchemy 2.0 is a big enough release that it has two
migration guides; the
Major Migration Guide
documents how to reach API compatibility for an
application to be able to run in SQLAlchemy 1.4 or 2.0 equally; the
What’s New in SQLAlchemy 2.0?
guide then provides for all the new features and APIs that are available once
an application is running on SQLAlchemy 2.0.

With that introduction in mind, here’s a top level bullet list of what’s
totally new in SQLAlchemy 2.0:

All new plugin-free pep-484 compatible ORM syntaxes – The ORM
Declarative style of mapper configuration now has an all new look, borrowing
heavily from systems such as Python dataclasses and SQLModel to
provide for Annotation-driven ORM declarations,
using runtime interpretation of pep-484 annotations to produce mapped classes
that are fully typed and compatible with any type checker or IDE.
pep-484 support for both new-style and legacy ORM queries – using Annotated
Declarative models, the SQL constructs one creates, such as a select()
object or a legacy Query construct, are themselves typed as to the
kind of columns returned within a row, and this typing
carries forth all the way to the Python values extracted from the result
object returned after executing the query.
While Python typing still has limitations for doing this kind of thing,
the new typing support draws from some techniques used by SQLModel so that
it works without additional annotations for the vast majority
of queries generated from ORM model classes, with varying
degrees of support for result-set-typing of Core-oriented or hybrid
Core/ORM queries as well. Some examples are at
SQL Expression Typing – Examples.
Declarative fully integrated with Python Dataclasses – SQLAlchemy 2.0 introduces
support for an Annotated Declarative mapped class to be generated as a
Python dataclass directly; this will provide an ORM
mapped class declared like any other that features auto-generated dataclasses
methods such as __init__(), __repr__(), __eq__(), and all the
rest. This new approach is vastly improved over the interim “hybrid”
approaches introduced in SQLAlchemy 1.4.
An all-new, fully ORM-integrated approach to bulk INSERT that is typically
an order of magnitude faster on most backends – Most of SQLAlchemy’s
supported databases and drivers have now added or improved their support for
INSERT RETURNING syntax, and SQLAlchemy 2.0 has taken advantage of this
by supporting and improving RETURNING support for all of its backends, with
the sole exception of MySQL (MySQL only; MariaDB supports RETURNING).
Part of this improvement is the
ability to INSERT thousands of rows in one batched statement that also
returns a complete result set of server generated values needed by the ORM,
which was never possible before with the sole exception of an interim
implementation that relied upon the psycopg2 driver. In 2.0,
the work has been done to optimize INSERT for all backends so that
thousands of ORM objects with or without primary keys can be INSERTed with
one database round trip,
and this capability is fully integrated into the ORM for all INSERT
operations, both “regular” ORM unit of work operations as well as for “bulk”
approaches, which are also
newly revised
in version 2.0.
An all-new bulk-optimized schema reflection architecture – operations that
reflect full schemas as once such as MetaData.reflect(engine) and newly
added schema-wide operations with the inspect(engine) construct now build on top of a
foundation that assumes operations that work across hundreds or thousands of tables at once,
rather than table-at-a-time. Dialects can “opt in” to the new architecture
by providing for new reflection queries that work across many tables at once,
allowing for very large schemas to be fully reflected much more efficently.
The new architecture is enabled for the
PostgreSQL and Oracle backends,
which were the two backends that had the biggest performance issues for
large schemas, where it grants a 250% improvement for PostgreSQL and
a 900% improvement for Oracle. The SQL Server dialect is the next
target for the new architecture. Any third party dialect can
opt in to the new “many tables at once” system or remain on the previous
“table at a time” approach that remains fully compatible without any changes.
Native extensions ported to Cython – SQLAlchemy’s C extensions have been
replaced with an updated approach using Cython. The Cython version of the
extensions in most cases benches as fast as, and sometimes faster than,
the previous C extensions, and allows SQLAlchemy to provide new native
extensions in more areas of the library much more easily without risk
of memory or stability issues.

SQLAlchemy 2.0 includes many more features and with new architectures and
a lot of older baggage being shed, hopes to have plenty of room for the future.

Links to the detailed changelog for 2.0.0 is at Changelog –
note that the 2.0 series changelog spans across four beta and three release
candidate releases.

SQLAlchemy 2.0.0 is available on the Download Page.