SQLAlchemy Documentation
Version: 0.1.6 Last Updated: 04/12/06 12:42:40
View: Paged  |  One Page
Table of Contents    (view full table)

Table of Contents: Full    (view brief table)

View: Paged  |  One Page
How to Read this Manual

SQLAlchemy features a lot of tools and patterns to help in every area of writing applications that talk to relational databases. To achieve this, it has a lot of areas of functionality which work together to provide a cohesive package. Ultimately, just a little bit of familiarity with each concept is all that's needed to get off the ground.

That said, here's two quick links that summarize the two most prominent features of SQLAlchemy:

  • None - a synopsis of how to map objects to database tables (Object Relational Mapping)
  • None - SQLAlchemy's own domain-oriented approach to constructing and executing SQL statements.

Trail Map

For a comprehensive tour through all of SQLAlchemy's components, below is a "Trail Map" of the knowledge dependencies between these components indicating the order in which concepts may be learned. Concepts marked in bold indicate features that are useful on their own.

Start
  |
  |
  |--- Connection Pooling
  |              |
  |              |
  |              |------ Connection Pool Configuration
  |                                         |              
  |                                         |
  +--- Establishing a Database Engine       |
                   |                        |
                   |                        | 
                   |--------- Database Engine Options
                   |
                   |
                   +---- Describing Tables with MetaData
                                   |
                                   |
                                   |---- Creating and Dropping Database Tables
                                   | 
                                   |    
                                   |---- None
                                   |                                      |                
                                   |                                      |                                  
                                   +---- None               |                
                                   |               |                      |  
                                   |               |                      |  
                                   |         None                 |              
                                   |               |                      |              
                                   |               |                      |              
                                   |               +----------- Advanced Data Mapping
                                   |                                       
                                   +----- The Types System
back to section top

Note:This section describes the connection pool module of SQLAlchemy, which is the smallest component of the library that can be used on its own. If you are interested in using SQLAlchemy for query construction or Object Relational Mapping, this module is automatically managed behind the scenes; you can skip ahead to Database Engines in that case.

At the base of any database helper library is a system of efficiently acquiring connections to the database. Since the establishment of a database connection is typically a somewhat expensive operation, an application needs a way to get at database connections repeatedly without incurring the full overhead each time. Particularly for server-side web applications, a connection pool is the standard way to maintain a "pool" of database connections which are used over and over again among many requests. Connection pools typically are configured to maintain a certain "size", which represents how many connections can be used simultaneously without resorting to creating more newly-established connections.

SQLAlchemy includes a pooling module that can be used completely independently of the rest of the toolset. This section describes how it can be used on its own, as well as the available options. If SQLAlchemy is being used more fully, the connection pooling described below occurs automatically. The options are still available, though, so this core feature is a good place to start.

Establishing a Transparent Connection Pool
Any DBAPI module can be "proxied" through the connection pool using the following technique (note that the usage of 'psycopg2' is just an example; substitute whatever DBAPI module you'd like):
import sqlalchemy.pool as pool
import psycopg2 as psycopg
psycopg = pool.manage(psycopg)

# then connect normally
connection = psycopg.connect(database='test', username='scott', password='tiger')

This produces a sqlalchemy.pool.DBProxy object which supports the same connect() function as the original DBAPI module. Upon connection, a thread-local connection proxy object is returned, which delegates its calls to a real DBAPI connection object. This connection object is stored persistently within a connection pool (an instance of sqlalchemy.pool.Pool) that corresponds to the exact connection arguments sent to the connect() function. The connection proxy also returns a proxied cursor object upon calling connection.cursor(). When all cursors as well as the connection proxy are de-referenced, the connection is automatically made available again by the owning pool object.

Basically, the connect() function is used in its usual way, and the pool module transparently returns thread-local pooled connections. Each distinct set of connect arguments corresponds to a brand new connection pool created; in this way, an application can maintain connections to multiple schemas and/or databases, and each unique connect argument set will be managed by a different pool object.

back to section top
Connection Pool Configuration

When proxying a DBAPI module through the pool module, options exist for how the connections should be pooled:

  • echo=False : if set to True, connections being pulled and retrieved from/to the pool will be logged to the standard output, as well as pool sizing information.
  • use_threadlocal=True : if set to True, repeated calls to connect() within the same application thread will be guaranteed to return the same connection object, if one has already been retrieved from the pool and has not been returned yet. This allows code to retrieve a connection from the pool, and then while still holding on to that connection, to call other functions which also ask the pool for a connection of the same arguments; those functions will act upon the same connection that the calling method is using. Note that once the connection is returned to the pool, it then may be used by another thread. To guarantee a single unique connection per thread that never changes, use the option poolclass=SingletonThreadPool, in which case the use_threadlocal parameter is automatically set to False.
  • poolclass=QueuePool : the Pool class used by the pool module to provide pooling. QueuePool uses the Python Queue.Queue class to maintain a list of available connections. A developer can supply his or her own Pool class to supply a different pooling algorithm. Also included is the ThreadSingletonPool, which provides a single distinct connection per thread and is required with SQLite.
  • pool_size=5 : used by QueuePool - the size of the pool to be maintained. This is the largest number of connections that will be kept persistently in the pool. Note that the pool begins with no connections; once this number of connections is requested, that number of connections will remain.
  • max_overflow=10 : used by QueuePool - the maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of "sleeping" connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections.
back to section top
Custom Pool Construction

One level below using a DBProxy to make transparent pools is creating the pool yourself. The pool module comes with two implementations of connection pools: QueuePool and SingletonThreadPool. While QueuePool uses Queue.Queue to provide connections, SingletonThreadPool provides a single per-thread connection which SQLite requires.

Constructing your own pool involves passing a callable used to create a connection. Through this method, custom connection schemes can be made, such as a connection that automatically executes some initialization commands to start. The options from the previous section can be used as they apply to QueuePool or SingletonThreadPool.

Plain QueuePool
import sqlalchemy.pool as pool
import psycopg2

def getconn():
    c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
    # execute an initialization function on the connection before returning
    c.cursor.execute("setup_encodings()")
    return c
    
p = pool.QueuePool(getconn, max_overflow=10, pool_size=5, use_threadlocal=True)
SingletonThreadPool
import sqlalchemy.pool as pool
import sqlite

def getconn():
    return sqlite.connect(filename='myfile.db')

# SQLite connections require the SingletonThreadPool    
p = pool.SingletonThreadPool(getconn)
back to section top

A database engine is a subclass of sqlalchemy.engine.SQLEngine, and is the starting point for where SQLAlchemy provides a layer of abstraction on top of the various DBAPI2 database modules. It serves as an abstract factory for database-specific implementation objects as well as a layer of abstraction over the most essential tasks of a database connection, including connecting, executing queries, returning result sets, and managing transactions.

The average developer doesn't need to know anything about the interface or workings of a SQLEngine in order to use it. Simply creating one, and then specifying it when constructing tables and other SQL objects is all that's needed.

A SQLEngine is also a layer of abstraction on top of the connection pooling described in the previous section. While a DBAPI connection pool can be used explicitly alongside a SQLEngine, its not really necessary. Once you have a SQLEngine, you can retrieve pooled connections directly from its underlying connection pool via its own connection() method. However, if you're exclusively using SQLALchemy's SQL construction objects and/or object-relational mappers, all the details of connecting are handled by those libraries automatically.

Establishing a Database Engine

Engines exist for SQLite, Postgres, MySQL, MS-SQL, and Oracle, using the Pysqlite, Psycopg (1 or 2), MySQLDB, adodbapi or pymssql, and cx_Oracle modules (there is also experimental support for Firebird). Each engine imports its corresponding module which is required to be installed. For Postgres and Oracle, an alternate module may be specified at construction time as well.

The string based argument names for connecting are translated to the appropriate names when the connection is made; argument names include "host" or "hostname" for database host, "database", "db", or "dbname" for the database name (also is dsn for Oracle), "user" or "username" for the user, and "password", "pw", or "passwd" for the password. SQLite expects "filename" or "file" for the filename, or if None it defaults to "":memory:".

The connection arguments can be specified as a string + dictionary pair, or a single URL-encoded string, as follows:

from sqlalchemy import *

# sqlite in memory    
sqlite_engine = create_engine('sqlite', {'filename':':memory:'}, **opts)

# via URL
sqlite_engine = create_engine('sqlite://', **opts)

# sqlite using a file
sqlite_engine = create_engine('sqlite', {'filename':'querytest.db'}, **opts)

# via URL
sqlite_engine = create_engine('sqlite://filename=querytest.db', **opts)

# postgres
postgres_engine = create_engine('postgres', 
                        {'database':'test', 
                        'host':'127.0.0.1', 
                        'user':'scott', 
                        'password':'tiger'}, **opts)

# via URL
postgres_engine = create_engine('postgres://database=test&host=127.0.0.1&user=scott&password=tiger')

# mysql
mysql_engine = create_engine('mysql',
                        {
                            'db':'mydb',
                            'user':'scott',
                            'passwd':'tiger',
                            'host':'127.0.0.1'
                        }
                        **opts)
# oracle
oracle_engine = create_engine('oracle', 
                        {'dsn':'mydsn', 
                        'user':'scott', 
                        'password':'tiger'}, **opts)

Note that the general form of connecting to an engine is:

# separate arguments
engine = create_engine(
            <enginename>, 
            {<named DBAPI arguments>}, 
            <sqlalchemy options>
        )

# url
engine = create_engine('<enginename>://<named DBAPI arguments>', <sqlalchemy options>)
back to section top
Database Engine Methods

A few useful methods off the SQLEngine are described here:

engine = create_engine('postgres://hostname=localhost&user=scott&password=tiger&database=test')

# get a pooled DBAPI connection
conn = engine.connection()

# create/drop tables based on table metadata objects
# (see the next section, Table Metadata, for info on table metadata)
engine.create(mytable)
engine.drop(mytable)

# get the DBAPI module being used
dbapi = engine.dbapi()

# get the default schema name
name = engine.get_default_schema_name()

# execute some SQL directly, returns a ResultProxy (see the SQL Construction section for details)
result = engine.execute("select * from table where col1=:col1", {'col1':'foo'})

# log a message to the engine's log stream
engine.log('this is a message')
back to section top
Database Engine Options

The remaining arguments to create_engine are keyword arguments that are passed to the specific subclass of sqlalchemy.engine.SQLEngine being used, as well as the underlying sqlalchemy.pool.Pool instance. All of the options described in the previous section Connection Pool Configuration can be specified, as well as engine-specific options:

  • pool=None : an instance of sqlalchemy.pool.Pool to be used as the underlying source for connections, overriding the engine's connect arguments (pooling is described in the previous section). If None, a default Pool (QueuePool or SingletonThreadPool as appropriate) will be created using the engine's connect arguments.

Example:

from sqlalchemy import *
import sqlalchemy.pool as pool
import MySQLdb

def getconn():
    return MySQLdb.connect(user='ed', dbname='mydb')

engine = create_engine('mysql', pool=pool.QueuePool(getconn, pool_size=20, max_overflow=40))
  • echo=False : if True, the SQLEngine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. A SQLEngine instances' "echo" data member can be modified at any time to turn logging on and off. If set to the string 'debug', result rows will be printed to the standard output as well.
  • logger=None : a file-like object where logging output can be sent, if echo is set to True. This defaults to sys.stdout.
  • module=None : used by Oracle and Postgres, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle.
  • default_ordering=False : if True, table objects and associated joins and aliases will generate information used for ordering by primary keys (or OIDs, if the database supports OIDs). This information is used by the Mapper system to when it constructs select queries to supply a default ordering to mapped objects.
  • use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of some Oracle databases, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using <column1>(+)=<column2> must be used in order to achieve a LEFT OUTER JOIN. Its advised that the Oracle database be configured to have full ANSI support instead of using this feature.
  • use_oids=False : used only by Postgres, will enable the column name "oid" as the object ID column. Postgres as of 8.1 has object IDs disabled by default.
  • convert_unicode=False : if set to True, all String/character based types will convert Unicode values to raw byte values going into the database, and all raw byte values to Python Unicode coming out in result sets. This is an engine-wide method to provide unicode across the board. For unicode conversion on a column-by-column level, use the Unicode column type instead.
  • encoding='utf-8' : the encoding to use for Unicode translations - passed to all encode/decode methods.
  • echo_uow=False : when True, logs unit of work commit plans to the standard output.
back to section top
Using the Proxy Engine

The ProxyEngine is useful for applications that need to swap engines at runtime, or to create their tables and mappers before they know what engine they will use. One use case is an application meant to be pluggable into a mix of other applications, such as a WSGI application. Well-behaved WSGI applications should be relocatable; and since that means that two versions of the same application may be running in the same process (or in the same thread at different times), WSGI applications ought not to depend on module-level or global configuration. Using the ProxyEngine allows a WSGI application to define tables and mappers in a module, but keep the specific database connection uri as an application instance or thread-local value.

The ProxyEngine is used in the same way as any other engine, with one additional method:

# define the tables and mappers
from sqlalchemy import *
from sqlalchemy.ext.proxy import ProxyEngine

engine = ProxyEngine()

users = Table('users', engine, ... )

class Users(object):
    pass

assign_mapper(Users, users)

def app(environ, start_response):
    # later, connect the proxy engine to a real engine via the connect() method
    engine.connect(environ['db_uri'])
    # now you have a real db connection and can select, insert, etc.
Using the Global Proxy

There is an instance of ProxyEngine available within the schema package as default_engine. You can construct Table objects and not specify the engine parameter, and they will connect to this engine by default. To connect the default_engine, use the global_connect function.

# define the tables and mappers
from sqlalchemy import *

# specify a table with no explicit engine
users = Table('users', 
        Column('user_id', Integer, primary_key=True),
        Column('user_name', String)
    )

# connect the global proxy engine
global_connect('sqlite://filename=foo.db')

# create the table in the selected database
users.create()
back to section top
Transactions

A SQLEngine also provides an interface to the transactional capabilities of the underlying DBAPI connection object, as well as the connection object itself. Note that when using the object-relational-mapping package, described in a later section, basic transactional operation is handled for you automatically by its "Unit of Work" system; the methods described here will usually apply just to literal SQL update/delete/insert operations or those performed via the SQL construction library.

Typically, a connection is opened with autocommit=False. So to perform SQL operations and just commit as you go, you can simply pull out a connection from the connection pool, keep it in the local scope, and call commit() on it as needed. As long as the connection remains referenced, all other SQL operations within the same thread will use this same connection, including those used by the SQL construction system as well as the object-relational mapper, both described in later sections:

conn = engine.connection()

# execute SQL via the engine
engine.execute("insert into mytable values ('foo', 'bar')")
conn.commit()

# execute SQL via the SQL construction library            
mytable.insert().execute(col1='bat', col2='lala')
conn.commit()

There is a more automated way to do transactions, and that is to use the engine's begin()/commit() functionality. When the begin() method is called off the engine, a connection is checked out from the pool and stored in a thread-local context. That way, all subsequent SQL operations within the same thread will use that same connection. Subsequent commit() or rollback() operations are performed against that same connection. In effect, its a more automated way to perform the "commit as you go" example above.

engine.begin()
engine.execute("insert into mytable values ('foo', 'bar')")
mytable.insert().execute(col1='foo', col2='bar')
engine.commit()

A traditional "rollback on exception" pattern looks like this:

engine.begin()
try:
    engine.execute("insert into mytable values ('foo', 'bar')")
    mytable.insert().execute(col1='foo', col2='bar')
except:
    engine.rollback()
    raise
engine.commit()

An shortcut which is equivalent to the above is provided by the transaction method:

def do_stuff():
        engine.execute("insert into mytable values ('foo', 'bar')")
        mytable.insert().execute(col1='foo', col2='bar')

engine.transaction(do_stuff)

An added bonus to the engine's transaction methods is "reentrant" functionality; once you call begin(), subsequent calls to begin() will increment a counter that must be decremented corresponding to each commit() statement before an actual commit can happen. This way, any number of methods that want to insure a transaction can call begin/commit, and be nested arbitrarily:

# method_a starts a transaction and calls method_b
def method_a():
    engine.begin()
    try:
        method_b()
    except:
        engine.rollback()
        raise
    engine.commit()

# method_b starts a transaction, or joins the one already in progress,
# and does some SQL
def method_b():
    engine.begin()
    try:
        engine.execute("insert into mytable values ('bat', 'lala')")
        mytable.insert().execute(col1='bat', col2='lala')
    except:
        engine.rollback()
        raise
    engine.commit()

# call method_a                
method_a()

Above, method_a is called first, which calls engine.begin(). Then it calls method_b. When method_b calls engine.begin(), it just increments a counter that is decremented when it calls commit(). If either method_a or method_b calls rollback(), the whole transaction is rolled back. The transaction is not committed until method_a calls the commit() method.

The object-relational-mapper capability of SQLAlchemy includes its own commit() method that gathers SQL statements into a batch and runs them within one transaction. That transaction is also invokved within the scope of the "reentrant" methodology above; so multiple objectstore.commit() operations can also be bundled into a larger database transaction via the above methodology.

back to section top

Describing Tables with MetaData

The core of SQLAlchemy's query and object mapping operations is table metadata, which are Python objects that describe tables. Metadata objects can be created by explicitly naming the table and all its properties, using the Table, Column, ForeignKey, and Sequence objects imported from sqlalchemy.schema, and a database engine constructed as described in the previous section, or they can be automatically pulled from an existing database schema. First, the explicit version:

from sqlalchemy import *
engine = create_engine('sqlite', {'filename':':memory:'}, **opts)

users = Table('users', engine, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('email_address', String(60), key='email'),
    Column('password', String(20), nullable = False)
)

user_prefs = Table('user_prefs', engine, 
    Column('pref_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
    Column('pref_name', String(40), nullable=False),
    Column('pref_value', String(100))
)

The specific datatypes, such as Integer, String, etc. are defined in The Types System and are automatically pulled in when you import * from sqlalchemy. Note that for Column objects, an altername name can be specified via the "key" parameter; if this parameter is given, then all programmatic references to this Column object will be based on its key, instead of its actual column name.

Once constructed, the Table object provides a clean interface to the table's properties as well as that of its columns:

employees = Table('employees', engine, 
    Column('employee_id', Integer, primary_key=True),
    Column('employee_name', String(60), nullable=False, key='name'),
    Column('employee_dept', Integer, ForeignKey("departments.department_id"))
)

# access the column "EMPLOYEE_ID":
employees.columns.employee_id

# or just
employees.c.employee_id

# via string
employees.c['employee_id']

# iterate through all columns
for c in employees.c:
    # ...

# get the table's primary key columns
for primary_key in employees.primary_key:
    # ...

# get the table's foreign key objects:
for fkey in employees.foreign_keys:
    # ...

# access the table's SQLEngine object:
employees.engine

# access a column's name, type, nullable, primary key, foreign key
employees.c.employee_id.name
employees.c.employee_id.type
employees.c.employee_id.nullable
employees.c.employee_id.primary_key
employees.c.employee_dept.foreign_key

# get the "key" of a column, which defaults to its name, but can 
# be any user-defined string:
employees.c.name.key

# access a column's table:
employees.c.employee_id.table is employees
>>> True

# get the table related by a foreign key
fcolumn = employees.c.employee_dept.foreign_key.column.table

Metadata objects can also be reflected from tables that already exist in the database. Reflection means based on a table name, the names, datatypes, and attributes of all columns, including foreign keys, will be loaded automatically. This feature is supported by all database engines:

>>> messages = Table('messages', engine, autoload = True)
>>> [c.name for c in messages.columns]
['message_id', 'message_name', 'date']

Note that if a reflected table has a foreign key referencing another table, then the metadata for the related table will be loaded as well, even if it has not been defined by the application:

>>> shopping_cart_items = Table('shopping_cart_items', engine, autoload = True)
>>> print shopping_cart_items.c.cart_id.table.name
shopping_carts

To get direct access to 'shopping_carts', simply instantiate it via the Table constructor. You'll get the same instance of the shopping cart Table as the one that is attached to shoppingcartitems:

>>> shopping_carts = Table('shopping_carts', engine)
>>> shopping_carts is shopping_cart_items.c.cart_id.table.name
True

This works because when the Table constructor is called for a particular name and database engine, if the table has already been created then the instance returned will be the same as the original. This is a singleton constructor:

>>> news_articles = Table('news', engine, 
... Column('article_id', Integer, primary_key = True),
... Column('url', String(250), nullable = False)
... )
>>> othertable = Table('news', engine)
>>> othertable is news_articles
True
back to section top
Creating and Dropping Database Tables

Creating and dropping is easy, just use the create() and drop() methods:

employees = Table('employees', engine, 
    Column('employee_id', Integer, primary_key=True),
    Column('employee_name', String(60), nullable=False, key='name'),
    Column('employee_dept', Integer, ForeignKey("departments.department_id"))
)
sqlemployees.create()
sqlemployees.drop()
back to section top
Column Defaults and OnUpdates

SQLAlchemy includes flexible constructs in which to create default values for columns upon the insertion of rows, as well as upon update. These defaults can take several forms: a constant, a Python callable to be pre-executed before the SQL is executed, a SQL expression or function to be pre-executed before the SQL is executed, a pre-executed Sequence (for databases that support sequences), or a "passive" default, which is a default function triggered by the database itself upon insert, the value of which can then be post-fetched by the engine, provided the row provides a primary key in which to call upon.

Pre-Executed Insert Defaults

A basic default is most easily specified by the "default" keyword argument to Column. This defines a value, function, or SQL expression that will be pre-executed to produce the new value, before the row is inserted:

# a function to create primary key ids
i = 0
def mydefault():
    global i
    i += 1
    return i

t = Table("mytable", db, 
    # function-based default
    Column('id', Integer, primary_key=True, default=mydefault),

    # a scalar default
    Column('key', String(10), default="default")
)

The "default" keyword can also take SQL expressions, including select statements or direct function calls:

t = Table("mytable", db, 
    Column('id', Integer, primary_key=True),

    # define 'create_date' to default to now()
    Column('create_date', DateTime, default=func.now()),

    # define 'key' to pull its default from the 'keyvalues' table
    Column('key', String(20), default=keyvalues.select(keyvalues.c.type='type1', limit=1))
    )

The "default" keyword argument is shorthand for using a ColumnDefault object in a column definition. This syntax is optional, but is required for other types of defaults, futher described below:

Column('mycolumn', String(30), ColumnDefault(func.get_data()))
back to section top
Pre-Executed OnUpdate Defaults

Similar to an on-insert default is an on-update default, which is most easily specified by the "onupdate" keyword to Column, which also can be a constant, plain Python function or SQL expression:

t = Table("mytable", db, 
    Column('id', Integer, primary_key=True),

    # define 'last_updated' to be populated with current_timestamp (the ANSI-SQL version of now())
    Column('last_updated', DateTime, onupdate=func.current_timestamp()),
)

To use an explicit ColumnDefault object to specify an on-update, use the "for_update" keyword argument:

Column('mycolumn', String(30), ColumnDefault(func.get_data(), for_update=True))
back to section top
Inline Default Execution: PassiveDefault

A PassiveDefault indicates a column default or on-update value that is executed automatically by the database. This construct is used to specify a SQL function that will be specified as "DEFAULT" when creating tables, and also to indicate the presence of new data that is available to be "post-fetched" after an insert or update execution.

t = Table('test', e, 
    Column('mycolumn', DateTime, PassiveDefault("sysdate"))
)

A create call for the above table will produce:

CREATE TABLE test (
    mycolumn datetime default sysdate
)

PassiveDefaults also send a message to the SQLEngine that data is available after update or insert. The object-relational mapper system uses this information to post-fetch rows after insert or update, so that instances can be refreshed with the new data. Below is a simplified version:

# table with passive defaults
mytable = Table('mytable', engine, 
    Column('my_id', Integer, primary_key=True),

    # an on-insert database-side default
    Column('data1', Integer, PassiveDefault("d1_func")),

    # an on-update database-side default
    Column('data2', Integer, PassiveDefault("d2_func", for_update=True))
)
# insert a row
mytable.insert().execute(name='fred')

# ask the engine: were there defaults fired off on that row ?
if table.engine.lastrow_has_defaults():
    # postfetch the row based on primary key.
    # this only works for a table with primary key columns defined
    primary_key = table.engine.last_inserted_ids()
    row = table.select(table.c.id == primary_key[0])

When Tables are reflected from the database using autoload=True, any DEFAULT values set on the columns will be reflected in the Table object as PassiveDefault instances.

The Catch: Postgres Primary Key Defaults always Pre-Execute

Current Postgres support does not rely upon OID's to determine the identity of a row. This is because the usage of OIDs has been deprecated with Postgres and they are disabled by default for table creates as of PG version 8. Pyscopg2's "cursor.lastrowid" function only returns OIDs. Therefore, when inserting a new row which has passive defaults set on the primary key columns, the default function is still pre-executed since SQLAlchemy would otherwise have no way of retrieving the row just inserted.


back to section top
Defining Sequences

A table with a sequence looks like:

table = Table("cartitems", db, 
    Column("cart_id", Integer, Sequence('cart_id_seq'), primary_key=True),
    Column("description", String(40)),
    Column("createdate", DateTime())
)

The Sequence is used with Postgres or Oracle to indicate the name of a Sequence that will be used to create default values for a column. When a table with a Sequence on a column is created by SQLAlchemy, the Sequence object is also created. Similarly, the Sequence is dropped when the table is dropped. Sequences are typically used with primary key columns. When using Postgres, if an integer primary key column defines no explicit Sequence or other default method, SQLAlchemy will create the column with the SERIAL keyword, and will pre-execute a sequence named "tablenamecolumnnameseq" in order to retrieve new primary key values. Oracle, which has no "auto-increment" keyword, requires that a Sequence be created for a table if automatic primary key generation is desired. Note that for all databases, primary key values can always be explicitly stated within the bind parameters for any insert statement as well, removing the need for any kind of default generation function.

A Sequence object can be defined on a Table that is then used for a non-sequence-supporting database. In that case, the Sequence object is simply ignored. Note that a Sequence object is entirely optional for all databases except Oracle, as other databases offer options for auto-creating primary key values, such as AUTOINCREMENT, SERIAL, etc. SQLAlchemy will use these default methods for creating primary key values if no Sequence is present on the table metadata.

A sequence can also be specified with optional=True which indicates the Sequence should only be used on a database that requires an explicit sequence, and not those that supply some other method of providing integer values. At the moment, it essentially means "use this sequence only with Oracle and not Postgres".

back to section top
Defining Indexes

Indexes can be defined on table columns, including named indexes, non-unique or unique, multiple column. Indexes are included along with table create and drop statements. They are not used for any kind of run-time constraint checking; SQLAlchemy leaves that job to the expert on constraint checking, the database itself.

mytable = Table('mytable', engine, 
    # define a unique index 
    Column('col1', Integer, unique=True),

    # define a unique index with a specific name
    Column('col2', Integer, unique='mytab_idx_1'),

    # define a non-unique index
    Column('col3', Integer, index=True),

    # define a non-unique index with a specific name
    Column('col4', Integer, index='mytab_idx_2'),

    # pass the same name to multiple columns to add them to the same index
    Column('col5', Integer, index='mytab_idx_2'),

    Column('col6', Integer),
    Column('col7', Integer)
)

# create the table.  all the indexes will be created along with it.
mytable.create()

# indexes can also be specified standalone
i = Index('mytab_idx_3', mytable.c.col6, mytable.c.col7, unique=False)

# which can then be created separately (will also get created with table creates)
i.create()
back to section top
Adapting Tables to Alternate Engines

A Table object created against a specific engine can be re-created against a new engine using the toengine method:

# create two engines
sqlite_engine = create_engine('sqlite', {'filename':'querytest.db'})
postgres_engine = create_engine('postgres', 
                    {'database':'test', 
                    'host':'127.0.0.1', 'user':'scott', 'password':'tiger'})

# load 'users' from the sqlite engine
users = Table('users', sqlite_engine, autoload=True)

# create the same Table object for the other engine
pg_users = users.toengine(postgres_engine)

Also available is the "database neutral" ansisql engine:

import sqlalchemy.ansisql as ansisql
generic_engine = ansisql.engine()

users = Table('users', generic_engine, 
    Column('user_id', Integer),
    Column('user_name', String(50))
)

Flexible "multi-engined" tables can also be achieved via the proxy engine, described in the section Using the Proxy Engine.

Non-engine primitives: TableClause/ColumnClause

TableClause and ColumnClause are "primitive" versions of the Table and Column objects which dont use engines at all; applications that just want to generate SQL strings but not directly communicate with a database can use TableClause and ColumnClause objects (accessed via 'table' and 'column'), which are non-singleton and serve as the "lexical" base class of Table and Column:

tab1 = table('table1',
    column('id'),
    column('name'))

tab2 = table('table2',
    column('id'),
    column('email'))

tab1.select(tab1.c.name == 'foo')

TableClause and ColumnClause are strictly lexical. This means they are fully supported within the full range of SQL statement generation, but they don't support schema concepts like creates, drops, primary keys, defaults, nullable status, indexes, or foreign keys.

back to section top

Note: This section describes how to use SQLAlchemy to construct SQL queries and receive result sets. It does not cover the object relational mapping capabilities of SQLAlchemy; that is covered later on in None. However, both areas of functionality work similarly in how selection criterion is constructed, so if you are interested just in ORM, you should probably skim through basic None construction before moving on.

Once you have used the sqlalchemy.schema module to construct your tables and/or reflect them from the database, performing SQL queries using those table meta data objects is done via the sqlalchemy.sql package. This package defines a large set of classes, each of which represents a particular kind of lexical construct within a SQL query; all are descendants of the common base class sqlalchemy.sql.ClauseElement. A full query is represented via a structure of ClauseElements. A set of reasonably intuitive creation functions is provided by the sqlalchemy.sql package to create these structures; these functions are described in the rest of this section.

To execute a query, you create its structure, then call the resulting structure's execute() method, which returns a cursor-like object (more on that later). The same clause structure can be used repeatedly. A ClauseElement is compiled into a string representation by an underlying SQLEngine object, which is located by searching through the clause's child items for a Table object, which provides a reference to its SQLEngine.

The examples below all include a dump of the generated SQL corresponding to the query object, as well as a dump of the statement's bind parameters. In all cases, bind parameters are named parameters using the colon format (i.e. ':name'). A named parameter scheme, either ':name' or '%(name)s', is used with all databases, including those that use positional schemes. For those, the named-parameter statement and its bind values are converted to the proper list-based format right before execution. Therefore a SQLAlchemy application that uses ClauseElements can standardize on named parameters for all databases.

For this section, we will assume the following tables:

from sqlalchemy import *
db = create_engine('sqlite://filename=mydb', echo=True)

# a table to store users
users = Table('users', db,
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(40)),
    Column('password', String(80))
)

# a table that stores mailing addresses associated with a specific user
addresses = Table('addresses', db,
    Column('address_id', Integer, primary_key = True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

# a table that stores keywords
keywords = Table('keywords', db,
    Column('keyword_id', Integer, primary_key = True),
    Column('name', VARCHAR(50))
)

# a table that associates keywords with users
userkeywords = Table('userkeywords', db,
    Column('user_id', INT, ForeignKey("users")),
    Column('keyword_id', INT, ForeignKey("keywords"))
)
Simple Select

A select is done by constructing a Select object with the proper arguments, adding any extra arguments if desired, then calling its execute() method.

from sqlalchemy import *

# use the select() function defined in the sql package
s = select([users])

# or, call the select() method off of a Table object
s = users.select()

# then, call execute on the Select object:
sqlc = s.execute()
# the SQL text of any clause object can also be viewed via the str() call:
>>> str(s)
SELECT users.user_id, users.user_name, users.password FROM users

The object returned by the execute call is a sqlalchemy.engine.ResultProxy object, which acts much like a DBAPI cursor object in the context of a result set, except that the rows returned can address their columns by ordinal position, column name, or even column object:

# select rows, get resulting ResultProxy object
sqlc = users.select().execute()
# get one row
row = c.fetchone()

# get the 'user_id' column via integer index:
user_id = row[0]

# or column name
user_name = row['user_name']

# or column object
password = row[users.c.password]

# or column accessor
password = row.password

# ResultProxy object also supports fetchall()
rows = c.fetchall()

# or get the underlying DBAPI cursor object
cursor = c.cursor
Using Column Labels

A common need when writing statements that reference multiple tables is to create labels for columns, thereby separating columns from different tables with the same name. The Select construct supports automatic generation of column labels via the use_labels=True parameter:

sqlc = select([users, addresses], 
users.c.user_id==addresses.c.address_id, 
use_labels=True).execute()

The table name part of the label is affected if you use a construct such as a table alias:

person = users.alias('person')
sqlc = select([person, addresses], 
    person.c.user_id==addresses.c.address_id, 
    use_labels=True).execute()

You can also specify custom labels on a per-column basis using the label() function:

sqlc = select([users.c.user_id.label('id'), users.c.user_name.label('name')]).execute()
back to section top
Table/Column Specification

Calling select off a table automatically generates a column clause which includes all the table's columns, in the order they are specified in the source Table object.

But in addition to selecting all the columns off a single table, any set of columns can be specified, as well as full tables, and any combination of the two:

# individual columns
sqlc = select([users.c.user_id, users.c.user_name]).execute()
# full tables
sqlc = select([users, addresses]).execute()
# combinations
sqlc = select([users, addresses.c.zip]).execute()
back to section top
WHERE Clause

The WHERE condition is the named keyword argument whereclause, or the second positional argument to the select() constructor and the first positional argument to the select() method of Table.

WHERE conditions are constructed using column objects, literal values, and functions defined in the sqlalchemy.sql module. Column objects override the standard Python operators to provide clause compositional objects, which compile down to SQL operations:

sqlc = users.select(users.c.user_id == 7).execute()

Notice that the literal value "7" was broken out of the query and placed into a bind parameter. Databases such as Oracle must parse incoming SQL and create a "plan" when new queries are received, which is an expensive process. By using bind parameters, the same query with various literal values can have its plan compiled only once, and used repeatedly with less overhead.

More where clauses:

# another comparison operator
sqlc = select([users], users.c.user_id>7).execute()
# OR keyword
sqlc = users.select(or_(users.c.user_name=='jack', users.c.user_name=='ed')).execute()
# AND keyword
sqlc = users.select(and_(users.c.user_name=='jack', users.c.password=='dog')).execute()
# NOT keyword
sqlc = users.select(not_(
        or_(users.c.user_name=='jack', users.c.password=='dog')
    )).execute()
# IN clause
sqlc = users.select(users.c.user_name.in_('jack', 'ed', 'fred')).execute()
# join users and addresses together
sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id).execute()
# join users and addresses together, but dont specify "addresses" in the 
# selection criterion.  The WHERE criterion adds it to the FROM list 
# automatically.
sqlc = select([users], and_(
                users.c.user_id==addresses.c.user_id,
                users.c.user_name=='fred'
            )).execute()

Select statements can also generate a WHERE clause based on the parameters you give it. If a given parameter, which matches the name of a column or its "label" (the combined tablename + "_" + column name), and does not already correspond to a bind parameter in the select object, it will be added as a comparison against that column. This is a shortcut to creating a full WHERE clause:

# specify a match for the "user_name" column
sqlc = users.select().execute(user_name='ed')
# specify a full where clause for the "user_name" column, as well as a
# comparison for the "user_id" column
sqlc = users.select(users.c.user_name=='ed').execute(user_id=10)
Operators

Supported column operators so far are all the numerical comparison operators, i.e. '==', '>', '>=', etc., as well as like(), startswith(), endswith(), between(), and in(). Boolean operators include not_(), and() and or(), which also can be used inline via '~', '&', and '|'. Math operators are '+', '-', '*', '/'. Any custom operator can be specified via the op() function shown below.

# "like" operator
users.select(users.c.user_name.like('%ter'))

# equality operator
users.select(users.c.user_name == 'jane')

# in opertator
users.select(users.c.user_id.in_(1,2,3))

# and_, endswith, equality operators
users.select(and_(addresses.c.street.endswith('green street'), addresses.c.zip=='11234'))

# & operator subsituting for 'and_'
users.select(addresses.c.street.endswith('green street') & (addresses.c.zip=='11234'))

# + concatenation operator
select([users.c.user_name + '_name'])

# NOT operator
users.select(~(addresses.c.street == 'Green Street'))

# any custom operator
select([users.c.user_name.op('||')('_category')])
back to section top
Specifying the Engine

For queries that don't contain any tables, the SQLEngine can be specified to any constructed statement via the engine keyword parameter:

# select a literal
select(["hi"], engine=myengine)

# select a function
select([func.now()], engine=db)
back to section top
Functions

Functions can be specified using the func keyword:

sqlselect([func.count(users.c.user_id)]).execute()
sqlusers.select(func.substr(users.c.user_name, 1) == 'J').execute()

Functions also are callable as standalone values:

# call the "now()" function
time = func.now(engine=myengine).scalar()

# call myfunc(1,2,3)
myvalue = func.myfunc(1, 2, 3, engine=db).execute()

# or call them off the engine
db.func.now().scalar()
back to section top
Literals

You can drop in a literal value anywhere there isnt a column to attach to via the literal keyword:

sqlselect([literal('foo') + literal('bar'), users.c.user_name]).execute()
# literals have all the same comparison functions as columns
sqlselect([literal('foo') == literal('bar')], engine=myengine).scalar()

Literals also take an optional type parameter to give literals a type. This can sometimes be significant, for example when using the "+" operator with SQLite, the String type is detected and the operator is converted to "||":

sqlselect([literal('foo', type=String) + 'bar'], engine=e).execute()
back to section top
Order By

The ORDER BY clause of a select statement can be specified as individual columns to order by within an array specified via the order_by parameter, and optional usage of the asc() and desc() functions:

# straight order by
sqlc = users.select(order_by=[users.c.user_name]).execute()
# descending/ascending order by on multiple columns
sqlc = users.select(
    users.c.user_name>'J', 
    order_by=[desc(users.c.user_id), asc(users.c.user_name)]).execute()
back to section top
DISTINCT, LIMIT and OFFSET

These are specified as keyword arguments:

sqlc = select([users.c.user_name], distinct=True).execute()
sqlc = users.select(limit=10, offset=20).execute()

The Oracle driver does not support LIMIT and OFFSET directly, but instead wraps the generated query into a subquery and uses the "rownum" variable to control the rows selected (this is somewhat experimental).

back to section top
Inner and Outer Joins

As some of the examples indicated above, a regular inner join can be implicitly stated, just like in a SQL expression, by just specifying the tables to be joined as well as their join conditions:

sqladdresses.select(addresses.c.user_id==users.c.user_id).execute()

There is also an explicit join constructor, which can be embedded into a select query via the from_obj parameter of the select statement:

sqladdresses.select(from_obj=[
    addresses.join(users, addresses.c.user_id==users.c.user_id)
]).execute()

The join constructor can also be used by itself:

sqljoin(users, addresses, users.c.user_id==addresses.c.user_id).select().execute()

The join criterion in a join() call is optional. If not specified, the condition will be derived from the foreign key relationships of the two tables. If no criterion can be constructed, an exception will be raised.

sqljoin(users, addresses).select().execute()

Notice that this is the first example where the FROM criterion of the select statement is explicitly specified. In most cases, the FROM criterion is automatically determined from the columns requested as well as the WHERE clause. The from_obj keyword argument indicates a list of explicit FROM clauses to be used in the statement.

A join can be created on its own using the join or outerjoin functions, or can be created off of an existing Table or other selectable unit via the join or outerjoin methods:

sqlouterjoin(users, addresses, users.c.user_id==addresses.c.address_id).select().execute()
sqlusers.select(keywords.c.name=='running', from_obj=[
        users.join(
            userkeywords, userkeywords.c.user_id==users.c.user_id).join(
                keywords, keywords.c.keyword_id==userkeywords.c.keyword_id)
        ]).execute()
back to section top
Table Aliases

Aliases are used primarily when you want to use the same table more than once as a FROM expression in a statement:

address_b = addresses.alias('addressb')
sql# select users who have an address on Green street as well as Orange street
users.select(and_(
    users.c.user_id==addresses.c.user_id,
    addresses.c.street.like('%Green%'),
    users.c.user_id==address_b.c.user_id,
    address_b.c.street.like('%Orange%')
)).execute()
back to section top
Subqueries

SQLAlchemy allows the creation of select statements from not just Table objects, but from a whole class of objects that implement the Selectable interface. This includes Tables, Aliases, Joins and Selects. Therefore, if you have a Select, you can select from the Select:

">>>" s = users.select()
">>>" str(s)
SELECT users.user_id, users.user_name, users.password FROM users

">>>" s = s.select()
">>>" str(s)
SELECT user_id, user_name, password
FROM (SELECT users.user_id, users.user_name, users.password FROM users)

Any Select, Join, or Alias object supports the same column accessors as a Table:

">>>" s = users.select()
">>>" [c.key for c in s.columns]
['user_id', 'user_name', 'password']

When you use use_labels=True in a Select object, the label version of the column names become the keys of the accessible columns. In effect you can create your own "view objects":

s = select([users, addresses], users.c.user_id==addresses.c.user_id, use_labels=True)
sqlselect([
    s.c.users_user_name, s.c.addresses_street, s.c.addresses_zip
], s.c.addresses_city=='San Francisco').execute()

To specify a SELECT statement as one of the selectable units in a FROM clause, it usually should be given an alias.

sqls = users.select().alias('u')
select([addresses, s]).execute()

Select objects can be used in a WHERE condition, in operators such as IN:

# select user ids for all users whos name starts with a "p"
s = select([users.c.user_id], users.c.user_name.like('p%'))

# now select all addresses for those users
sqladdresses.select(addresses.c.user_id.in_(s)).execute()

The sql package supports embedding select statements into other select statements as the criterion in a WHERE condition, or as one of the "selectable" objects in the FROM list of the query. It does not at the moment directly support embedding a SELECT statement as one of the column criterion for a statement, although this can be achieved via direct text insertion, described later.

Scalar Column Subqueries

Subqueries can be used in the column clause of a select statement by specifying the scalar=True flag:

sqlselect([table2.c.col1, table2.c.col2, select([table1.c.col1], table1.c.col2==7, scalar=True)])
back to section top
Correlated Subqueries

When a select object is embedded inside of another select object, and both objects reference the same table, SQLAlchemy makes the assumption that the table should be correlated from the child query to the parent query. To disable this behavior, specify the flag correlate=False to the Select statement.

# make an alias of a regular select.   
s = select([addresses.c.street], addresses.c.user_id==users.c.user_id).alias('s')
>>> str(s)
SELECT addresses.street FROM addresses, users 
WHERE addresses.user_id = users.user_id

# now embed that select into another one.  the "users" table is removed from
# the embedded query's FROM list and is instead correlated to the parent query
s2 = select([users, s.c.street])
>>> str(s2)
SELECT users.user_id, users.user_name, users.password, s.street
FROM users, (SELECT addresses.street FROM addresses
WHERE addresses.user_id = users.user_id) s
EXISTS Clauses

An EXISTS clause can function as a higher-scaling version of an IN clause, and is usually used in a correlated fashion:

# find all users who have an address on Green street:
sqlusers.select(
    exists(
        [addresses.c.address_id], 
        and_(
            addresses.c.user_id==users.c.user_id, 
            addresses.c.street.like('%Green%')
        )
    )
)
back to section top
Unions

Unions come in two flavors, UNION and UNION ALL, which are available via module level functions or methods off a Selectable:

sqlunion(
    addresses.select(addresses.c.street=='123 Green Street'),
    addresses.select(addresses.c.street=='44 Park Ave.'),
    addresses.select(addresses.c.street=='3 Mill Road'),
    order_by=[addresses.c.street]
).execute()
sqlusers.select(
    users.c.user_id==7
  ).union_all(
      users.select(
          users.c.user_id==9
      ), 
      order_by=[users.c.user_id]   # order_by is an argument to union_all()
  ).execute()
back to section top
Custom Bind Parameters

Throughout all these examples, SQLAlchemy is busy creating bind parameters wherever literal expressions occur. You can also specify your own bind parameters with your own names, and use the same statement repeatedly. As mentioned at the top of this section, named bind parameters are always used regardless of the type of DBAPI being used; for DBAPI's that expect positional arguments, bind parameters are converted to lists right before execution, and Pyformat strings in statements, i.e. '%(name)s', are converted to the appropriate positional style.

s = users.select(users.c.user_name==bindparam('username'))
sqls.execute(username='fred')
sqls.execute(username='jane')
sqls.execute(username='mary')

executemany() is also available, but that applies more to INSERT/UPDATE/DELETE, described later.

The generation of bind parameters is performed specific to the engine being used. The examples in this document all show "named" parameters like those used in sqlite and oracle. Depending on the parameter type specified by the DBAPI module, the correct bind parameter scheme will be used.

Precompiling a Query

By throwing the compile() method onto the end of any query object, the query can be "compiled" by the SQLEngine into a sqlalchemy.sql.Compiled object just once, and the resulting compiled object reused, which eliminates repeated internal compilation of the SQL string:

s = users.select(users.c.user_name==bindparam('username')).compile()
s.execute(username='fred')
s.execute(username='jane')
s.execute(username='mary')
back to section top
Literal Text Blocks

The sql package tries to allow free textual placement in as many ways as possible. In the examples below, note that the from_obj parameter is used only when no other information exists within the select object with which to determine table metadata. Also note that in a query where there isnt even table metadata used, the SQLEngine to be used for the query has to be explicitly specified:

# strings as column clauses
sqlselect(["user_id", "user_name"], from_obj=[users]).execute()
# strings for full column lists
sqlselect(
        ["user_id, user_name, password, addresses.*"], 
        from_obj=[users.alias('u'), addresses]).execute()
# functions, etc.
sqlselect([users.c.user_id, "process_string(user_name)"]).execute()
# where clauses
sqlusers.select(and_(users.c.user_id==7, "process_string(user_name)=27")).execute()
# subqueries
sqlusers.select(
    "exists (select 1 from addresses where addresses.user_id=users.user_id)").execute()
# custom FROM objects
sqlselect(
        ["*"], 
        from_obj=["(select user_id, user_name from users)"], 
        engine=db).execute()
# a full query
{sql}text("select user_name from users", engine=db).execute()
select user_name from users
{}

# or call text() off of the engine
engine.text("select user_name from users").execute()

# execute off the engine directly - you must use the engine's native bind parameter
# style (i.e. named, pyformat, positional, etc.)
{sql}db.execute(
        "select user_name from users where user_id=:user_id", 
        {'user_id':7}).execute()
select user_name from users where user_id=:user_id
{'user_id':7}
Using Bind Parameters in Text Blocks

Use the format ':paramname' to define bind parameters inside of a text block. They will be converted to the appropriate format upon compilation:

t = engine.text("select foo from mytable where lala=:hoho")
r = t.execute(hoho=7)

Bind parameters can also be explicit, which allows typing information to be added. Just specify them as a list with keys that match those inside the textual statement:

t = engine.text("select foo from mytable where lala=:hoho", 
                bindparams=[bindparam('hoho', type=types.String)])
    r = t.execute(hoho="im hoho")

Result-row type processing can be added via the typemap argument, which is a dictionary of return columns mapped to types:

# specify DateTime type for the 'foo' column in the result set
# sqlite, for example, uses result-row post-processing to construct dates
t = engine.text("select foo from mytable where lala=:hoho", 
        bindparams=[bindparam('hoho', type=types.String)],
        typemap={'foo':types.DateTime}
        )
r = t.execute(hoho="im hoho")

# 'foo' is a datetime
year = r.fetchone()['foo'].year
back to section top
Building Select Objects

One of the primary motivations for a programmatic SQL library is to allow the piecemeal construction of a SQL statement based on program variables. All the above examples typically show Select objects being created all at once. The Select object also includes "builder" methods to allow building up an object. The below example is a "user search" function, where users can be selected based on primary key, user name, street address, keywords, or any combination:

def find_users(id=None, name=None, street=None, keywords=None):
statement = users.select()
if id is not None:
    statement.append_whereclause(users.c.user_id==id)
if name is not None:
    statement.append_whereclause(users.c.user_name==name)
if street is not None:
    # append_whereclause joins "WHERE" conditions together with AND
    statement.append_whereclause(users.c.user_id==addresses.c.user_id)
    statement.append_whereclause(addresses.c.street==street)
if keywords is not None:
    statement.append_from(
            users.join(userkeywords, users.c.user_id==userkeywords.c.user_id).join(
                    keywords, userkeywords.c.keyword_id==keywords.c.keyword_id))
    statement.append_whereclause(keywords.c.name.in_(keywords))
    # to avoid multiple repeats, set query to be DISTINCT:
    statement.distinct=True
return statement.execute()

sqlfind_users(id=7)
sqlfind_users(street='123 Green Street')
sqlfind_users(name='Jack', keywords=['jack','foo'])
back to section top
Inserts

An INSERT involves just one table. The Insert object is used via the insert() function, and the specified columns determine what columns show up in the generated SQL. If primary key columns are left out of the criterion, the SQL generator will try to populate them as specified by the particular database engine and sequences, i.e. relying upon an auto-incremented column or explicitly calling a sequence beforehand. Insert statements, as well as updates and deletes, can also execute multiple parameters in one pass via specifying an array of dictionaries as parameters.

The values to be populated for an INSERT or an UPDATE can be specified to the insert()/update() functions as the values named argument, or the query will be compiled based on the values of the parameters sent to the execute() method.

# basic insert
sqlusers.insert().execute(user_id=1, user_name='jack', password='asdfdaf')
# insert just user_name, NULL for others
# will auto-populate primary key columns if they are configured
# to do so
sqlusers.insert().execute(user_name='ed')
# INSERT with a list:
sqlusers.insert(values=(3, 'jane', 'sdfadfas')).execute()
# INSERT with user-defined bind parameters
i = users.insert(
    values={'user_name':bindparam('name'), 'password':bindparam('pw')}
    )
sqli.execute(name='mary', pw='adas5fs')
# INSERT many - if no explicit 'values' parameter is sent,
# the first parameter list in the list determines
# the generated SQL of the insert (i.e. what columns are present)
# executemany() is used at the DBAPI level
sqlusers.insert().execute(
    {'user_id':7, 'user_name':'jack', 'password':'asdfasdf'},
    {'user_id':8, 'user_name':'ed', 'password':'asdffcadf'},
    {'user_id':9, 'user_name':'fred', 'password':'asttf'},
)
back to section top
Updates

Updates work a lot like INSERTS, except there is an additional WHERE clause that can be specified.

# change 'jack' to 'ed'
sqlusers.update(users.c.user_name=='jack').execute(user_name='ed')
# use bind parameters
u = users.update(users.c.user_name==bindparam('name'), 
                values={'user_name':bindparam('newname')})
sqlu.execute(name='jack', newname='ed')
# update a column to another column
sqlusers.update(values={users.c.password:users.c.user_name}).execute()
# multi-update
sqlusers.update(users.c.user_id==bindparam('id')).execute(
        {'id':7, 'user_name':'jack', 'password':'fh5jks'},
        {'id':8, 'user_name':'ed', 'password':'fsr234ks'},
        {'id':9, 'user_name':'mary', 'password':'7h5jse'},
    )
Correlated Updates

A correlated update lets you update a table using selection from another table, or the same table:

s = select([addresses.c.city], addresses.c.user_id==users.c.user_id)
sqlusers.update(
    and_(users.c.user_id>10, users.c.user_id&lt;20), 
    values={users.c.user_name:s}
).execute()
back to section top
Deletes

A delete is formulated like an update, except theres no values:

users.delete(users.c.user_id==7).execute()
users.delete(users.c.user_name.like(bindparam('name'))).execute(
        {'name':'%Jack%'},
        {'name':'%Ed%'},
        {'name':'%Jane%'},
    )
users.delete(exists())
back to section top

Basic Data Mapping

Data mapping describes the process of defining Mapper objects, which associate table metadata with user-defined classes.

The Mapper's role is to perform SQL operations upon the database, associating individual table rows with instances of those classes, and individual database columns with properties upon those instances, to transparently associate in-memory objects with a persistent database representation.

When a Mapper is created to associate a Table object with a class, all of the columns defined in the Table object are associated with the class via property accessors, which add overriding functionality to the normal process of setting and getting object attributes. These property accessors keep track of changes to object attributes; these changes will be stored to the database when the application "commits" the current transactional context (known as a Unit of Work). The __init__() method of the object is also decorated to communicate changes when new instances of the object are created.

The Mapper also provides the interface by which instances of the object are loaded from the database. The primary method for this is its select() method, which has similar arguments to a sqlalchemy.sql.Select object. But this select method executes automatically and returns results, instead of awaiting an execute() call. Instead of returning a cursor-like object, it returns an array of objects.

The three elements to be defined, i.e. the Table metadata, the user-defined class, and the Mapper, are typically defined as module-level variables, and may be defined in any fashion suitable to the application, with the only requirement being that the class and table metadata are described before the mapper. For the sake of example, we will be defining these elements close together, but this should not be construed as a requirement; since SQLAlchemy is not a framework, those decisions are left to the developer or an external framework.

back to section top
Synopsis

This is the simplest form of a full "round trip" of creating table meta data, creating a class, mapping the class to the table, getting some results, and saving changes. For each concept, the following sections will dig in deeper to the available capabilities.

from sqlalchemy import *

# engine
engine = create_engine("sqlite://mydb.db")

# table metadata
users = Table('users', engine, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# class definition 
class User(object):
    pass

# create a mapper
usermapper = mapper(User, users)

# select
sqluser = usermapper.select_by(user_name='fred')[0]
# modify
user.user_name = 'fred jones'

# commit - saves everything that changed
sqlobjectstore.commit()

Attaching Mappers to their Class

For convenience's sake, the Mapper can be attached as an attribute on the class itself as well:

User.mapper = mapper(User, users)

userlist = User.mapper.select_by(user_id=12)

There is also a full-blown "monkeypatch" function that creates a primary mapper, attaches the above mapper class property, and also the methods get, get_by, select, select_by, selectone, selectfirst, commit, expire, refresh, expunge and delete:

# "assign" a mapper to the User class/users table
assign_mapper(User, users)

# methods are attached to the class for selecting
userlist = User.select_by(user_id=12)

myuser = User.get(1)

# mark an object as deleted for the next commit
myuser.delete()

# commit the changes on a specific object
myotheruser.commit()

Other methods of associating mappers and finder methods with their corresponding classes, such as via common base classes or mixins, can be devised as well. SQLAlchemy does not aim to dictate application architecture and will always allow the broadest variety of architectural patterns, but may include more helper objects and suggested architectures in the future.

back to section top
Overriding Properties

A common request is the ability to create custom class properties that override the behavior of setting/getting an attribute. Currently, the easiest way to do this in SQLAlchemy is how it would be done in any Python program; define your attribute with a different name, such as "_attribute", and use a property to get/set its value. The mapper just needs to be told of the special name:

class MyClass(object):
    def _set_email(self, email):
        self._email = email
    def _get_email(self, email):
        return self._email
    email = property(_get_email, _set_email)

m = mapper(MyClass, mytable, properties = {
        # map the '_email' attribute to the "email" column
        # on the table
        '_email': mytable.c.email
})

In a later release, SQLAlchemy will also allow getemail and setemail to be attached directly to the "email" property created by the mapper, and will also allow this association to occur via decorators.

back to section top
Selecting from a Mapper

There are a variety of ways to select from a mapper. These range from minimalist to explicit. Below is a synopsis of the these methods:

# select_by, using property names or column names as keys
# the keys are grouped together by an AND operator
result = mapper.select_by(name='john', street='123 green street')

# select_by can also combine SQL criterion with key/value properties
result = mapper.select_by(users.c.user_name=='john', 
        addresses.c.zip_code=='12345, street='123 green street')

# get_by, which takes the same arguments as select_by
# returns a single scalar result or None if no results
user = mapper.get_by(id=12)

# "dynamic" versions of select_by and get_by - everything past the 
# "select_by_" or "get_by_" is used as the key, and the function argument
# as the value
result = mapper.select_by_name('fred')
u = mapper.get_by_name('fred')

# get an object directly from its primary key.  this will bypass the SQL
# call if the object has already been loaded
u = mapper.get(15)

# get an object that has a composite primary key of three columns.
# the order of the arguments matches that of the table meta data.
myobj = mapper.get(27, 3, 'receipts')

# using a WHERE criterion
result = mapper.select(or_(users.c.user_name == 'john', users.c.user_name=='fred'))

# using a WHERE criterion to get a scalar
u = mapper.selectfirst(users.c.user_name=='john')

# selectone() is a stricter version of selectfirst() which
# will raise an exception if there is not exactly one row
u = mapper.selectone(users.c.user_name=='john')

# using a full select object
result = mapper.select(users.select(users.c.user_name=='john'))

# using straight text  
result = mapper.select_text("select * from users where user_name='fred'")

# or using a "text" object
result = mapper.select(text("select * from users where user_name='fred'", engine=engine))

Some of the above examples above illustrate the usage of the mapper's Table object to provide the columns for a WHERE Clause. These columns are also accessible off of the mapped class directly. When a mapper is assigned to a class, it also attaches a special property accessor c to the class itself, which can be used just like the table metadata to access the columns of the table:

User.mapper = mapper(User, users)

userlist = User.mapper.select(User.c.user_id==12)
back to section top
Saving Objects

When objects corresponding to mapped classes are created or manipulated, all changes are logged by a package called sqlalchemy.mapping.objectstore. The changes are then written to the database when an application calls objectstore.commit(). This pattern is known as a Unit of Work, and has many advantages over saving individual objects or attributes on those objects with individual method invocations. Domain models can be built with far greater complexity with no concern over the order of saves and deletes, excessive database round-trips and write operations, or deadlocking issues. The commit() operation uses a transaction as well, and will also perform "concurrency checking" to insure the proper number of rows were in fact affected (not supported with the current MySQL drivers). Transactional resources are used effectively in all cases; the unit of work handles all the details.

The Unit of Work is a powerful tool, and has some important concepts that must be understood in order to use it effectively. While this section illustrates rudimentary Unit of Work usage, it is strongly encouraged to consult the None section for a full description on all its operations, including session control, deletion, and developmental guidelines.

When a mapper is created, the target class has its mapped properties decorated by specialized property accessors that track changes, and its __init__() method is also decorated to mark new objects as "new".

User.mapper = mapper(User, users)

# create a new User
myuser = User()
myuser.user_name = 'jane'
myuser.password = 'hello123'

# create another new User      
myuser2 = User()
myuser2.user_name = 'ed'
myuser2.password = 'lalalala'

# load a third User from the database            
sqlmyuser3 = User.mapper.select(User.c.user_name=='fred')[0]
myuser3.user_name = 'fredjones'

# save all changes            
sqlobjectstore.commit()

In the examples above, we defined a User class with basically no properties or methods. Theres no particular reason it has to be this way, the class can explicitly set up whatever properties it wants, whether or not they will be managed by the mapper. It can also specify a constructor, with the restriction that the constructor is able to function with no arguments being passed to it (this restriction can be lifted with some extra parameters to the mapper; more on that later):

class User(object):
    def __init__(self, user_name = None, password = None):
        self.user_id = None
        self.user_name = user_name
        self.password = password
    def get_name(self):
        return self.user_name
    def __repr__(self):
        return "User id %s name %s password %s" % (repr(self.user_id), 
            repr(self.user_name), repr(self.password))
User.mapper = mapper(User, users)

u = User('john', 'foo')
sqlobjectstore.commit()
>>> u
User id 1 name 'john' password 'foo'

Recent versions of SQLAlchemy will only put modified object attributes columns into the UPDATE statements generated upon commit. This is to conserve database traffic and also to successfully interact with a "deferred" attribute, which is a mapped object attribute against the mapper's primary table that isnt loaded until referenced by the application.

back to section top
Defining and Using Relationships

So that covers how to map the columns in a table to an object, how to load objects, create new ones, and save changes. The next step is how to define an object's relationships to other database-persisted objects. This is done via the relation function provided by the mapper module. So with our User class, lets also define the User has having one or more mailing addresses. First, the table metadata:

from sqlalchemy import *
engine = create_engine('sqlite://filename=mydb')

# define user table
users = Table('users', engine, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# define user address table
addresses = Table('addresses', engine,
    Column('address_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

Of importance here is the addresses table's definition of a foreign key relationship to the users table, relating the user_id column into a parent-child relationship. When a Mapper wants to indicate a relation of one object to another, this ForeignKey object is the default method by which the relationship is determined (although if you didn't define ForeignKeys, or you want to specify explicit relationship columns, that is available as well).

So then lets define two classes, the familiar User class, as well as an Address class:

class User(object):
    def __init__(self, user_name = None, password = None):
        self.user_name = user_name
        self.password = password

class Address(object):
    def __init__(self, street=None, city=None, state=None, zip=None):
        self.street = street
        self.city = city
        self.state = state
        self.zip = zip

And then a Mapper that will define a relationship of the User and the Address classes to each other as well as their table metadata. We will add an additional mapper keyword argument properties which is a dictionary relating the name of an object property to a database relationship, in this case a relation object against a newly defined mapper for the Address class:

User.mapper = mapper(User, users, properties = {
                    'addresses' : relation(mapper(Address, addresses))
                }
              )

Lets do some operations with these classes and see what happens:

u = User('jane', 'hihilala')
u.addresses.append(Address('123 anywhere street', 'big city', 'UT', '76543'))
u.addresses.append(Address('1 Park Place', 'some other city', 'OK', '83923'))

objectstore.commit()
INSERT INTO users (user_name, password) VALUES (:user_name, :password)
{'password': 'hihilala', 'user_name': 'jane'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'big city', 'state': 'UT', 'street': '123 anywhere street', 'user_id':1, 'zip': '76543'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'some other city', 'state': 'OK', 'street': '1 Park Place', 'user_id':1, 'zip': '83923'}

A lot just happened there! The Mapper object figured out how to relate rows in the addresses table to the users table, and also upon commit had to determine the proper order in which to insert rows. After the insert, all the User and Address objects have all their new primary and foreign keys populated.

Also notice that when we created a Mapper on the User class which defined an 'addresses' relation, the newly created User instance magically had an "addresses" attribute which behaved like a list. This list is in reality a property accessor function, which returns an instance of sqlalchemy.util.HistoryArraySet, which fulfills the full set of Python list accessors, but maintains a unique set of objects (based on their in-memory identity), and also tracks additions and deletions to the list:

del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

objectstore.commit()
UPDATE addresses SET user_id=:user_id
WHERE addresses.address_id = :addresses_address_id
[{'user_id': None, 'addresses_address_id': 2}]
INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}
Useful Feature: Private Relations

So our one address that was removed from the list, was updated to have a user_id of None, and a new address object was inserted to correspond to the new Address added to the User. But now, theres a mailing address with no user_id floating around in the database of no use to anyone. How can we avoid this ? This is acheived by using the private=True parameter of relation:

User.mapper = mapper(User, users, properties = {
                    'addresses' : relation(mapper(Address, addresses), private=True)
                }
              )
del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

objectstore.commit()
INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}
DELETE FROM addresses WHERE addresses.address_id = :address_id
[{'address_id': 2}]

In this case, with the private flag set, the element that was removed from the addresses list was also removed from the database. By specifying the private flag on a relation, it is indicated to the Mapper that these related objects exist only as children of the parent object, otherwise should be deleted.

back to section top
Useful Feature: Backreferences

By creating relations with the backref keyword, a bi-directional relationship can be created which will keep both ends of the relationship updated automatically, even without any database queries being executed. Below, the User mapper is created with an "addresses" property, and the corresponding Address mapper receives a "backreference" to the User object via the property name "user":

Address.mapper = mapper(Address, addresses)
User.mapper = mapper(User, users, properties = {
                'addresses' : relation(Address.mapper, backref='user')
            }
          )

u = User('fred', 'hi')
a1 = Address('123 anywhere street', 'big city', 'UT', '76543')
a2 = Address('1 Park Place', 'some other city', 'OK', '83923')

# append a1 to u
u.addresses.append(a1)

# attach u to a2
a2.user = u

# the bi-directional relation is maintained
>>> u.addresses == [a1, a2]
True
>>> a1.user is user and a2.user is user
True

The backreference feature also works with many-to-many relationships, which are described later. When creating a backreference, a corresponding property is placed on the child mapper. The default arguments to this property can be overridden using the backref() function:

Address.mapper = mapper(Address, addresses)

User.mapper = mapper(User, users, properties = {
                'addresses' : relation(Address.mapper, 
                    backref=backref('user', lazy=False, private=True))
            }
          )
back to section top
Creating Relationships Automatically with cascade_mappers

The mapper package has a helper function cascade_mappers() which can simplify the task of linking several mappers together. Given a list of classes and/or mappers, it identifies the foreign key relationships between the given mappers or corresponding class mappers, and creates relation() objects representing those relationships, including a backreference. Attempts to find the "secondary" table in a many-to-many relationship as well. The names of the relations are a lowercase version of the related class. In the case of one-to-many or many-to-many, the name is "pluralized", which currently is based on the English language (i.e. an 's' or 'es' added to it):

# create two mappers.  the 'users' and 'addresses' tables have a foreign key
# relationship
mapper1 = mapper(User, users)
mapper2 = mapper(Address, addresses)

# cascade the two mappers together (can also specify User, Address as the arguments)
cascade_mappers(mapper1, mapper2)

# two new object instances
u = User('user1')
a = Address('test')

# "addresses" and "user" property are automatically added
u.addresses.append(a)
print a.user
back to section top
Selecting from Relationships: Lazy Load

We've seen how the relation specifier affects the saving of an object and its child items, how does it affect selecting them? By default, the relation keyword indicates that the related property should be attached a Lazy Loader when instances of the parent object are loaded from the database; this is just a callable function that when accessed will invoke a second SQL query to load the child objects of the parent.

# define a mapper
User.mapper = mapper(User, users, properties = {
              'addresses' : relation(mapper(Address, addresses), private=True)
            })

# select users where username is 'jane', get the first element of the list
# this will incur a load operation for the parent table
sqluser = User.mapper.select(user_name='jane')[0]
# iterate through the User object's addresses.  this will incur an
# immediate load of those child items
sqlfor a in user.addresses:
Useful Feature: Creating Joins via select_by

In mappers that have relationships, the select_by method and its cousins include special functionality that can be used to create joins. Just specify a key in the argument list which is not present in the primary mapper's list of properties or columns, but is present in the property list of one of its relationships:

sqll = User.mapper.select_by(street='123 Green Street')

The above example is shorthand for:

l = User.mapper.select(and_(
         Address.c.user_id==User.c.user_id, 
         Address.c.street=='123 Green Street')
   )
back to section top
How to Refresh the List?

Once the child list of Address objects is loaded, it is done loading for the lifetime of the object instance. Changes to the list will not be interfered with by subsequent loads, and upon commit those changes will be saved. Similarly, if a new User object is created and child Address objects added, a subsequent select operation which happens to touch upon that User instance, will also not affect the child list, since it is already loaded.

The issue of when the mapper actually gets brand new objects from the database versus when it assumes the in-memory version is fine the way it is, is a subject of transactional scope. Described in more detail in the Unit of Work section, for now it should be noted that the total storage of all newly created and selected objects, within the scope of the current thread, can be reset via releasing or otherwise disregarding all current object instances, and calling:

objectstore.clear()

This operation will clear out all currently mapped object instances, and subsequent select statements will load fresh copies from the databse.

To operate upon a single object, just use the remove function:

# (this function coming soon)
objectstore.remove(myobject)
back to section top
Selecting from Relationships: Eager Load

With just a single parameter "lazy=False" specified to the relation object, the parent and child SQL queries can be joined together.

Address.mapper = mapper(Address, addresses)
User.mapper = mapper(User, users, properties = {
                'addresses' : relation(Address.mapper, lazy=False)
            }
          )

sqluser = User.mapper.get_by(user_name='jane')
for a in user.addresses:  
    print repr(a)

Above, a pretty ambitious query is generated just by specifying that the User should be loaded with its child Addresses in one query. When the mapper processes the results, it uses an Identity Map to keep track of objects that were already loaded, based on their primary key identity. Through this method, the redundant rows produced by the join are organized into the distinct object instances they represent.

The generation of this query is also immune to the effects of additional joins being specified in the original query. To use our select_by example above, joining against the "addresses" table to locate users with a certain street results in this behavior:

sqlusers = User.mapper.select_by(street='123 Green Street')

The join implied by passing the "street" parameter is converted into an "aliasized" clause by the eager loader, so that it does not conflict with the join used to eager load the child address objects.

back to section top
Switching Lazy/Eager, No Load

The options method of mapper provides an easy way to get alternate forms of a mapper from an original one. The most common use of this feature is to change the "eager/lazy" loading behavior of a particular mapper, via the functions eagerload(), lazyload() and noload():

# user mapper with lazy addresses
User.mapper = mapper(User, users, properties = {
             'addresses' : relation(mapper(Address, addresses))
         }
)

# make an eager loader                    
eagermapper = User.mapper.options(eagerload('addresses'))
u = eagermapper.select()

# make another mapper that wont load the addresses at all
plainmapper = User.mapper.options(noload('addresses'))

# multiple options can be specified
mymapper = oldmapper.options(lazyload('tracker'), noload('streets'), eagerload('members'))

# to specify a relation on a relation, separate the property names by a "."
mymapper = oldmapper.options(eagerload('orders.items'))
back to section top
One to One/Many to One

The above examples focused on the "one-to-many" relationship. To do other forms of relationship is easy, as the relation function can usually figure out what you want:

# a table to store a user's preferences for a site
prefs = Table('user_prefs', engine,
    Column('pref_id', Integer, primary_key = True),
    Column('stylename', String(20)),
    Column('save_password', Boolean, nullable = False),
    Column('timezone', CHAR(3), nullable = False)
)

# user table gets 'preference_id' column added
users = Table('users', engine, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('password', String(20), nullable = False),
    Column('preference_id', Integer, ForeignKey("prefs.pref_id"))
)

# class definition for preferences
class UserPrefs(object):
    pass
UserPrefs.mapper = mapper(UserPrefs, prefs)

# address mapper
Address.mapper = mapper(Address, addresses)

# make a new mapper referencing everything.
m = mapper(User, users, properties = dict(
    addresses = relation(Address.mapper, lazy=True, private=True),
    preferences = relation(UserPrefs.mapper, lazy=False, private=True),
))

# select
sqluser = m.get_by(user_name='fred')
save_password = user.preferences.save_password

# modify
user.preferences.stylename = 'bluesteel'
sqluser.addresses.append(Address('freddy@hi.org'))
# commit
sqlobjectstore.commit()
back to section top
Many to Many

The relation function handles a basic many-to-many relationship when you specify the association table:

articles = Table('articles', engine,
    Column('article_id', Integer, primary_key = True),
    Column('headline', String(150), key='headline'),
    Column('body', TEXT, key='body'),
)

keywords = Table('keywords', engine,
    Column('keyword_id', Integer, primary_key = True),
    Column('keyword_name', String(50))
)

itemkeywords = Table('article_keywords', engine,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id"))
)

# class definitions
class Keyword(object):
    def __init__(self, name = None):
        self.keyword_name = name

class Article(object):
    pass

# define a mapper that does many-to-many on the 'itemkeywords' association 
# table
Article.mapper = mapper(Article, articles, properties = dict(
    keywords = relation(mapper(Keyword, keywords), itemkeywords, lazy=False)
    )
)

article = Article()
article.headline = 'a headline'
article.body = 'this is the body'
article.keywords.append(Keyword('politics'))
article.keywords.append(Keyword('entertainment'))
sqlobjectstore.commit()
# select articles based on a keyword.  select_by will handle the extra joins.
sqlarticles = Article.mapper.select_by(keyword_name='politics')
# modify
a = articles[0]
del a.keywords[:]
a.keywords.append(Keyword('topstories'))
a.keywords.append(Keyword('government'))

# commit.  individual INSERT/DELETE operations will take place only for the list
# elements that changed.
sqlobjectstore.commit()
back to section top
Association Object

Many to Many can also be done with an association object, that adds additional information about how two items are related. This association object is set up in basically the same way as any other mapped object. However, since an association table typically has no primary key columns, you have to tell the mapper what columns will compose its "primary key", which are the two (or more) columns involved in the association. Also, the relation function needs an additional hint as to the fact that this mapped object is an association object, via the "association" argument which points to the class or mapper representing the other side of the association.

# add "attached_by" column which will reference the user who attached this keyword
itemkeywords = Table('article_keywords', engine,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")),
    Column('attached_by', Integer, ForeignKey("users.user_id"))
)

# define an association class
class KeywordAssociation(object):
    pass

# mapper for KeywordAssociation
# specify "primary key" columns manually
KeywordAssociation.mapper = mapper(KeywordAssociation, itemkeywords,
    primary_key = [itemkeywords.c.article_id, itemkeywords.c.keyword_id],
    properties={
        'keyword' : relation(Keyword, lazy = False), # uses primary Keyword mapper
        'user' : relation(User, lazy = True) # uses primary User mapper
    }
)

# mappers for Users, Keywords
User.mapper = mapper(User, users)
Keyword.mapper = mapper(Keyword, keywords)

# define the mapper. 
m = mapper(Article, articles, properties={
    'keywords':relation(KeywordAssociation.mapper, lazy=False, association=Keyword)
    }
)

# bonus step - well, we do want to load the users in one shot, 
# so modify the mapper via an option.
# this returns a new mapper with the option switched on.
m2 = mapper.options(eagerload('keywords.user'))

# select by keyword again
sqlalist = m2.select_by(keyword_name='jacks_stories')
# user is available
for a in alist:
    for k in a.keywords:
        if k.keyword.name == 'jacks_stories':
            print k.user.user_name
back to section top

Overview

The concept behind Unit of Work is to track modifications to a field of objects, and then be able to commit those changes to the database in a single operation. Theres a lot of advantages to this, including that your application doesn't need to worry about individual save operations on objects, nor about the required order for those operations, nor about excessive repeated calls to save operations that would be more efficiently aggregated into one step. It also simplifies database transactions, providing a neat package with which to insert into the traditional database begin/commit phase.

SQLAlchemy's unit of work includes these functions:

  • The ability to monitor scalar and list attributes on object instances, as well as object creates. This is handled via the attributes package.
  • The ability to maintain and process a list of modified objects, and based on the relationships set up by the mappers for those objects as well as the foreign key relationships of the underlying tables, figure out the proper order of operations so that referential integrity is maintained, and also so that on-the-fly values such as newly created primary keys can be propigated to dependent objects that need them before they are saved. The central algorithm for this is the topological sort.
  • The ability to define custom functionality that occurs within the unit-of-work commit phase, such as "before insert", "after insert", etc. This is accomplished via MapperExtension.
  • an Identity Map, which is a dictionary storing the one and only instance of an object for a particular table/primary key combination. This allows many parts of an application to get a handle to a particular object without any chance of modifications going to two different places.
  • Thread-local operation. the Identity map as well as its enclosing Unit of Work are normally instantiated and accessed in a manner that is local to the current thread, within an object called a Session. Another concurrently executing thread will therefore have its own Session, so unless an application explicitly shares objects between threads, the operation of the object relational mapping is automatically threadsafe. Session objects can also be constructed manually to allow any user-defined scoping.
back to section top
The Session Interface

The current unit of work is accessed via a Session object. The Session is available in a thread-local context from the objectstore module as follows:

# get the current thread's session
session = objectstore.get_session()

The Session object acts as a proxy to an underlying UnitOfWork object. Common methods include commit(), begin(), clear(), and delete(). Most of these methods are available at the module level in the objectstore module, which operate upon the Session returned by the get_session() function:

# this...
objectstore.get_session().commit()

# is the same as this:
objectstore.commit()

A description of the most important methods and concepts follows.

Identity Map

The first concept to understand about the Unit of Work is that it is keeping track of all mapped objects which have been loaded from the database, as well as all mapped objects which have been saved to the database in the current session. This means that everytime you issue a select call to a mapper which returns results, all of those objects are now installed within the current Session, mapped to their identity.

In particular, it is insuring that only one instance of a particular object, corresponding to a particular database identity, exists within the Session at one time. By "database identity" we mean a table or relational concept in the database combined with a particular primary key in that table. The session accomplishes this task using a dictionary known as an Identity Map. When select or get calls on mappers issue queries to the database, they will in nearly all cases go out to the database on each call to fetch results. However, when the mapper instantiates objects corresponding to the result set rows it receives, it will check the current identity map first before instantating a new object, and return the same instance already present in the identiy map if it already exists.

Example:

mymapper = mapper(MyClass, mytable)

obj1 = mymapper.selectfirst(mytable.c.id==15)
obj2 = mymapper.selectfirst(mytable.c.id==15)

>>> obj1 is obj2
True

The Identity Map is an instance of weakref.WeakValueDictionary, so that when an in-memory object falls out of scope, it will be removed automatically. However, this may not be instant if there are circular references upon the object. The current SA attributes implementation places some circular refs upon objects, although this may change in the future. There are other ways to remove object instances from the current session, as well as to clear the current session entirely, which are described later in this section.

To view the Session's identity map, it is accessible via the identity_map accessor, and is an instance of weakref.WeakValueDictionary:

>>> objectstore.get_session().identity_map.values()
[<__main__.User object at 0x712630>, <__main__.Address object at 0x712a70>]

The identity of each object instance is available via the instancekey property attached to each object instance, and is a tuple consisting of the object's class and an additional tuple of primary key values, in the order that they appear within the table definition:

>>> obj._instance_key 
(<class 'test.tables.User'>, (7,))

At the moment that an object is assigned this key, it is also added to the current thread's unit-of-work's identity map.

The get() method on a mapper, which retrieves an object based on primary key identity, also checks in the current identity map first to save a database round-trip if possible. In the case of an object lazy-loading a single child object, the get() method is used as well, so scalar-based lazy loads may in some cases not query the database; this is particularly important for backreference relationships as it can save a lot of queries.

Methods on mappers and the objectstore module, which are relevant to identity include the following:

# assume 'm' is a mapper
m = mapper(User, users)

# get the identity key corresponding to a primary key
key = m.identity_key(7)

# for composite key, list out the values in the order they
# appear in the table
key = m.identity_key(12, 'rev2')

# get the identity key given a primary key 
# value as a tuple and a class
key = objectstore.get_id_key((12, 'rev2'), User)

# get the identity key for an object, whether or not it actually
# has one attached to it (m is the mapper for obj's class)
key = m.instance_key(obj)

# is this key in the current identity map?
session.has_key(key)

# is this object in the current identity map?
session.has_instance(obj)

# get this object from the current identity map based on 
# singular/composite primary key, or if not go 
# and load from the database
obj = m.get(12, 'rev2')
back to section top
Whats Changed ?

The next concept is that in addition to the Session storing a record of all objects loaded or saved, it also stores records of all newly created objects, records of all objects whose attributes have been modified, records of all objects that have been marked as deleted, and records of all modified list-based attributes where additions or deletions have occurred. These lists are used when a commit() call is issued to save all changes. After the commit occurs, these lists are all cleared out.

These records are all tracked by a collection of Set objects (which are a SQLAlchemy-specific instance called a HashSet) that are also viewable off the Session:

# new objects that were just constructed
session.new

# objects that exist in the database, that were modified
session.dirty

# objects that have been marked as deleted via session.delete(obj)
session.deleted

# list-based attributes thave been appended
session.modified_lists

Heres an interactive example, assuming the User and Address mapper setup first outlined in None:

>>> # get the current thread's session
>>> session = objectstore.get_session()

>>> # create a new object, with a list-based attribute 
>>> # containing two more new objects
>>> u = User(user_name='Fred')
>>> u.addresses.append(Address(city='New York'))
>>> u.addresses.append(Address(city='Boston'))

>>> # objects are in the "new" list
>>> session.new
[<__main__.User object at 0x713630>, 
<__main__.Address object at 0x713a70>, 
<__main__.Address object at 0x713b30>]

>>> # view the "modified lists" member, 
>>> # reveals our two Address objects as well, inside of a list
>>> session.modified_lists
[[<__main__.Address object at 0x713a70>, <__main__.Address object at 0x713b30>]]

>>> # lets view what the class/ID is for the list object
>>> ["%s %s" % (l.__class__, id(l)) for l in session.modified_lists]
['sqlalchemy.mapping.unitofwork.UOWListElement 7391872']

>>> # now commit
>>> session.commit()

>>> # the "new" list is now empty
>>> session.new
[]

>>> # the "modified lists" list is now empty
>>> session.modified_lists
[]

>>> # now lets modify an object
>>> u.user_name='Ed'

>>> # it gets placed in the "dirty" list
>>> session.dirty
[<__main__.User object at 0x713630>]

>>> # delete one of the addresses 
>>> session.delete(u.addresses[0])

>>> # and also delete it off the User object, note that
>>> # this is *not automatic* when using session.delete()
>>> del u.addresses[0]
>>> session.deleted
[<__main__.Address object at 0x713a70>]    

>>> # commit
>>> session.commit()

>>> # all lists are cleared out
>>> session.new, session.dirty, session.modified_lists, session.deleted
([], [], [], [])

>>> # identity map has the User and the one remaining Address
>>> session.identity_map.values()
[<__main__.Address object at 0x713b30>, <__main__.User object at 0x713630>]

Unlike the identity map, the new, dirty, modified_lists, and deleted lists are not weak referencing. This means if you abandon all references to new or modified objects within a session, they are still present and will be saved on the next commit operation, unless they are removed from the Session explicitly (more on that later). The new list may change in a future release to be weak-referencing, however for the deleted list, one can see that its quite natural for a an object marked as deleted to have no references in the application, yet a DELETE operation is still required.

back to section top
Commit

This is the main gateway to what the Unit of Work does best, which is save everything ! It should be clear by now that a commit looks like:

objectstore.get_session().commit()

It also can be called with a list of objects; in this form, the commit operation will be limited only to the objects specified in the list, as well as any child objects within private relationships for a delete operation:

# saves only user1 and address2.  all other modified
# objects remain present in the session.
objectstore.get_session().commit(user1, address2)

This second form of commit should be used more carefully as it will not necessarily locate other dependent objects within the session, whose database representation may have foreign constraint relationships with the objects being operated upon.

What Commit is, and Isn't

The purpose of the Commit operation, as defined by the objectstore package, is to instruct the Unit of Work to analyze its lists of modified objects, assemble them into a dependency graph, fire off the appopriate INSERT, UPDATE, and DELETE statements via the mappers related to those objects, and to synchronize column-based object attributes that correspond directly to updated/inserted database columns.

Its important to note that the objectstore.get_session().commit() operation is not the same as the commit() operation on SQLEngine. A SQLEngine, described in None, has its own begin and commit statements which deal directly with transactions opened on DBAPI connections. While the session.commit() makes use of these calls in order to issue its own SQL within a database transaction, it is only dealing with "committing" its own in-memory changes and only has an indirect relationship with database connection objects.

The session.commit() operation also does not affect any relation-based object attributes, that is attributes that reference other objects or lists of other objects, in any way. A brief list of what will not happen includes:

  • It will not append or delete any object instances to/from any list-based object attributes. Any objects that have been created or marked as deleted will be updated as such in the database, but if a newly deleted object instance is still attached to a parent object's list, the object itself will remain in that list.
  • It will not set or remove any scalar references to other objects, even if the corresponding database identifier columns have been committed.

This means, if you set address.user_id to 5, that integer attribute will be saved, but it will not place an Address object in the addresses attribute of the corresponding User object. In some cases there may be a lazy-loader still attached to an object attribute which when first accesed performs a fresh load from the database and creates the appearance of this behavior, but this behavior should not be relied upon as it is specific to lazy loading and also may disappear in a future release. Similarly, if the Address object is marked as deleted and a commit is issued, the correct DELETE statements will be issued, but if the object instance itself is still attached to the User, it will remain.

So the primary guideline for dealing with commit() is, the developer is responsible for maintaining in-memory objects and their relationships to each other, the unit of work is responsible for maintaining the database representation of the in-memory objects. The typical pattern is that the manipulation of objects is the way that changes get communicated to the unit of work, so that when the commit occurs, the objects are already in their correct in-memory representation and problems dont arise. The manipulation of identifier attributes like integer key values as well as deletes in particular are a frequent source of confusion.

A terrific feature of SQLAlchemy which is also a supreme source of confusion is the backreference feature, described in None. This feature allows two types of objects to maintain attributes that reference each other, typically one object maintaining a list of elements of the other side, which contains a scalar reference to the list-holding object. When you append an element to the list, the element gets a "backreference" back to the object which has the list. When you attach the list-holding element to the child element, the child element gets attached to the list. This feature has nothing to do whatsoever with the Unit of Work. * It is strictly a small convenience feature intended to support the developer's manual manipulation of in-memory objects, and the backreference operation happens at the moment objects are attached or removed to/from each other, independent of any kind of database operation. It does not change the golden rule, that the developer is reponsible for maintaining in-memory object relationships.

* there is an internal relationship between two relations that have a backreference, which state that a change operation is only logged once to the unit of work instead of two separate changes since the two changes are "equivalent", so a backreference does affect the information that is sent to the Unit of Work. But the Unit of Work itself has no knowledge of this arrangement and has no ability to affect it.

back to section top
Delete

The delete call places an object or objects into the Unit of Work's list of objects to be marked as deleted:

# mark three objects to be deleted
objectstore.get_session().delete(obj1, obj2, obj3)

# commit
objectstore.get_session().commit()

When objects which contain references to other objects are deleted, the mappers for those related objects will issue UPDATE statements for those objects that should no longer contain references to the deleted object, setting foreign key identifiers to NULL. Similarly, when a mapper contains relations with the private=True option, DELETE statements will be issued for objects within that relationship in addition to that of the primary deleted object; this is called a cascading delete.

As stated before, the purpose of delete is strictly to issue DELETE statements to the database. It does not affect the in-memory structure of objects, other than changing the identifying attributes on objects, such as setting foreign key identifiers on updated rows to None. It has no effect on the status of references between object instances, nor any effect on the Python garbage-collection status of objects.

back to section top
Clear

To clear out the current thread's UnitOfWork, which has the effect of discarding the Identity Map and the lists of all objects that have been modified, just issue a clear:

# via module
objectstore.clear()

# or via Session
objectstore.get_session().clear()

This is the easiest way to "start fresh", as in a web application that wants to have a newly loaded graph of objects on each request. Any object instances created before the clear operation should either be discarded or at least not used with any Mapper or Unit Of Work operations (with the exception of import_instance()), as they no longer have any relationship to the current Unit of Work, and their behavior with regards to the current session is undefined.

back to section top
Refresh / Expire

To assist with the Unit of Work's "sticky" behavior, individual objects can have all of their attributes immediately re-loaded from the database, or marked as "expired" which will cause a re-load to occur upon the next access of any of the object's mapped attributes. This includes all relationships, so lazy-loaders will be re-initialized, eager relationships will be repopulated. Any changes marked on the object are discarded:

# immediately re-load attributes on obj1, obj2
session.refresh(obj1, obj2)

# expire objects obj1, obj2, attributes will be reloaded
# on the next access:
session.expire(obj1, obj2, obj3)
back to section top
Expunge

Expunge simply removes all record of an object from the current Session. This includes the identity map, and all history-tracking lists:

session.expunge(obj1)

Use expunge when youd like to remove an object altogether from memory, such as before calling del on it, which will prevent any "ghost" operations occuring when the session is committed.

back to section top
Import Instance

The instancekey attribute placed on object instances is designed to work with objects that are serialized into strings and brought back again. As it contains no references to internal structures or database connections, applications that use caches or session storage which require serialization (i.e. pickling) can store SQLAlchemy-loaded objects. However, as mentioned earlier, an object with a particular database identity is only allowed to exist uniquely within the current unit-of-work scope. So, upon deserializing such an object, it has to "check in" with the current Session. This is achieved via the import_instance() method:

# deserialize an object
myobj = pickle.loads(mystring)

# "import" it.  if the objectstore already had this object in the 
# identity map, then you get back the one from the current session.
myobj = session.import_instance(myobj)

Note that the import_instance() function will either mark the deserialized object as the official copy in the current identity map, which includes updating its instancekey with the current application's class instance, or it will discard it and return the corresponding object that was already present. Thats why its important to receive the return results from the method and use the result as the official object instance.

back to section top
Begin

The "scope" of the unit of work commit can be controlled further by issuing a begin(). A begin operation constructs a new UnitOfWork object and sets it as the currently used UOW. It maintains a reference to the original UnitOfWork as its "parent", and shares the same identity map of objects that have been loaded from the database within the scope of the parent UnitOfWork. However, the "new", "dirty", and "deleted" lists are empty. This has the effect that only changes that take place after the begin() operation get logged to the current UnitOfWork, and therefore those are the only changes that get commit()ted. When the commit is complete, the "begun" UnitOfWork removes itself and places the parent UnitOfWork as the current one again. The begin() method returns a transactional object, upon which you can call commit() or rollback(). Only this transactional object controls the transaction - commit() upon the Session will do nothing until commit() or rollback() is called upon the transactional object.

# modify an object
myobj1.foo = "something new"

# begin 
trans = session.begin()

# modify another object
myobj2.lala = "something new"

# only 'myobj2' is saved
trans.commit()

begin/commit supports the same "nesting" behavior as the SQLEngine (note this behavior is not the original "nested" behavior), meaning that many begin() calls can be made, but only the outermost transactional object will actually perform a commit(). Similarly, calls to the commit() method on the Session, which might occur in function calls within the transaction, will not do anything; this allows an external function caller to control the scope of transactions used within the functions.

back to section top
Advanced UnitOfWork Management
Nesting UnitOfWork in a Database Transaction

The UOW commit operation places its INSERT/UPDATE/DELETE operations within the scope of a database transaction controlled by a SQLEngine:

engine.begin()
try:
    # run objectstore update operations
except:
    engine.rollback()
    raise
engine.commit()

If you recall from the Transactions section, the engine's begin()/commit() methods support reentrant behavior. This means you can nest begin and commits and only have the outermost begin/commit pair actually take effect (rollbacks however, abort the whole operation at any stage). From this it follows that the UnitOfWork commit operation can be nested within a transaction as well:

engine.begin()
try:
    # perform custom SQL operations
    objectstore.commit()
    # perform custom SQL operations
except:
    engine.rollback()
    raise
engine.commit()
back to section top
Per-Object Sessions

Sessions can be created on an ad-hoc basis and used for individual groups of objects and operations. This has the effect of bypassing the normal thread-local Session and explicitly using a particular Session:

# make a new Session with a global UnitOfWork
s = objectstore.Session()

# make objects bound to this Session
x = MyObj(_sa_session=s)

# perform mapper operations bound to this Session
# (this function coming soon)
r = MyObj.mapper.using(s).select_by(id=12)

# get the session that corresponds to an instance
s = objectstore.get_session(x)

# commit 
s.commit()

# perform a block of operations with this session set within the current scope
objectstore.push_session(s)
try:
    r = mapper.select_by(id=12)
    x = new MyObj()
    objectstore.commit()
finally:
    objectstore.pop_session()
Nested Transaction Sessions

Sessions also now support a "nested transaction" feature whereby a second Session can use a different database connection. This can be used inside of a larger database transaction to issue commits to the database that will be committed independently of the larger transaction's status:

engine.begin()
try:
    a = MyObj()
    b = MyObj()

    sess = Session(nest_on=engine)
    objectstore.push_session(sess)
    try:
        c = MyObj()
        objectstore.commit()    # will commit "c" to the database,
                                # even if the external transaction rolls back
    finally:
        objectstore.pop_session()

    objectstore.commit()  # commit "a" and "b" to the database
    engine.commit()
except:
    engine.rollback()
    raise
back to section top
Custom Session Objects/Custom Scopes

For users who want to make their own Session subclass, or replace the algorithm used to return scoped Session objects (i.e. the objectstore.get_session() method):

Create a Session
# make a new Session
s = objectstore.Session()

# set it as the current thread-local session
objectstore.session_registry.set(s)

# set the objectstore's session registry to a different algorithm

def create_session():
    """creates new sessions"""
    return objectstore.Session()
def mykey():
    """creates contextual keys to store scoped sessions"""
    return "mykey"

objectstore.session_registry = sqlalchemy.util.ScopedRegistry(createfunc=create_session, scopefunc=mykey)
back to section top
Analyzing Object Commits

The objectstore module can log an extensive display of its "commit plans", which is a graph of its internal representation of objects before they are committed to the database. To turn this logging on:

# make an engine with echo_uow
engine = create_engine('myengine...', echo_uow=True)

Commits will then dump to the standard output displays like the following:

Task dump:
 UOWTask(6034768) 'User/users/6015696'
  |
  |- Save elements
  |- Save: UOWTaskElement(6034800): User(6016624) (save)
  |
  |- Save dependencies
  |- UOWDependencyProcessor(6035024) 'addresses' attribute on saved User's (UOWTask(6034768) 'User/users/6015696')
  |       |-UOWTaskElement(6034800): User(6016624) (save)
  |
  |- Delete dependencies
  |- UOWDependencyProcessor(6035056) 'addresses' attribute on User's to be deleted (UOWTask(6034768) 'User/users/6015696')
  |       |-(no objects)
  |
  |- Child tasks
  |- UOWTask(6034832) 'Address/email_addresses/6015344'
  |   |
  |   |- Save elements
  |   |- Save: UOWTaskElement(6034864): Address(6034384) (save)
  |   |- Save: UOWTaskElement(6034896): Address(6034256) (save)
  |   |----
  | 
  |----

The above graph can be read straight downwards to determine the order of operations. It indicates "save User 6016624, process each element in the 'addresses' list on User 6016624, save Address 6034384, Address 6034256".

back to section top

This section details all the options available to Mappers, as well as advanced patterns.

To start, heres the tables we will work with again:

from sqlalchemy import *
db = create_engine('sqlite://filename=mydb', echo=True)

# a table to store users
users = Table('users', db,
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(40)),
    Column('password', String(80))
)

# a table that stores mailing addresses associated with a specific user
addresses = Table('addresses', db,
    Column('address_id', Integer, primary_key = True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

# a table that stores keywords
keywords = Table('keywords', db,
    Column('keyword_id', Integer, primary_key = True),
    Column('name', VARCHAR(50))
)

# a table that associates keywords with users
userkeywords = Table('userkeywords', db,
    Column('user_id', INT, ForeignKey("users")),
    Column('keyword_id', INT, ForeignKey("keywords"))
)
More On Relations
Custom Join Conditions

When creating relations on a mapper, most examples so far have illustrated the mapper and relationship joining up based on the foreign keys of the tables they represent. in fact, this "automatic" inspection can be completely circumvented using the primaryjoin and secondaryjoin arguments to relation, as in this example which creates a User object which has a relationship to all of its Addresses which are in Boston:

class User(object):
    pass
class Address(object):
    pass
Address.mapper = mapper(Address, addresses)
User.mapper = mapper(User, users, properties={
    'boston_addreses' : relation(Address.mapper, primaryjoin=
                and_(users.c.user_id==Address.c.user_id, 
                Addresses.c.city=='Boston'))
})

Many to many relationships can be customized by one or both of primaryjoin and secondaryjoin, shown below with just the default many-to-many relationship explicitly set:

class User(object):
    pass
class Keyword(object):
    pass
Keyword.mapper = mapper(Keyword, keywords)
User.mapper = mapper(User, users, properties={
    'keywords':relation(Keyword.mapper, 
        primaryjoin=users.c.user_id==userkeywords.c.user_id,
        secondaryjoin=userkeywords.c.keyword_id==keywords.c.keyword_id
        )
})
back to section top
Lazy/Eager Joins Multiple Times to One Table

The previous example leads in to the idea of joining against the same table multiple times. Below is a User object that has lists of its Boston and New York addresses, both lazily loaded when they are first accessed:

User.mapper = mapper(User, users, properties={
    'boston_addreses' : relation(Address.mapper, primaryjoin=
                and_(users.c.user_id==Address.c.user_id, 
                Addresses.c.city=='Boston')),
    'newyork_addresses' : relation(Address.mapper, primaryjoin=
                and_(users.c.user_id==Address.c.user_id, 
                Addresses.c.city=='New York')),
})

A complication arises with the above pattern if you want the relations to be eager loaded. Since there will be two separate joins to the addresses table during an eager load, an alias needs to be used to separate them. You can create an alias of the addresses table to separate them, but then you are in effect creating a brand new mapper for each property, unrelated to the main Address mapper, which can create problems with commit operations. So an additional argument use_alias can be used with an eager relationship to specify the alias to be used just within the eager query:

User.mapper = mapper(User, users, properties={
    'boston_addreses' : relation(Address.mapper, primaryjoin=
                and_(User.c.user_id==Address.c.user_id, 
                Addresses.c.city=='Boston'), lazy=False, use_alias=True),
    'newyork_addresses' : relation(Address.mapper, primaryjoin=
                and_(User.c.user_id==Address.c.user_id, 
                Addresses.c.city=='New York'), lazy=False, use_alias=True),
})

sqlu = User.mapper.select()
back to section top
Relation Options
Keyword options to the relation function include:
  • lazy=(True|False|None) - specifies how the related items should be loaded. a value of True indicates they should be loaded when the property is first accessed. A value of False indicates they should be loaded by joining against the parent object query, so parent and child are loaded in one round trip. A value of None indicates the related items are not loaded by the mapper in any case; the application will manually insert items into the list in some other way. A relationship with lazy=None is still important; items added to the list or removed will cause the appropriate updates and deletes upon commit().
  • primaryjoin - a ClauseElement that will be used as the primary join of this child object against the parent object, or in a many-to-many relationship the join of the primary object to the association table. By default, this value is computed based on the foreign key relationships of the parent and child tables (or association table).
  • secondaryjoin - a ClauseElement that will be used as the join of an association table to the child object. By default, this value is computed based on the foreign key relationships of the association and child tables.
  • foreignkey - specifies which column in this relationship is "foreign", i.e. which column refers to the parent object. This value is automatically determined in all cases, based on the primary and secondary join conditions, except in the case of a self-referential mapper, where it is needed to indicate the child object's reference back to it's parent.
  • uselist - a boolean that indicates if this property should be loaded as a list or a scalar. In most cases, this value is determined based on the type and direction of the relationship - one to many forms a list, one to one forms a scalar, many to many is a list. If a scalar is desired where normally a list would be present, set uselist to False.
  • private - indicates if these child objects are "private" to the parent; removed items will also be deleted, and if the parent item is deleted, all child objects are deleted as well. See the example in None.
  • backreference - indicates the name of a property to be placed on the related mapper's class that will handle this relationship in the other direction, including synchronizing the object attributes on both sides of the relation. See the example in None.
  • order_by - indicates the ordering that should be applied when loading these items. See the section Controlling Ordering for details.
  • association - When specifying a many to many relationship with an association object, this keyword should reference the mapper of the target object of the association. See the example in None.
  • post_update - this indicates that the relationship should be handled by a second UPDATE statement after an INSERT, or before a DELETE. using this flag essentially means the relationship will not incur any "dependency" between parent and child item, as the particular foreign key relationship between them is handled by a second statement. use this flag when a particular mapping arrangement will incur two rows that are dependent on each other, such as a table that has a one-to-many relationship to a set of child rows, and also has a column that references a single child row within that list (i.e. both tables contain a foreign key to each other). If a commit() operation returns an error that a "cyclical dependency" was detected, this is a cue that you might want to use post_update.
back to section top
Controlling Ordering

By default, mappers will not supply any ORDER BY clause when selecting rows. This can be modified in several ways.

A "default ordering" can be supplied by all mappers, by enabling the "default_ordering" flag to the engine, which indicates that table primary keys or object IDs should be used as the default ordering:

db = create_engine('postgres://username=scott&password=tiger', default_ordering=True)

The "order_by" parameter can be sent to a mapper, overriding the per-engine ordering if any. A value of None means that the mapper should not use any ordering, even if the engine's default_ordering property is True. A non-None value, which can be a column, an asc or desc clause, or an array of either one, indicates the ORDER BY clause that should be added to all select queries:

# disable all ordering
mapper = mapper(User, users, order_by=None)

# order by a column
mapper = mapper(User, users, order_by=users.c.user_id)

# order by multiple items
mapper = mapper(User, users, order_by=[users.c.user_id, desc(users.c.user_name)])

"order_by" can also be specified to an individual select method, overriding all other per-engine/per-mapper orderings:

# order by a column
l = mapper.select(users.c.user_name=='fred', order_by=users.c.user_id)

# order by multiple criterion
l = mapper.select(users.c.user_name=='fred', order_by=[users.c.user_id, desc(users.c.user_name)])

For relations, the "order_by" property can also be specified to all forms of relation:

# order address objects by address id
mapper = mapper(User, users, properties = {
    'addresses' : relation(mapper(Address, addresses), order_by=addresses.c.address_id)
})

# eager load with ordering - the ORDER BY clauses of parent/child will be organized properly
mapper = mapper(User, users, properties = {
    'addresses' : relation(mapper(Address, addresses), order_by=desc(addresses.c.email_address), eager=True)
}, order_by=users.c.user_id)
back to section top
Limiting Rows

You can limit rows in a regular SQL query by specifying limit and offset. A Mapper can handle the same concepts:

class User(object):
    pass

m = mapper(User, users)
sqlr = m.select(limit=20, offset=10)
However, things get tricky when dealing with eager relationships, since a straight LIMIT of rows does not represent the count of items when joining against other tables to load related items as well. So here is what SQLAlchemy will do when you use limit or offset with an eager relationship:
class User(object):
    pass
class Address(object):
    pass
m = mapper(User, users, properties={
    'addresses' : relation(mapper(Address, addresses), lazy=False)
})
r = m.select(User.c.user_name.like('F%'), limit=20, offset=10)
SELECT users.user_id AS users_user_id, users.user_name AS users_user_name,
users.password AS users_password, addresses.address_id AS addresses_address_id,
addresses.user_id AS addresses_user_id, addresses.street AS addresses_street,
addresses.city AS addresses_city, addresses.state AS addresses_state,
addresses.zip AS addresses_zip
FROM
(SELECT users.user_id FROM users WHERE users.user_name LIKE %(users_user_name)s
ORDER BY users.oid LIMIT 20 OFFSET 10) AS rowcount,
users LEFT OUTER JOIN addresses ON users.user_id = addresses.user_id
WHERE rowcount.user_id = users.user_id ORDER BY users.oid, addresses.oid
{'users_user_name': 'F%'}

The main WHERE clause as well as the limiting clauses are coerced into a subquery; this subquery represents the desired result of objects. A containing query, which handles the eager relationships, is joined against the subquery to produce the result.

back to section top
Overriding Column Names

When mappers are constructed, by default the column names in the Table metadata are used as the names of attributes on the mapped class. This can be customzed within the properties by stating the key/column combinations explicitly:

user_mapper = mapper(User, users, properties={
    'id' : users.c.user_id,
    'name' : users.c.user_name,
})

In the situation when column names overlap in a mapper against multiple tables, columns may be referenced together with a list:

# join users and addresses
usersaddresses = sql.join(users, addresses, users.c.user_id == addresses.c.user_id)
m = mapper(User, usersaddresses,   
    properties = {
        'id' : [users.c.user_id, addresses.c.user_id],
    }
    )
back to section top
Deferred Column Loading

This feature allows particular columns of a table to not be loaded by default, instead being loaded later on when first referenced. It is essentailly "column-level lazy loading". This feature is useful when one wants to avoid loading a large text or binary field into memory when its not needed. Individual columns can be lazy loaded by themselves or placed into groups that lazy-load together.

book_excerpts = Table('books', db, 
    Column('book_id', Integer, primary_key=True),
    Column('title', String(200), nullable=False),
    Column('summary', String(2000)),
    Column('excerpt', String),
    Column('photo', Binary)
)

class Book(object):
    pass

# define a mapper that will load each of 'excerpt' and 'photo' in 
# separate, individual-row SELECT statements when each attribute
# is first referenced on the individual object instance
book_mapper = mapper(Book, book_excerpts, properties = {
    'excerpt' : deferred(book_excerpts.c.excerpt),
    'photo' : deferred(book_excerpts.c.photo)
})

Deferred columns can be placed into groups so that they load together:

book_excerpts = Table('books', db, 
    Column('book_id', Integer, primary_key=True),
    Column('title', String(200), nullable=False),
    Column('summary', String(2000)),
    Column('excerpt', String),
    Column('photo1', Binary),
    Column('photo2', Binary),
    Column('photo3', Binary)
)

class Book(object):
    pass

# define a mapper with a 'photos' deferred group.  when one photo is referenced,
# all three photos will be loaded in one SELECT statement.  The 'excerpt' will 
# be loaded separately when it is first referenced.
book_mapper = mapper(Book, book_excerpts, properties = {
    'excerpt' : deferred(book_excerpts.c.excerpt),
    'photo1' : deferred(book_excerpts.c.photo1, group='photos'),
    'photo2' : deferred(book_excerpts.c.photo2, group='photos'),
    'photo3' : deferred(book_excerpts.c.photo3, group='photos')
})
back to section top
More on Mapper Options

The options method of mapper, first introduced in None, supports the copying of a mapper into a new one, with any number of its relations replaced by new ones. The method takes a variable number of MapperOption objects which know how to change specific things about the mapper. The five available options are eagerload, lazyload, noload, deferred and extension.

An example of a mapper with a lazy load relationship, upgraded to an eager load relationship:

class User(object):
    pass
class Address(object):
    pass

# a 'lazy' relationship
User.mapper = mapper(User, users, properties = {
    'addreses':relation(mapper(Address, addresses), lazy=True)
})
    
# copy the mapper and convert 'addresses' to be eager
eagermapper = User.mapper.options(eagerload('addresses'))

The load options also can take keyword arguments that apply to the new relationship. To take the "double" address lazy relationship from the previous section and upgrade it to eager, adding the "selectalias" keywords as well:

m = User.mapper.options(
        eagerload('boston_addresses', selectalias='boston_ad'), 
        eagerload('newyork_addresses', selectalias='newyork_ad')
    )

The defer and undefer options can control the deferred loading of attributes:

# set the 'excerpt' deferred attribute to load normally
m = book_mapper.options(undefer('excerpt'))

# set the referenced mapper 'photos' to defer its loading of the column 'imagedata'
m = book_mapper.options(defer('photos.imagedata'))

Options can also take a limited set of keyword arguments which will be applied to a new mapper. For example, to create a mapper that refreshes all objects loaded each time:

m2 = mapper.options(always_refresh=True)

Or, a mapper with different ordering:

m2 = mapper.options(order_by=[newcol])
back to section top
Mapping a Class with Table Inheritance

Table Inheritance indicates the pattern where two tables, in a parent-child relationship, are mapped to an inheritance chain of classes. If a table "employees" contains additional information about managers in the table "managers", a corresponding object inheritance pattern would have an Employee class and a Manager class. Loading a Manager object means you are joining managers to employees. For SQLAlchemy, this pattern is just a special case of a mapper that maps against a joined relationship, and is provided via the inherits keyword.

class User(object):
    """a user object."""
    pass
User.mapper = mapper(User, users)

class AddressUser(User):
    """a user object that also has the users mailing address."""
    pass

# define a mapper for AddressUser that inherits the User.mapper, and joins on the user_id column
AddressUser.mapper = mapper(
AddressUser,
        addresses, inherits=User.mapper
        )

items = AddressUser.mapper.select()

Above, the join condition is determined via the foreign keys between the users and the addresses table. To specify the join condition explicitly, use inherit_condition:

AddressUser.mapper = mapper(
        AddressUser,
        addresses, inherits=User.mapper, 
        inherit_condition=users.c.user_id==addresses.c.user_id
    )
back to section top
Mapping a Class against Multiple Tables

The more general case of the pattern described in "table inheritance" is a mapper that maps against more than one table. The join keyword from the SQL package creates a neat selectable unit comprised of multiple tables, complete with its own composite primary key, which can be passed in to a mapper as the table.

# a class
class AddressUser(object):
    pass

# define a Join
j = join(users, addresses)

# map to it - the identity of an AddressUser object will be 
# based on (user_id, address_id) since those are the primary keys involved
m = mapper(AddressUser, j)
A second example:
# many-to-many join on an association table
j = join(users, userkeywords, 
        users.c.user_id==userkeywords.c.user_id).join(keywords, 
           userkeywords.c.keyword_id==keywords.c.keyword_id)
 
# a class 
class KeywordUser(object):
    pass

# map to it - the identity of a KeywordUser object will be
# (user_id, keyword_id) since those are the primary keys involved
m = mapper(KeywordUser, j)
back to section top
Mapping a Class against Arbitary Selects

Similar to mapping against a join, a plain select() object can be used with a mapper as well. Below, an example select which contains two aggregate functions and a group_by is mapped to a class:

s = select([customers, 
            func.count(orders).label('order_count'), 
            func.max(orders.price).label('highest_order')],
            customers.c.customer_id==orders.c.customer_id,
            group_by=[c for c in customers.c]
            )
class Customer(object):
    pass

mapper = mapper(Customer, s)

Above, the "customers" table is joined against the "orders" table to produce a full row for each customer row, the total count of related rows in the "orders" table, and the highest price in the "orders" table, grouped against the full set of columns in the "customers" table. That query is then mapped against the Customer class. New instances of Customer will contain attributes for each column in the "customers" table as well as an "order_count" and "highest_order" attribute. Updates to the Customer object will only be reflected in the "customers" table and not the "orders" table. This is because the primary keys of the "orders" table are not represented in this mapper and therefore the table is not affected by save or delete operations.

back to section top
Multiple Mappers for One Class

By now it should be apparent that the mapper defined for a class is in no way the only mapper that exists for that class. Other mappers can be created at any time; either explicitly or via the options method, to provide different loading behavior.

However, its not as simple as that. The mapper serves a dual purpose; one is to generate select statements and load objects from executing those statements; the other is to keep track of the defined dependencies of that object when save and delete operations occur, and to extend the attributes of the object so that they store information about their history and communicate with the unit of work system. For this reason, it is a good idea to be aware of the behavior of multiple mappers. When creating dependency relationships between objects, one should insure that only the primary mappers are used in those relationships, else deep object traversal operations will fail to load in the expected properties, and update operations will not take all the dependencies into account.

Generally its as simple as, the first mapper that is defined for a particular class is the one that gets to define that classes' relationships to other mapped classes, and also decorates its attributes and constructors with special behavior. Any subsequent mappers created for that class will be able to load new instances, but object manipulation operations will still function via the original mapper. The special keyword is_primary will override this behavior, and make any mapper the new "primary" mapper.

class User(object):
    pass

# mapper one - mark it as "primary", meaning this mapper will handle
# saving and class-level properties
m1 = mapper(User, users, is_primary=True)

# mapper two - this one will also eager-load address objects in
m2 = mapper(User, users, properties={
        'addresses' : relation(mapper(Address, addresses), lazy=False)
    })

# get a user.  this user will not have an 'addreses' property
u1 = m1.select(User.c.user_id==10)

# get another user.  this user will have an 'addreses' property.
u2 = m2.select(User.c.user_id==27)

# make some modifications, including adding an Address object.
u1.user_name = 'jack'
u2.user_name = 'jane'
u2.addresses.append(Address('123 green street'))

# upon commit, the User objects will be saved. 
# the Address object will not, since the primary mapper for User
# does not have an 'addresses' relationship defined
objectstore.commit()
back to section top
Circular Mapping

Oftentimes it is necessary for two mappers to be related to each other. With a datamodel that consists of Users that store Addresses, you might have an Address object and want to access the "user" attribute on it, or have a User object and want to get the list of Address objects. The easiest way to do this is via the backreference keyword described in None. Although even when backreferences are used, it is sometimes necessary to explicitly specify the relations on both mappers pointing to each other.

To achieve this involves creating the first mapper by itself, then creating the second mapper referencing the first, then adding references to the first mapper to reference the second:

class User(object):
    pass
class Address(object):
    pass
User.mapper = mapper(User, users)
Address.mapper = mapper(Address, addresses, properties={
    'user':relation(User.mapper)
})
User.mapper.add_property('addresses', relation(Address.mapper))

Note that with a circular relationship as above, you cannot declare both relationships as "eager" relationships, since that produces a circular query situation which will generate a recursion exception. So what if you want to load an Address and its User eagerly? Just make a second mapper using options:

eagermapper = Address.mapper.options(eagerload('user'))
s = eagermapper.select(Address.c.address_id==12)
back to section top
Self Referential Mappers

A self-referential mapper is a mapper that is designed to operate with an adjacency list table. This is a table that contains one or more foreign keys back to itself, and is usually used to create hierarchical tree structures. SQLAlchemy's default model of saving items based on table dependencies is not sufficient in this case, as an adjacency list table introduces dependencies between individual rows. Fortunately, SQLAlchemy will automatically detect a self-referential mapper and do the extra lifting to make it work.

# define a self-referential table
trees = Table('treenodes', engine,
    Column('node_id', Integer, primary_key=True),
    Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('node_name', String(50), nullable=False),
    )

# treenode class
class TreeNode(object):
    pass

# mapper defines "children" property, pointing back to TreeNode class,
# with the mapper unspecified.  it will point back to the primary 
# mapper on the TreeNode class.
TreeNode.mapper = mapper(TreeNode, trees, properties={
        'children' : relation(
                        TreeNode, 
                        private=True
                     ),
        }
    )
    
# or, specify the circular relationship after establishing the original mapper:
mymapper = mapper(TreeNode, trees)

mymapper.add_property('children', relation(
                        mymapper, 
                        private=True
                     ))

This kind of mapper goes through a lot of extra effort when saving and deleting items, to determine the correct dependency graph of nodes within the tree.

A self-referential mapper where there is more than one relationship on the table requires that all join conditions be explicitly spelled out. Below is a self-referring table that contains a "parent_node_id" column to reference parent/child relationships, and a "root_node_id" column which points child nodes back to the ultimate root node:

# define a self-referential table with several relations
trees = Table('treenodes', engine,
    Column('node_id', Integer, primary_key=True),
    Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('root_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('node_name', String(50), nullable=False),
    )

# treenode class
class TreeNode(object):
    pass

# define the "children" property as well as the "root" property
TreeNode.mapper = mapper(TreeNode, trees, properties={
        'children' : relation(
                        TreeNode, 
                        primaryjoin=trees.c.parent_node_id==trees.c.node_id
                        private=True
                     ),
        'root' : relation(
                TreeNode,
                primaryjoin=trees.c.root_node_id=trees.c.node_id, 
                foreignkey=trees.c.node_id,
                uselist=False
            )
        }
    )

The "root" property on a TreeNode is a many-to-one relationship. By default, a self-referential mapper declares relationships as one-to-many, so the extra parameter foreignkey, pointing to the "many" side of a relationship, is needed to indicate a "many-to-one" self-referring relationship.

Both TreeNode examples above are available in functional form in the examples/adjacencytree directory of the distribution.

back to section top
Result-Set Mapping

Take any result set and feed it into a mapper to produce objects. Multiple mappers can be combined to retrieve unrelated objects from the same row in one step. The instances method on mapper takes a ResultProxy object, which is the result type generated from SQLEngine, and delivers object instances.

single object
class User(object):
    pass

User.mapper = mapper(User, users)

# select users
c = users.select().execute()

# get objects
userlist = User.mapper.instances(c)
multiple objects
# define a second class/mapper
class Address(object):
    pass
    
Address.mapper = mapper(Address, addresses)

# select users and addresses in one query
s = select([users, addresses], users.c.user_id==addresses.c.user_id)

# execute it, and process the results with the User mapper, chained to the Address mapper
r = User.mapper.instances(s.execute(), Address.mapper)

# result rows are an array of objects, one for each mapper used
for entry in r:
    user = r[0]
    address = r[1]
back to section top
Mapper Arguments

Other arguments not covered above include:

  • version_id_col=None - an integer-holding Column object that will be assigned an incrementing counter, which is added to the WHERE clause used by UPDATE and DELETE statements. The matching row count returned by the database is compared to the expected row count, and an exception is raised if they dont match. This is a basic "optimistic concurrency" check. Without the version id column, SQLAlchemy still compares the updated rowcount.
  • always_refresh=False - this option will cause the mapper to refresh all the attributes of all objects loaded by select/get statements, regardless of if they already exist in the current session. this includes all lazy- and eager-loaded relationship attributes, and will also overwrite any changes made to attributes on the column.
  • entity_name=None - this is an optional "entity name" that will be appended to the key used to associate classes to this mapper. What this basically means is, several primary mappers can be made against the same class by using different entity names; object instances will have the entity name tagged to them, so that all operations will occur on them relative to that mapper. When instantiating new objects, use _sa_entity='name' to tag them to the appropriate mapper.
back to section top
Extending Mapper

Mappers can have functionality augmented or replaced at many points in its execution via the usage of the MapperExtension class. This class is just a series of "hooks" where various functionality takes place. An application can make its own MapperExtension objects, overriding only the methods it needs.

class MapperExtension(object):
    def create_instance(self, mapper, row, imap, class_):
        """called when a new object instance is about to be created from a row.  
        the method can choose to create the instance itself, or it can return 
        None to indicate normal object creation should take place.
        
        mapper - the mapper doing the operation
        row - the result row from the database
        imap - a dictionary that is storing the running set of objects collected from the
        current result set
        class_ - the class we are mapping.
        """
    def append_result(self, mapper, row, imap, result, instance, isnew, populate_existing=False):
        """called when an object instance is being appended to a result list.
        
        If it returns True, it is assumed that this method handled the appending itself.

        mapper - the mapper doing the operation
        row - the result row from the database
        imap - a dictionary that is storing the running set of objects collected from the
        current result set
        result - an instance of util.HistoryArraySet(), which may be an attribute on an
        object if this is a related object load (lazy or eager).  use result.append_nohistory(value)
        to append objects to this list.
        instance - the object instance to be appended to the result
        isnew - indicates if this is the first time we have seen this object instance in the current result
        set.  if you are selecting from a join, such as an eager load, you might see the same object instance
        many times in the same result set.
        populate_existing - usually False, indicates if object instances that were already in the main 
        identity map, i.e. were loaded by a previous select(), get their attributes overwritten
        """
    def before_insert(self, mapper, instance):
        """called before an object instance is INSERTed into its table.
        
        this is a good place to set up primary key values and such that arent handled otherwise."""
    def after_insert(self, mapper, instance):
        """called after an object instance has been INSERTed"""
    def before_delete(self, mapper, instance):
        """called before an object instance is DELETEed"""

To use MapperExtension, make your own subclass of it and just send it off to a mapper:

mapper = mapper(User, users, extension=MyExtension())

An existing mapper can create a copy of itself using an extension via the extension option:

extended_mapper = mapper.options(extension(MyExtension()))
back to section top
How Mapper Modifies Mapped Classes

This section is a quick summary of what's going on when you send a class to the mapper() function. This material, not required to be able to use SQLAlchemy, is a little more dense and should be approached patiently!

The primary changes to a class that is mapped involve attaching property objects to it which represent table columns. These property objects essentially track changes. In addition, the __init__ method of the object is decorated to track object creates.

Here is a quick rundown of all the changes in code form:

# step 1 - override __init__ to 'register_new' with the Unit of Work
oldinit = myclass.__init__
def init(self, *args, **kwargs):
    nohist = kwargs.pop('_mapper_nohistory', False)
    oldinit(self, *args, **kwargs)
    if not nohist:
        # register_new with Unit Of Work
        objectstore.uow().register_new(self)
myclass.__init__ = init

# step 2 - set a string identifier that will 
# locate the classes' primary mapper
myclass._mapper = mapper.hashkey

# step 3 - add column accessor
myclass.c = mapper.columns

# step 4 - attribute decorating.  
# this happens mostly within the package sqlalchemy.attributes

# this dictionary will store a series of callables 
# that generate "history" containers for
# individual object attributes
myclass._class_managed_attributes = {}

# create individual properties for each column - 
# these objects know how to talk 
# to the attribute package to create appropriate behavior.
# the next example examines the attributes package more closely.
myclass.column1 = SmartProperty().property('column1', uselist=False)
myclass.column2 = SmartProperty().property('column2', uselist=True)

The attribute package is used when save operations occur to get a handle on modified values. In the example below, a full round-trip attribute tracking operation is illustrated:

import sqlalchemy.attributes as attributes

# create an attribute manager.  
# the sqlalchemy.mapping package keeps one of these around as 
# 'objectstore.global_attributes'
manager = attributes.AttributeManager()

# regular old new-style class
class MyClass(object):
    pass

# register a scalar and a list attribute
manager.register_attribute(MyClass, 'column1', uselist=False)
manager.register_attribute(MyClass, 'column2', uselist=True)
    
# create/modify an object
obj = MyClass()
obj.column1 = 'this is a new value'
obj.column2.append('value 1')
obj.column2.append('value 2')

# get history objects
col1_history = manager.get_history(obj, 'column1')
col2_history = manager.get_history(obj, 'column2')

# whats new ?
>>> col1_history.added_items()
['this is a new value']

>>> col2_history.added_items()
['value1', 'value2']

# commit changes
manager.commit(obj)

# the new values become the "unchanged" values
>>> col1_history.added_items()
[]

>>> col1_history.unchanged_items()
['this is a new value']

>>> col2_history.added_items()
[]

>>> col2_history.unchanged_items()
['value1', 'value2']

The above AttributeManager also includes a method value_changed which is triggered whenever change events occur on the managed object attributes. The Unit of Work (objectstore) package overrides this method in order to receive change events; its essentially this:

import sqlalchemy.attributes as attributes
class UOWAttributeManager(attributes.AttributeManager):
    def value_changed(self, obj, key, value):
        if hasattr(obj, '_instance_key'):
            uow().register_dirty(obj)
        else:
            uow().register_new(obj)
            
global_attributes = UOWAttributeManager()

Objects that contain the attribute "_instance_key" are already registered with the Identity Map, and are assumed to have come from the database. They therefore get marked as "dirty" when changes happen. Objects without an "_instance_key" are not from the database, and get marked as "new" when changes happen, although usually this will already have occured via the object's __init__ method.

back to section top

View: Paged  |  One Page

The package sqlalchemy.types defines the datatype identifiers which may be used when defining metadata. This package includes a set of generic types, a set of SQL-specific subclasses of those types, and a small extension system used by specific database connectors to adapt these generic types into database-specific type objects.

Built-in Types

SQLAlchemy comes with a set of standard generic datatypes, which are defined as classes.

The standard set of generic types are:

package sqlalchemy.types
class String(TypeEngine):
    def __init__(self, length=None)

class Integer(TypeEngine)

class SmallInteger(Integer)

class Numeric(TypeEngine): 
    def __init__(self, precision=10, length=2)

class Float(Numeric):
    def __init__(self, precision=10)

class DateTime(TypeEngine)

class Date(TypeEngine)

class Time(TypeEngine)

class Binary(TypeEngine): 
    def __init__(self, length=None)

class Boolean(TypeEngine)

# converts unicode strings to raw bytes
# as bind params, raw bytes to unicode as 
# rowset values, using the unicode encoding 
# setting on the engine (defaults to 'utf-8')
class Unicode(String)

# uses the pickle protocol to serialize data
# in/out of Binary columns
class PickleType(Binary)

More specific subclasses of these types are available, which various database engines may choose to implement specifically, allowing finer grained control over types:

class FLOAT(Numeric)
class TEXT(String)
class DECIMAL(Numeric)
class INT(Integer)
INTEGER = INT
class TIMESTAMP(DateTime)
class DATETIME(DateTime)
class CLOB(String)
class VARCHAR(String)
class CHAR(String)
class BLOB(Binary)
class BOOLEAN(Boolean)

When using a specific database engine, these types are adapted even further via a set of database-specific subclasses defined by the database engine. There may eventually be more type objects that are defined for specific databases. An example of this would be Postgres' Array type.

Type objects are specified to table meta data using either the class itself, or an instance of the class. Creating an instance of the class allows you to specify parameters for the type, such as string length, numerical precision, etc.:

mytable = Table('mytable', engine, 
    # define type using a class
    Column('my_id', Integer, primary_key=True), 

    # define type using an object instance
    Column('value', Number(7,4)) 
)
back to section top
Creating your Own Types

User-defined types can be created, to support either database-specific types, or customized pre-processing of query parameters as well as post-processing of result set data. You can make your own classes to perform these operations. They are specified by subclassing the desired type class:

Basic Example
import sqlalchemy.types as types

class MyType(types.String):
    """basic type that decorates String, prefixes values with "PREFIX:" on 
    the way in and strips it off on the way out."""
    def convert_bind_param(self, value, engine):
        return "PREFIX:" + value
    def convert_result_value(self, value, engine):
        return value[7:]

A common desire is for a "pickle" type, which overrides a Binary object to provide pickling behavior:

Pickle Type
import cPickle

class PickleType(Binary):
    def __init__(self, protocol=pickle.HIGHEST_PROTOCOL):
        """allows the pickle protocol to be specified"""
        self.protocol = protocol
    def convert_result_value(self, value, engine):
        if value is None:
          return None
        buf = Binary.convert_result_value(self, value, engine)
        return pickle.loads(str(buf))
    def convert_bind_param(self, value, engine):
        if value is None:
          return None
        return Binary.convert_bind_param(self, pickle.dumps(value, self.protocol), engine)
    def get_constructor_args(self):
        return {}

Which can be used like:

mytable = Table('mytable', engine, 
        Column('id', Integer, primary_key=True),
        Column('data', PickleType()))

my_object = MyObject()
mytable.insert().execute(data=my_object)

Another example, which illustrates a fully defined datatype. This just overrides the base type class TypeEngine:

import sqlalchemy.types as types

class MyType(types.TypeEngine):
    def __init__(self, precision = 8):
        self.precision = precision
    def get_col_spec(self):
        return "MYTYPE(%s)" % self.precision
    def convert_bind_param(self, value, engine):
        return value
    def convert_result_value(self, value, engine):
        return value
    def adapt_args(self):
        """allows for the adaptation of this TypeEngine object into a new kind of type depending on its arguments."""
        return self
back to section top

Module sqlalchemy.schema

the schema module provides the building blocks for database metadata. This means all the entities within a SQL database that we might want to look at, modify, or create and delete are described by these objects, in a database-agnostic way.

A structure of SchemaItems also provides a "visitor" interface which is the primary method by which other methods operate upon the schema. The SQL package extends this structure with its own clause-specific objects as well as the visitor interface, so that the schema package "plugs in" to the SQL package.

Class Column(ColumnClause)

represents a column in a database table. this is a subclass of sql.ColumnClause and represents an actual existing table in the database, in a similar fashion as TableClause/Table.

def __init__(self, name, type, *args, **kwargs)

constructs a new Column object. Arguments are: name : the name of this column. this should be the identical name as it appears, or will appear, in the database. type : this is the type of column. This can be any subclass of types.TypeEngine, including the database-agnostic types defined in the types module, database-specific types defined within specific database modules, or user-defined types. *args : ForeignKey and Sequence objects should be added as list values. **kwargs : keyword arguments include: key=None : an optional "alias name" for this column. The column will then be identified everywhere in an application, including the column list on its Table, by this key, and not the given name. Generated SQL, however, will still reference the column by its actual name. primary_key=False : True if this column is a primary key column. Multiple columns can have this flag set to specify composite primary keys. nullable=True : True if this column should allow nulls. Defaults to True unless this column is a primary key column. default=None : a scalar, python callable, or ClauseElement representing the "default value" for this column, which will be invoked upon insert if this column is not present in the insert list or is given a value of None.

hidden=False : indicates this column should not be listed in the table's list of columns. Used for the "oid" column, which generally isnt in column lists.

index=None : True or index name. Indicates that this column is indexed. Pass true to autogenerate the index name. Pass a string to specify the index name. Multiple columns that specify the same index name will all be included in the index, in the order of their creation.

unique=None : True or index name. Indicates that this column is indexed in a unique index . Pass true to autogenerate the index name. Pass a string to specify the index name. Multiple columns that specify the same index name will all be included in the index, in the order of their creation.

def accept_schema_visitor(self, visitor)

traverses the given visitor to this Column's default and foreign key object, then calls visit_column on the visitor.

def append_item(self, item)

columns = property()

def copy(self)

creates a copy of this Column, unitialized

engine = property()

original = property()

parent = property()

back to section top
Class ColumnDefault(DefaultGenerator)

A plain default value on a column. this could correspond to a constant, a callable function, or a SQL clause.

def __init__(self, arg, **kwargs)

def accept_schema_visitor(self, visitor)

calls the visit_column_default method on the given visitor.

back to section top
Class ForeignKey(SchemaItem)

defines a ForeignKey constraint between two columns. ForeignKey is specified as an argument to a Column object.

def __init__(self, column)

Constructs a new ForeignKey object. "column" can be a schema.Column object representing the relationship, or just its string name given as "tablename.columnname". schema can be specified as "schemaname.tablename.columnname"

def accept_schema_visitor(self, visitor)

calls the visit_foreign_key method on the given visitor.

column = property()

def copy(self)

produces a copy of this ForeignKey object.

def references(self, table)

returns True if the given table is referenced by this ForeignKey.

back to section top
Class Index(SchemaItem)

Represents an index of columns from a database table

def __init__(self, name, *columns, **kw)

Constructs an index object. Arguments are:

name : the name of the index

*columns : columns to include in the index. All columns must belong to the same table, and no column may appear more than once.

**kw : keyword arguments include:

unique=True : create a unique index

def accept_schema_visitor(self, visitor)

def append_column(self, column)

def create(self)

def drop(self)

def execute(self)

back to section top
Class PassiveDefault(DefaultGenerator)

a default that takes effect on the database side

def __init__(self, arg, **kwargs)

def accept_schema_visitor(self, visitor)

back to section top
Class SchemaEngine(AbstractEngine)

a factory object used to create implementations for schema objects. This object is the ultimate base class for the engine.SQLEngine class.

def __init__(self)

def reflecttable(self, table)

given a table, will query the database and populate its Column and ForeignKey objects.

def schemadropper(self, **params)

def schemagenerator(self, **params)

back to section top
Class SchemaItem(object)

base class for items that define a database schema.

back to section top
Class SchemaVisitor(ClauseVisitor)

defines the visiting for SchemaItem objects

def visit_column(self, column)

visit a Column.

def visit_column_default(self, default)

visit a ColumnDefault.

def visit_column_onupdate(self, onupdate)

visit a ColumnDefault with the "for_update" flag set.

def visit_foreign_key(self, join)

visit a ForeignKey.

def visit_index(self, index)

visit an Index.

def visit_passive_default(self, default)

visit a passive default

def visit_schema(self, schema)

visit a generic SchemaItem

def visit_sequence(self, sequence)

visit a Sequence.

def visit_table(self, table)

visit a Table.

back to section top
Class Sequence(DefaultGenerator)

represents a sequence, which applies to Oracle and Postgres databases.

def __init__(self, name, start=None, increment=None, optional=False, **kwargs)

def accept_schema_visitor(self, visitor)

calls the visit_seauence method on the given visitor.

def create(self)

def drop(self)

back to section top
Class Table(TableClause)

represents a relational database table. This subclasses sql.TableClause to provide a table that is "wired" to an engine. Whereas TableClause represents a table as its used in a SQL expression, Table represents a table as its created in the database. Be sure to look at sqlalchemy.sql.TableImpl for additional methods defined on a Table.

def __init__(self, name, engine, **kwargs)

Table objects can be constructed directly. The init method is actually called via the TableSingleton metaclass. Arguments are: name : the name of this table, exactly as it appears, or will appear, in the database. This property, along with the "schema", indicates the "singleton identity" of this table. Further tables constructed with the same name/schema combination will return the same Table instance. engine : a SchemaEngine instance to provide services to this table. Usually a subclass of sql.SQLEngine. *args : should contain a listing of the Column objects for this table. **kwargs : options include: schema=None : the "schema name" for this table, which is required if the table resides in a schema other than the default selected schema for the engine's database connection. autoload=False : the Columns for this table should be reflected from the database. Usually there will be no Column objects in the constructor if this property is set. redefine=False : if this Table has already been defined in the application, clear out its columns and redefine with new arguments. mustexist=False : indicates that this Table must already have been defined elsewhere in the application, else an exception is raised. useexisting=False : indicates that if this Table was already defined elsewhere in the application, disregard the rest of the constructor arguments. If this flag and the "redefine" flag are not set, constructing the same table twice will result in an exception.

def accept_schema_visitor(self, visitor)

traverses the given visitor across the Column objects inside this Table, then calls the visit_table method on the visitor.

def append_column(self, column)

def append_index(self, index)

def append_index_column(self, column, index=None, unique=None)

Add an index or a column to an existing index of the same name.

def append_item(self, item)

appends a Column item or other schema item to this Table.

def create(self, **params)

def deregister(self)

removes this table from it's engines table registry. this does not issue a SQL DROP statement.

def drop(self, **params)

def reload_values(self, *args)

clears out the columns and other properties of this Table, and reloads them from the given argument list. This is used with the "redefine" keyword argument sent to the metaclass constructor.

def toengine(self, engine, schema=None)

returns a singleton instance of this Table with a different engine

back to section top
Module sqlalchemy.engine

Defines the SQLEngine class, which serves as the primary "database" object used throughout the sql construction and object-relational mapper packages. A SQLEngine is a facade around a single connection pool corresponding to a particular set of connection parameters, and provides thread-local transactional methods and statement execution methods for Connection objects. It also provides a facade around a Cursor object to allow richer column selection for result rows as well as type conversion operations, known as a ResultProxy.

A SQLEngine is provided to an application as a subclass that is specific to a particular type of DBAPI, and is the central switching point for abstracting different kinds of database behavior into a consistent set of behaviors. It provides a variety of factory methods to produce everything specific to a certain kind of database, including a Compiler, schema creation/dropping objects.

The term "database-specific" will be used to describe any object or function that has behavior corresponding to a particular vendor, such as mysql-specific, sqlite-specific, etc.

Module Functions
def create_engine(name, opts=None, **kwargs)

creates a new SQLEngine instance. There are two forms of calling this method. In the first, the "name" argument is the type of engine to load, i.e. 'sqlite', 'postgres', 'oracle', 'mysql'. "opts" is a dictionary of options to be sent to the underlying DBAPI module to create a connection, usually including a hostname, username, password, etc. In the second, the "name" argument is a URL in the form <enginename>://opt1=val1&opt2=val2. Where <enginename> is the name as above, and the contents of the option dictionary are spelled out as a URL encoded string. The "opts" argument is not used. In both cases, **kwargs represents options to be sent to the SQLEngine itself. A possibly partial listing of those options is as follows: pool=None : an instance of sqlalchemy.pool.DBProxy or sqlalchemy.pool.Pool to be used as the underlying source for connections (DBProxy/Pool is described in the previous section). If None, a default DBProxy will be created using the engine's own database module with the given arguments. echo=False : if True, the SQLEngine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. A SQLEngine instances' "echo" data member can be modified at any time to turn logging on and off. If set to the string 'debug', result rows will be printed to the standard output as well. logger=None : a file-like object where logging output can be sent, if echo is set to True. This defaults to sys.stdout.

module=None : used by Oracle and Postgres, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle. For mysql, MySQLdb.

use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of some Oracle databases, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using <column1>(+)=<column2> must be used in order to achieve a LEFT OUTER JOIN. Its advised that the Oracle database be configured to have full ANSI support instead of using this feature.

def engine_descriptors()

provides a listing of all the database implementations supported. this data is provided as a list of dictionaries, where each dictionary contains the following key/value pairs: name : the name of the engine, suitable for use in the create_engine function

description: a plain description of the engine.

arguments : a dictionary describing the name and description of each parameter used to connect to this engine's underlying DBAPI. This function is meant for usage in automated configuration tools that wish to query the user for database and connection information.

back to section top
Class SQLSession(object)

represents a a handle to the SQLEngine's connection pool. the default SQLSession maintains a distinct connection during transactions, otherwise returns connections newly retrieved from the pool each time. the Pool is usually configured to have use_threadlocal=True so if a particular connection is already checked out, youll get that same connection in the same thread. There can also be a "unique" SQLSession pushed onto the engine, which returns a connection via the unique_connection() method on Pool; this allows nested transactions to take place, or other operations upon more than one connection at a time.`

def __init__(self, engine, parent=None)

def begin(self)

begins a transaction on this SQLSession's connection. repeated calls to begin() will increment a counter that must be decreased by corresponding commit() statements before an actual commit occurs. this is to provide "nested" behavior of transactions so that different functions in a particular call stack can call begin()/commit() independently of each other without knowledge of an existing transaction.

def commit(self)

commits the transaction started by begin(). If begin() was called multiple times, a counter will be decreased for each call to commit(), with the actual commit operation occuring when the counter reaches zero. this is to provide "nested" behavior of transactions so that different functions in a particular call stack can call begin()/commit() independently of each other without knowledge of an existing transaction.

connection = property()

the connection represented by this SQLSession. The connection is late-connecting, meaning the call to the connection pool only occurs when it is first called (and the pool will typically only connect the first time it is called as well)

def is_begun(self)

def pop(self)

def rollback(self)

rolls back the transaction on this SQLSession's connection. this can be called regardless of the "begin" counter value, i.e. can be called from anywhere inside a callstack. the "begin" counter is cleared.

back to section top
Class SQLEngine(SchemaEngine)

The central "database" object used by an application. Subclasses of this object is used by the schema and SQL construction packages to provide database-specific behaviors, as well as an execution and thread-local transaction context. SQLEngines are constructed via the create_engine() function inside this package.

def __init__(self, pool=None, echo=False, logger=None, default_ordering=False, echo_pool=False, echo_uow=False, convert_unicode=False, encoding='utf-8', **params)

constructs a new SQLEngine. SQLEngines should be constructed via the create_engine() function which will construct the appropriate subclass of SQLEngine.

def begin(self)

begins a transaction on the current thread SQLSession.

def commit(self)

def compile(self, statement, parameters, **kwargs)

given a sql.ClauseElement statement plus optional bind parameters, creates a new instance of this engine's SQLCompiler, compiles the ClauseElement, and returns the newly compiled object.

def compiler(self, statement, parameters)

returns a sql.ClauseVisitor which will produce a string representation of the given ClauseElement and parameter dictionary. This object is usually a subclass of ansisql.ANSICompiler. compiler is called within the context of the compile() method.

def connect_args(self)

subclasses override this method to provide a two-item tuple containing the *args and **kwargs used to establish a connection.

def connection(self)

returns a managed DBAPI connection from this SQLEngine's connection pool.

def create(self, entity, **params)

creates a table or index within this engine's database connection given a schema.Table object.

def dbapi(self)

subclasses override this method to provide the DBAPI module used to establish connections.

def defaultrunner(self, proxy=None)

Returns a schema.SchemaVisitor instance that can execute the default values on a column. The base class for this visitor is the DefaultRunner class inside this module. This visitor will typically only receive schema.DefaultGenerator schema objects. The given proxy is a callable that takes a string statement and a dictionary of bind parameters to be executed. For engines that require positional arguments, the dictionary should be an instance of OrderedDict which returns its bind parameters in the proper order. defaultrunner is called within the context of the execute_compiled() method.

def dispose(self)

disposes of the underlying pool manager for this SQLEngine.

def do_begin(self, connection)

implementations might want to put logic here for turning autocommit on/off, etc.

def do_commit(self, connection)

implementations might want to put logic here for turning autocommit on/off, etc.

def do_rollback(self, connection)

implementations might want to put logic here for turning autocommit on/off, etc.

def drop(self, entity, **params)

drops a table or index within this engine's database connection given a schema.Table object.

def execute(self, statement, parameters=None, connection=None, cursor=None, echo=None, typemap=None, commit=False, return_raw=False, **kwargs)

executes the given string-based SQL statement with the given parameters.

The parameters can be a dictionary or a list, or a list of dictionaries or lists, depending on the paramstyle of the DBAPI. If the current thread has specified a transaction begin() for this engine, the statement will be executed in the context of the current transactional connection. Otherwise, a commit() will be performed immediately after execution, since the local pooled connection is returned to the pool after execution without a transaction set up.

In all error cases, a rollback() is immediately performed on the connection before propagating the exception outwards.

Other options include:

connection - a DBAPI connection to use for the execute. If None, a connection is pulled from this engine's connection pool.

echo - enables echo for this execution, which causes all SQL and parameters to be dumped to the engine's logging output before execution.

typemap - a map of column names mapped to sqlalchemy.types.TypeEngine objects. These will be passed to the created ResultProxy to perform post-processing on result-set values.

commit - if True, will automatically commit the statement after completion.

def execute_compiled(self, compiled, parameters, connection=None, cursor=None, echo=None, **kwargs)

executes the given compiled statement object with the given parameters.

The parameters can be a dictionary of key/value pairs, or a list of dictionaries for an executemany() style of execution. Engines that use positional parameters will convert the parameters to a list before execution.

If the current thread has specified a transaction begin() for this engine, the statement will be executed in the context of the current transactional connection. Otherwise, a commit() will be performed immediately after execution, since the local pooled connection is returned to the pool after execution without a transaction set up.

In all error cases, a rollback() is immediately performed on the connection before propigating the exception outwards.

Other options include:

connection - a DBAPI connection to use for the execute. If None, a connection is pulled from this engine's connection pool.

echo - enables echo for this execution, which causes all SQL and parameters to be dumped to the engine's logging output before execution.

typemap - a map of column names mapped to sqlalchemy.types.TypeEngine objects. These will be passed to the created ResultProxy to perform post-processing on result-set values.

commit - if True, will automatically commit the statement after completion.

func = property()

def get_default_schema_name(self)

returns the currently selected schema in the current connection.

def hash_key(self)

ischema = property()

returns an ISchema object for this engine, which allows access to information_schema tables (if supported)

def last_inserted_ids(self)

returns a thread-local list of the primary key values for the last insert statement executed. This does not apply to straight textual clauses; only to sql.Insert objects compiled against a schema.Table object, which are executed via statement.execute(). The order of items in the list is the same as that of the Table's 'primary_key' attribute. In some cases, this method may invoke a query back to the database to retrieve the data, based on the "lastrowid" value in the cursor.

def last_inserted_params(self)

returns a dictionary of the full parameter dictionary for the last compiled INSERT statement, including any ColumnDefaults or Sequences that were pre-executed. this value is thread-local.

def last_updated_params(self)

returns a dictionary of the full parameter dictionary for the last compiled UPDATE statement, including any ColumnDefaults that were pre-executed. this value is thread-local.

def lastrow_has_defaults(self)

returns True if the last row INSERTED via a compiled insert statement contained PassiveDefaults, indicating that the database inserted data beyond that which we gave it. this value is thread-local.

def log(self, msg)

logs a message using this SQLEngine's logger stream.

def multi_transaction(self, tables, func)

provides a transaction boundary across tables which may be in multiple databases. If you have three tables, and a function that operates upon them, providing the tables as a list and the function will result in a begin()/commit() pair invoked for each distinct engine represented within those tables, and the function executed within the context of that transaction. any exceptions will result in a rollback(). clearly, this approach only goes so far, such as if database A commits, then database B commits and fails, A is already committed. Any failure conditions have to be raised before anyone commits for this to be useful.

name = property()

def oid_column_name(self)

returns the oid column name for this engine, or None if the engine cant/wont support OID/ROWID.

paramstyle = property()

def pop_session(self, s=None)

restores the current thread's SQLSession to that before the last push_session. Returns the restored SQLSession object. Raises an exception if there is no SQLSession pushed onto the stack.

def post_exec(self, proxy, compiled, parameters, **kwargs)

called by execute_compiled after the compiled statement is executed.

def pre_exec(self, proxy, compiled, parameters, **kwargs)

called by execute_compiled before the compiled statement is executed.

def proxy(self, statement=None, parameters=None)

returns a callable which will execute the given statement string and parameter object. the parameter object is expected to be the result of a call to compiled.get_params(). This callable is a generic version of a connection/cursor-specific callable that is produced within the execute_compiled method, and is used for objects that require this style of proxy when outside of an execute_compiled method, primarily the DefaultRunner.

def push_session(self)

pushes a new SQLSession onto this engine, temporarily replacing the previous one for the current thread. The previous session can be restored by calling pop_session(). this allows the usage of a new connection and possibly transaction within a particular block, superceding the existing one, including any transactions that are in progress. Returns the new SQLSession object.

def reflecttable(self, table)

given a Table object, reflects its columns and properties from the database.

def rollback(self)

rolls back the transaction on the current thread's SQLSession.

def schemadropper(self, **params)

returns a schema.SchemaVisitor instance that can drop schemas, when it is invoked to traverse a set of schema objects. schemagenerator is called via the drop() method.

def schemagenerator(self, **params)

returns a schema.SchemaVisitor instance that can generate schemas, when it is invoked to traverse a set of schema objects. schemagenerator is called via the create() method.

session = property()

returns the current thread's SQLSession

def supports_sane_rowcount(self)

Provided to indicate when MySQL is being used, which does not have standard behavior for the "rowcount" function on a statement handle.

def text(self, text, *args, **kwargs)

returns a sql.text() object for performing literal queries.

def transaction(self, func, *args, **kwargs)

executes the given function within a transaction boundary. this is a shortcut for explicitly calling begin() and commit() and optionally rollback() when execptions are raised. The given *args and **kwargs will be passed to the function as well, which could be handy in constructing decorators.

def type_descriptor(self, typeobj)

provides a database-specific TypeEngine object, given the generic object which comes from the types module. Subclasses will usually use the adapt_type() method in the types module to make this job easy.

def unique_connection(self)

returns a DBAPI connection from this SQLEngine's connection pool that is distinct from the current thread's connection.

back to section top
Class ResultProxy

wraps a DBAPI cursor object to provide access to row columns based on integer position, case-insensitive column name, or by schema.Column object. e.g.: row = fetchone()

col1 = row[0] # access via integer position

col2 = row['col2'] # access via name

col3 = row[mytable.c.mycol] # access via Column object. ResultProxy also contains a map of TypeEngine objects and will invoke the appropriate convert_result_value() method before returning columns.

def __init__(self, cursor, engine, typemap=None)

ResultProxy objects are constructed via the execute() method on SQLEngine.

def fetchall(self)

fetches all rows, just like DBAPI cursor.fetchall().

def fetchone(self)

fetches one row, just like DBAPI cursor.fetchone().

def last_inserted_ids(self)

def last_inserted_params(self)

def last_updated_params(self)

def lastrow_has_defaults(self)

def supports_sane_rowcount(self)

back to section top
Class RowProxy

proxies a single cursor row for a parent ResultProxy.

def __init__(self, parent, row)

RowProxy objects are constructed by ResultProxy objects.

def items(self)

def keys(self)

def values(self)

back to section top
Module sqlalchemy.sql

defines the base components of SQL expression trees.

Module Functions
def alias(*args, **params)

def and_(*clauses)

joins a list of clauses together by the AND operator. the & operator can be used as well.

def asc(column)

returns an ascending ORDER BY clause element, e.g.: order_by = [asc(table1.mycol)]

def between_(ctest, cleft, cright)

returns BETWEEN predicate clause (clausetest BETWEEN clauseleft AND clauseright)

def bindparam(key, value=None, type=None)

creates a bind parameter clause with the given key. An optional default value can be specified by the value parameter, and the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this bind parameter.

def cast(clause, totype, **kwargs)

returns CAST function CAST(clause AS totype) Use with a sqlalchemy.types.TypeEngine object, i.e cast(table.c.unit_price * table.c.qty, Numeric(10,4)) or cast(table.c.timestamp, DATE)

def column(text, table=None, type=None)

returns a textual column clause, relative to a table. this is also the primitive version of a schema.Column which is a subclass.

def delete(table, whereclause=None, **kwargs)

returns a DELETE clause element. This can also be called from a table directly via the table's delete() method. 'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement.

def desc(column)

returns a descending ORDER BY clause element, e.g.: order_by = [desc(table1.mycol)]

def exists(*args, **params)

def insert(table, values=None, **kwargs)

returns an INSERT clause element. This can also be called from a table directly via the table's insert() method. 'table' is the table to be inserted into. 'values' is a dictionary which specifies the column specifications of the INSERT, and is optional. If left as None, the column specifications are determined from the bind parameters used during the compile phase of the INSERT statement. If the bind parameters also are None during the compile phase, then the column specifications will be generated from the full list of table columns.

If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.

The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this INSERT statement's table, the statement will be correlated against the INSERT statement.

def join(left, right, onclause=None, **kwargs)

returns a JOIN clause element (regular inner join), given the left and right hand expressions, as well as the ON condition's expression. To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.

def literal(value, type=None)

returns a literal clause, bound to a bind parameter. literal clauses are created automatically when used as the right-hand side of a boolean or math operation against a column object. use this function when a literal is needed on the left-hand side (and optionally on the right as well). the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this literal.

def not_(clause)

returns a negation of the given clause, i.e. NOT(clause). the ~ operator can be used as well.

def null()

returns a Null object, which compiles to NULL in a sql statement.

def or_(*clauses)

joins a list of clauses together by the OR operator. the | operator can be used as well.

def outerjoin(left, right, onclause=None, **kwargs)

returns an OUTER JOIN clause element, given the left and right hand expressions, as well as the ON condition's expression. To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.

def select(columns=None, whereclause=None, from_obj=[], **kwargs)

returns a SELECT clause element. this can also be called via the table's select() method. 'columns' is a list of columns and/or selectable items to select columns from 'whereclause' is a text or ClauseElement expression which will form the WHERE clause 'from_obj' is an list of additional "FROM" objects, such as Join objects, which will extend or override the default "from" objects created from the column list and the whereclause. **kwargs - additional parameters for the Select object.

def subquery(alias, *args, **kwargs)

def table(name, *columns)

returns a table clause. this is a primitive version of the schema.Table object, which is a subclass of this object.

def text(text, engine=None, *args, **kwargs)

creates literal text to be inserted into a query. When constructing a query from a select(), update(), insert() or delete(), using plain strings for argument values will usually result in text objects being created automatically. Use this function when creating textual clauses outside of other ClauseElement objects, or optionally wherever plain text is to be used. Arguments include:

text - the text of the SQL statement to be created. use :<param> to specify bind parameters; they will be compiled to their engine-specific format.

engine - an optional engine to be used for this text query. Alternatively, call the text() method off the engine directly.

bindparams - a list of bindparam() instances which can be used to define the types and/or initial values for the bind parameters within the textual statement; the keynames of the bindparams must match those within the text of the statement. The types will be used for pre-processing on bind values.

typemap - a dictionary mapping the names of columns represented in the SELECT clause of the textual statement to type objects, which will be used to perform post-processing on columns within the result set (for textual statements that produce result sets).

def union(*selects, **params)

def union_all(*selects, **params)

def update(table, whereclause=None, values=None, **kwargs)

returns an UPDATE clause element. This can also be called from a table directly via the table's update() method. 'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement. 'values' is a dictionary which specifies the SET conditions of the UPDATE, and is optional. If left as None, the SET conditions are determined from the bind parameters used during the compile phase of the UPDATE statement. If the bind parameters also are None during the compile phase, then the SET conditions will be generated from the full list of table columns.

If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.

The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this UPDATE statement's table, the statement will be correlated against the UPDATE statement.

back to section top
Class ClauseParameters(OrderedDict)

represents a dictionary/iterator of bind parameter key names/values. Includes parameters compiled with a Compiled object as well as additional arguments passed to the Compiled object's get_params() method. Parameter values will be converted as per the TypeEngine objects present in the bind parameter objects. The non-converted value can be retrieved via the get_original method. For Compiled objects that compile positional parameters, the values() iteration of the object will return the parameter values in the correct order.

def __init__(self, engine=None)

def get_original(self, key)

def get_original_dict(self)

def get_raw_dict(self)

def set_parameter(self, key, value, bindparam)

def values(self)

back to section top
Class Compiled(ClauseVisitor)

represents a compiled SQL expression. the __str__ method of the Compiled object should produce the actual text of the statement. Compiled objects are specific to the database library that created them, and also may or may not be specific to the columns referenced within a particular set of bind parameters. In no case should the Compiled object be dependent on the actual values of those bind parameters, even though it may reference those values as defaults.

def __init__(self, statement, parameters, engine=None)

constructs a new Compiled object. statement - ClauseElement to be compiled parameters - optional dictionary indicating a set of bind parameters specified with this Compiled object. These parameters are the "default" values corresponding to the ClauseElement's BindParamClauses when the Compiled is executed. In the case of an INSERT or UPDATE statement, these parameters will also result in the creation of new BindParamClause objects for each key and will also affect the generated column list in an INSERT statement and the SET clauses of an UPDATE statement. The keys of the parameter dictionary can either be the string names of columns or ColumnClause objects. engine - optional SQLEngine to compile this statement against

def compile(self)

def execute(self, *multiparams, **params)

executes this compiled object using the AbstractEngine it is bound to.

def get_params(self, **params)

returns the bind params for this compiled object. Will start with the default parameters specified when this Compiled object was first constructed, and will override those values with those sent via **params, which are key/value pairs. Each key should match one of the BindParamClause objects compiled into this object; either the "key" or "shortname" property of the BindParamClause.

def scalar(self, *multiparams, **params)

executes this compiled object via the execute() method, then returns the first column of the first row. Useful for executing functions, sequences, rowcounts, etc.

back to section top
Class ClauseElement(object)

base class for elements of a programmatically constructed SQL expression.

def accept_visitor(self, visitor)

accepts a ClauseVisitor and calls the appropriate visit_xxx method.

def compare(self, other)

compares this ClauseElement to the given ClauseElement. Subclasses should override the default behavior, which is a straight identity comparison.

def compile(self, engine=None, parameters=None, typemap=None, compiler=None)

compiles this SQL expression using its underlying SQLEngine to produce a Compiled object. If no engine can be found, an ANSICompiler is used with no engine. bindparams is a dictionary representing the default bind parameters to be used with the statement.

def copy_container(self)

should return a copy of this ClauseElement, iff this ClauseElement contains other ClauseElements. Otherwise, it should be left alone to return self. This is used to create copies of expression trees that still reference the same "leaf nodes". The new structure can then be restructured without affecting the original.

engine = property()

attempts to locate a SQLEngine within this ClauseElement structure, or returns None if none found.

def execute(self, *multiparams, **params)

def is_selectable(self)

returns True if this ClauseElement is Selectable, i.e. it contains a list of Column objects and can be used as the target of a select statement.

def scalar(self, *multiparams, **params)

def using(self, abstractengine)

back to section top
Class TableClause(FromClause)

def __init__(self, name, *columns)

def accept_visitor(self, visitor)

def alias(self, name=None)

def append_column(self, c)

c = property()

columns = property()

def count(self, whereclause=None, **params)

def delete(self, whereclause=None)

foreign_keys = property()

indexes = property()

def insert(self, values=None)

def join(self, right, *args, **kwargs)

original_columns = property()

def outerjoin(self, right, *args, **kwargs)

primary_key = property()

def select(self, whereclause=None, **params)

def update(self, whereclause=None, values=None)

back to section top
Class ColumnClause(ColumnElement)

represents a textual column clause in a SQL statement. May or may not be bound to an underlying Selectable.

def __init__(self, text, selectable=None, type=None)

def accept_visitor(self, visitor)

def to_selectable(self, selectable)

given a Selectable, returns this column's equivalent in that Selectable, if any. for example, this could translate the column "name" from a Table object to an Alias of a Select off of that Table object.

back to section top
Module sqlalchemy.pool

provides a connection pool implementation, which optionally manages connections on a thread local basis. Also provides a DBAPI2 transparency layer so that pools can be managed automatically, based on module type and connect arguments, simply by calling regular DBAPI connect() methods.

Module Functions
def clear_managers()

removes all current DBAPI2 managers. all pools and connections are disposed.

def manage(module, **params)

given a DBAPI2 module and pool management parameters, returns a proxy for the module that will automatically pool connections. Options are delivered to an underlying DBProxy object.

Arguments: module : a DBAPI2 database module. Options: echo=False : if set to True, connections being pulled and retrieved from/to the pool will be logged to the standard output, as well as pool sizing information.

use_threadlocal=True : if set to True, repeated calls to connect() within the same application thread will be guaranteed to return the same connection object, if one has already been retrieved from the pool and has not been returned yet. This allows code to retrieve a connection from the pool, and then while still holding on to that connection, to call other functions which also ask the pool for a connection of the same arguments; those functions will act upon the same connection that the calling method is using.

poolclass=QueuePool : the default class used by the pool module to provide pooling. QueuePool uses the Python Queue.Queue class to maintain a list of available connections.

pool_size=5 : used by QueuePool - the size of the pool to be maintained. This is the largest number of connections that will be kept persistently in the pool. Note that the pool begins with no connections; once this number of connections is requested, that number of connections will remain.

max_overflow=10 : the maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of "sleeping" connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections.

back to section top
Class DBProxy(object)

proxies a DBAPI2 connect() call to a pooled connection keyed to the specific connect parameters.

def __init__(self, module, poolclass=, **params)

module is a DBAPI2 module poolclass is a Pool class, defaulting to QueuePool. other parameters are sent to the Pool object's constructor.

def close(self)

def connect(self, *args, **params)

connects to a database using this DBProxy's module and the given connect arguments. if the arguments match an existing pool, the connection will be returned from the pool's current thread-local connection instance, or if there is no thread-local connection instance it will be checked out from the set of pooled connections. If the pool has no available connections and allows new connections to be created, a new database connection will be made.

def dispose(self, *args, **params)

disposes the connection pool referenced by the given connect arguments.

def get_pool(self, *args, **params)

back to section top
Class Pool(object)

def __init__(self, echo=False, use_threadlocal=True, logger=None)

def connect(self)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def get(self)

def log(self, msg)

def return_conn(self, conn)

def return_invalid(self)

def status(self)

def unique_connection(self)

back to section top
Class QueuePool(Pool)

uses Queue.Queue to maintain a fixed-size list of connections.

def __init__(self, creator, pool_size=5, max_overflow=10, **params)

def checkedin(self)

def checkedout(self)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def overflow(self)

def size(self)

def status(self)

back to section top
Class SingletonThreadPool(Pool)

Maintains one connection per each thread, never moving to another thread. this is used for SQLite and other databases with a similar restriction.

def __init__(self, creator, **params)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def status(self)

back to section top
Module sqlalchemy.mapping

the mapper package provides object-relational functionality, building upon the schema and sql packages and tying operations to class properties and constructors.

Module Functions
def assign_mapper(class_, *args, **params)

def backref(name, **kwargs)

def cascade_mappers(*classes_or_mappers)

given a list of classes and/or mappers, identifies the foreign key relationships between the given mappers or corresponding class mappers, and creates relation() objects representing those relationships, including a backreference. Attempts to find the "secondary" table in a many-to-many relationship as well. The names of the relations will be a lowercase version of the related class. In the case of one-to-many or many-to-many, the name will be "pluralized", which currently is based on the English language (i.e. an 's' or 'es' added to it).

def class_mapper(class_, entity_name=None)

given a ClassKey, returns the primary Mapper associated with the key.

def clear_mappers()

removes all mappers that have been created thus far. when new mappers are created, they will be assigned to their classes as their primary mapper.

def defer(name, **kwargs)

returns a MapperOption that will convert the column property of the given name into a deferred load. Used with mapper.options()

def deferred(*columns, **kwargs)

returns a DeferredColumnProperty, which indicates this object attributes should only be loaded from its corresponding table column when first accessed.

def eagerload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into an eager load. Used with mapper.options()

def extension(ext)

returns a MapperOption that will add the given MapperExtension to the mapper returned by mapper.options().

def lazyload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into a lazy load. Used with mapper.options()

def mapper(class_, table=None, *args, **params)

returns a new or already cached Mapper object.

def noload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into a non-load. Used with mapper.options()

def object_mapper(object)

given an object, returns the primary Mapper associated with the object or the object's class.

def relation(*args, **kwargs)

provides a relationship of a primary Mapper to a secondary Mapper, which corresponds to a parent-child or associative table relationship.

def undefer(name, **kwargs)

returns a MapperOption that will convert the column property of the given name into a non-deferred (regular column) load. Used with mapper.options.

back to section top
Class Mapper(object)

Persists object instances to and from schema.Table objects via the sql package. Instances of this class should be constructed through this package's mapper() or relation() function.

def __init__(self, class_, table, primarytable=None, properties=None, primary_key=None, is_primary=False, inherits=None, inherit_condition=None, extension=None, order_by=False, allow_column_override=False, entity_name=None, always_refresh=False, version_id_col=None, **kwargs)

def add_property(self, key, prop)

adds an additional property to this mapper. this is the same as if it were specified within the 'properties' argument to the constructor. if the named property already exists, this will replace it. Useful for circular relationships, or overriding the parameters of auto-generated properties such as backreferences.

def compile(self, whereclause=None, **options)

works like select, except returns the SQL statement object without compiling or executing it

def copy(self, **kwargs)

def count(self, whereclause=None, params=None, **kwargs)

calls count() on this mapper's default Query object.

def count_by(self, *args, **params)

calls count_by() on this mapper's default Query object.

def delete_obj(self, objects, uow)

called by a UnitOfWork object to delete objects, which involves a DELETE statement for each table used by this mapper, for each object in the list.

def get(self, *ident, **kwargs)

calls get() on this mapper's default Query object.

def get_by(self, *args, **params)

calls get_by() on this mapper's default Query object.

def has_eager(self)

returns True if one of the properties attached to this Mapper is eager loading

def identity(self, instance)

returns the identity (list of primary key values) for the given instance. The list of values can be fed directly into the get() method as mapper.get(*key).

def identity_key(self, *primary_key)

returns the instance key for the given identity value. this is a global tracking object used by the objectstore, and is usually available off a mapped object as instance._instance_key.

def instance_key(self, instance)

returns the instance key for the given instance. this is a global tracking object used by the objectstore, and is usually available off a mapped object as instance._instance_key.

def instances(self, cursor, *mappers, **kwargs)

given a cursor (ResultProxy) from an SQLEngine, returns a list of object instances corresponding to the rows in the cursor.

def is_assigned(self, instance)

returns True if this mapper is the primary mapper for the given instance. this is dependent not only on class assignment but the optional "entity_name" parameter as well.

def options(self, *options, **kwargs)

uses this mapper as a prototype for a new mapper with different behavior. *options is a list of options directives, which include eagerload(), lazyload(), and noload()

def populate_instance(self, session, instance, row, identitykey, imap, isnew, frommapper=None)

query = property()

returns an instance of sqlalchemy.mapping.query.Query, which implements all the query-constructing methods such as get(), select(), select_by(), etc. The default Query object uses the global thread-local Session from the objectstore package. To get a Query object for a specific Session, call the using(session) method.

def register_deleted(self, obj, uow)

def register_dependencies(self, uowcommit, *args, **kwargs)

called by an instance of objectstore.UOWTransaction to register which mappers are dependent on which, as well as DependencyProcessor objects which will process lists of objects in between saves and deletes.

def save_obj(self, objects, uow, postupdate=False)

called by a UnitOfWork object to save objects, which involves either an INSERT or an UPDATE statement for each table used by this mapper, for each element of the list.

def select(self, arg=None, **kwargs)

calls select() on this mapper's default Query object.

def select_by(self, *args, **params)

calls select_by() on this mapper's default Query object.

def select_statement(self, statement, **params)

calls select_statement() on this mapper's default Query object.

def select_text(self, text, **params)

def select_whereclause(self, whereclause=None, params=None, **kwargs)

calls select_whereclause() on this mapper's default Query object.

def selectfirst(self, *args, **params)

calls selectfirst() on this mapper's default Query object.

def selectfirst_by(self, *args, **params)

calls selectfirst_by() on this mapper's default Query object.

def selectone(self, *args, **params)

calls selectone() on this mapper's default Query object.

def selectone_by(self, *args, **params)

calls selectone_by() on this mapper's default Query object.

def set_property(self, key, prop)

def translate_row(self, tomapper, row)

attempts to take a row and translate its values to a row that can be understood by another mapper. breaks the column references down to their bare keynames to accomplish this. So far this works for the various polymorphic examples.

def using(self, session)

returns a new Query object with the given Session.

back to section top
Class MapperExtension(object)

def __init__(self)

def after_insert(self, mapper, instance)

called after an object instance has been INSERTed

def after_update(self, mapper, instance)

called after an object instnace is UPDATED

def append_result(self, mapper, row, imap, result, instance, isnew, populate_existing=False)

called when an object instance is being appended to a result list. If this method returns True, it is assumed that the mapper should do the appending, else if this method returns False, it is assumed that the append was handled by this method.

mapper - the mapper doing the operation row - the result row from the database imap - a dictionary that is storing the running set of objects collected from the current result set result - an instance of util.HistoryArraySet(), which may be an attribute on an object if this is a related object load (lazy or eager). use result.append_nohistory(value) to append objects to this list. instance - the object instance to be appended to the result isnew - indicates if this is the first time we have seen this object instance in the current result set. if you are selecting from a join, such as an eager load, you might see the same object instance many times in the same result set. populate_existing - usually False, indicates if object instances that were already in the main identity map, i.e. were loaded by a previous select(), get their attributes overwritten

def before_delete(self, mapper, instance)

called before an object instance is DELETEed

def before_insert(self, mapper, instance)

called before an object instance is INSERTed into its table. this is a good place to set up primary key values and such that arent handled otherwise.

def before_update(self, mapper, instance)

called before an object instnace is UPDATED

def chain(self, ext)

def create_instance(self, mapper, row, imap, class_)

called when a new object instance is about to be created from a row. the method can choose to create the instance itself, or it can return None to indicate normal object creation should take place. mapper - the mapper doing the operation row - the result row from the database imap - a dictionary that is storing the running set of objects collected from the current result set class_ - the class we are mapping.

def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew)

called right before the mapper, after creating an instance from a row, passes the row to its MapperProperty objects which are responsible for populating the object's attributes. If this method returns True, it is assumed that the mapper should do the appending, else if this method returns False, it is assumed that the append was handled by this method. Essentially, this method is used to have a different mapper populate the object: def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew): othermapper.populate_instance(session, instance, row, identitykey, imap, isnew, frommapper=mapper) return True

def select(self, query, *args, **kwargs)

overrides the select method of the Query object

def select_by(self, query, *args, **kwargs)

overrides the select_by method of the Query object

back to section top
Module sqlalchemy.mapping.query

Class Query(object)

encapsulates the object-fetching operations provided by Mappers.

def __init__(self, mapper, **kwargs)

def count(self, whereclause=None, params=None, **kwargs)

def count_by(self, *args, **params)

returns the count of instances based on the given clauses and key/value criterion. The criterion is constructed in the same way as the select_by() method.

def get(self, *ident, **kwargs)

returns an instance of the object based on the given identifier, or None if not found. The *ident argument is a list of primary key columns in the order of the table def's primary key columns.

def get_by(self, *args, **params)

returns a single object instance based on the given key/value criterion. this is either the first value in the result list, or None if the list is empty.

the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.

e.g. u = usermapper.get_by(user_name = 'fred')

def instances(self, *args, **kwargs)

props = property()

def select(self, arg=None, **kwargs)

selects instances of the object from the database.

arg can be any ClauseElement, which will form the criterion with which to load the objects.

For more advanced usage, arg can also be a Select statement object, which will be executed and its resulting rowset used to build new object instances. in this case, the developer must insure that an adequate set of columns exists in the rowset with which to build new object instances.

def select_by(self, *args, **params)

returns an array of object instances based on the given clauses and key/value criterion.

*args is a list of zero or more ClauseElements which will be connected by AND operators. **params is a set of zero or more key/value parameters which are converted into ClauseElements. the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.

e.g. result = usermapper.select_by(user_name = 'fred')

def select_statement(self, statement, **params)

def select_text(self, text, **params)

def select_whereclause(self, whereclause=None, params=None, **kwargs)

def selectfirst(self, *args, **params)

works like select(), but only returns the first result by itself, or None if no objects returned.

def selectfirst_by(self, *args, **params)

works like select_by(), but only returns the first result by itself, or None if no objects returned. Synonymous with get_by()

def selectone(self, *args, **params)

works like selectfirst(), but throws an error if not exactly one result was returned.

def selectone_by(self, *args, **params)

works like selectfirst_by(), but throws an error if not exactly one result was returned.

session = property()

table = property()

back to section top
Module sqlalchemy.mapping.objectstore

provides the Session object and a function-oriented convenience interface. This is the "front-end" to the Unit of Work system in unitofwork.py. Issues of "scope" are dealt with here, primarily through an important function "get_session()", which is where mappers and units of work go to get a handle on the current threa-local context.

Module Functions
def begin()

deprecated. use s = Session(new_imap=False).

def class_mapper(class_)

def clear()

removes all current UnitOfWorks and IdentityMaps for this thread and establishes a new one. It is probably a good idea to discard all current mapped object instances, as they are no longer in the Identity Map.

def commit(*obj)

deprecated; use flush(*obj)

def delete(*obj)

registers the given objects as to be deleted upon the next commit

def expire(*obj)

invalidates the data in the given objects and sets them to refresh themselves the next time they are requested.

def expunge(*obj)

def flush(*obj)

flushes the current UnitOfWork transaction. if a transaction was begun via begin(), flushes only those objects that were created, modified, or deleted since that begin statement. otherwise flushes all objects that have been changed.

if individual objects are submitted, then only those objects are committed, and the begin/commit cycle is not affected.

def get_id_key(ident, class_, entity_name=None)

def get_row_key(row, class_, primary_key, entity_name=None)

def get_session(obj=None)

def get_session(obj=None)

def has_instance(instance)

returns True if the current thread-local IdentityMap contains the given instance

def has_key(key)

returns True if the current thread-local IdentityMap contains the given instance key

def import_instance(instance)

def instance_key(instance)

returns the IdentityMap key for the given instance

def is_dirty(obj)

returns True if the given object is in the current UnitOfWork's new or dirty list, or if its a modified list attribute on an object.

def mapper(*args, **params)

def object_mapper(obj)

def pop_session()

def push_session(sess)

def refresh(*obj)

reloads the state of this object from the database, and cancels any in-memory changes.

def using_session(sess, func)

back to section top
Class LegacySession(Session)

def __init__(self, nest_on=None, hash_key=None, **kwargs)

def begin(self)

begins a new UnitOfWork transaction and returns a tranasaction-holding object. commit() or rollback() should be called on the returned object. commit() on the Session will do nothing while a transaction is pending, and further calls to begin() will return no-op transactional objects.

def commit(self, *objects)

commits the current UnitOfWork transaction. called with no arguments, this is only used for "implicit" transactions when there was no begin(). if individual objects are submitted, then only those objects are committed, and the begin/commit cycle is not affected.

def was_popped(self)

def was_pushed(self)

back to section top
Class SessionTrans(object)

returned by Session.begin(), denotes a transactionalized UnitOfWork instance. call commit() on this to commit the transaction.

def __init__(self, parent, uow, isactive)

def begin(self)

calls begin() on the underlying Session object, returning a new no-op SessionTrans object.

def commit(self)

commits the transaction noted by this SessionTrans object.

isactive = property()

True if this SessionTrans is the 'active' transaction marker, else its a no-op.

parent = property()

returns the parent Session of this SessionTrans object.

def rollback(self)

rolls back the current UnitOfWork transaction, in the case that begin() has been called. The changes logged since the begin() call are discarded.

uow = property()

returns the parent UnitOfWork corresponding to this transaction.

back to section top
Module sqlalchemy.exceptions

Class ArgumentError

raised for all those conditions where invalid arguments are sent to constructed objects. This error generally corresponds to construction time state errors.

back to section top
Class AssertionError

corresponds to internal state being detected in an invalid state

back to section top
Class CommitError

raised when an invalid condition is detected upon a commit()

back to section top
Class DBAPIError

something weird happened with a particular DBAPI version

back to section top
Class InvalidRequestError

sqlalchemy was asked to do something it cant do, return nonexistent data, etc. This error generally corresponds to runtime state errors.

back to section top
Class SQLAlchemyError

generic error class

back to section top
Class SQLError

raised when the execution of a SQL statement fails. includes accessors for the underlying exception, as well as the SQL and bind parameters

def __init__(self, statement, params, orig)

back to section top
Module sqlalchemy.ext.proxy

Module Functions
def create_engine(name, opts=None, **kwargs)

creates a new SQLEngine instance. There are two forms of calling this method. In the first, the "name" argument is the type of engine to load, i.e. 'sqlite', 'postgres', 'oracle', 'mysql'. "opts" is a dictionary of options to be sent to the underlying DBAPI module to create a connection, usually including a hostname, username, password, etc. In the second, the "name" argument is a URL in the form <enginename>://opt1=val1&opt2=val2. Where <enginename> is the name as above, and the contents of the option dictionary are spelled out as a URL encoded string. The "opts" argument is not used. In both cases, **kwargs represents options to be sent to the SQLEngine itself. A possibly partial listing of those options is as follows: pool=None : an instance of sqlalchemy.pool.DBProxy or sqlalchemy.pool.Pool to be used as the underlying source for connections (DBProxy/Pool is described in the previous section). If None, a default DBProxy will be created using the engine's own database module with the given arguments. echo=False : if True, the SQLEngine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. A SQLEngine instances' "echo" data member can be modified at any time to turn logging on and off. If set to the string 'debug', result rows will be printed to the standard output as well. logger=None : a file-like object where logging output can be sent, if echo is set to True. This defaults to sys.stdout.

module=None : used by Oracle and Postgres, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle. For mysql, MySQLdb.

use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of some Oracle databases, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using <column1>(+)=<column2> must be used in order to achieve a LEFT OUTER JOIN. Its advised that the Oracle database be configured to have full ANSI support instead of using this feature.

back to section top
Class AutoConnectEngine(BaseProxyEngine)

An SQLEngine proxy that automatically connects when necessary.

def __init__(self, dburi, opts=None, **kwargs)

def get_engine(self)

back to section top
Class BaseProxyEngine(SchemaEngine)

Basis for all proxy engines

def compiler(self, *args, **kwargs)

engine = property()

def execute_compiled(self, *args, **kwargs)

def get_engine(self)

def hash_key(self)

def oid_column_name(self)

def reflecttable(self, table)

def schemadropper(self, *args, **kwargs)

def schemagenerator(self, *args, **kwargs)

def set_engine(self, engine)

back to section top
Class ProxyEngine(BaseProxyEngine)

SQLEngine proxy. Supports lazy and late initialization by delegating to a real engine (set with connect()), and using proxy classes for TypeEngine.

def __init__(self, **kwargs)

def connect(self, uri, opts=None, **kwargs)

Establish connection to a real engine.

def get_engine(self)

def set_engine(self, engine)

back to section top
Class TypeEngine(object)

def __init__(self, *args, **kwargs)

def adapt_args(self)

def convert_bind_param(self, value, engine)

def convert_result_value(self, value, engine)

def engine_impl(self, engine)

def get_col_spec(self)

def get_constructor_args(self)

impl = property()

impl_dict = property()

def set_impl(self, impltype)

back to section top