-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
WindowFunctionsByDefault
note: - See also http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery for a more ad-hoc version of this technique.
note: - this recipe requires at least SQLAlchemy version 0.6.2.
SQL 2003 introduces the concept of a "window function", that is a function which allows criteria against the current result row in the result returned by a SELECT statement. Per http://troels.arvin.dk/db/rdbms/#select-limit-offset, the window function ROW_NUMBER() is used as the SQL standard method for paginating result sets.
Oracle and SQL Server have supported window functions for many years. Postgresql 8.4 supports them, although testing reveals that PG's OFFSET construct, while having inherent performance issues, is still dramatically faster than the ROW_NUMBER() approach.
SQLAlchemy's current behavior (0.6.1) does not use window functions for LIMIT/OFFSET, instead using the database specific keywords provided, and on Oracle we currently use a scheme recommended by a prominent Oracle engineer based on the special ROWNUM variable (http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html).
The recipe below allows one to replace all database-dependent LIMIT/OFFSET behavior with a single method that renders a ROW_NUMBER(). It's based on our Oracle LIMIT/OFFSET implementation previous to version 0.5, and modifies the compilation of the Select
construct such that LIMIT / OFFSET are detected and converted into a subquery with ROW_NUMBER() OVER (ORDER BY col ASC|DESC)
. The ORDER BY is required as it's not very useful to paginate results without sorting them.
#!python
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import Select
from sqlalchemy import exc, sql
@compiles(Select)
def compile_select(element, compiler, **kw):
if not getattr(element, '_window_visit', None):
if element._limit is not None or element._offset is not None:
limit, offset = element._limit, element._offset
# copy the element so that we can modify it
element = element._generate()
element._window_visit = True
element._limit = element._offset = None
orderby = compiler.process(element._order_by_clause)
if not orderby:
raise exc.CompileError("ORDER BY is required with LIMIT/OFFSET")
element = element.column(
sql.literal_column("ROW_NUMBER() OVER (ORDER BY %s)" % orderby
).label("rownum")).order_by(None)
limitselect = sql.select([c for c in element.alias().c if c.key != 'rownum'])
limitselect._window_visit = True
limitselect._is_wrapper = True
if offset is not None:
limitselect.append_whereclause(sql.column("rownum") > offset)
if limit is not None:
limitselect.append_whereclause(
sql.column("rownum") <=
(limit + offset)
)
else:
limitselect.append_whereclause(sql.column("rownum") <= limit)
element = limitselect
kw['iswrapper'] = getattr(element, '_is_wrapper', False)
return compiler.visit_select(element, **kw)
if __name__ == '__main__':
from sqlalchemy.sql import table, column, select
t1 = table('t1', column('c1'), column('c2'), column('c3'))
s = select([t1]).order_by(t1.c.c2).limit(3).offset(5)
print s
Variants of the above are doable as well, such as only running the window function when OFFSET is present, or when ORDER BY is present, otherwise falling back to default behavior. The example below illustrates how to render the window function only if ORDER BY is present:
#!python
@compiles(Select)
def compile_select(element, compiler, **kw):
if not getattr(element, '_window_visit', None):
if element._limit is not None or element._offset is not None:
limit, offset = element._limit, element._offset
orderby = compiler.process(element._order_by_clause)
if orderby:
element = element._generate()
element._window_visit = True
element._limit = element._offset = None
element = element.column(
sql.literal_column("ROW_NUMBER() OVER (ORDER BY %s)" % orderby
).label("rownum")).order_by(None)
limitselect = sql.select([c for c in element.alias().c if c.key != 'rownum'])
limitselect._window_visit = True
limitselect._is_wrapper = True
if offset is not None:
limitselect.append_whereclause(sql.column("rownum") > offset)
if limit is not None:
limitselect.append_whereclause(
sql.column("rownum") <=
(limit + offset)
)
else:
limitselect.append_whereclause(sql.column("rownum") <= limit)
element = limitselect
kw['iswrapper'] = getattr(element, '_is_wrapper', False)
return compiler.visit_select(element, **kw)