0% found this document useful (0 votes)

179 views84 pages

Introduction To Hacking Postgresql: With Lots of Code Review!

The document introduces hacking on PostgreSQL. It discusses why PostgreSQL is a good database to hack on, the development environment for PostgreSQL including tools like GCC and text editors, the architecture of PostgreSQL which has five main components - the parser, rewriter, optimizer, executor, and utility processor, and an overview of submitting patches.

Uploaded by

Med Bahba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

179 views84 pages

Introduction To Hacking Postgresql: With Lots of Code Review!

Uploaded by

Med Bahba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

Why hack on Postgres?

Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Introduction to Hacking PostgreSQL

With lots of code review!

Gavin Sherry
gavin@alcove.com.au
Alcove Systems Engineering

January 14, 2007

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 1 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 2 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Databases are cool

Everyone uses them

There are only a small number – begging for diversification
Many open problems to be solved
Example: Storage capacity doubles each year, bandwidth remains
constant and latency goes up!
You can solve lots of the cool problems
Database guys get the babes (or guys (or what ever))

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 3 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Everyone wants database hackers

The number of hard core database kernel developers in the world

is low
Demand is high
Employers are cool: Google, MSR, CERN, NASA, JPL, Los
Alamos, heaps of startups. . .

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 4 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

So, why PostgreSQL?

Most database technologies come from “System/R” or

“Berkeley/INGRES/POSTGRES”
Many high end databases are based on Postgres
Netezza
ParACCEL
Greenplum DB
Streaming technologies like Telegraph CQ and StreamBase
...
Other database technologies came out of PostgreSQL
Informix
Illustra
It’s BSD, you can do what you want
Hacking on PostgreSQL is easier than hacking on any other large,
mature source base I’ve looked at
Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 5 / 63
Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 6 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Development Environment

The basics
Most of the Postgres developers
use Unix
You’ll need to know C
C is easy
Code reading is hard, but worth
getting good at

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 7 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Development Tools

The Basics
gcc, Bison, Flex, CVS, autotools,
gdb
Configure flags:
--enable-depend
--enable-debug
--enable-cassert
CFLAGS=-O0

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 8 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Development tools cont.

tags or cscope are essential

“What is the definition of this function/type?”
“What are all the call-sites of this function?”
src/tools/make [ce]tags
ccache and distcc are useful, especially on slower machines
valgrind can be useful for debugging memory errors

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 9 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Text Editor

Coding
Use a proper text editor
Follow the style of the surrounding
code (Basically, Allman BSD)
If you don’t follow the style, it will
just slow down the review
process

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 10 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 11 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

PostgreSQL Architecture

Five main components:

1 The parser - parse the query string
2 The rewriter - apply rewrite rules
3 The optimizer - determine an efficient query plan
4 The executor - execute a query plan
5 The utility processor - process DDL like CREATE TABLE

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 12 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

PostgreSQL Architecture

Five main components:

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 12 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

PostgreSQL Architecture

Five main components:

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 12 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

PostgreSQL Architecture

Five main components:

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 12 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

PostgreSQL Architecture

Five main components:

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 12 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Architecture Diagram

Postgres backend
PostgresMain()

PARSE:
Parse query string
pg_parse_query()

ANALYZE:
Semantic analysis of query, transform to Query node
parse_analyze()

REWRITE: UTILITY PROCESSOR:

Apply rewrite rules Execute DDL
pg_rewrite_queries() PortalRun() -> ProcessUtility()

PLAN:
Produce a query plan
pg_plan_queries()

EXECUTOR:
Execute DML
PortalRun() -> ExecutePlan()

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 13 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

The Parser

Lex and parse the query string submitted by the user

parser/gram.y has the guts; entry point is parser/parser.c
Produces a “raw parsetree”: a linked list of parse nodes
Parse nodes are defined in include/nodes/parsenodes.h
There is usually a simple mapping between grammar productions
and parse node structure

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 14 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Semantic Analysis

In the parser itself, only syntactic analysis is done

Next comes semantic analysis
parser/analyze.c and related code under parser/

Semantic analysis SELECT a, b, c FROM

Resolve column references t1, t2, t3 WHERE...
Verify that target schemas, tables and
columns exist
Check that the types used in expressions are
consistent
In general, check for errors that are
impossible or difficult to detect in the parser
itself

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 15 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Rewriter, Planner

The analysis phase produces a Query, which is the query’s parse

tree
The rewriter applies rewrite rules: view definitions and ordinary
rules. Input is a Query, output is zero or more Querys
The planner takes a Query and produces a Plan, which encodes
how the query ought to be executed
Only needed for “optimizable” statements (INSERT, DELETE,
SELECT, UPDATE)

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 16 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Executor, Utility Processor

DDL statements are “executed” via the utility processor, which

basically just calls the appropriate function for each different kind
of DDL statement
processUtility() in tcop/utility.c; the implementation of
the DDL statements is in commands/
Optimizeable statements are processed via the Executor: given a
Plan, it executes the plan and produces any resulting tuples
executor/; entry point is in execMain.c

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 17 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 18 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Common Idioms: Nodes

Postgres uses a very simple object system with support for single
inheritance. The root of the class hierarchy is Node:
typedef struct typedef struct typedef struct
{ { {
NodeTag type; NodeTag type; Parent parent;
} Node; int a_field; int b_field;
} Parent; } Child;

This relies on a C trick: you can treat a Child * like a Parent *

since their initial fields are the same
The first field of any Node is a NodeTag, which can be used to
determine a Node’s specific type at runtime

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 19 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Create a new Node: makeNode()
Run-time type testing via the IsA() macro
Test if two nodes are equal: equal()
Deep copy a node: copyObject()
Serialise a node to text: nodeToString()
Deserialise a node from text: stringToNode()

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes, Cont.

Important things to keep in mind

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 20 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Nodes: Hints

When you modify a node or add a new node, remember to update

nodes/equalfuncs.c
nodes/copyfuncs.c
You may have to update nodes/outfuncs.c if your Node is to
be serialised/deserialised
Grepping for references to the node’s type can be helpful to make
sure you don’t forget to update anything

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 21 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Memory Management

Postgres uses hierarchical,

region-based memory management,
and it absolutely rocks
backend/util/mmgr
Memory is allocated via palloc()
All allocations occur inside a
memory context
Default memory context:
CurrentMemoryContext

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 22 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Memory Management, cont.

Allocations can be freed individually via pfree()

When a memory context is reset, all allocations in the context are
released
Resetting contexts is both faster and less error-prone than
releasing individual allocations
Contexts are arranged in a tree; deleting/resetting a context
deletes/resets its child contexts

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 23 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Memory Management Conventions

You should sometimes pfree() your allocations

If the context is short lived, who cares
Calculate just how much memory you might consume in a worst
case
Be aware of the memory allocation assumptions made by
functions you call
Memory leaks, per se, are rare in the backend

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 24 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Error Handling

Most errors reported by ereport() or elog()

ereport() is for user-visible errors, and allows more fields to be
specified (SQLSTATE, detail, hint, etc.)
Implemented via longjmp(3)
elog(ERROR) walks back up the stack
The top-level error handler aborts the current transaction and
resets the transaction’s memory context
Releases all resources held by the transaction, including files, locks,
memory, and buffer pins

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 25 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Error Handling, Cont.

Custom error handlers can be

defined via PG TRY() Example
Assert(BufferIsValid(buf);
Think about error handling! PG_TRY();
{
Never ignore the return values of ...
}
system calls PG_CATCH();
{
Should your function return an error ...
if (...)
code, or ereport() on failure? PG_RE_THROW();
else
Probably ereport() – think ...
about why. . . and why not }
PG_END_TRY();

Use assertions (Assert) liberally to

detect programming errors, but
never errors the user might
encounter
Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 26 / 63
Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 27 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Your First Patch

Step 1: Research and preparation

Is your new feature actually useful?
Does it just scratch your itch, or
Is it of general value?
Does it belong in the backend?
Does the SQL standard define
similar or equivalent functionality?
What about Oracle, DB2, . . . ?
Has someone suggested this idea in
the past?
Search the archives and TODO list
Most ideas are bad

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 28 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Sending A Proposal

Step 2: Send a proposal for your feature to pgsql-hackers

Patches that appear without prior discussion risk wasting your time
Discuss your proposed syntax and behavior
Consider corner cases, and how the feature will relate to other
parts of PostgreSQL (consistency is good)
Will any system catalog changes be needed?
Backward-compatibility?
Try to reach a consensus with -hackers on how the feature
ought to behave

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 29 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Implementation

Step 3: Implement the patch

A general strategy is to look at how similar parts of the system
function
Don’t copy and paste (IMHO)
Instead, read through similar sections of code to try to understand
how they work, and the APIs they are using
Implement (just) what you need, refactoring the existed APIs if
required
Ask for implementation advice as needed (-hackers or IRC)
Consider posting work-in-progress versions of the patch

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 30 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Thoroughness

Step 4: Update tools

For example, if you’ve modified DDL syntax, update psql’s tab
completion
Add pg dump support if necessary

Step 5: Testing
Make sure the existing regression tests don’t fail
No compiler warnings
Add new regression tests for the new feature

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 31 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Thoroughness

Step 4: Update tools

For example, if you’ve modified DDL syntax, update psql’s tab
completion
Add pg dump support if necessary

Step 5: Testing
Make sure the existing regression tests don’t fail
No compiler warnings
Add new regression tests for the new feature

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 31 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Submitting The Patch

Step 6: Update documentation

make check in doc/src/sgml does a syntax check that is
faster than building the whole SGML docs
Check documentation changes visually in a browser

Step 7: Submit the patch

Use context diff format: diff -c
Review every hunk of the patch
Is this hunk necessary? Are there errors?
Does it needlessly change whitespace or existing code?

Work with a code reviewer to make any necessary changes

If your patch falls through the cracks, be persistent

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 32 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Submitting The Patch

Step 6: Update documentation

make check in doc/src/sgml does a syntax check that is
faster than building the whole SGML docs
Check documentation changes visually in a browser

Step 7: Submit the patch

Use context diff format: diff -c
Review every hunk of the patch
Is this hunk necessary? Are there errors?
Does it needlessly change whitespace or existing code?

Work with a code reviewer to make any necessary changes

If your patch falls through the cracks, be persistent

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 32 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Outline

1 Why hack on Postgres?

2 Development Environment

3 PostgreSQL Architecture

4 Conventions, and all that

5 Submitting a patch

6 Some real code

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 33 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Time for some actual code

We’ll look at a patch I

“prepared earlier”
It’s against PostgreSQL CVS
HEAD
You can see it here:
http://alcove.com.au/when_clause.patch

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 34 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

WHEN Clause

We’ll be walking you through the implementation of the WHEN

clause for CREATE TRIGGER
You can see a patch at http://neilconway.org/talks/hacking/when_clause.patch
Defined by SQL:2003, implemented by Oracle and others
Optional clause; when the WHEN expression evaluates to false
(or NULL), the associated trigger is not fired
In the WHEN clause, OLD and NEW tuples can be referenced:
In UPDATE and DELETE triggers, OLD is the tuple being replaced
In UPDATE and INSERT triggers, NEW is the tuple being added

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 35 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

WHEN Clause Considerations

Syntax is easy: defined by SQL spec

Syntax

CREATE TRIGGER name { BEFORE | AFTER }

{ event [ OR . . . ] }
ON t a b l e [ FOR [ EACH ] { ROW | STATEMENT } ]
[ WHEN ( expr ) ]
EXECUTE PROCEDURE funcname ( arguments )

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 36 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

WHEN Clause Considerations, cont.

Behavioral questions:
Should we allow WHEN clause for statement-level triggers? (SQL
spec doesn’t specify)
What subset of SQL should we allow? Aggregate functions,
subqueries, . . . ?
No backward-compat concerns
Obviously needs to be in the backend
Useful for at least SQL-spec compliance

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 37 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Implementation Outline
1 Add support for the new syntax to the parser
2 Update the CREATE TRIGGER parsenode
3 Add support for WHEN clause to analysis phase
4 Add new field to pg trigger system catalog, containing the
WHEN clause
5 Modify implementation of CREATE TRIGGER to add the WHEN
clause to the new pg trigger row
6 Add support for the WHEN clause when firing triggers in the
executor (most of the difficulty is here)
7 Update pg dump/psql to support the WHEN clause

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 38 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Implementation outline cont.

8 Lose much sleep

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 39 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Parser Changes

Trivial, as it turns out — see page 2

Grammar code
1 CreateTrigStmt :
2 CREATE TRIGGER name T r i g g e r A c t i o n T i m e T r i g g e r E v e n t s ON
3 q u a l i f i e d n a m e TriggerForSpec TriggerWhen EXECUTE PROCEDURE
4 func name ’ ( ’ TriggerFuncArgs ’ ) ’
5 {
6 C r e a t e T r i g S t m t ∗ n = makeNode ( C r e a t e T r i g S t m t ) ;
7 /∗ ... ∗/
8 n−>when = $10 ;
9 $$ = ( Node ∗ ) n ;
10 }
11
12 TriggerWhen :
13 WHEN ’ ( ’ a e x p r ’) ’ { $$ = $3 ; }
14 | / ∗EMPTY∗ / { $$ = NULL ; }

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 40 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Parsenode Changes

The definition of the CreateTrigStmt parse node is closely

derived from the syntax of CREATE TRIGGER
Add a new field to the struct to stash the WHEN clause
Be sure to update equalfuncs.c and copyfuncs.c
See pages 3 and 4 of handout
Next: update the analysis phase. How do we parse WHEN clauses
like OLD.a <> NEW.a?

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 41 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Expressions In Postgres

The WHEN clause is a boolean expression

An expression is a tree of Expr nodes
There are Expr subclasses for the different kinds of expressions:
function calls, operator invocations, constants, variables, etc.
ExecEvalExpr() evaluates an expression by recursing through
this tree. For example:
A function is evaluated by first evaluating its arguments, then calling
the function itself
A constant value is trivial to evaluate
See pages 4 through 6 of handout

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 42 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Variable Expressions

In an expression like t.a > 10, t.a is a range variable, colloquially

known as a table column
Represented by the Var expression type
How are range variables implemented?
Var.varno identifies the variable’s table (t above)
Var.varattno is the attribute number of the variable’s column
varno is an index into the expression’s range table
The range table is the set of relations that can be referenced in
expressions — each Query has an associated range table

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 43 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Analysis Phase

The analysis phase is where we lookup identifiers; therefore,

during the analysis phase, we need to add range table entries for
the NEW and OLD relations
Other analysis phase work is straightforward:
Exclusive-lock the target relation
Disallow subqueries and aggregates in the WHEN clause
See pages 4 through 6 of the handout

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 44 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

System Catalogs

The format of the system catalogs is defined by header files, in the

src/include/catalog directory
These files are normal C headers, with some special macros
These macros are pre-processed for bootstrapping (initdb)
Nice effect: access to system catalog fields is the same as
accessing a C struct
A compiled copy of the backend depends upon the exact definition
of the system catalogs
If you modify the system catalog format, bump the catalog version
to force initdb
See pages 6 and 7

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 45 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

System Catalog Changes

Triggers are stored in the pg trigger catalog

To add support for WHEN, we add a new field to
FormData pg trigger in pg trigger.h
Add tgqual field, which stores a serialized version of the WHEN
expression tree
Review: nodeToString() serializes a Node
We can use stringToNode() to reconstruct the expression tree
when needed

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 46 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

CREATE TRIGGER Changes

CREATE TRIGGER needs to store the textual representation of the

WHEN clause in the new row it inserts into pg trigger
Also reject WHEN clause for statement-level triggers here
Also create a dependency between the elements of the WHEN
expression and the trigger
If the WHEN clause references column a of the table, DROP COLUMN
a should be disallowed (without cascade)
See page 8

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 47 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

TriggerDesc Updates

The Relation struct contains metadata about an opened

relation: the relation’s pg class row, a description of the format
of its tuples, associated indexes, associated triggers, etc.
Stored in the relcache
See pages 7 and 8

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 48 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

TriggerDesc Updates, cont.

Trigger information is stored in a subsidiary struct, TriggerDesc,

which itself contains a Trigger struct for each trigger on the
relation
Add a field to Trigger to store the WHEN clause
Fill it in when TriggerDesc constructed
Remember to update support functions!
FreeTriggerDesc(), CopyTriggerDesc(),
equalTriggerDescs()

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 49 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Executor Changes

The guts of the required changes are in the executor

We need to evaluate the WHEN clause before we fire a row-level
trigger
To do that, we need to:
Preprocess the WHEN clause to get it ready to be evaluated
Teach the executor to be able to evaluate expressions referencing
the NEW and OLD relations
See pages 9 through 12

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 50 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

OLD and NEW in Executor

Review: ExecEvalExpr() evaluates expression trees

To do so, it uses an ExprContext
All info needed to evaluate an expression
To evaluate an expression, you find an appropriate ExprContext,
setup the necessary information, then use ExecEvalExpr()
The executor keeps a “per-tuple ExprContext” that we can use: it is
reset for each tuple that is output
See pages 10

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 51 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Evaluating Variable Expressions

ExecEvalVar() is called by ExecEvalExpr() to evaluate Var

expressions:
ExecEvalVar change
switch (variable->varno)
{
case INNER: /* get the tuple from the inner node */
slot = econtext->ecxt_innertuple;
break;

case OUTER: /* get the tuple from the outer node */

slot = econtext->ecxt_outertuple;
break;

default: /* get the tuple from the relation being scanned */

slot = econtext->ecxt_scantuple;
break;
}

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 52 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Evaluating Variables

Note that the varno is ignored, except for the special INNER and
OUTER varnos
The code assumes that the caller will insert the current tuple into
the ExprContext’s “scan tuple” slot before calling ExecEvalExpr
This won’t work for us: the WHEN expression could reference two
different tuples (OLD and NEW)
How can we solve this?

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 53 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Solution

Add two more special varnos, TRIG OLD VARNO and

TRIG NEW VARNO
In the analysis phase, rewrite the varnos in the expression so that
references to the special relations are assigned the right varno
Machinery for this exists: ChangeVarNodes walks an expression
tree, changing varno x → y in every node of the tree
Add two new slots to ExprContext to hold the OLD and NEW
tuples, and setup these slots before calling ExecEvalExpr
In ExecEvalVar, add two more special-cases for the two special
varnos, fetching from the appropriate slots of the ExprContext

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 54 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Checking The Qualification

Before firing triggers, check the WHEN clause

For BEFORE triggers, this is easy. Add code to invoke
ExecQual() to:
ExecBRDeleteTriggers()
ExecBRInsertTriggers()
ExecBRUpdateTriggers()
Use the current executor instance to get per-tuple ExprContext;
try to avoid overhead by preparing WHEN expression the first time
the trigger is fired for this command
See pages 10 and 11

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 55 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

AFTER Trigger Support

Unfortunately, supporting AFTER triggers is not so easy

AfterTriggerSaveEvent() enqueues a trigger to be invoked
later, such as at the end of the current query
We can’t check the WHEN condition here
Instead, we need to check the WHEN condition when the saved
events are fired — but we won’t necessarily have an executor
instance to use!
Should just be a Small Matter of Programming

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 56 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Subqueries in WHEN clause

Subqueries in the WHEN clause would be convenient

Unfortunately, they’re hard to implement
We’d have to run the full-fledged query planner on the expression
Postgres has the infrastructure to do this, it’s just a matter of using
it
All the other code we’ve written is prepared to handle subqueries

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 57 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

psql Support

psql’s \d command includes the definitions of the triggers on a

table. How do we get it to include the WHEN clause?
psql gets trigger definitions by calling the backend function
pg get triggerdef(), so we need to update it
There is already machinery for pretty-printing expressions as SQL
text, so we can reuse all that
One hurdle: tgqual may contain the special TRIG OLD VARNO
and TRIG NEW VARNO varnos, which the expression printing code
doesn’t understand
Quick hack: use ChangeVarNodes() to switch back to original
varnos
See pages 12 and 13

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 58 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

pg dump Support

We need to update pg dump to dump WHEN clause

pg dump reconstructs the CREATE TRIGGER command for a
trigger by examining the trigger’s pg trigger row
For WHEN, this isn’t so easy:
tgqual references TG OLD VARNO and TG NEW VARNO, so there is
no easy way to reconstruct tgqual in a client app
Change pg dump to use pg get triggerdef() to send a
fully-formed CREATE TRIGGER to the client
See pages 12 and 13

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 59 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Regression Tests

Invoked by make check

Run out of src/test/regress
Put tests in sql/triggers.sql
Reflect changes in expected/triggers.out
See pages 13 and 14

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 60 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Documentation

Documentation is in DocBook SGML

Located in docs/src/sgml
SQL command reference in ref/create trigger.sgml
Be sure to add an example
See page 14 and 15

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 61 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

TODO Items

As implemented, the patch has some deficiencies:

No support for AFTER triggers
No support for subqueries in the WHEN clause
Leaks the when field in FreeTriggerDesc()
setup trigger quals() does some redundant work

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 62 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Where to go from here

Sign up to -hackers, -patches and -committers

Review incoming patches, particularly those from active hackers,
learn from them
Review the todo list: http://developer.postgresql.org
Choose some small project (denoted by % on the official TODO
list)
Read the code as much as possible

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 63 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Where to go from here

Sign up to -hackers, -patches and -committers

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 63 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Where to go from here

Sign up to -hackers, -patches and -committers

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 63 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Where to go from here

Sign up to -hackers, -patches and -committers

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 63 / 63

Why hack on Postgres? Development Environment PostgreSQL Architecture Conventions, and all that Submitting a pat

Where to go from here

Sign up to -hackers, -patches and -committers

Gavin Sherry (Alcove) Introduction to Hacking PostgreSQL January 14, 2007 63 / 63

Samarth
No ratings yet
Samarth
21 pages
PostgreSQL For Ruby Developers - Jesus Castello
No ratings yet
PostgreSQL For Ruby Developers - Jesus Castello
119 pages
T-SQL Northwind Database
100% (1)
T-SQL Northwind Database
52 pages
CS232L Lab#01
No ratings yet
CS232L Lab#01
50 pages
PostgreSQL 1
No ratings yet
PostgreSQL 1
32 pages
Barry Stinson - PostgreSQL Essential Reference-Sams (2001)
100% (1)
Barry Stinson - PostgreSQL Essential Reference-Sams (2001)
475 pages
Postgres The First Experience
No ratings yet
Postgres The First Experience
173 pages
Advanced SQL Injections - @CyberFreeCourses
No ratings yet
Advanced SQL Injections - @CyberFreeCourses
57 pages
Modul PostgreSQL Terbaru - WebHozz
No ratings yet
Modul PostgreSQL Terbaru - WebHozz
175 pages
Learn PostgreSQL - Second Edition - . - Instant Download
100% (3)
Learn PostgreSQL - Second Edition - . - Instant Download
46 pages
0292 Introduction Postgresql
No ratings yet
0292 Introduction Postgresql
91 pages
PostgreSQL Cookbook Sample Chapter
No ratings yet
PostgreSQL Cookbook Sample Chapter
24 pages
Introbook v4 en
No ratings yet
Introbook v4 en
145 pages
PostgreSQL For Beginners
100% (6)
PostgreSQL For Beginners
142 pages
Resumo Postgresql
No ratings yet
Resumo Postgresql
6 pages
PostgreSQL For Data Architects - Sample Chapter
No ratings yet
PostgreSQL For Data Architects - Sample Chapter
23 pages
Making Postgres Central in Your Data Center
No ratings yet
Making Postgres Central in Your Data Center
39 pages
How To Manage PostgreSQL Databases From The Command Line With PSQL
No ratings yet
How To Manage PostgreSQL Databases From The Command Line With PSQL
1 page
Admin Workshop
No ratings yet
Admin Workshop
117 pages
DBA Roadmap - Learn To Become A Database Administrator With Postg
No ratings yet
DBA Roadmap - Learn To Become A Database Administrator With Postg
8 pages
Foundations PostgreSQL Administration 13
100% (1)
Foundations PostgreSQL Administration 13
307 pages
Postgresql 7.2 Tutorial: The Postgresql Global Development Group
No ratings yet
Postgresql 7.2 Tutorial: The Postgresql Global Development Group
35 pages
Postgre SQL
100% (2)
Postgre SQL
85 pages
PostgreSQL Proficiency For Python People
No ratings yet
PostgreSQL Proficiency For Python People
215 pages
PostgeeSQL7 2
No ratings yet
PostgeeSQL7 2
36 pages
PostgreSQL Administration Essentials Sample Chapter
No ratings yet
PostgreSQL Administration Essentials Sample Chapter
25 pages
CRUD REST API With Node - JS, Express, and PostgreS
No ratings yet
CRUD REST API With Node - JS, Express, and PostgreS
4 pages
Monitoring Postgresql
No ratings yet
Monitoring Postgresql
38 pages
Postgre SQL
No ratings yet
Postgre SQL
21 pages
Postgre SQL
No ratings yet
Postgre SQL
35 pages
Troubleshooting PostgreSQL - Sample Chapter
100% (1)
Troubleshooting PostgreSQL - Sample Chapter
15 pages
Lab1: Accessing The Database: Objectives
No ratings yet
Lab1: Accessing The Database: Objectives
22 pages
Postgre SQL
No ratings yet
Postgre SQL
18 pages
PostgreSQL Server Programming 2nd Edition Usama Dar - Download The Ebook and Start Exploring Right Away
100% (1)
PostgreSQL Server Programming 2nd Edition Usama Dar - Download The Ebook and Start Exploring Right Away
81 pages
Chapter 8 SQL Semester II 2023 For Students Final
No ratings yet
Chapter 8 SQL Semester II 2023 For Students Final
239 pages
CSE-6001 PostgreSQL Tutorial
No ratings yet
CSE-6001 PostgreSQL Tutorial
34 pages
Hackingpg Present
No ratings yet
Hackingpg Present
36 pages
Introduction Postgre SQLAdministration V11
No ratings yet
Introduction Postgre SQLAdministration V11
274 pages
A Tour of Postgresql Internals: Tom Lane Great Bridge, LLC Tgl@Sss - Pgh.Pa - Us 1
No ratings yet
A Tour of Postgresql Internals: Tom Lane Great Bridge, LLC Tgl@Sss - Pgh.Pa - Us 1
25 pages
Learning PostgreSQL - Sample Chapter
100% (1)
Learning PostgreSQL - Sample Chapter
39 pages
PostgreSQL Developer's Guide - Sample Chapter
100% (1)
PostgreSQL Developer's Guide - Sample Chapter
24 pages
PostgreSQL Essentials v16 Student
No ratings yet
PostgreSQL Essentials v16 Student
400 pages
Postgresql 95231693412781986
No ratings yet
Postgresql 95231693412781986
218 pages
Postgres Postgresql Downloads
No ratings yet
Postgres Postgresql Downloads
10 pages
1 Introduction2PostgreSQL
No ratings yet
1 Introduction2PostgreSQL
27 pages
Having Fun With PostgreSQL
No ratings yet
Having Fun With PostgreSQL
11 pages
Postgres PDF
No ratings yet
Postgres PDF
6 pages
Setting Up Databases With Postgresql, Psequel, and Python
No ratings yet
Setting Up Databases With Postgresql, Psequel, and Python
54 pages
Postgresql PDF
No ratings yet
Postgresql PDF
93 pages
CENG301 DBMS - Session-3
100% (1)
CENG301 DBMS - Session-3
13 pages
52492-rc071 Postgresql PDF
No ratings yet
52492-rc071 Postgresql PDF
11 pages
Introduction To Postgresql
100% (1)
Introduction To Postgresql
54 pages
M2hY7oj - QsuoWO6I - 8LLcg - Weeks 1 and 2 Introduction To SQL and Single Table SQL
No ratings yet
M2hY7oj - QsuoWO6I - 8LLcg - Weeks 1 and 2 Introduction To SQL and Single Table SQL
46 pages
Basic Part 1
No ratings yet
Basic Part 1
46 pages
PosTGreSQL PDF
No ratings yet
PosTGreSQL PDF
170 pages
Top Best Practices SSIS PDF
No ratings yet
Top Best Practices SSIS PDF
10 pages
SSIS Best Practices
No ratings yet
SSIS Best Practices
13 pages
ETLDesignMethodologyDocument SSIS PDF
No ratings yet
ETLDesignMethodologyDocument SSIS PDF
12 pages
Mydbr Doc PDF
No ratings yet
Mydbr Doc PDF
322 pages
Power Bi PDF
No ratings yet
Power Bi PDF
22 pages
BI Solutions Using SSAS Tabular Model Succinctly
No ratings yet
BI Solutions Using SSAS Tabular Model Succinctly
174 pages
Image Classification Using Convolutional Neural Network With Python
No ratings yet
Image Classification Using Convolutional Neural Network With Python
8 pages
Bca Part 1
No ratings yet
Bca Part 1
6 pages
OS PYQ Questions and Answers
No ratings yet
OS PYQ Questions and Answers
5 pages
Js Problem Solving
No ratings yet
Js Problem Solving
7 pages
07 Festo CPX - Lib en PDF
No ratings yet
07 Festo CPX - Lib en PDF
17 pages
IT112: Computer Systems Lab (End Sem Exam Questions Set - April 2022)
No ratings yet
IT112: Computer Systems Lab (End Sem Exam Questions Set - April 2022)
2 pages
Industry Skill Demand Collection
No ratings yet
Industry Skill Demand Collection
8 pages
HW 1 Ans
No ratings yet
HW 1 Ans
7 pages
LAB 7 - Modern Javascript (ES5 - ES6 - ESNEXT)
No ratings yet
LAB 7 - Modern Javascript (ES5 - ES6 - ESNEXT)
11 pages
Lab 2 Grouping and Aggregation
No ratings yet
Lab 2 Grouping and Aggregation
10 pages
Mapping Tool Ariba
No ratings yet
Mapping Tool Ariba
58 pages
RMI Calculator
No ratings yet
RMI Calculator
3 pages
Developing Flowcharts
No ratings yet
Developing Flowcharts
6 pages
Ooad MCQ
70% (10)
Ooad MCQ
18 pages
Oop C++ Book Solution All Mcq's PDF
No ratings yet
Oop C++ Book Solution All Mcq's PDF
37 pages
Python MCQ With Answer TEST-1
100% (1)
Python MCQ With Answer TEST-1
18 pages
Quiz1 - Conditional Control If Statements
No ratings yet
Quiz1 - Conditional Control If Statements
5 pages
Data Warehousing Mining MCQs
No ratings yet
Data Warehousing Mining MCQs
12 pages
MVJ22IS42-Advanced Java-LAB MANUAL
No ratings yet
MVJ22IS42-Advanced Java-LAB MANUAL
30 pages
Lab#05
No ratings yet
Lab#05
2 pages
CS301P Lab Exercises
No ratings yet
CS301P Lab Exercises
77 pages
Cucumber With Selenium Java BDD Testing
No ratings yet
Cucumber With Selenium Java BDD Testing
9 pages
NetSDK Programming Manual
No ratings yet
NetSDK Programming Manual
49 pages
07 Memory Management Basics
No ratings yet
07 Memory Management Basics
33 pages
Wchar C
No ratings yet
Wchar C
15 pages
01 - Memory System PDF
No ratings yet
01 - Memory System PDF
11 pages
Binary Search Algorithm
No ratings yet
Binary Search Algorithm
6 pages