Just a Theory

By David E. Wheeler

Posts about MySQL

Sqitch 1.5.0

Sqitch Logo

Released yesterday: Sqitch v1.5.0. This version the MySQL driver DBD::mysql with DBD::MariaDB, both for its better backward compatibility with MySQL as well as MariaDB driver libraries and for its improved Unicode handling. The Docker image likewise switched to the MariaDB mysql client. I expect no compatibility issues, but you never know! Please file an issue should you find any.

V1.5.0 also features a fixes for Yugabyte deployment, Oracle error handling, existing Snowflake schemas, connecting to MySQL/MariaDB without a database name, and omitting the checkit MySQL/MariaDB function when the Sqitch user lacks sufficient permission to create it. Sqitch now will also complain when deploying with --log-only and a deployment file is missing.

Find it in the usual places:

Many thanks to everyone who has enjoyed using Sqitch and let me know in person, via email Mastodon, bug reports, and patches. It gratifies me how useful people find it.

Should URI::mysql Switch to DBD::MariaDB?

I seek the wisdom of the Perl Monks:

The Sqitch project got a request to switch from DBD::mysql to DBD::MariaDB. DBD::mysql 5’s requirement to build from the MySQL 8 client library provides the impetus for the request, but in poking around, I found a blogs.perl.org post highlighting some Unicode fixes in DBD::MariaDB, as well.

Now, Sqitch likely doesn’t have the Unicode issue (it always works with Perl Unicode strings), but it depends on URI::db to provide the DBI connection string. For MySQL URIs, the URI::mysql dbi_driver method returns mysql.

Should it be changed to return MariaDB, instead? Is there general community consensus that DBD::MariaDB provides better compatibility with both MySQL and MariaDB these days?

I’m also curious what the impact of this change would be for Sqitch. Presumably, if DBD::MariaDB can build against either the MariaDB or MySQL client library, it is the more flexible choice to continue supporting both databases going forward.

Feedback appreciated via PerlMonks or the Sqitch issue.

Update 2025-01-08

URI-db 0.23 uses DBD::MariaDB instead of DBD::mysql for both URI::mysql and URI::MariaDB.

Similarly, Sqitch v1.5.0 always uses DBD::MariaDB when connecting to MySQL or MariaDB, even when using older versions of URI::db. Thanks everyone for the feedback and suggestions!

Sqitch Goes Vertical

I released Sqitch v0.996 today. Despite the minor version increase, this is a pretty big release. I’m busy knocking out all the stuff I want to get done for 1.0, but the version space is running out, so just a minor version jump from v0.995 to v0.996. But a lot changed. A couple the biggies:

Goodbye Mouse and Moose, Hello Moo

If you’re not a Perl programmer, you probably aren’t familiar with Moose or its derivatives Mouse and Moo. Briefly, it’s an object system. Great interface and features, but freaking huge—and slow. Mouse is a lighter version, and when we (mostly) switched to it last year, it yielded a 20-30% speed improvement.

Still wasn’t great, though. So on a day off recently, I switched to Moo, which implements most of Moose but without a lot of the baggage. At first, there wasn’t much difference in performance, but as I profiled it (Devel::NYTProf is indispensable for profiling Perl apps, BTW), I was able to root out all trace of Moose or Mouse, including in CPAN modules Sqitch depends on. The result is around a 40% speedup over what we had before. Honestly, it feels like a new app, it’s so fast. I’m really happy with how it turned out, and to have shed some of the baggage from the code base.

The downside is that package maintainers will need to do some work to get the new dependencies built. Have a look at the RPM spec changes I made to get our internal Sqitch RPMs to build v0.996.

MySQL Password Handling

The handling of MySQL passwords has also been improved. Sqitch now uses the $MYSQL_PWD environment variable if a password is provided in a target. This should simplify authentication when running MySQL change scripts through the mysql client client.

Furthermore, if MySQL::Config is installed, Sqitch will look for passwords in the client and mysql sections of your MySQL configuration files (~/.my.cnf, /etc/my.cnf). This should already happen automatically when executing scripts, but Sqitch now tries to replicate that behavior when connecting to the database via DBI.

Spotting the $MYSQL_PWD commit, Ștefan Suciu updated the Firebird engine to use the $ISC_PASSWORD when running scripts. Awesome.

Vertically Integrated

And finally, another big change: I added support for Vertica, a very nice commercial column-store database that features partitioning and sharding, among other OLAP-style functionality. It was originally forked from PostgreSQL, so it was fairly straight-forward to port, though I did have to borrow a bit from the Oracle and SQLite engines, too. This port was essential for work, as we’re starting to use Vertical more and more, and need ways to manage changes.

If you’re using Vertica, peruse the tutorial to get a feel for what it’s all about. If you want to install it, you can get it from CPAN:

cpan install App::Sqitch BDD::ODBC

Or, if you’re on Homebrew:

brew tap theory/sqitch
brew install sqitch_vertica

Be warned that there’s a minor bug in v0.996, though. Apply this diff to fix it:

 @@ -16,7 +16,7 @@ our $VERSION = '0.996';
 
 sub key    { 'vertica' }
 sub name   { 'Vertica' }
-sub driver { 'DBD::Pg 2.0' }
+sub driver { 'DBD::ODBC 1.43' }
 sub default_client { 'vsql' }
 
 has '+destination' => (

That fix will be in the next release, of course, as will support for Vertica 6.

What Next?

I need to focus on some other work stuff for a few weeks, but then I expect to come back to Sqitch again. I’d like to get 1.0 shipped before the end of the year. To that end, next up I will be rationalizing configuration hierarchies to make engine selection and deploy-time configuration more sensible. I hope to get that done by early October.

Notes on Upcoming Sqitch Improvements

I was traveling last week, and knowing I would be offline a fair bit, not to mention seriously jet-lagged, I put my hacking efforts into getting MySQL support into Sqitch. I merged it in yesterday; check out the tutorial if you’re interested in it. I expect to release v0.980 with the MySQL support in a couple of weeks; testing and feedback would most appreciated.

There is a caveat, though: it requires MySQL v5.6.4. So if you’re stuck with an older MySQL, it won’t work. There are two reasons to require v5.6.4:

  • The microsecond precision support in DATETIME values, added in v5.6.4. This makes it much easier to keep things in the proper order (deployments usually take less than a second).
  • The SIGNAL functionality, introduced in v5.5. This allows the schema to mock a check constraint in the Sqitch database, as well as make it much easier to write verify tests (as described in the tutorial and figured out on StackOverflow).

But if you can afford to take advantage of a relatively modern MySQL, give it a shot!

The next release also makes a backwards-incompatible change to the SQLite engine: the default Sqitch database is no longer $db_dir/$db_name-sqitch.$suffix, but $db_dir/sqitch.$suffix. In other words, if you were deploying to a db named /var/db/myapp.db, Sqitch previously kept its metadata in /var/db/myapp-sqitch.db, but now will keep it in /var/db/sqitch.db. This is to make it more like the other engines (MySQL defaults to a database named “sqitch”, and Postgres and Oracle default to a schema named “sqitch”).

It’s also useful if you use the SQLite ATTACHDATABASE command to manage multiple database files in a single project. In that case, you will want to use the same metadata file for all the databases. Keep them all in the same directory with the same suffix and you get just that with the default sqitch database.

If you’d like it to have a different name, use sqitch config core.sqlite.sqitch_db $name to configure it. This will be useful if you don’t want to use the same Sqitch database to manage multiple databases, or if you do, but they live in different directories.

I haven’t released this change yet, and I am not a big-time SQLite user. So if this makes no sense, please comment on this issue. It’ll be a couple of weeks before I release v0.980, so there is time to reverse if if there’s consensus that it’s a bad idea.

But given another idea I’ve had, I suspect it will be okay. The idea is to expand on the concept of a Sqitch “target” by giving it its own command and configuration settings. Basically, it would be sort of like Git remotes: use URIs to specify database connection and parameter info (such as the sqitch database name for SQLite). These can be passed to database-touching commands, such as deploy, revert, log, and the like. They can also be given names and stored in the configuration file. The upshot is that it would enable invocations such as

sqitch deploy production
sqitch log qa
sqitch status pg://localhost/flipr?sqitch_schema=meta

See the GitHub issue for a fuller description of this feature. I’m certain that this would be useful at work, as we have a limited number of databases that we deploy each Sqitch project to, and it’s more of a PITA for my co-workers to remember to use different values for the --db-host, --db-user, --db-name and friends options. The project itself would just store the named list of relevant deployment targets.

And it alleviates the issue of specifying a different Sqitch database on SQLite or MySQL, as one can just create a named target that specifies it in the URI.

Not sure when I will get to this feature, though. I think it would be great to have, and maybe iovation would want me to spend some time on it in the next couple of months. But it might also be a great place for someone else to get started adding functionality to Sqitch.

Oh, and before I forget: it looks like Sqitch might soon get CUBRID support, too, thanks to Ștefan Suciu. Stay tuned!

Looking for the comments? Try the old layout.

Introducing MyTAP

I gave my OSCON tutorial (slides) last week. It went okay. I spent way too much time helping to get everyone set up with pgTAP, and then didn’t have time to have the attendees do the exercises, and I had to rush through 2.5 hours of material in 1.5 hours. Yikes! At least the video will be better when it’s released (more when that happens).

But as often happens, I was asked whether something like pgTAP exists for MySQL. But this time I was asked by MySQL Community Manager Giuseppe Maxia, who also said that he’d tried to create a test framework himself (a fellow Perl hacker!), but that it wasn’t as nice as pgTAP. Well, since I was at OSCON and tend to like to hack on side projects while at conferences, and since I hoped that Giuseppe will happily take it over once I’ve implemented the core, I started hacking on it myself. And today, I’m pleased to announce the release of MyTAP 0.01 (downloads).

Once you’ve downloaded it, install it against your MySQL server like so:

mysql -u root < mytap.sql

Here’s a very simple example script:

-- Start a transaction.
BEGIN;

-- Plan the tests.
SELECT tap.plan(1);

-- Run the tests.
SELECT tap.pass( 'My test passed, w00t!' );

-- Finish the tests and clean up.
CALL tap.finish();
ROLLBACK;

You can run this test from a .sql file using the mysql client like so:

mysql -u root --disable-pager --batch --raw --skip-column-names --unbuffered --database try --execute 'source test.sql'

But that’s a PITA and can only run one test at a time. Instead, put all of your tests into a directory, perhaps named tests, each with the suffix “.my”, and use my_prove (install TAP::Parser::SourceHandler::MyTAP from CPAN to get it) instead:

my_prove -u root --database try tests/

For MyTAP’s own tests, the output looks like this:

tests/eq.my ........ ok
tests/hastap.my .... ok
tests/matching.my .. ok
tests/moretap.my ... ok
tests/todotap.my ... ok
tests/utils.my ..... ok
All tests successful.
Files=6, Tests=137,  1 wallclock secs
(0.06 usr  0.03 sys +  0.01 cusr  0.02 csys =  0.12 CPU)
Result: PASS

Nice, eh? Of course there are quite a few more assertion functions. See the complete documentation for details.

Now, I did my best to keep the interface the same as pgTAP, but there are a few differences:

  • MySQL temporary tables are teh suck, so I had to use permanent tables to track test state. To make this more feasible, MyTAP is always installed in its own database, (named “tap” by default), and you must always schema-qualify your use of the MyTAP functions.
  • Another side-effect of permanent tables is that MyTAP must keep track of test outcomes without colliding with the state from tests running in multiple concurrent connections. So MyTAP uses connection_id() to keep track of state for a single test run. It also deletes the state when tests finish(), but if there’s a crash before then, data can be left in those tables. If the connection ID is ever re-used, this can lead to conflicts. This seems mostly avoidable by using InnoDB tables and transactions in the tests.
  • The word “is” is strictly reserved by MySQL, so the function that corresponds to pgTAP’s is() is eq() in MyTAP. Similarly, isnt() is called not_eq() in MyTAP.
  • There is no way to throw an exception in MySQL functions an procedures, so the code cheats by instead performing an illegal operation: selecting from a non-existent column, where the name of that column is the error message. Hinky, but should get the point across.

Other than these issues, things went fairly smoothly. I finished up the 0.01 version last night and released it today with most of the core functionality in place. And now I want to find others to take over, as I am not a MySQL hacker myself and thus unlikely ever to use it. If you’re interested, my recommendations for things to do next are:

So fork on GitHub or contact me if you’d like to be added as a collaborator (I’m looking at you, Giuseppe!).

Hope you find it useful.

Looking for the comments? Try the old layout.

Bricolage 2.0 Drops

Bricolage 2.0 was released today. This is a pretty big deal, and a long time coming. The most important changes, from my point of view, are:

  • Revamped UI. As a 2006 Google Summer of Code project, Marshall Roch added a slew of Ajaxy goodness to the Bricolage UI. It used to be that, to dig down into a document, you needed to click through reloads for every level. Now the entire structure of a document is available on a single screen, and digging down takes place in situ. This means faster, easier document editing.

    There are other niceties too, thanks to Marshall, like as-you-type autocompletion of category URIs and keywords, popups for associating related documents, dynamic field generation for document keywords and user contacts, and animated workflow actions for moving, deleting, and publishing documents.

    These changes mark a truly impressive improvement in usability for the people who use Bricolage every day, and will by far be the most welcome change for our users.

  • Finer content control. Thanks to another 2006 Google Summer of Code project, Christian Muise implemented what we call “element occurrence specification.” Bricolage document structure is controlled by administrators creating document types with hierarchies of elements. Elements may contain fields—the types and value of which may also be specified (text, textarea, select list, etc.)—and other elements.

    In previous versions of Bricolage, if an element was a subelement of a document, one could add any number of that element to a document. Fields were a bit more controlled: you could only say whether one or many instances of a field were allowed in a given element.

    Element occurrence specification allows administrators to have much finer control over document elements by specifying the minimum and maximum number of instances of an element or field may occur. For example, one can say that a document may have only one instance of a field, or must have three, or may have between 3 and 5, or may have at least 3, or may have any number, including none.

    Bret Dawson put it really well in the Bricolage 2.0 Changes:

    Want every book review you publish to contain at least three but no more than 10 ISBN numbers? Want exactly four pull-quotes in every article? You can do that in Bricolage 2.

  • MySQL support. This, too, was a 2006 Google Summer of Code project, by Andrei Arsu. Yes, you can run Bricolage 2.0 on MySQL 5.0 if you want. This was a pretty big project, and I’m surprisingly pleased at how well it works now that all the kinks have been worked out (special thanks to Waldo Jaquith for being brave (foolish?) enough to start a Bricolage project on MySQL and thus to shake out some bugs).

  • Apache 2 support. This was started quite some time ago by Chris Heiland, hacked on later by Scott Lanning, and finally finished by yours truly. I look forward to dumping Apache 1 in the future.

There’s other stuff, too, lots of little things and not-so-little things. Altogether they go a long way toward making Bricolage better.

It’s been quite a day, and I’m glad to have it out the door. Four years is a long time to wait for a major release, and it happened not because of me, but thanks to the work of others who have picked up the gauntlet. Huge thanks especially to:

Many others provided feedback, patches, and bug reports, and I appreciate all the help. I hope to see you all for Bricolage 2.2!

Looking for the comments? Try the old layout.

RFC: A Simple Markdown Table Format

I’ve been thinking about markdown tables a bit lately. I’ve had in mind to follow up on my definition list proposal with a second proposal for the creation and editing of simple tables in Markdown. For better or for worse, an aside on the markdown-discuss mail list led to a longish thread about a syntax for continuing lines in tables (not to mention a long aside on the use of monospaced fonts, but I digress), wherein I realized, after an open-minded post from MultiMarkdown’s Fletcher Penney, that I needed to set to working up this request for comments sooner rather than later.

Requirements

All of which is to say that this blog entry is a request for comments on a proposed syntax for simple tables in Markdown. The requirements for such a feature, to my thinking, are:

  • Simple tables only
  • Formatting should be implicit
  • Support for a simple caption
  • Support for representing column headers
  • Support for left, right, and center alignment
  • Support for multicolumn cells
  • Support for empty cells
  • Support for multiline (wrapping) cells
  • Support for multiple table bodies
  • Support for inline Markdown (spans, lists, etc.)
  • Support for all features (if not syntax) of MultiMarkdown tables.

By “simple tables” in that first bullet, I mean that they should look good in 78 character-wide monospaced plain text. Anything more complicated should just be done in XHTML. My goal is to be able to handle the vast majority of simple cases, not to handle every kind of table. That’s not to say that one won’t be able to use the syntax to create more complicated tables, just that it might not be appropriate to do so, and many more advanced features of tables will just have to be done in XHTML.

And by “implicit formatting” in the second bullet, I mean that the syntax should use the bare minimum number of punctuation characters to provide hints about formatting. Another way to think about it is that formatting hints should be completely invisible to a casual reader of the Markdown text.

Most of the rest of the requirements I borrowed from MultiMarkdown, with the last bullet thrown in just to cover anything I might have missed. The MultiMarkdown syntax appears to be a superset of the PHP Markdown Extra syntax, so that’s covered, too.

Prior Art: Databases

When I think about the display of tables in plain text, the first piece of prior art I think of is the output from command-line database clients. Database developers have been thinking about tables since, well, the beginning, so it makes sense to see what they’re doing. So I wrote a bit of SQL and ran it in three databases. The SQL builds a table with an integer, a short name, a textual description, and a decimal number. Here’s the code:

CREATE TEMPORARY TABLE widgets (
    id          integer,
    name        text,
    description text,
    price       numeric(6,2)
);

INSERT INTO widgets VALUES( 1, 'gizmo', 'Takes care of the doohickies', 1.99);
INSERT INTO widgets VALUES( 2, 'doodad', 'Collects *gizmos*', 23.8);
INSERT INTO widgets VALUES( 10, 'dojigger', 'Handles:
*   gizmos
*   doodads
*   thingamobobs', 102.98);
INSERT INTO widgets VALUES(1024, 'thingamabob', 'Self-explanatory, no?', 0.99);

SELECT * FROM widgets;

My goal here was to see how the database client would format a variety of data formats, as well as a textual column (“description”) with newlines in it (and a Markdown list, no less!), as the newlines will force the output to appear on multiple lines for a single row. This is one of the features that is missing from the existing Markdown implementations, which all require that the text all be on a single line.

The first database client in which I ran this code was psql 8.3, the interactive terminal for PostgreSQL 8.3. Its output looks like this:

  id  |    name     |         description          | price  
------+-------------+------------------------------+--------
    1 | gizmo       | Takes care of the doohickies |   1.99
    2 | doodad      | Collects *gizmos*            |  23.80
   10 | dojigger    | Handles:                     | 102.98
                    : * gizmos                       
                    : * doodads                      
                    : * thingamobobs                 
 1024 | thingamabob | Self-explanatory, no?        |   0.99

As you can see, PostgreSQL properly right-aligned the integer and numeric columns. It also has a very nice syntax for demonstrating continuing lines for a given column: the colon. The colon is really nice here because it looks kind of like a broken pipe character, which is an excellent mnemonic for a string of text that breaks over multiple lines. Really, this is just a very nice output format overall.

The next database client I tried was mysql 5.0, the command-line client for MySQL 5.0. Its output looks like this:

+------+-------------+--------------------------------------------+--------+
| id   | name        | description                                | price  |
+------+-------------+--------------------------------------------+--------+
|    1 | gizmo       | Takes care of the doohickies               |   1.99 | 
|    2 | doodad      | Collects *gizmos*                          |  23.80 | 
|   10 | dojigger    | Handles:
* gizmos
* doodads
* thingamobobs | 102.98 | 
| 1024 | thingamabob | Self-explanatory, no?                      |   0.99 | 
+------+-------------+--------------------------------------------+--------+

Once again we have very good alignment of the numeric data types. Furthermore, MySQL uses exactly the same syntax as PostgreSQL to represent the separation between column headers and column rows, although the PostgreSQL version is a bit more minimalist. The MySQL version just hast a little more stuff in it

Where the MySQL version fails, however, is in the representation of the continuing lines for the “dojigger” row. First of all, it set the width of the “description” column to the longest value in that column, but since that longest value includes newlines, it actually ends up being much too long—much longer than PostgreSQL’s representation of the same column. And second, as a symptom of that problem, nothing special is done with the wrapped lines. The newlines are simply output like any other character, with no attempt to line up the column. This has the side effect of orphaning the price for the “dojigger” after the last line of the continuing description. So its alignment is shot, too.

To be fair, PostgreSQL’s display featured almost exactly the same handling of continuing columns prior to version 8.2. But I do think that their solution featuring the colons is a good one.

The last database client I tried was SQLite 3.6. This client is the most different of all. I set .header ON and .mode column and got this output:

id          name        description                   price     
----------  ----------  ----------------------------  ----------
1           gizmo       Takes care of the doohickies  1.99      
2           doodad      Collects *gizmos*             23.8      
10          dojigger    Handles:
* gizmos
* doodads
  102.98    
1024        thingamabo  Self-explanatory, no?         0.99      

I don’t think this is at all useful for Markdown.

Prior Art: MultiMarkdown

Getting back to Markdown now, here is the MultiMarkdown syntax, borrowed from the documentation:

|             |          Grouping           ||
First Header  | Second Header | Third Header |
 ------------ | :-----------: | -----------: |
Content       |          *Long Cell*        ||
Content       |   **Cell**    |         Cell |

New section   |     More      |         Data |
And more      |            And more          |
[Prototype table]

There are a few interesting features to this syntax, including support for multiple lines of headers, multicolumn cells alignment queues, and captions. I like nearly everything about this syntax, except for two things:

  1. There is no support for multiline cell values.
  2. The explicit alignment queues are, to my eye, distracting.

The first issue can be solved rather nicely with PostgreSQL’s use of the colon to indicate continued lines. I think it could even optionally use colons to highlight all rows in the output, not just the continuing one, as suggested by Benoit Perdu on the markdown-discuss list:

  id  |    name     |         description          | price  
------+-------------+------------------------------+--------
    1 | gizmo       | Takes care of the doohickies |   1.99
    2 | doodad      | Collects *gizmos*            |  23.80
   10 | dojigger    | Handles:                     | 102.98
      :             : * gizmos                     : 
      :             : * doodads                    : 
      :             : * thingamobobs               : 
 1024 | thingamabob | Self-explanatory, no?        |   0.99

I think I prefer the colon only in front of the continuing cell, but see no reason why both couldn’t be supported.

The second issue is a bit more subtle. My problem with the alignment hints, embodied by the colons in the header line, is that to the reader of the plain-text Markdown they fill no obvious purpose, but are provided purely for the convenience of the parser. In my opinion, if there is some part of the Markdown syntax that provides no obvious meaning to the user, it should be omitted. I take this point of view not only for my own selfish purposes, which are, of course, many and rampant, but from John Gruber’s original design goal for Markdown, which was:

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. While Markdown’s syntax has been influenced by several existing text-to-HTML filters, the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.

To me, those colons are formatting instructions. So, how else could we support alignment of cells but with formatting instructions? Why, by formatting the cells themselves, of course. Take a look again at the PostgreSQL and MySQL outputs. both simply align values in their cells. There is absolutely no reason why a decent parser couldn’t do the same on a cell-by-cell basis if the table Markdown follows these simple rules:

  • For a left-aligned cell, the content should have no more than one space between the pipe character that precedes it, or the beginning of the line.
  • For a right-aligned cell, the content should have no more than one space between itself and the pipe character that succeeds it, or the end of the line.
  • For a centered cell, the content should have at least two characters between itself and both its left and right borders.
  • If a cell has one space before and one space after its content, it is assumed to be left-aligned unless the cell that precedes it or, in the case of the first cell, the cell that succeeds it, is right-aligned.

What this means, in effect, is that you can create tables wherein you line things up for proper display with a proportional font and, in general, the Markdown parser will know what you mean. A quick example, borrowing from the PostgreSQL output:

  id  |    name     |         description          |  price  
------+-------------+------------------------------+--------
    1 | gizmo       | Takes care of the doohickies |   1.99 
    2 | doodad      | Collects *gizmos*            |  23.80 
   10 | dojigger    | Handles stuff                | 102.98 
 1024 | thingamabob | Self-explanatory, no?        |   0.99 

The outcome for this example is that:

  • The table headers are all center-aligned, because they all have 2 or more spaces on each side of their values
  • The contents of the “id” column are all right-aligned. This includes 1024, which ambiguously has only one space on each side of it, so it makes the determination based on the preceding line.
  • The contents of the “name” column are all left-aligned. This includes “thingamabob”, which ambiguously has only one space on each side of it, so it makes the determination based on the preceding line.
  • The contents of the “description” column are also all left-aligned. This includes first row, which ambiguously has only one space on each side of it, so it makes the determination based on the succeeding line.
  • And finally, the contents of the “price” column are all right-aligned. This includes 102.98, which ambiguously has only one space on each side of it, so it makes the determination based on the preceding line.

And that’s it. The alignments are perfectly clear to the parser and highly legible to the reader. No further markup is required.

Proposed Syntax

So, with this review, I’d like to propose the following syntax. It is inspired largely by a combination of PostgreSQL and MySQL’s output, as well as by MultiMarkdown’s syntax.

  • A table row is identifiable by the use of one or more pipe (|) characters in a line of text, aside from those found in a literal span (backticks).
  • Table headers are identified as a table row with the immediately-following line containing only -, |, +, :or spaces. (This is the same as the MultiMarkdown syntax, but with the addition fo the plus sign.)
  • Columns are separated by |, except on the header underline, where they may optionally be separated by +, and on continuing lines (see next point).
  • Lines that continue content from one or more cells from a previous line must use : to separate cells with continued content. The content of such cells must line up with the cell width on the first line, determined by the number of spaces (tabs won’t work). They may optionally demarcate all cells on continued lines, or just the cells that contain continued content.
  • Alignment of cell content is to be determined on a cell-by-cell basis, with reference to the same cell on the preceding or succeeding line as necessary to resolve ambiguities.
  • To indicate that a cell should span multiple columns, there should be additional pipes (|) at the end of the cell, as in MultiMarkdown. If the cell in question is at the end of the row, then of course that means that pipes are not optional at the end of that row.
  • You can use normal Markdown markup within the table cells, including multiline formats such as lists, as long as they are properly indented and denoted by colons on succeeding lines.
  • Captions are optional, but if present must be at the beginning of the line immediately preceding or following the table, start with [ and end with ], as in MultiMarkdown. If you have a caption before and after the table, only the first match will be used.
  • If you have a caption, you can also have a label, allowing you to create anchors pointing to the table, as in MultiMarkdown. If there is no label, then the caption acts as the label.
  • Cells may not be empty, except as represented by the appropriate number of space characters to match the width of the cell in all rows.
  • As in MultiMarkdown. You can create multiple <tbody> tags within a table by having a single empty line between rows of the table.

Sound like a lot? Well, if you’re acquainted with MultiMarkdown’s syntax, it’s essentially the same, but with these few changes:

  • Implicit cell alignment
  • Cell content continuation
  • Stricter use of space, for proper alignment in plain text (which all of the MultiMarkdown examples I’ve seen tend to do anyway)
  • Allow + to separate columns in the header-demarking lines
  • A table does not have to start right at the beginning of a line

I think that, for purposes of backwards compatibility, we could still allow the use of : in the header lines to indicate alignment, thus also providing a method to override implicit alignment in those rare cases where you really need to do so. I think that the only other change I would make is to eliminate the requirement that the first row be made the table header row if now header line is present. But that’s a gimme, really.

Taking the original MultiMarkdown example and rework it with these changes yields:

|               |            Grouping            ||
+---------------+---------------------------------+
| First Header  |  Second Header  |  Third Header |
+---------------+-----------------+---------------+
| Content       |           *Long Cell*          ||
: continued     :                                ::
: content       :                                ::
| Content       |    **Cell**     |          Cell |
: continued     :                 :               :
: content       :                 :               :

| New section   |      More       |          Data |
| And more      |             And more           ||
 [Prototype table]

Comments?

I think I’ve gone on long enough here, especially since it ultimately comes down to some refinements to the MultiMarkdown syntax. Ultimately, what I’m trying to do here is to push MultiMarkdown to be just a bit more Markdownish (by which I mean that it’s more natural to read as plain text), as well as to add a little more support for some advanced features. The fact that I’ll be able to cut-and-paste the output from my favorite database utilities is a handy bonus.

As it happens, John Gruber today posted a comment to the markdown-discuss mail list in which he says (not for the first time, I expect):

A hypothetical official table syntax for Markdown will almost certainly look very much, if not exactly, like Michel’s table syntax in PHP Markdown Extra.

I hope that he finds this post in that vein, as my goals here were to embrace the PHP Markdown Extra and MultiMarkdown formats, make a few tweaks, and see what people think, with an eye toward contributing toward a (currently hypothetical) official table syntax.

So what do you think? Please leave a comment, or comment on the markdown-discuss list, where I’ll post a synopsis of my proposal and a link to this entry. What have I missed? What mistakes have I made? What do you like? What do you hate? Please do let me know.

Thanks!

Looking for the comments? Try the old layout.

Moving Towards Bricolage 2.0

Today I’ve finished just about over two and a half weeks of hacking on Bricolage. It has been a couple of years since I gave it much attention, but there was so much good stuff that other people have contributed that, since I had a little time, it seemed worth it to give it some love. So here’s a quick list of all that I’ve done in the last two weeks:

  • Fixed all reported issues with Bricolage 1.10. Scott Lanning kindly released 1.10.5 yesterday with all of those fixes.

  • I integrated the element occurrence branch that Christian Muise had worked on as his 2006 Google Summer of Code project. Christian’s project added support for maximum and minimum specifications for subelements in Bricolage, which allows administrators to define how many fields and elements can occur in a story or media document. All I had to do was add a few UI tweaks to support the new fields and their specification in the story profile, and all was ready to go. Oh, and I did have to go back and make the SOAP interface work with the feature, but the only reason it never did was lazy hacking of the SOAP interface (way before Christian’s time). Nice work, Christian, and thank you for your contribution!

  • I fixed a few bugs with Arsu Andrei’s port of Bricolage to MySQL, which was his 2006 Google Summer of Code project. Arsu did a terrific job with the port, with only a few minor things missed that he likely could not have caught anyway. This work had already been merged into the trunk. Thanks Arsu!

  • I fixed a bunch of bugs from Marshall Roch’s AJAXification of Bricolage, carried out during his 2006 Google Summer of Code project. Marshall actually did a lot more stuff than he’d planned, as it all went quite smoothly. I found only a few minor oversights that I was able to easily address. This work represents the single most visible change to how users user Bricolage since we launched the project back in 2001. Editing stories, in particular, is now a lot cleaner, with far fewer page loads. Thanks a million, Marshall!

  • I completed the work started by Chris Heiland of the University of Washington, Bothell, and Scott Lanning of the World Health Organization to port Bricolage to Apache 2. They really did most of the hard work, and I just spent several days integrating everything, making sure all the features work, and updating the installer to handle differences in configuration. I thought this would take me a day or two, but it actually took the better part of a week! So much has changed, but in truth Bricolage is now better for running on mod_perl 2. Expect to see Apache 2 bet the recommended platform for Bricolage in a release in the near future.

  • I integrated a number of patches from Brian Smith of Gossamer Threads to allow the installer to be run as a non-root user. The key here is if the installer has to become the database super user, which is required for ident authentication, and of course whether files are to be installed somewhere on the system requiring super user access. This work is not done, yet, as make upgrade and make uninstall are not quite there yet. But we’re getting there, and it should be all done in time for 2.0, thanks to Brian.

  • I added support for a whole slew of environment variables to the installer. Now you can set environment variables to override default settings for installation parameters, such as choice of RDBMS, Apache, location of an SSL cert and key, whether to support SLL, and lots of other stuff, besides. This is all documented in the “Quick Installation Instructions” section of Bric::Admin/INSTALL.

  • I fully tested and fixed a lot of bugs leftover from making the installer database- and Apache-neutral. Now all of these commands should work perfectly:

    • make
    • make cpan
    • make test
    • make install
    • make devtest
    • make clone
    • make uninstall
  • I improved the DHTML functionality of the “Add More” widget, which is used to add contact information to users and contributors, rules to alert types, and extensions to media types. I think it’s pretty slick, now! This was built on Marshall’s AJAX work.

All of these changes have been integrated into the Bricolage trunk and I’ve pushed out a developer release today. Please do check out all the goodness on a test box and send feedback or file bug reports! There are only a couple of other features waiting to go into Bricolage before we start the release candidate process. And, oh yeah, tht title of this blog post? It’s not a lie. The next production release of Bricolage, based on all this work, will be Bricolage 2.0. Enough of the features we’d planned for Bricolage lo these many years ago are in the trunk that the new version number is warranted. I for one will be thrilled to see 2.0 ship in time for OSCON.

And in case it isn’t already clear, many thanks to the Google Summer of Code and participating students for the great contributions! This release would not have been possible without them.

Also in the news today, the Bricolage server has been replaced! The new server, which hosts the Web site, the wiki and the instance of Bricolage used to manage the site itself, is graciously provided by the kind folks at Gossamer Threads. The server is Gossamer Threads’s way of giving back to the Bricolage community as they prepare to launch a hosted Bricolage solution. Thaks GT!

The old Bricolage server was provided by pair Networds for the last five years. I’d just like to thank pair for the generous five-year loan of that box, which helped provided infrastructure for both Bricolage and Kineticode. Thank you, pair!

And with that, I’m going heads-down on some other projects. I’ll pop back up to make sure that Bricolage 2.0 is released in a few months, but otherwise, I’m on to other things again for a while. Watch this space for details!

Looking for the comments? Try the old layout.

Bricolage GSoC Projects Completed

I’m very pleased to report that the Google Summer of Code Bricolage projects have all been successfully completed. The contributions of the three Summer of Coders, Marshall Roch, Christian Muise, and Andrei Arsu, will be included in the next major release of Bricolage. On behalf of the Bricolage community, like to extend my gratitude to Google for sponsoring these three excellent students to dramatically improve the interface, capabilities, and compatibility of Bricolage.

So what got done? Here’s a rundown:

  • Marshall Roch added many slick Ajax features to Bricolage. The story profile now manages the editing of all elements and subelements in a single screen, with no loading of a separate screen for subelements. You can navigate to subelements by clicking on a tree structure right in the story profile. Subelements more than three levels down will be loaded dynamically when you get to them. You can also drag and drop fields and elements to reorder them.

    Other stuff that Marshall Ajaxified:

    • Document and category keyword editing
    • Document category association
    • Document output channel associations
    • Organizations in the source profile
    • The “Add More” sections of the user, contributor, media type, and alert type profiles
    • Roles in the contributor profile
    • Assets on desks and My Workspace

    Marshall worked hard to integrate more interactive features into this 2000-era application, and I, for one, appreciate his hard work. Great job, Marshall!

  • Christian Muise added support for an occurrence specification to element types and field types. That means that when you make an element type a subelement of another element type, you can specify the minimum and/or maximum number of times that it can be a subelement. So when an element of the parent type is created, it will automatically add the minimum number of instances of a subelement specified for that parent type. This will allow an entire element tree to be pre-populated as soon as you create a new story or media document. Leaving the min and max occurrence set to 0 (zero) maintains the old behavior (no required subelements and an unlimited number can be added).

    Christian did the same for field types, too. The old “Required” and “Repeatable” attributes are gone; now you just specify a minimum number to require that number of instances of a field, and a maximum number to limit the number of instances. Together with the element type occurrence specification, this functionality allows Bricolage administrators to have a lot more control over the structure of the documents created by editors.

    Christian worked hard to complete this project, despite other huge demands on his time this summer (including a full-time job!). But thanks to his active participation on the developer mail list and his willingness to ask questions of his mentor, Scott Lanning, and myself, he overcame all obstacles to implement these features. He even wrote a number of new tests to ensure that it works properly and will continue to do so for the foreseeable future.

    Excellent work, Christian, and thank you so much for your contribution!

  • Andrei Arsu ported Bricolage to MySQL 5. Bricolage has always run on PostgreSQL and used a number of PostgreSQL-specific features to ensure that it ran properly and well. Andrei took these on, converting the existing PostgreSQL DDL files to work on MySQL, figuring out how to convince MySQL to work with some of their idiosyncrasies, and writing compatibility functions in the MySQL driver and upgrade module so that things should largely “just work.” As a result, for the first time ever, you can now build and run Bricolage on MySQL. Can compatibility with other databases be far behind?

    Andrei picked up Perl very quickly during this project, and was able to understand how such horrible code as the Bricolage installer worked without running screaming from the project. His code was well-written and his approaches to compatibility flexible and scalable. Well done, Andrei!

Future Plans

The next tasks toward getting this code integrated and released are as follows:

  • Andrei will merge his MySQL port into subversion trunk. This should actually be fairly straight-forward.

  • Marshall will merge his Ajaxification work into trunk. I don’t expect that there will be any conflicts with Andrei’s work, as the two projects were orthogonal.

  • Christian will merge his occurrence specification work into trunk. This will require that he work some with Andrei to ensure that his changes to the PostgreSQL DDLs are propagated to the new MySQL DDLS. He will also then need to work with Marshall to make sure that the occurrence specification works properly with the Ajaxified UI.

Once these tasks have been completed, we’ll be ready to release a development version of Bricolage with all three of these major improvements. The development release will allow members of the Bricolage community to start to play with the new features, report bugs, and make further suggestions for improvement. Expect the release sometime in the next six weeks or so.

Again, my thanks to Marshall, Christian, and Andrei for their hard work this summer, and for all that they have contributed to the Bricolage community and project. I hope that each will remain involved in the community, not only to support the features they’ve added, but to work with other members of the community to add new features, help newbies, and generally to spread the word.

Looking for the comments? Try the old layout.

Intelligent MySQL Configuration

James Duncan Davidson’s Configuring MySQL on MacOS X post earlier today reminded me that I wanted to blog about the configuration I came up with while installing MySQL 5 on my box. Nothing has irritated me more than when MySQL’s syntax has violated the ANSI SQL standards in the most blatant ways, or when transactions have appeared to work, but mysteriously not worked. Yes, I use Duncan’s settings to make sure that the MySQL box on my PowerBook only listens on local sockets, but I additionally add this configuration to /etc/my.cnf:

[mysqld]
sql-mode=ansi,strict_trans_tables,no_auto_value_on_zero,no_zero_date,no_zero_in_date,only_full_group_by
character-set-server=utf8
default-storage-engine=InnoDB
default-time-zone=utc

That last configuration can actually only be added after running this command:

/usr/local/mysql/bin/mysql_tzinfo_to_sql /usr/share/zoneinfo | \
mysql -u root mysql

But then the upshot is that I have everything configured to be as compliant as possible (although the time zone stuff is just my personal preference):

mysql> SELECT @@global.sql_mode;
mysql> SELECT @@global.sql_mode;
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| @@global.sql_mode                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| REAL_AS_FLOAT,PIPES_AS_CONCAT,ANSI_QUOTES,IGNORE_SPACE,ONLY_FULL_GROUP_BY,ANSI,NO_AUTO_VALUE_ON_ZERO,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show variables like '%character_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   |
| character_set_connection | utf8   |
| character_set_database   | utf8   |
| character_set_filesystem | binary |
| character_set_results    | utf8   |
| character_set_server     | utf8   |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.01 sec)

mysql> show variables like '%table_ty%';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| table_type    | InnoDB |
+---------------+--------+
1 row in set (0.00 sec)

mysql> show variables like 'time_zone%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| time_zone     | utc   |
+---------------+-------+
1 row in set (0.00 sec)

Now that’s the way things should be! Or at least as close as I’m going to get to it in MySQL 5.

Update 2006-11-04: Ask Bjørn Hansen turned me on to the “strict_trans_tables” mode, which prevents MySQL from trying to guess what you mean when you leave out a value for a required column. So I’ve now updated my configuration with sql-mode=ansi,strict_trans_tables.

Update 2009-11-05: I found myself configuring MySQL again today, and there were some other settings I found it useful to add:

  • no_auto_value_on_zero forces AUTO_INCREMENT columns to increment only when inserting a NULL, rather than when inserting a NULL or a zero(!).
  • no_zero_date and no_zero_in_date disallow dates where the the year or month are set to 0.
  • only_full_group_by requires that non-aggregated columns in a select list be included in a GROUP BY clause, as is mandated by the SQL standard. This only applies if an aggregate function is used in a query

I’ve added all of these to the example above.

Looking for the comments? Try the old layout.

What Advanced SQL Book Should I Buy?

So, what advanced SQL book should I buy? I’ve learned a lot about SQL over the last year or so, but I’m sure that Josh Berkus is tired of being my own personal advanced SQL reference. So I’d like to really learn more about triggers, stored procedures, rules, views, and whatnot, what they’re best used for and when to use them. And other typical database features that I’m not familiar with, of course.

What I don’t need is an introduction to SQL. There are a million of those, and they all have much the same stuff. I want to really get into advanced concepts.

So what’s the best choice? Leave me a comment with your opinion. Thanks!

Looking for the comments? Try the old layout.

MySQL’s REPLACE Considered Harmful

So we’ve set up a client with an online poll application using MySQL. Polls are created in Bricolage, and when they’re published, rather than writing data to files, the template writes data to the MySQL database. PHP code on the front-end server then uses the database records to manage the polls.

On the recommendation of one of my colleagues, I was using the MySQL REPLACE statement to insert and update poll answers in the database. At first, this seemed like a cool idea. All I had to do was create a unique index on the story_id and ord (for answer order) columns and I was set. Any time someone reordered the answers or changed their wording in Bricolage, the REPLACE statement would change the appropriate records and just do the right thing.

Or so I thought.

Come the day after the launch of the new site, I get a complaint from the customer that the percentage spread between the answers doesn’t add up to 100%. After some investigation, I realized that the poll_results table is using the ID of each question to identify the votes submitted by readers. This makes sense, of course, and is excellent relational practice, but I have overlooked the fact that REPLACE essentially replaces rows every time it is used. This means that even when a poll answer hasn’t changed, it gets a new ID. Yes, that’s right, its primary key value was changing. Yow!

Now we might have caught this earlier, but the database was developed on MySQL 3.23.58 and, as is conventional among MySQL developers, there were no foreign key constraints. So the poll results were still happily pointing to non-existent records. So a poll might appear to have 800 votes, but the percentages might be counted for only 50 votes. Hence the problem with the percentages not adding up to 100% (nowhere near it, in fact).

Fortunately, the production application is on a MySQL 4.1 server, so I made a number of changes to correct this issue:

  • Added foreign key constraints
  • Exploited a little-known (mis)feature of Bricolage to store primary keys for all poll answers (and questions, for that matter)
  • Switched from REPLACE to INSERT, UPDATE, and DELETE statements using the primary keys

I also started using transactions when making all these updates when a poll is published so that changes are always atomic. Now it works beautifully.

But the lesson learned is that REPLACE is a harmful construct. Yes, it was my responsibility to recognize that it would create new rows and therefore new primary keys. But any construct that changes primary keys should be stricken from any database developer’s toolbox. The fact that MySQL convention omits the use of foreign key constraints makes this a particularly serious issue that can appear to have mysterious consequences.

So my advice to you, gentle reader, is don’t use it.

Looking for the comments? Try the old layout.

My Adventures with Mac OS X

I recently decided to make the leap from Yellow Dog Linux to Mac OS X on my Titanium PowerBook. Getting everything to work the way I wanted proved to be a challenge, but well worth it. This document outlines all that I learned, so that neither you nor I will have to experience such pain again. The overall goal was to get Bricolage up and running, figuring that if it worked, then just about any mod_perl based solution would run. I’m happy to say that I was ultimately successful. You can be, too.

In the descriptions below, I provide links to download the software you’ll need, as well as the shell commands I used to compile and install each package. In all cases (except for the installation of the Developer Tools), I saved each package’s sources to /usr/local/src and gunzipped and untarred them there. I also carried out each step as root, by running sudo -s. If you’re not comfortable using a Unix shell, you might want to read up on it, first. All of my examples also assume a sh-compatible shell, such as bash or zsh. Fortunately, zsh comes with OS X, so you can just enable it for yourself in NetInfo Manager by setting users -> <username> -> shell to “/bin/zsh”, where <username> is your user name.

Developer Tools

All of the software that I describe installing below must be compiled. To compile software on Mac OS X, you need to install the Mac OS X Developer Tools. These provide the cc compiler and many required libraries. Conveniently, these come on a CD-ROM with the Mac OS X Version 10.1 upgrade kit. I just popped in the CD and installed them like you’d install any other OS X software. I needed administrative access to OS X to install the Developer Tools (or, indeed, to install any of the other software I describe below), but otherwise it posed no problems.

The best time to install the Developer Tools is immediately after upgrading to OS X version 10.1. Then run the Software Update applet in the System preferences to get your system completely up-to-date. By the time I was done, I had the system updated to version 10.1.3.

Emacs

The first step I took in the process of moving to OS X was to get working the tools I needed most. Essentially, what this meant was GNU Emacs. Now I happen to be a fan of the X version of Emacs – not XEmacs, but GNU Emacs with X support built in. I wasn’t relishing the idea of having to install X on OS X (although there are XFree86 ports that do this), so I was really pleased to discover the Mac-Emacs project. All I had to do was patch the GNU Emacs 21.1 sources and compile them, and I was ready to go! GNU Emacs works beautifully with the OS X Aqua interface.

There were a few configuration issues for me to work out, however. I have become addicted to the green background that an old RedHat .XConfig file had set, and I wanted this feature in OS X, too. Plus, the default font was really ugly (well, too big, really – anyone know how to make it smaller in Emacs?) and the Mac command key was working as the Emacs META key, rather than the option key. So I poked around the net until I found the settings I needed and put them into my .emacs file:

(custom-set-faces
'(default ((t (:stipple nil
  :background "DarkSlateGrey"
  :foreground "Wheat"
  :inverse-video nil
  :box nil
  :strike-through nil
  :overline nil
  :underline nil
  :slant normal
  :weight normal
  :height 116
  :width normal
  :family "apple-andale mono"))))
'(cursor ((t (:background "Wheat"))))
; Use option for the meta key.
(setq mac-command-key-is-meta nil)

Installing Emacs is not required for installing any of the other packages described below – it just happens to be my favorite text editor and IDE. So I don’t provide the instructions here; the Mac-Emacs project does a plenty good job. If you’re not comfortable with Unix editors, you can use whatever editor you like. BBEdit is a good choice.

GDBM

Mac OS X doesn’t come with a DBM! But since mod_ssl needs it, we have to install it. Fortunately, I found this PDF detailing someone else’s adventures with mod_ssl on OS X, and it provided decent instructions for installing GDBM. First, I created a new user for GDBM. In NetInfoManager, I created a duplicate of the “unknown” user and named it “bin”. Then, I downloaded GDBM from the FSF, and installed it like this:

cd /usr/local/src/gdbm-1.8.0
cp /usr/libexec/config* .
./configure
make
make install
ln -s /usr/local/lib/libgdbm.a /usr/local/lib/libdbm.a

That did the trick. Nothing else was involved, fortunately.

Expat

Who doesn’t do something with XML these days? If your answer is, “not me!”, then you’ll need to install the Expat library in order to work with XML::Parser in Perl. Fortunately it’s relatively easy to install, although support for the -static flag appears to be broken in cc on OS X, so it needs to be stripped out. I downloaded it from its project bpage, and then did this:

cd /usr/local/src/expat-1.95.2
./configure
perl -i.bak -p -e \
  's/LDFLAGS\s*=\s*-static/LDFLAGS=/' \
  examples/Makefile
perl -i.bak -p -e \
    's/LDFLAGS\s*=\s*-static/LDFLAGS=/' \
    xmlwf/Makefile
make
make install

Perl

Although Mac OS X ships with Perl (Yay!), it’s the older 5.6.0 version. There have been many bug fixes included in 5.6.1, so I wanted to make sure I got the latest stable version before I built anything else around it (mod_perl, modules, etc.).

Being a Unix program, Perl doesn’t expect to run into the problems associated with a case-insensitive file system like that Mac OS X’s HFS Plus. So there are a couple of tweaks to the install process that make it slightly more complicated than you would typically expect. Fortunately, many have preceded us in doing this, and the work-arounds are well-known. Basically, it comes down to this:

cd /usr/local/src/perl-5.6.1/
export LC_ALL=C
export LANG=en_US
perl -i.bak -p -e 's|Local/Library|Library|g' hints/darwin.sh
sh Configure -des -Dfirstmakefile=GNUmakefile -Dldflags="-flat_namespace"
make
make test
make install

There were a few errors during make test, but none of them seems to be significant. Hopefully, in the next version of Perl, the build will work just as it does on other platforms.

Downloads

Before installing Open SSL, mod_ssl, mod_perl, and Apache, I needed to get all the right pieces in place. The mod_ssl and mod_perl configure processes patch the Apache sources, so the Apache sources have to be downloaded and gunzipped and untarred into an adjacent directory. Furthermore, the mod_ssl version number corresponds to the Apache version number, so you have to be sure that they match up. Normally, I would just download the latest versions of all of these pieces and run with it.

However, Bricolage requires the libapreq library and its supporting Perl modules to run, and these libraries have not yet been successfully ported to Mac OS X. But worry not; fearless mod_perl hackers are working on the problem even as we speak, and there is an interim solution to get everything working.

As of this writing, the latest version of Apache is 1.3.24. But because I needed libapreq, I had to use an experimental version of Apache modified to statically compile in libapreq. Currently, only version 1.3.23 has been patched for libapreq, so that’s what I had to use. I discovered this experimental path thanks to a discussion on the Mac OS X Perl mail list.

So essentially what I did was download the experimental apache.tar.gz and the experimental lightweight apreq.tar.gz packages and gunzip and untar them into /usr/local/src. Then I was ready to move on to Open SSL, mod_ssl, and mod_perl.

Open SSL

Compiling Open SSL was pretty painless. One of the tests fails, but it all seems to work out, anyway. I download the sources from the Open SSL site, and did this:

cd /usr/local/src/openssl-0.9.6c
./config
make
make test

mod_ssl

The mod_ssl Apache module poses no problems whatsoever. I simply downloaded mod_ssl-2.8.7-1.3.23 from the mod_ssl site (note that the “1.3.23” at the end matches the version of Apache I downloaded) and gunzipped and untarred it into /usr/local/src/. Then I simply excuted:

./configure --with-apache=/usr/local/src/apache_1.3.23

mod_perl

Configuring and installing mod_ssl was, fortunately, a relatively straight-forward process. Getting Apache compiled with mod_perl and mod_ssl, however, was quite tricky, as you’ll see below. A number of braver folks than I have preceded me in installing mod_perl, so I was able to rely on their hard-earned knowledge to get the job done. For example, Randal Schwartz posted instructions to the mod_perl mail list, and his instructions worked well for me. So I downloaded the sources from the mod_perl site, and did this:

cd /usr/local/src/mod_perl-1.26
perl Makefile.PL \
  APACHE_SRC=/usr/local/src/apache_1.3.23/src \
  NO_HTTPD=1 \
  USE_APACI=1 \
  PREP_HTTPD=1 \
  EVERYTHING=1
make
make install

Apache

Getting Apache compiled just right was the most time-consuming part of this process for me. Although many had gone before me in this task, everybody seems to do it differently. I had become accustomed to just allowing Apache to use most of its defaults when I compiled under Linux, but now I was getting all kinds of errors while following different instructions from different authorities from around the web. Sometimes Apache wouldn’t compile at all, and I’d get strange errors. Other times it would compile, pass all of its tests, and install, only to offer up errors such as

dyld: /usr/local/apache/bin/httpd Undefined symbols: _log_config_module

when I tried to start it. It turns out that the problem there was that I had a number of modules compiled as DSOs – that is, libraries that can be loaded into Apache dynamically – but wasn’t loading them properly in my httpd.conf. This was mainly because I’ve grown accustomed to Apache having all the libraries I needed compiled in statically, so I simply didn’t have to worry about them.

But I finally hit on the right incantation to get Apache to compile with everything I need added statically, but still with support for DSOs by compiling in mod_so. I present it here for your viewing pleasure:

SSL_BASE=/usr/local/src/openssl-0.9.6c/ \
  ./configure \
  --with-layout=Apache \
  --enable-module=ssl \
  --enable-module=rewrite \
  --enable-module=so \
  --activate-module=src/modules/perl/libperl.a \
  --disable-shared=perl \
  --without-execstrip
make
make certificate TYPE=custom 
make install

This series of commands successfully compiled Apache with mod_perl and mod_ssl support statically compiled in, along with most of the other default modules that come with Apache. In short, everything is there that you need to run a major application with security such as Bricolage.

Note that make certificate will lead you through the process of creating an SSL certificate. I like to use the “custom” type so that it reflects the name of my organization. But you can use whatever approach you’re most comfortable with. Consult the mod_ssl INSTALL file for more information.

libapreq

Once Apache is installed with mod_perl and mod_ssl, the rest is gravy! The experimental libapreq library I downloaded installed without a hitch:

cd /usr/local/src/httpd-apreq
perl Makefile.PL
make
make install

PostgreSQL

PostgreSQL is a sophisticated open-source Object-Relational DBMS. I use it a lot in my application development, and it, too, is required by Bricolage. I was a bit concerned about how well it would compile and work on Mac OS X, but I needn’t have worried. First of all, Apple has provided some pretty decent instructions. Although they mainly document how to install MySQL, a competing open-source RDBMS, many of the same concepts apply to PostgreSQL.

The first thing I had to do was to create the “postgres” user. This is the system user that PostgreSQL typically runs as. I followed Apple’s instructions, using NetInfo Manager to duplicate the default “www” group and “www” user and give the copies the name “postgres” and a new gid and uid, respectively.

Next I downloaded the PostgreSQL version 7.2.1 sources. Version 7.2 is the first to specifically support Mac OS X, so going about the install was as simple as it is on any Unix system:

./configure --enable-multibyte=UNICODE
make
make install

That was it! PostgreSQL was now installed. Next I had to initialize the PostgreSQL database directory. Again, this works much the same as it does on any Unix system:

sudo -u postgres /usr/local/pgsql/bin/initdb \
  -D /usr/local/pgsql/data

The final step was to start PostgreSQL and try to connect to it:

sudo -u postgres /usr/local/pgsql/bin/pg_ctl start \
  -D /usr/local/pgsql/data /usr/local/pgsql/bin/psql -U postgres template1

If you follow the above steps and find yourself at a psql prompt, you’re in business! Because I tend to use PostgreSQL over TCP, I also enabled TCP connectivity by enabling the “tcpip_socket” option in the postgresql.conf file in the data directory created by initdb:

tcpip_socket = true

If you’re like me, you like to have servers such as PostgreSQL start when your computer starts. I enabled this by creating a Mac OS X PostgreSQL startup bundle. It may or may not be included in a future version of PostgreSQL, but in the meantime, you can download it from here. Simply download it, gunzip and untar it into /Library/StartupItems, restart OS X, and you’ll see it start up during the normal Mac OS X startup sequence. I built this startup bundle by borrowing from the existing FreeBSD PostgreSQL startup script, the Apache startup script that ships with OS X, and by reading the Creating SystemStarter Startup Item Bundles HOWTO.

XML::Parser

At this point, I had most

of the major pieces in place, and it was time for me to install the Perl modules I needed. First up was XML::Parser. For some reason, XML::Parser can’t find the expat libraries, even though the location in which I installed them is pretty common. I got around this by installing XML::Parser like this:

perl Makefile.PL EXPATLIBPATH=/usr/local/lib \
  EXPATINCPATH=/usr/local/include
make
make test
make install

Text::Iconv

In Bricolage, Text::Iconv does all the work of converting text between character sets. This is because all of the data is stored in the database in Unicode, but we wanted to allow users to use the character set with which they’re accustomed in the UI. So I needed to install Text::Iconv. Naturally, Mac OS X doesn’t come with libiconv – a library on which Text::Iconv depends – so I had to install it. Fortunately, it was a simple process to download it and do a normal build:

cd /usr/local/src/libiconv-1.7
./configure
make
make install

Now, Text::Iconv itself was a little more problematic. You have to tell it to look for libiconv by adding the -liconv option to the LIBS key in Makefile.PL. I’ve simplified doing this with the usual Perl magic:

perl -i.bak -p -e \
  "s/'LIBS'\s*=>\s*\[''\]/'LIBS' => \['-liconv'\]/" \
  Makefile.PL
perl Makefile.PL
make
make test
make install

DBD::Pg

Although the DBI installed via the CPAN module without problem, DBD::Pg wanted to play a little less nice. Of course I specified the proper environment variables to install it (anyone know why DBD::Pg’s Makefile.PL script can’t try to figure those out on its own?), but still I got this error during make:

/usr/bin/ld: table of contents for archive:
/usr/local/pgsql/lib/libpq.a is out of date;
rerun  ranlib(1) (can't load from it)

But this was one of those unusual situations in which the error message was helpful. So I took the error message’s advice, and successfully compiled and installed DBD::Pg like this:

ranlib /usr/local/pgsql/lib/libpq.a
export POSTGRES_INCLUDE=/usr/local/pgsql/include
export POSTGRES_LIB=/usr/local/pgsql/lib
perl Makefile.PL
make
make test
make install

LWP

The last piece I needed to worry about customizing when I installed it was LWP. Before installing, back up /usr/bin/head. The reason for this is that LWP will install /usr/bin/HEAD, and because HFS Plus is a case-insensitive file system, it’ll overwrite /usr/bin/head! This is a pretty significant issue, since many configure scripts use /usr/bin/head. So after installing LWP, move /usr/bin/HEAD, GET, & POST to /usr/local/bin. Also move /usr/bin/lwp* to /usr/local/bin. Then move your backed-up copy of head back to /usr/bin.

Naturally, I didn’t realize that this was necessary until it was too late. I installed LWP with the CPAN module, and it wiped out /usr/bin/head. Fortunately, all was not lost (though it took me a while to figure out why my Apache compiles were failing!): I was able to restore head by copying it from the Mac OS X installer CD. I Just popped it in an executed the command:

cp "/Volumes/Mac OS X Install CD/usr/bin/head" /usr/bin

And then everything was happy again.

Bricolage

And finally, the pièce de résistance: Bricolage! All of the other required Perl modules installed fine from Bundle::Bricolage:

perl -MCPAN -e 'install Bundle::Bricolage'

Then I simply followed the directions in Bricolage’s INSTALL file, and started ’er up! I would document those steps here, but the install process is currently in flux and likely to change soon. The INSTALL file should always be current, however – check it out!

To Be Continued

No doubt my adventures with Unix tools on Mac OS X are far from over. I’ve reported to various authors on the issues I’ve described above, and most will soon be releasing new versions to address those issues. As they do, I’ll endeavor to keep this page up-to-date. In the meantime, I am thoroughly enjoying working with the first really solid OS that Apple has released in years, and thrilled that I can finally have the best of both worlds: a good, reliable, and elegant UI, and all the Unix power tools I can stand! I hope you do, too.

Looking for the comments? Try the old layout.