CF Web>Database>PostgreSQL>PostgresSchemaUpdate (revision 1)~~EditAttach~~

Updating a Postgres Schema

It is easy to edit the .sql files which specify the tables and other schema entries for an application. In development, it is even easy to drop and re-create the schema from these files. But when it comes time to apply the changes to an application in production, it can be a bit trickier. This is especially so if the application provides tables or other objects directly to other applications; in this case, dumping the data, dropping and re-creating the schema, and restoring the data may not work.

Instead, what we do is write an update script, which is just an SQL file which issues appropriate SQL commands to change the existing tables to match the schema creation file. I believe there are automatic tools for doing this but I haven‘t investigated them. This is a job I want to understand how to do manually before I start trusting automation—the whole point is to update a production schema in place, so it is important that one understands exactly what the automatic tool is going to do.

Starting Out

The first thing to do is to dump the schema of the production schema. So once we have set our environment variables to cause us to connect to the production database, we use pg_dump to get the existing schema and save it in a file:

pg_dump --schema=uwdata --schema-only >old_uwdata.dump

Now we set our environment variables back to our usual values so the rest of the work connects to the development database. The rest of the work is done using exclusively the development database until we have a migration script we are confident is correct.

Basic Idea

We want to find the differences between our updated schema and the schema actually in use on the production server. The diff tool is nice for finding differences between files, but it performs a fairly unsophisticated textual diff—as far as it is concerned, every file is an sequence of atoms (lines). We need to translate the schemas into a form which diff can handle. pg_dump almost does this, but it can put items out of order, which typically isn't a real difference but will certainly be reported as one by diff.

The solution (which is probably only partial) is to use diff only on the result of using pg_dump on a schema created by loading the output of pg_dump.

-- IsaacMorland - 14 Jul 2009

Topic revision: r1 - 2009-07-14 - IsaacMorland

Information in this area is meant for use by CSCF staff and is not official documentation, but anybody who is interested is welcome to use it if they find it useful.

Other Webs

My links
- People
- CERAS
- WatForm
- Tetherless lab
- Ubuntu Main.HowTo
- eDocs
- RGG NE notes
- RGG
- CS infrastructure
- Grad images

Edit