TWiki
>
CF Web
>
Database
>
PostgreSQL
>
PostgresSchemaUpdate
(2009-07-20,
IsaacMorland
)
(raw view)
E
dit
A
ttach
---+ Updating a Postgres Schema It is easy to edit the =.sql= files which specify the tables and other schema entries for an application. In development, it is even easy to drop and re-create the schema from these files. But when it comes time to apply the changes to an application in production, it can be a bit trickier. This is especially so if the application provides tables or other objects directly to other applications; in this case, dumping the data, dropping and re-creating the schema, and restoring the data may not work. Instead, what we do is write an update script, which is just an SQL file which issues appropriate SQL commands to change the existing tables to match the schema creation file. I believe there are automatic tools for doing this but I haven’t investigated them. This is a job I want to understand how to do manually before I start trusting automation—the whole point is to update a production schema in place, so it is important that one understands *exactly* what the automatic tool is going to do. ---++ Starting Out The first thing to do is to dump the schema of the production schema. So once we have set our environment variables to cause us to connect to the production database, we use =pg_dump= to get the existing schema and save it in a file: <verbatim> pg_dump --schema=uwdata --schema-only >old_uwdata.dump </verbatim> Now we set our environment variables back to our usual values so the rest of the work connects to the development database. The rest of the work is done using exclusively the development database until we have a migration script we are confident is correct. Make sure you don’t set your connection information to connect to the production database while experimenting, otherwise you will need PostgresBackupRecovery at some point. ---++ Basic Idea We want to find the differences between our updated schema and the schema actually in use on the production server. The =diff= tool is nice for finding differences between files, but it performs a fairly unsophisticated textual diff—as far as it is concerned, every file is an sequence of atoms (lines). We need to translate the schemas into a form which =diff= can handle. =pg_dump= almost does this, but it can put items out of order, which typically isn't a real difference but will certainly be reported as one by =diff=. The solution (which is not quite complete) is to use =diff= only on the result of using =pg_dump= on a schema created by loading the output of =pg_dump=. We need a script to obtain an appropriate schema dump based on our version-controlled schema files, and another script to obtain an appropriate schema dump based on the production schema, adjusted by our proposed migration script. ---++ Example Scripts The schema names and file names will have to be adjusted for specific conditions. Make sure that environment variables are set for the development instance before running these, otherwise you will need PostgresBackupRecovery. Here is an example script to get the version-controlled target schema: <verbatim> #!/bin/bash psql <<EOF set role dba; drop schema uwdata cascade; \i init_uwdata.sql EOF pg_dump --schema=uwdata --schema-only >fresh_uwdata.dump psql <<EOF set role dba; drop schema uwdata cascade; \i fresh_uwdata.dump EOF pg_dump --schema=uwdata --schema-only >fresh_uwdata_reloaded.dump </verbatim> Here is a script to test the migration and check the results: <verbatim> #!/bin/bash psql <<EOF set role dba; drop schema uwdata cascade; \i old_prod_uwdata_reloaded.dump \i migrate_regdata.sql EOF pg_dump --schema=uwdata --schema-only >migrated_regdata.sql diff migrated_regdata.sql fresh_uwdata_reloaded.dump diff migrated_regdata.sql fresh_uwdata_reloaded.dump | wc </verbatim> The =old_prod_uwdata_reloaded.dump= file is obtained by loading =old_uwdata.dump=, generated earlier, and dumping using =pg_dump=. This is the same as the technique used to obtain =fresh_uwdata_reloaded.dump= in the first script. ---++ Deficiencies I know of the following differences that cannot be resolved with this technique: * Fields cannot be re-ordered. When an =ALTER TABLE= command adds a field, it does so as the last field in the table. Therefore, fields added to schema files need to be added as the last field also in order to match. * Permissions seem to be dumped in an order that is related to the order in which the permissions were assigned. Typically this appears in the =diff= output as re-ordered lines, which are fairly easy to recognize as long as the total number of them is manageable. -- Main.IsaacMorland - 14 Jul 2009
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r4 - 2009-07-20
-
IsaacMorland
CF
Information in this area is meant for use by CSCF staff and is not official documentation, but anybody who is interested is welcome to use it if they find it useful.
CF Web
CF Web Home
Changes
Index
Search
Administration
Communication
Email
Hardware
HelpDeskGuide
Infrastructure
InternalProjects
Linux
MachineNotes
Macintosh
Management
Networking
Printing
Research
Security
Software
Solaris
StaffStuff
TaskGroups
TermGoals
Teaching
UserSupport
Vendors
Windows
XHier
Other Webs
CSEveryBody
Main
Sandbox
TWiki
UW
My links
People
CERAS
WatForm
Tetherless lab
Ubuntu Main.HowTo
eDocs
RGG NE notes
RGG
CS infrastructure
Grad images
Edit
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback