05-15-08 -------- - DB migration . Binaries . Migration glitch . New setup . Migration casualties . Aftermath - Bug fix DB migration ------------ . Binaries: compiled 8.3.1 with modified src on both old and new machine, run the 8.3.1 version of pg_dumpall on the old machine . Migration glitches 1. The default locale setup on the machine is not C. hmidb:~$ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= This default setting does not alllow database with encoding LATIN1. jsoc uses LATIN1. I did the following, initdb --locale=C -D /usr/local/pgsql/data initdb --locale=C -D /usr/local/pgsql/data_sums 2. checkpoint too frequent for the migration task. When I changed it to 10 times the default value, the warning message disappeared. ---------------------------------------------------------------------- . New setup The new machine is called hmidb now, and the old machine renamed to hmidb2 Two database servers: 1. from port 5432 to serve DRMS /d/pgsql/data for database /d/pgsql/backup for archiving WALs, will move to /c/pgsql/backup 2. from port 5434 to serve SUMS /d/pgsql/data_sums for database /d/pgsql/backup_sums for archiving WALs, will move to /c/pgsql/backup_sums Both servers have the same pg_hba.conf. The server configuration varies, with DRMS DB server getting more memory. Both has archive turned on. Both servers not in /etc/init.d . Migration casualties SID database browser . Aftermath - hmidb2 may still run postmaster with listen_address set to localhost - removed SUMS tables from DB jsoc, which got transferred with other - home directories on old db machine. Brian moved them - found out that Jim backed up DCS stuff into postgres directory, not a good idea. - cron scripts, needs rework. Bug fix ------- Art noticed that transient record is not working for direct connect modules. I located the problem in drms_client.c:drms_alloc_recnum(), which does not record the recnums in templist for direct connect module. While fixing this problem, I introduced another one which makes all records temporary. Art fixed the bug. --- drms_client.c 10 May 2008 00:02:24 -0000 1.15 +++ drms_client.c 9 May 2008 19:25:18 -0000 1.14 @@ -771,10 +771,7 @@ if (env->session->db_direct) { seqnums = db_sequence_getnext_n(env->session->db_handle, series, n); - if (lifetime == DRMS_TRANSIENT) - { - drms_server_transient_records(env, series, n, seqnums); - } + drms_server_transient_records(env, series, n, seqnums); return seqnums; } else If you have used this buggy code, your records would not have committed in the database, since it was treating all records as transient. The segments associated with records will become orphans. Other than that you would have just wasted some recnums.