04-17-08
--------

Summary
- Trouble with drms_series() PL/pgSQL function
- Improve prototype of a perl interface to SUMS API
- SU export script
- New db machine
- Writeup Slony-I logshipping status and todos
- Slow queries
- Sub-record idea to reduce memory usage for drms_open_record()

Trouble with drms_series() PL/pgSQL function
--------------------------------------------

Ran into some problems when adding new columns to drms_series
tables. Basically all drms_series tables must be identical, the order
of the columns in the table definition must be the same. Learned
a debugging tool, i.e., 'RAISE NOTICE'. 

Fixed up masterlists.c to ensure the order of columns for future
drms_series table.


Improve prototype of a perl interface to SUMS API
-------------------------------------------------

Last report I reported that I managed to crash sum_svc while testing
SUM_put(). It's a check that SUMS API can do. Jim has agreed to add it.

New improvement:
- move wrapper functions, without printf's, out of sum_open.c and into
  the interface spec. 
- use uint64_t instead of unsigned long long.
- put swig command into makefile and 

To do:
- select location for perl module.
- test on gcc. currently using icc

SU export script
----------------

SU export script now uses perl interface to SUMS API

May query status via
- separate SU export log table
- use sum_main.online_status most accurate
What needs to be logged for SU export?

todo:
SU ingest: this would depend on Jim's new function SUM_alloc2() which
takes a sunum as input param.

New db machine
--------------

DB jsoc is currently about 15G

installed postgres from src in /usr/local/
- 8.3.1 use default port 5432 
- 8.1.11 configured to run from port 5434

migration test plan:
- mount disk from hmidb
- run 8.1.11 from this mounted disk from port 5434
- run 8.3.1 from port 5432
- move data from 8.1.11 to 8.3.1

Writeup Slony-I logshipping status and todos
--------------------------------------------

From my email on Mar 19, I listed the following targest for Slony log
shipping:

1. the order of log application: search for the right log file using a
    similiar script given in the slony documentation.
2. dynamically add and delete a series to a running replication cluster.
3. run slony_logshipper tool.

Jennifer has since done 1 and part of 2 (add only, no delete yet).
Here is what I think should be done

1. run slony_logshipper tool (Jennifer, we talked about it before the
   group meething. I forgot to add to your handwritten notes. Please
   add it).
2. run log ingest script without target table on the log slave
   site. This is to handle adding series for the log slave. We'll need
   to be able to gracefully bailout the ingest script, manually put in
   the new table, and resume the ingest script.
3. write script for moving logs from log generating site to remote
   site. This can be simulated by moving logs between two
   directories using cp, which can be changed to scp/sftp later. 
   when to move logs can be either timer based or log volume based,
   a better choice will be from both, whichever meets the threshold
   first. We'll also need to keep a log of what we have
   transferred. 

After we have all these pieces ready, we should conduct a more
realistic tests. I have outlined it to Jennifer. Here it is more
formally. 

Let M be the master DB, S the slony slave, and LS the log shipping
slave. Let T1 and T2 be two series on M

0. Run pgbench on M to update T1 and T2 continuously. 
1. Configure S to replicate T1 and generating logs (not being used)
2. Configure LS to replicate T1 (this is to test the creation of a new
   remote DRMS replica). This might use slony1_dump.sh to bring LS
   up-to-date, then use log shipping for incremental updates.
3. Add T2 to replication set on S
4. Add T2 to replication set on SL
5. Drop T2 from replication set on S
...

I recommend running the test sequence I outlined here using the toy
pgbench database first before any real data is involved. In fact, I
think we should keep this pgbench toy database permanently. Any time
we want to perform some slony configuration change, we should first
practise it on this toy database, simulating the real case as much as
possible, before trying it on the real database.

Slow queries
------------
I saw a few slow queries: (each of them repeats). I'll dig in more later.

1. LOG:  duration: 28030.756 ms  statement: update SUM_PARTN_AVAIL set
AVAIL_BYTES  =  31685997259489  where PARTN_NAME =  '/SUM1'

2. LOG:  duration: 1436.054 ms  statement: fetch reqcursor
I don't know where this one came from. There is no cursor used from
DRMS queries. So it's most likely from SUMS.

3. 1394.979 ms  statement: select ds_index from sum_partn_alloc where wd
like '/SUM3/D1130392%'

There is no index on sum_partn_alloc. Jim, Since you are using 'like',
the standard "create index" won't work.  I showed you before how to do
it. Let me know if you need that notes.

jsoc=> \d sum_partn_alloc
             Table "public.sum_partn_alloc"
       Column       |         Type          | Modifiers 
-------------------+-----------------------+-----------
  wd                | character varying(80) | not null
  sumid             | bigint                | not null
  status            | integer               | not null
  bytes             | bigint                |
  effective_date    | character varying(20) |
  archive_substatus | integer               |
  group_id          | integer               |
  ds_index          | bigint                | not null
  safe_id           | integer               | 
Indexes:
     "sum_partn_alloc_ds_index_idx" btree (ds_index)
     "sum_partn_alloc_effdate_idx" btree (effective_date)
     "sum_partn_alloc_sumid_idx" btree (sumid)

vacuumed sum_partn_avail table, not sure why it was so slow 

Sub-record idea to reduce memory usage for drms_open_record()
-------------------------------------------------------------

I had an idea of creating a view of the record template so as to cut
down memory usage if only a subset of keywords are of interest. Since
we do use template to create records, why not creating a smaller
template, a sub-template, in place of the original template.