03-13-08
--------

Summary
- drms_query_string()
- create_series hangs upon permission problem
- Gcc compilation with cfitsio library
- Slony-I log shipping woes

drms_query_string()
-------------------

Moved the query string generatation part out of
drms_retrieve_record(). It becomes a function by itself to faciliate
more query types. We define three types of queries;
DRMS_QUERY_COUNT: return records count for a given dsname. No "ORDER
BY" or "LIMIT" clause in query string
DRMS_QUERY_FL: return selected keyword. 
DRMS_QUERY_ALL: return all keywords. This is the one used by drms_open_records()

create_series hangs upon permission problem
-------------------------------------------

create_series hangs if it does not have permission to write any of the
master tables drms_*. I am still debugging it.

Gcc compilation with cfitsio library
------------------------------------

Now that cfitsio has been added to DRMS, gcc had trouble to compile with
icc compiled cfitsio library. Keh-Cheng took the trouble to recompile
the cfitsio library using gcc. Keh-Cheng recommends maintaining two
seperate cfitsio libraries, one for gcc, and one for icc, in order to
get better performance.

Slony-I log shipping woes
-------------------------

I have done some experiment and found that the sequence (the recnum
counter) may not be needed for query. This is good news for us because
sequence is not a good candidate for replication.

Since we must have a regular slave node, I propose to have one slony
slave that replicates a union of series that remote DRMS's want.  The
alternative is to maintain a slave for each remote DRMS thus incurring
overhead of potentially duplicated replication effort.  With only one
slave, we'll need to apply our own filter to parcel out slony logs to
remote DRMS's according to their subscription. This is based on the
assumption that filtering of the slony logs can be done, and this
would avoid duplicating replication efforts. The advantage of
replicating a "union" vs the whole database is to minimize unnecessary
replications. In other words, our slony slave only replicates what's
needed and once only if it's needed by multiple remote DRMS's. A
create_series event in our master database leads to slony
configuration change only if some remote DRMS wants it. The same can
be said about delete_series.

----------------------------------------------------------------------
From Slony-I documentation:

The log shipping functionality amounts to "sniffing" the data applied
at a particular subscriber node. As a result, you must have at least
one "regular" node; you cannot have a cluster that consists solely of
an origin and a set of "log shipping nodes.".

The "log shipping node" tracks the entirety of the traffic going to a
subscriber. You cannot separate things out if there are multiple
replication sets.

The "log shipping node" presently only fully tracks SYNC events. This
should be sufficient to cope with some changes in cluster
configuration, but not others.

----------------------------------------------------------------------
The following is also from Slony-I documentation. It is different from
what Jennifer presented on Tuesday, i.e., the application of logs
can't be arbitrary.

As of 1.2.11, there is an even better idea for application of logs, as
the sequencing of their names becomes more predictable. 

    * The table, on the log shipped node, tracks which log it most
      recently applied in table sl_archive_tracking. Thus, you may
      predict the ID number of the next file by taking the latest
      counter from this table and adding 1. 

    * There is still variation as to the filename, depending on what
      the overall set of nodes in the cluster are. All nodes
      periodically generate SYNC events, even if they are not an
      origin node, and the log shipping system does generate logs for
      such events. 

      As a result, when searching for the next file, it is necessary
      to search for files in a manner similar to the following: 

      ARCHIVEDIR=/var/spool/slony/archivelogs/node4
      SLONYCLUSTER=mycluster
      PGDATABASE=logshipdb
      PGHOST=logshiphost
      NEXTQUERY="select at_counter+1 from \"_${SLONYCLUSTER}\".sl_archive_tracking;"
      nextseq=`psql -d ${PGDATABASE} -h ${PGHOST} -A -t -c "${NEXTQUERY}"
      filespec=`printf "slony1_log_*_%20d.sql"
      for file in `find $ARCHIVEDIR -name "${filespec}"; do
         psql -d ${PGDATABASE} -h ${PGHOST} -f ${file}
      done