03-13-08 -------- Summary - drms_query_string() - create_series hangs upon permission problem - Gcc compilation with cfitsio library - Slony-I log shipping woes drms_query_string() ------------------- Moved the query string generatation part out of drms_retrieve_record(). It becomes a function by itself to faciliate more query types. We define three types of queries; DRMS_QUERY_COUNT: return records count for a given dsname. No "ORDER BY" or "LIMIT" clause in query string DRMS_QUERY_FL: return selected keyword. DRMS_QUERY_ALL: return all keywords. This is the one used by drms_open_records() create_series hangs upon permission problem ------------------------------------------- create_series hangs if it does not have permission to write any of the master tables drms_*. I am still debugging it. Gcc compilation with cfitsio library ------------------------------------ Now that cfitsio has been added to DRMS, gcc had trouble to compile with icc compiled cfitsio library. Keh-Cheng took the trouble to recompile the cfitsio library using gcc. Keh-Cheng recommends maintaining two seperate cfitsio libraries, one for gcc, and one for icc, in order to get better performance. Slony-I log shipping woes ------------------------- I have done some experiment and found that the sequence (the recnum counter) may not be needed for query. This is good news for us because sequence is not a good candidate for replication. Since we must have a regular slave node, I propose to have one slony slave that replicates a union of series that remote DRMS's want. The alternative is to maintain a slave for each remote DRMS thus incurring overhead of potentially duplicated replication effort. With only one slave, we'll need to apply our own filter to parcel out slony logs to remote DRMS's according to their subscription. This is based on the assumption that filtering of the slony logs can be done, and this would avoid duplicating replication efforts. The advantage of replicating a "union" vs the whole database is to minimize unnecessary replications. In other words, our slony slave only replicates what's needed and once only if it's needed by multiple remote DRMS's. A create_series event in our master database leads to slony configuration change only if some remote DRMS wants it. The same can be said about delete_series. ---------------------------------------------------------------------- From Slony-I documentation: The log shipping functionality amounts to "sniffing" the data applied at a particular subscriber node. As a result, you must have at least one "regular" node; you cannot have a cluster that consists solely of an origin and a set of "log shipping nodes.". The "log shipping node" tracks the entirety of the traffic going to a subscriber. You cannot separate things out if there are multiple replication sets. The "log shipping node" presently only fully tracks SYNC events. This should be sufficient to cope with some changes in cluster configuration, but not others. ---------------------------------------------------------------------- The following is also from Slony-I documentation. It is different from what Jennifer presented on Tuesday, i.e., the application of logs can't be arbitrary. As of 1.2.11, there is an even better idea for application of logs, as the sequencing of their names becomes more predictable. * The table, on the log shipped node, tracks which log it most recently applied in table sl_archive_tracking. Thus, you may predict the ID number of the next file by taking the latest counter from this table and adding 1. * There is still variation as to the filename, depending on what the overall set of nodes in the cluster are. All nodes periodically generate SYNC events, even if they are not an origin node, and the log shipping system does generate logs for such events. As a result, when searching for the next file, it is necessary to search for files in a manner similar to the following: ARCHIVEDIR=/var/spool/slony/archivelogs/node4 SLONYCLUSTER=mycluster PGDATABASE=logshipdb PGHOST=logshiphost NEXTQUERY="select at_counter+1 from \"_${SLONYCLUSTER}\".sl_archive_tracking;" nextseq=`psql -d ${PGDATABASE} -h ${PGHOST} -A -t -c "${NEXTQUERY}" filespec=`printf "slony1_log_*_%20d.sql" for file in `find $ARCHIVEDIR -name "${filespec}"; do psql -d ${PGDATABASE} -h ${PGHOST} -f ${file} done