Handle System Banner
Previous: Replication       Next: Using a SQL Database for Storage           Table of Contents

11. Splitting a Handle Server

This chapter describes how to split a single handle server into multiple handle servers. This does not describe how to split sites consisting of multiple servers-- that is a much more complicated process.

The goal with this procedure is to minimize the down time for the primary site. For this reason, the database splitting process can be performed on a checkpoint/backup of the source database and then completed using the transaction log of the source database.

Here are the steps to splitting a single server site:

  1. Create the directories and configuration files for the new servers. This is done by running the net.handle.server.SimpleSetup program to create a new directory, config file, and `siteinfo.bin' for each new server.

  2. Collect the `siteinfo.bin' files for each of the new servers. Combine each of these into one HS_SITE record using the Handle Admin application. Specifically:
    1. open the create-handle window
    2. click the "Add Custom Data" button
    3. select the HS_SITE value type, and click the "Value Data" button
    4. perform steps a through b again to create a new window
    5. for each of the new `siteinfo.bin' files, enter the information for each one (ip address, public key and ports) into the first site-info window
    6. make sure to set the protocol for the new site info to 2 and 1
    7. make sure to set the serial # for the new site to something greater than the serial number of the existing server
    8. make sure to set the ServerID for each server to a unique number this number will need to be specified in the `config.dct' file for each corresponding server so that the server knows which server it is relative to the other servers in the `siteinfo.bin' file.

    Save the new site info value to a new `siteinfo.bin' file. Copy this new file in the directory of each of the new servers. Also make sure to add the this_server_id setting to the server_config section of each of the new handle servers.

  3. Run a checkpoint/backup on the server that is to be split. Then wait until the checkpoint/backup operation is complete. The old primary server should still be running at this point.

  4. Decrease the timeout/TTL value for the HS_SITE value that is being modified in the naming authority or service handle. A good setting is probably 1200 (20 minutes). This is done so that when we change the HS_SITE value, clients don't still try to access the old server for an entire day after the HS_SITE value is changed.

  5. Run the net.handle.apps.tools.SplitServer program to split the `handles.bak' file into the new server directories. Make sure to specify the new server directories in the correct order on the command line. For example, this phase of the DOI split was run like so:
     
        java -cp handle.jar net.handle.apps.tools.SplitServer /usr/local/doi_hs \
              /usr/local/doi_hs1 /usr/local/doi_hs2 \
              /usr/local/doi_hs2 /usr/local/doi_hs4
    

    This phase can take a long time when splitting large databases. The DOI database took several days to perform this step.

    The old server should still be running at this point.

  6. Run the net.handle.apps.tools.SplitRecoveryLog program to process the `dbtxns.log' file. This will scan all of the handle modifications, creations, and deletions that have taken place since the backup/checkpoint operation. When this is finished, the date of the last transaction processed will be printed. Record this for future use.

  7. Only perform this step if you are splitting a primary server
    Shut down the old handle server. Copy the `txn_id' file from the old server to each of the new server directories. This file will only be used by one of the new servers, but it is hard to tell which one so we copy it to each server directory just to be safe.

  8. Run the net.handle.apps.tools.SplitRecoveryLog program again, this time providing the date of the last transaction processed in step 6 on the command line. Run the net.handle.apps.tools.SplitRecoveryLog program with no arguments to see the syntax of the command.

    This step will quickly bring the new databases into sync with the single primary by processing the transactions that have occurred on the primary since step 6.

  9. Start the new servers and make sure they come up without any problems.

  10. Update the HS_SITE value of the naming authority or service handle with the new `siteinfo.bin' file.

  11. Wait at least 20 minutes so that the old HS_SITE value is timed-out in the caches of all possible handle clients/administrators.

  12. Only perform this step if you are splitting a primary server
    Copy the new `siteinfo.bin' file into the directory of the old handle server. Start the old handle server back up. Hopefully no administration requests will arrive(in theory they shouldn't).

    Performing this step should cause any secondary servers to notice the site-info version number change and retrieve the new site-info record from the old primary.

  13. After a little while of making sure that everything is working OK, change the timeout/TTL of the modified HS_SITE value back to 86400 (one day).


Previous: Replication       Next: Using a SQL Database for Storage           Table of Contents