System Fundamentals

Handle System Architecture
Handle Syntax
Handles as Persistent Identifiers
Comparing the Handle System and DNS
Handle Resolution
Administering Handles
Authentication
Scalability
Handle Server Replication
Prefix Handle Values
Handle Value Types

 

Introduction

The following summaries cover a range of Handle System technology topics, including system architecture, security, identifiers and identifier services, handle and handle server administration, and other aspects of the HANDLE.NET software and its functionality.

 

Handle System Architecture

The Handle System has a two-level hierarchical service model. The top level consists of a single global service, known as the Global Handle Registry®. The lower level consists of all other handle services, which are generically known as local handle services (LHS). The global service can be used to manage any namespace. It is unique among handle services only in that it provides the service used to manage the namespace of handle prefixes, all of which are managed as handles. The state information of these prefixes is the service information that clients can use to access and utilize associated local services.

The local handle service layer consists of all local handle services managing all identifiers under their prefixes, providing resolution and administration service for these local names. Local services are intended to be hosted by organizations with administrative responsibility for the identifiers within the service or acting on behalf of the responsible organizations. The way to define local namespaces, and the way to optimize overall Handle System performance, is by prefix. All identifiers under a given prefix must be maintained in one service. Handle services may be responsible for more than one prefix.

A second important component of Handle System architecture is distribution. The Handle System as a whole consists of a number of individual handle services, each of which consists of one or more handle service sites, where each site replicates the complete individual handle service, at least for the purposes of identifier resolution. Each handle service site in turn consists of one or more handle servers. There are no design limits on the total number of handle services which constitute the Handle System; there are no design limits on the number of sites which make up each service; and there are no design limits on the number of servers which make up each site. Replication by site, within a service, does not require that each site contain the same number of servers; that is, while each site will have the same replicated set of identifiers, each site may allocate that set of identifiers across a different number of handle servers. This distributed approach is intended to aid scalability and to mitigate problems of single point failure.

To improve resolution performance, any client may select to cache the service information returned from the global service, and/or the resolution result from any local service. A separate handle caching server, either stand-alone or as a piece of a general caching mechanism, may also be used to provide shared caching within a local community. Given a cached resolution result, subsequent queries of the same identifier may be answered locally without contacting any handle service. Given cached service information, clients can send their requests directly to the responsible local service without contacting the global service.

 

Handle Syntax

Within the handle namespace, every identifier consists of two parts: its handle prefix, and a suffix or unique "local name" under the prefix. The prefix and suffix are separated by the ASCII character "/". An identifier may thus be defined as

<handle> ::= <handle prefix> "/"<handle suffix>

For example, "10.1045/january2010-reilly" is an identifier (also known as a Digital Object Identifier (DOI) name, an implementation of the Handle System) for an article published in D-Lib Magazine. It is defined under the prefix "10.1045", and its suffix is "january2010-reilly".

Identifiers may consist of any printable characters from the Universal Character Set, two-octet form (UCS-2) of ISO/IEC 10646, which is the exact character set defined by Unicode v2.0. The UCS-2 character set encompasses most characters used in every major language written today. To allow compatibility with most of the existing systems and prevent ambiguity among different encoding, handle protocol mandates UTF-8 to be the only encoding used for handles. The UTF-8 encoding preserves any ASCII encoded names, which allows maximum compatibility to existing systems without causing naming conflict.

By default, handles are case sensitive. However, any handle service, including the global service, may define its namespace such that all ASCII characters within any handle are case insensitive.

The handle namespace can be considered as a superset of many local namespaces, with each local namespace having its own unique prefix. The prefix identifies the administrative unit of creation, although not necessarily continuing administration, of the associated handles. Each prefix is guaranteed to be globally unique within the Handle System. Any existing local namespace can join the global handle namespace by obtaining a unique prefix, with the resulting identifiers being a combination of prefix and local name as shown above.

Each prefix may have "derived" prefixes. For example, once the prefix 12345 has been created, 12345.1 can be created. Derived prefix 12345.1 is therefore defined under prefix 12345. The syntax can be represented as "string.derivedstring". In terms of Handle System technology, a derived prefix is a prefix in its own right and can be used any way that any other prefix can be used, but typically it is used as one of a set of connected prefixes.

Derived prefixes are sometimes used by organizations that assign identifiers to different categories of content or objects that they wish to keep separate. They are also used for test purposes. There is no Registration Fee for derived prefixes; only an Annual Service Fee. Note that the use of derived prefixes is controlled by the Handle System Service Agreement.

The prefix and the suffix, or local name, are separated by the octet used for ASCII character "/" (0x2F). The collection of local names under a prefix is the local namespace for that prefix. Any local name must be unique under its local namespace. The uniqueness of a prefix and a local name under that prefix ensures that any identifier is globally unique within the context of the Handle System.

 

Handles as Persistent Identifiers

Handles are persistent identifiers for Internet resources. A handle does not have to be derived in any way from the entity that it names — the connection is maintained within the Handle System. This allows the name to persist over changes of location, ownership, and other 'current state' conditions. When a named resource moves from one location to another, e.g., from an old server to a new server, the handle is kept current by updating its value in the Handle System to reflect the new location.

The Handle system is designed to meet the following requirements for persistence.

Handles are:

  • not based on any changeable attributes of the entity they identify (location, ownership, or any other attribute that may change over time);
  • opaque, preferably 'dumb numbers' from which no potentially confusing meaning can be drawn, and from which no assumptions about ownership or use can be made;
  • unique within the Handle System, avoiding collisions and referential uncertainty;
  • easy to make user friendly, human-readable, cut-and-paste-able, and can be embedded, if needed;
  • easily fit into common systems, e.g., URI specification.

Handle resolution is:

  • reliable, using redundancy, with no single points of failure and resolution time fast enough never to appear broken;
  • scalable, so that higher loads are easily managed with more computers;
  • flexible and easily adapted to changing computing environments and new applications;
  • trusted, with both resolution and administration built on proven trust methods;
  • built on an open architecture that encourages the community of users to build applications on top of the infrastructure;
  • transparent to users who don't need to know the infrastructure details.
 

Comparing the Handles System and DNS

The Domain Name System (DNS), originally designed and used for mapping domain names into IP Addresses for network routing purposes, is one of a number of existing Internet identifier services or specifications that provide some of the functionalities of the Handle System. It is also the one to which the Handle System is most frequently compared. However, there are similarities and differences in both the design and intended use of the two systems. (Note that HANDLE.NET Software Version 7.1 includes a DNS interface to translate DNS resolution requests to handle resolution requests. This includes support for translating DNS names to handles, including decoding Internationalized Domain Names.)

Naming

The DNS naming hierarchy reflects a control hierarchy. That is, whoever runs .com controls who runs mybusiness.com and whoever controls mybusiness.com controls who runs branch.mybusiness.com, etc. This is not necessarily true of the Handle System. Any prefix can be, and at the moment all are, at the same level. So administration of 20.1.2.3 can be completely separate from 20.1.2 which can be completely separate from 20.1 and so on. They can all live in root and all be controlled by different sets of administrators and all point to different handle services.

Two related points:

  • while no one has implemented it, the specification allows for delegation DNS-style, that is, 20.1.2.3 can exist but not live on the global root, and to find out anything about it you have to talk to the service responsible for 20.1.2;
  • the naming hierarchy can be used in permissions to create new prefixes. That is, prefix 20.1 can be set up to give the administrators of 20.1 permission on the GLobal Handle Service to create new prefixes that begin 20.1, e.g., 20.1.2, but not the ability to create new prefixes that start with 20.

Distributed Administration

Each identifier and prefix can have its own set of administrators independent from the system administrator. Handle administrators can add/delete identifier and identifier values via the handle system protocol securely over the public Internet. DNS systems may have ad hoc mechanisms for updating records, but there is a difference in perspective on data ownership. In DNS, the system administrator is generally considered the owner of the data, while in the Handle System the prefix administrator is considered the owner. In cases where there are many users creating data, with only a few servers, having prefix-level data ownership is desirable. Having a consistent administration protocol also makes it easier to develop programs for creating and modifying data, independent of any particular server implementation.

Proxies

Making DNS resolution work behind SOCKS proxies may be difficult, depending on the DNS library used. The handle library supports SOCKS proxies. Making DNS resolution work from behind HTTP proxies is probably impossible. The handle library supports HTTP proxies.

Unicode

The Handle System is 8-bit clean, so full Unicode is supported. There are hacks to make DNS support 8-bit character sets, but they are not widely implemented.

Replication

Mirroring in the Handle System has fine granularity. If a single record is updated, the server will copy only that record to the mirror servers. In DNS, if a single record is updated, the entire zone is invalidated, and all records must be copied to mirror servers.

Certification

DNS has to be fast, especially at the root. This makes it tend toward policies that aren't very good for alternative uses. For example, certificates aren't as robust as in the Handle System, because a design constraint of DNS-SEC was that all signatures had to be pre-generated. DNS-SEC also depends on X.509, which may or may not be desirable. Finally, DNS-SEC may not be present in all DNS implementations. The Handle System has more flexible and robust certification support.

Access Control

The Handle System has support for access control and authentication. DNS does not.

Record Size

The DNS protocol defaults to UDP, but if a record is greater than 512 bytes, the server returns an error requiring the client to resend the request over TCP, making for two round trips. If you are storing a lot of metadata, that's two round trips for every message. If you are storing extremely large amounts of data, DNS has a 64K limit, while the Handle System has a limit closer to 4G. The handle protocol supports UDP chunking, so larger responses are possible over UDP. The handle library also makes it possible to exclusively use TCP, eliminating the issue altogether. Some DNS libraries may also allow forced TCP, but at the cost of losing the speed of UDP. A lot of DNS servers don't support TCP at all, and if your organization's DNS servers don't, you will end up losing the DNS hierarchy and put a greater burden on the primary servers and the global DNS roots. Some more draconian ISPs don't allow users to bypass their DNS. If these ISPs don't support TCP-DNS, there is no way to resolve DNS records larger than 512 bytes.

 

Handle Resolution

The Handle System allows identifiers (handles) to be resolved in a distributed fashion, using dedicated clients, common clients such as web browsers using special extensions or plug-ins, or unextended clients going through various proxies. In all cases, communication with the Handle System is carried out using Handle System protocols, and in all cases, those protocols have both a formal specification and some specific implementations.

Figure 1 below shows a client sending a request to the Handle System for the data associated with identifier 123/456.

 
Figure showing handle resolution process

Figure 1: Handle Resolution

 

as illustrated above:

  • A client such as a web browser encounters a handle, e.g., 123/456, on the Internet or an individual intranet, typically as a hyperlink or other kind of reference. The client sends the handle to the Handle System for resolution. This can be done directly by a client which understands the handle resolution protocol natively, or through a proxy server by a client which doesn't.
  • The Handle System consists of a collection of local handle services. Each service consists of at least one primary site and any number of secondary sites, with each site containing any number of handle servers. (See Handle System Scalability) For resolution, each site replicates all of the identifiers in that handle service.
  • The Global Handle Registry is responsible for knowing the locations and namespace responsibilities of all of the local handle services. Each of these local services knows how to access the Global Handle Registry. This allows a resolution query to enter the Handle System at any point, and be routed to the server that knows the answer in any site within the responsible service.
  • Each identifier can be associated with one or more pieces of typed data. In this example, the handle 123/456 is associated with, and so resolves to two URLs (it would also be possible to associate multiple instances of the same data type), and also XML and binary data. The client can request that the handle server return to the client all of the data associated with that identifier, or all of the data of a specific type, or with a specific index value. The Handle System is a pure resolution system and carries no assumptions on what the client will or will not do with the resolution information, thus maximizing the flexibility of applications which use the Handle System as an infrastructure for naming.

Handles are often used to identify objects retrieved via web browsers. CNRI maintains a proxy server that understands both the handle protocol and HTTP, to which any web browser may be directed for handle resolution.

 

Administering Handles

Conducting handle administration (i.e., creating, modifying, and deleting individual handles) requires that you authenticate yourself to the Handle System by proving that you are who you claim to be. To authenticate yourself, you need to have an ID that uniquely identifies you, and since the Handle System is global in nature, your ID must also be globally unique. Since globally unique identifiers are the Handle System's specialty, it is natural that administrators should be identified by handles.

An administrator handle contains either a public key or a secret key (password) that authenticates the individual identified by that handle. If an administrator handle is specified with permission to perform some operation in the Handle System, then that administrator can perform that operation as long as he can authenticate himself against the public or secret key in the administrator handle.

When you request your own prefix, a prefix will be created that will also serve as the administrator handle for that prefix, so prefixes (such as 0.NA/123456) serve double-duty as administrator handles and as prefixes. In this discussion we will be focusing on the administrator functions of the naming authority handle.

An administrator handle can be queried and the values viewed using a handle client, or by using the form on the "Resolve a Handle and View the Values" page at http://hdl.handle.net, the URL for the proxy server run by CNRI. (Access the form at http://hdl.handle.net/. Note that if you append a handle to the proxy server address http://hdl.handle.net/, the proxy server will resolve the handle to its associated URL.) Your public or secret key will be associated with the administrator handle. When you query the handle, you will notice that there are several values associated with it. In addition, each handle value has a unique (within the handle) numeric index, as well as a type identifier. Some of the handle values have special meaning within the Handle System:

  • Admin Value. Every handle must have an admin value associated with it. An admin value is of type HS_ADMIN, and specifies the permissions and handle of the administrator who is allowed to make changes to or delete that handle. Admin values are values of type HS_ADMIN, and for consistency are being given an index of 1001. If there are multiple admin values, then the additional indexes are being given 101, 102, 103, and so on. Admin values specify who can perform administration, by the handle and value index of either the administrator's authentication, or of an admin group value.
  • Public Key Value. If you have a public key associated with the handle, it will be in a handle value with type HS_PUBKEY. Public key values for consistency are being given an index of 300 (or 301, 302, 303, etc., if you have multiple keys). It is important to remember the index because you will need to specify it along with your handle when you authenticate yourself.
  • Secret Key Value. If you have a secret key associated with your handle, it will be in a handle value with type HS_SECKEY. For consistency, secret key values are being given an index of 300 (or 301, 302, 303, etc. if you have multiple keys). It is important to remember this index because you will need to specify it along with your handle when asked to authenticate yourself. This handle value should obviously not be publicly readable, which is why secret key values do not appear in non-authenticated queries for your administrative handle.
  • Group Value. A group value contains a list of handle values that identify public keys, secret keys, or other groups and is type HS_VLIST. If an admin value specifies a group value as an administrator, then every value in the group is considered an administrator for the handle.

Handle administration requires an administrator to authenticate himself by providing the following information:

  • Your admin handle (your prefix) and the index of your public or secret key value within that handle.
  • Your private or secret key. Note: your private and secret key will *never* be sent over the Internet by the Handle System. You shouldn't send either private or secret keys over the Internet. You only need to provide this to the Handle System client software so that it can prove to any handle server that you have this information.

In order to create an identifier under a given prefix, the owner of the prefix (the part of the handle before the slash) must give you permission to create identifiers under that prefix. He can give you permission to create identifiers by adding your admin handle and the index for your key value to a list of administrators who have permission to create identifiers under that prefix. When you send the 'create-handle' request to the Handle System, you must provide your authentication information. If the server can verify that you are the individual identified by the admin handle (your private key matches your public key, or you enter the correct secret key) then the requested identifier will be created.

1The Handle System does not require these particular index values. The index values just need to be unique within the handle.

 

Authentication

The security of the Handle System depends on both client and server host security, and depends heavily on the integrity of the Global Handle Registry service information. Extreme care is taken to protect the service information and the public key pair used to sign the global service information. Client applications should only accept the global service information from the Global Handle Registry. They should check its integrity upon each update.

For efficiency, handle servers will not generate or return a digital signature for every service response, unless specifically requested by clients. To assure data integrity, clients must explicitly ask the server to return the digital signature. To protect sensitive data from exposure, clients may establish a communication session with the server and ask the server to encrypt any data using the session key.

Types of Authentication

The handle protocol allows handle servers to authenticate their clients and to provide data integrity service upon client request. Public key and/or secret key cryptography may be used. Server authentication may be used to prevent eavesdroppers from forging client requests or tampering with server responses.

The Handle System provides the authentication and data integrity services, depending on client request. By default, the handle resolution service does not require any client authentication. However, resolution requests for confidential data assigned to any handle (by its administrator), as well as all administration requests (e.g., adding or deleting handle values) require authentication of the client as having the requisite authority. When authentication is required, the responsible handle server will issue a challenge to the requesting client before carrying out the client's request. To satisfy the authentication requirement, the client must send back the correct response that identifies itself as the administrator, or that it otherwise is in possession of the appropriate credentials. The handle server will respond to the initial request only after successful authentication of the client. Handle clients may choose to use either secret key or public key cryptography for authentication.

Figure 2 below illustrates authentication by a handle client using public/private key.

 
Figure showing authentication using public private key

Figure 2: Authentication Using Public/Private Key

 

Certification

Clients can request that a server cryptographically certify its messages with its private key. This certification can be used to verify the authenticity of handle server transmissions. The current implementation of the Handle System uses DSA for this purpose. The DSA public key for a handle server is stored in its site information record.

Sessions

The Handle System allows for encryption of communication after establishing a session with a handle server. This is equivalent to SSL or TLS as used in protocols such as HTTPS, as it affords protection from eavesdropping and man-in-the-middle attacks. The current implementation of the Handle System encrypts session communications using 56-bit DES. Sessions reduce the authentication processing time for performing a sequence of administrative operations. They allow sharing of authentication information for multiple message exchanges between client and server. For example, a prefix administrator may authenticate itself once through the session setup, and then register multiple handles under the same session. A batch of CREATE_HANDLE requests for a given naming authority submitted without the establishment of a session requires administrator authentication for each request. Establishing a session when the first handle in the batch is created, and using a session key for authentication for each subsequent handle, eliminates the need for multiple authentication message exchanges.

Sessions also enable encrypting transactions between the client and hosting server. The following diagram illustrates the exchanges between client and server when a client initiates a session:

 
Figure showing session exchanges

Figure 3: Session Exchanges

 
 

Scalability

Scalability was a critical design criteria for the Handle System. The problem can be divided into storage and performance. That is, is there some limit to the number of identifiers (handles) that can be added? And, does performance go down, or do some functions simply break with increased numbers of identifiers, such that at some point the system becomes unusable? Specific details on this are given below, but it is important to keep two higher level issues in mind. First, it is important here, as in many other places, to distinguish between Handle System design and any given implementation. Scalability in design may or may not work out as expected in any given implementation, but if the design is fundamentally scalable, specific implementation problems can be corrected as they are encountered. Secondly, use of the Handle System through some other service, e.g., an http proxy, may well introduce other scalability issues which the basic Handle System design does not and cannot address.

Storage

The Handle System has been designed at a very basic level as a distributed system, that is, it will run across as many computers as are required to provide the desired functionality. Figure 4 illustrates two possible configurations.

 
Figure showing scalability

Figure 4: Example Handle Site Configurations

 

Identifiers are held in and resolved by handle servers and handle servers are grouped into one or more handle sites within each handle service. There are no design limits on the total number of handle services which constitute the Handle System, there are no design limits on the number of sites which make up each service, and there are no limits on the number of servers which make up each site. Replication by site, within a service, does not require that each site contain the same number of servers; that is, while each site will have the same replicated set of identifiers, each site may allocate that set of identifiers across a different number of servers. Thus increased numbers of identifiers within a site can be accommodated by adding additional servers, either on the same or additional computers, additional sites can be added to a service at any time, and additional services can be created. Every service must be registered with the Global Handle Registry, but that service can also have as many sites with as many servers as needed. The result is that the number of identifiers that can be accommodated in the current system is limited only by the number of computers available.

Performance

Constant performance across increasing numbers of identifiers is addressed by hashing, replication, and caching. Hashing, a technique well known to database designers, is used in the Handle System to evenly allocate any number of identifiers across any number of servers within a site, and allows a single computation to determine on which server within a set of servers a given identifier is located, regardless of the number of identifiers or the number of servers. Each server within a site is responsible for a subset of identifiers managed by that site. Given a specific identifier and knowledge of the service responsible for that identifier, a handle client selects a site within that service and can perform a single computation on the identifier to determine which server within the site contains the identifier. The result of the computation becomes a pointer into a hash table, which is unique to each handle site and which can be thought of as a map of the given site, mapping which identifiers belong to which servers. The computation is independent of the number of servers and identifiers, and it will not take a client any longer to locate and query the correct server for an identifier within a service that contains billions of identifiers and hundreds of servers, than for a service that contains only millions of identifiers and only a few servers.

The connection between a given identifier and the responsible handle service is determined by prefix. Prefix records are maintained by the Global Handle Registry as handles, and these handles are hashed across the Global Handle Registry sites in the same way that all other identifiers are hashed across their respective service sites. The only hierarchy in Handle System services is the two level distinction between a single global and all locals, which means that the worst case resolution would be that a client with no built-in or cached knowledge would have to consult Global and one local.

Another aspect of Handle System scalability is replication. The individual handle services within the Handle System each consist of one or more handle service sites, where each site replicates the complete individual handle service, at least for the purposes of handle resolution. Thus, increased demand on a given handle service can be met with additional sites, and increased demand on a given site can be met with additional servers. This also opens up the option, so far not implemented by any existing clients, of optimizing resolution performance by selecting the "best" server from a group of replicated servers.

Handle clients may optimize performance across parallel service sites and, given a choice of multiple sites, will largely ignore sites which are slow or completely unresponsive, either because of server problems or because of network problems. Any given handle service can thus be made more robust both in terms of performance and reliability, through the addition of servers and collections of servers.

Caching may also be used to improve performance and reduce the possibility of bottleneck situations in the Handle System, as is the case in many distributed systems. The Handle System data model and protocol design includes a space for cache time-outs and handle caching servers have been developed and are in use.

 

Replication

Replication is the process by which changes in a primary handle site are communicated to one or more 'secondary' sites. A handle service has a single 'primary' site and zero or more 'secondary' sites that are simple mirrors of the primary. The number of servers in each site may vary. Clients are required to send all administrative messages (such as create/modify/delete-handle requests) only to a primary site.

When a new secondary server is started, it requests all handles from the primary server(s). This a called "dump" because for some primary servers, the list can be very large and listing them takes time. Once the complete list is received, the secondary performs incremental replication.

When a primary handle server receives a request to add, modify or delete a handle, it records an entry in a transaction log just prior to modifying the database. This transaction log can be viewed in the "txns" subdirectory of the primary handle server. There is a transaction log file for each calendar day, with one transaction per line. Each transaction consists of an encoded handle, the encoded type of change (add, delete, modify, home-prefix, unhome-prefix), a time-stamp and transaction ID. There is also an "index" file that contains the first time-stamp and transaction ID for each daily log.

Secondary sites poll the primary (or another intermediate) site every n minutes (where n is generally between 1 and 5). The poll message includes the last transaction ID retrieved by the secondary, and the date of the last poll. If the last poll occurred before the replication source log begins, the response tells the secondary to skip incremental replication and re-copy the entire database from the source. Otherwise, the source returns the transactions that have occurred since the given transaction ID and includes the latest transaction ID. For transactions other than delete-handle, home-prefix and unhome-prefix, the current handle values are included with the transaction listing.

Replication only works if handles are changed through an interaction with the primary handle server using the handle administrative protocol. That means that if you run a multi-server handle service, and your handle server is configured to use an SQL database as the back end, you will need to (1) take care of replication at the database level or (2) ensure that all changes are performed through the handle server.

For more information on replication, see the Interface Specification Handle System Protocol (ver 2.1) Specification , RFC 3652.

How is Replication Accomplished?

To do replication, a secondary needs to have and keep track of the following:

  1. The site information (including public key, etc.) of the site it will be replicating.
  2. The last transaction ID it received from each server in the site.
  3. The timestamp of the last transaction it received from each server in the site.
  4. Authentication credentials that give the secondary permission to do replication with the primary.

In the handle server, server replication is done in a separate thread. The replication daemon is a thread that retrieves handle transactions from the primary servers or some other source (depending on the server configuration). The replication daemon should only run on secondaries, not on primary servers.

The replication daemon does the following:

  1. Retrieves transactions by sending a request to the server being replicated.
  2. The primary server responds to the secondary server's request by returning a stream containing all the transactions that have occurred since the secondary's last request. The daemon processes this stream and updates the secondary server with the incoming transactions.
  3. The primary server will continue to stream transactions until the secondary server is up to date. The response status is continuously monitored by the replication daemon to determine if transactions are still being sent or if the secondary server needs to re-request all the handles.
  1. If the response status indicates that a 're-dump' is needed then the replication daemon will have to send a 'dump' request to all the servers being replicated.

    (1) All the currently stored handles are deleted.
    (2) The server sends a dump handles request to all primary servers that are being replicated.
    (3) The response to the dump handles request is processed to dump the databases from each primary server being replicated.
    (4) The new replication information is saved.
     
  2. If the response status indicates that transactions are being sent then the replication daemon goes ahead and processes the incoming transactions.

Details

Handle server replication communication is based on two request/response pairs. The secondary server sends out a request for new transactions or it sends a request for a dump of all the handles in the primary server. What follows is a description of the request and associated responses.

Retrieve Transactions Request:

This is the request used to retrieve any new transactions from a server. This request is used for server<->server (or replicator<->server) communication. The request needs to provide the following information to the server being queried:

  • Last transaction query ID
  • Last transaction query date
  • Information specifying which server in the site is requesting new transactions:
            Requestor's server number
            Requestor's hash type
            Requestor's number of servers

The last transaction ID will allow the server being queried to determine which transactions need to be returned. The queried server will send every transaction that has a transaction ID greater than the last transaction ID and hashes to the requesting server. Knowing the last time the transactions were queried will allow the server being queried to determine if the entire set of handles needs to be "dumped" again.

The following describes the body of the Retrieve Transactions Request handle protocol message as defined in Section 2 of RFC 3652.

The Message Header of any Retrieve Transactions Request must set its <OpCode> to OC_RETRIEVE_TXN_LOG and its <ResponseCode> to 0.

 
space<Message Body of Retrieve Transactions Request> ::=<LastTransactionID>
space <LastQueryDate>
space <ReceieverHashType>
space <NumberOfServers>
space <ServerNumber>

where:

<LastTransactionID>
A 8-byte unsigned integer that specifies the last Transaction ID.

<LastQueryDate>
A 8-byte unsigned integer that specifies the last query date. The date unit is milliseconds. The value of the date is the milliseconds since January 1, 1970, 00:00:00 GMT.

<ReceiverHashType>
A 1-byte value that identifies how the handles are hashed on the server.

<NumberOfServers>
A 4-byte unsigned integer that specifies the number of servers in the site.

<ServerNumber>
A 4-byte unsigned integer that specifies which server in the site is sending the request.

 

Retrieve Transactions Response:

This is the response used to forward new transactions to a replicated site/server. This response is used for server<->server or (replicator<->server) communication. The response has two valid states. It will either be SENDING_TRANSACTIONS or it will indicate a NEED_TO_REDUMP all the handles for the servers being replicated. If NEED_TO_REDUMP is returned then the secondary site/server will request all the handles from all the servers in the primary site. If the Retrieve Transactions Response status is SENDING_TRANSACTIONS, the primary server wil stream all new transactions to the requesting secondary server. The following describes the body of the handle protocol message as defined in Section 2 of RFC 3652.

The Message Header of any Retrieve Transactions Request must set its <OpCode> to RETRIEVE_TXN_LOG. A successful Retrieve Transactions Response must set its <ResponseCode> to RC_SUCCESS. This message is streamable.

 
space <Message Body of Retrieve Transactions Request> ::= <RequestDigest>
space   <DataFormatVersion>
space <Status>
space <Stream>
space <EndTransmissionRecord>
space <LastQueryDate>
space

where:

<RequestDigest>
Optional field as defined in section 2.2.3 of RFC 3652. Including the request digest enables the client to ensure that the response being received is actually in response to the request that was sent. This prevents a malevolent entity from returning a different response (a response previously signed by the server as a valid response to a different request) to the client, and giving the client bad information.

<DataFormatVersion>
A 4-byte unsigned integer that specifies the data format version. The value of this integer is set to 1. This record is signed using the primary server's private key.

<Status>
A 4-byte unsigned integer. This record is signed using the primary server's private key. This integer specifies the status code that indicates if transactions are being sent or if a redump of all handles is needed.

If the date of the specified last transaction ID is later than the date of the last replication request, then the requesting server has missed some transactions. The requesting server needs to request a dump of all handles.

To indicate a redump is needed, the status code integer is set to 1.

To indicate transactions are being sent, the status code integer is set to 2.

<Stream>
A stream composed of all the new transactions that hash to the requesting server. If the handle referred to in the transaction hashes to the requesting server then the transaction is sent to the requestor.

If the handle hashes to the requesting server, the following data is sent to the requestor for each transaction.

space <Transaction Data> ::= <RecordType>
space   <TransactionID>
space   <Handle>
space   <TransactionAction>
space   <TransactionDate>
space   <HandleValue>

where:

<RecordType>
A 1-byte unsigned integer that specifies the type of the record being streamed. The value of this integer is set to 1 for handle records.

<TransactionID>
A 8-byte unsigned integer that specifies the Transaction ID.

<Handle>
A 4-byte unsigned integer followed by the bytes that specify the name of the handle. The integer specifies the length of the handle name.

<TransactionAction>
A one-byte unsigned integer that specifies the action being carried out in the transaction. The table below shows the possible integer values.

<TransactionDate>
An 8-byte unsigned integer that specifies the transaction date. The value of the date is the milliseconds since January 1, 1970, 00:00:00 GMT.

<HandleValue>
If the transaction is a handle create or a handle update, the handle values being created or updated need to be sent. A 4-byte unsigned integer followed by a list of handle values. The integer indicates the number of handle values in the list.

<EndTransmissionRecord>
A 1-byte unsigned integer whose value is 0. This byte represents the ending summary record. This byte is sent when the server finishes streaming the handles that hash to the requesting server.

<LastQueryDate>
An 8-byte unsigned integer. This integer is the date the requestor should use for the next RetrieveTxnRequest. This is needed so that in the event that there have been no transactions in a while, the requestor does not have to redump the entire database. This record is signed using the primary server's private key. The date unit is milliseconds. The value of the date is the milliseconds since January 1, 1970, 00:00:00 GMT.

 

Dump Handles Request:

This is the request used to retrieve all handles from a server. This request is used for server<->server (or replicator<->server) communication. The requesting server needs to specify which handles to send (filtered by how the handles are hashed)

  • Requestor's server number
  • Requestor's receiver hash type
  • Requestor's number of servers

The following describes the body of the Dump Handles Request handle protocol message as defined in Section 2 of RFC 3652.

The Message Header of any Dump Handles Request must set its <OpCode> to <OC_DUMP_HANDLES> and its <ResponseCode> to 0.

 
space <Message Body of Dump Handles Request> ::= <ReceiverHashType>
space   <NumberofServers>
space   <ServerNumber>

where:

<ReceiverHashType>
A 1-byte value that identifies how the handles are hashed on the server.

<NumberOfServers>
A 4-byte unsigned integer that specifies the number of servers in the site.

<Server Number>
A 4-byte unsigned integer that specifies which server this is in the site.

 

Dump Handles Response:

This is the response used to send all handles in the database to a replicated site/server. This response is used for server<->server (or replicator<->server) communication. This response is used by the primary server to send all of the handles that hash to the requestor beginning with the transaction ID specified in the Dump Handles Request. The message is signed using the normal handle response signature format as defined in Section 2.2.4 of RFC 3652. The following describes the body of the Dump Handles Response handle protocol message as defined in Section 2 of RFC 3652.

The Message Header of any Dump Handles Response must set its <OpCode> to OC_RETRIEVE_TXN_LOG. A successful Dump Handles Response must set its <ResponseCode> to RC_SUCCESS. This message is streamable.

 
space <Message Body of Dump Handles Response> ::= <RequestDigest>
space   <DataFormatVersion>
space   <Stream>
space   <EndTransmissionRecord>
space   <LastTxnId>

where:

<RequestDigest>
Optional field as defined in section 2.2.3 of RFC 3652. Including the request digest ensures that the response received is actually in response to the request that was sent. This prevents a malevolent entity from returning a different response (a response previously signed by the server as a valid response to a different request) to the client, and giving the client bad information.

<DataFormatVersion>
A 4-byte unsigned integer that specifies the data format version. The value of this integer is set to 1. This value is signed using the primary server's private key.

<Stream>
A stream composed of the handle values that hash to the requesting server. Each block that is sent is signed using the primary server's private key. Only a handle that hashes to the requesting server is streamed. The value being streamed will either be used to populate the handle database or the naming authority database.

If a handle hashes to the requesting server, the following bytes are streamed:

The data sent to the requestor for each record is defined as follows:

space <Data returned for each database record> ::= <RecordType>
space   <RecordData>

For Handle Records:

<RecordType>
A 1-byte integer that specifies that the record being streamed is a prefix (naming authority) record. The value of this integer is set to 1 for handle records.

<RecordData>
A 4-byte unsigned integer followed by a list of handle values. The integer indicates the number of handle values in the list.

For Prefix (Naming Authority) Records:

<RecordType>
One-byte integer that specifies that the record being streamed is a naming authority record. The value of this integer is set to 2 for prefix records.

<RecordData>
A 4-byte unsigned integer followed by the prefix value. The integer indicates the length of the prefix value.

<EndTransmissionRecord>
One-byte unsigned integer whose value is 0. This byte represents the ending summary record. This byte is sent when the server finishes streaming the handles that hash to the requesting server.

<LastTxnId>
An 8-byte unsigned integer that specifies the date the requestor should use for the next Request Handles Request to the primary server. This is needed so that the server does not have to redump all the handles at each request.

 

<OpCode> used for handle Server Replication:

   
      Op_Code | Symbolic Name       | Remark

      --------|---------------------|-------------------------

       1001   | OC_RETRIEVE_TXN_LOG | Retrieve Transaction Log

       1002   | OC_DUMP_HANDLES     | Dump Handles

 

More Information on Transaction Types:

The Retrieve Transactions Response Message streams transactions to the secondary server. This section provides a little more detail on each of these transactions types.

   
      Transaction Actions       | Integer Value

      --------------------------|--------------

       ACTION_PLACEHOLDER       |       0

       ACTION_CREATE_HANDLE     |       1
       
       ACTION_DELETE_HANDLE     |       2
       
       ACTION_UPDATE_HANDLE     |       3
       
       ACTION_HOME_NA           |       4
       
       ACTION_UNHOME_NA         |       5
       
       ACTION_DELETE_ALL        |       6

ACTION_PLACEHOLDER
A dummy value that doesn't represent any actual transaction.

ACTION_CREATE_HANDLE
Specifies a transaction that creates a handle.

ACTION_DELETE_HANDLE
Specifies a transaction that deletes a handle.

ACTION_UPDATE_HANDLE
Specifies a transaction that updates the value(s) of a handle.

ACTION_HOME_NA
Specifies a transaction that homes a prefix on a handle server.

ACTION_UNHOME_NA
Specifies a transaction that unhomes a prefix from a handle server.

ACTION_DELETE_ALL
Specifies a transaction that deletes all handles from a handle server.

 

Additional Considerations:

The Dump Handles Request/Response and the ACTION_DELETE_ALL transaction should be used carefully. Setting up secondary servers such that they alert an administrator when a Dump Handles Request or ACTION_DELETE_ALL transaction is received, and requiring administrator confirmation of these actions, are recommended.

 

Handle Value Types

A handle has a set of values assigned to it and may be thought of as a record that consists of a group of fields. Each handle value must have a data type specified in its <type> field, that defines the syntax and semantics of its data, and a unique <index> value that distinguishes it from the other values of the set.

Types are identified by handles and can be any UTF8-string. Handle System users acknowledge, however, that there are potential conflicts for handle clients if users assign types that are not registered and recognized across the user community. How types should be defined and how they should be registered and used is currently under discussion.

Table 1: Pre-defined Administrative Types

Type Handle

Description

0.TYPE/HS_ADMIN

Values of type HS_ADMIN are encoded representations of the admin record for a handle. This admin record defines the administrator as well as the permissions that the administrator has over the handle.

0.TYPE/HS_PUBKEY

Values of type HS_PUBKEY contain encoded information describing a public key that can be used to authenticate entities in the Handle System.

0.TYPE/HS_SECKEY

Values of type HS_SECKEY are used to store UTF8-encoded text that is used as a password to access some service. Values of type HS_SECKEY should usually not be publicly readable.

0.TYPE/HS_VLIST

Values of type HS_VLIST contain a list of handle value references. Each handle value reference consists of a handle and an index of the value being referenced.

0.TYPE/HS_SERV

Values of type HS_SERV contain UTF8-encoded handles that identify a handle service (i.e., a set of HS_SITE values). When HS_SERV values are contained in a prefix handle, resolvers retrieve the service information from the handle referenced in the HS_SERV value.

0.TYPE/HS_ALIAS

Values of type HS_ALIAS contain a UTF8-encoded handle. The handle identified by the HS_ALIAS value should be resolved, and its data should be used in place of the data in the handle containing the HS_ALIAS value. This redirection should be handled at the application level.

 

Table 2: Non-administrative Types

Type Handle

Description

0.TYPE/URL

Values of type URL are UTF8-encoded URIs that specify the location of the object identified by a handle.

0.TYPE/EMAIL

Values of type EMAIL are UTF8-encoded email addresses.

0.TYPE/DESC

Values of type DESC are UTF8-encoded text descriptions of the object identified by the handle.

10320/loc

Values of type 10320/loc specify an XML-formatted list of locations

 

Among the handle values stored in every prefix are some that directly impact the behavior of clients, servers, and the proxies. They are described below.

Table 3: Prefix Handle Values

1 HS_SITE and HS_SERV values These values determine which LHS a client will use to resolve handles under a prefix.
2 HS_ADMIN value An LHS server will use this value to determine which administrators are authorized to create handles under the prefix.
3 HS_VLIST, HS_PUBKEY, HS_SECKEY values These values are referenced by HS_ADMIN values and so are used by servers (and clients) for authentication and authorization. It has become customary to use the prefix handle itself as the handle an LHS administrator uses to authenticate.
4 An HS_NAMESPACE value with several subcomponents:
4a Delegation information Used by clients to determine when derived prefixes should be resolved via delegation.
4b Template handle information Used by servers to determine how to configure themselves to respond to handle resolution requests via template.
4c Status information Used by the proxy when it is unable to resolve a handle under the prefix.
4d Multiple redirection information Used by the proxy when it is unable to resolve a handle under the prefix.
 

More information on handle types can be found in the Technical Manual and the Handle System RFCs the make up the Interface Specification.

 
spacer
 

October 2012