|Previous: Introduction||Next: Installation||Table of Contents|
The Handle System is a comprehensive system for assigning, managing, and resolving persistent identifiers, known as "handles," for digital objects and other resources on the Internet. Handles can be used as Uniform Resource Names (URNs). The Handle System includes an open set of protocols, a namespace, and an implementation of the protocols. The protocols enable a distributed computer system to store handles of digital resources and resolve those handles into the information necessary to locate and access the resources. This associated information can be changed as needed to reflect the current state of the identified resource without changing the handle, thus allowing the name of the item to persist over changes of location and other state information.
Within the handle namespace, every handle consists of two parts: its naming authority, otherwise known as its prefix, and a unique local name under the naming authority, otherwise known as its suffix. The naming authority and local name are separated by the ASCII character "/". A handle may thus be defined as
<Handle> ::= <Handle Naming Authority> "/" <Handle Local Name>
For example, "10.1045/january99-bearman" is a handle for an article published in D-Lib Magazine. It is defined under the Handle Naming Authority "10.1045", and its Handle Local Name is "january99-bearman".
Handles may consist of any printable characters from the Universal Character Set, two-octet form (UCS-2) of ISO/IEC 10646, which is the exact character set defined by Unicode v2.0. The UCS-2 character set encompasses most characters used in every major language written today. To allow compatibility with most of the existing systems and prevent ambiguity among different encoding, handle protocol mandates UTF-8 to be the only encoding used for handles. The UTF-8 encoding preserves any ASCII encoded names, which allows maximum compatibility to existing systems without causing naming conflict.
By default, handles are case sensitive. However, any handle service, including the global service, may define its namespace such that all ASCII characters within any handle are case insensitive.
The handle namespace can be considered as a superset of many local namespaces, with each local namespace having its own unique handle naming authority. The naming authority identifies the administrative unit of creation, although not necessarily continuing administration, of the associated handles. Each naming authority is guaranteed to be globally unique within the Handle System. Any existing local namespace can join the global handle namespace by obtaining a unique naming authority, with the resulting handles being a combination of naming authority and local name as shown above.
Each naming authority may have many child naming authorities registered underneath. Any child naming authority can only be registered by its parent after its parent naming authority is registered. Every handle is then defined under a naming authority. The naming authority and the local name are separated by the octet used for ASCII character "/" (0x2F). The collection of local names under a naming authority is the local namespace for that naming authority. Any local name must be unique under its local namespace. The uniqueness of a naming authority and a local name under that authority ensures that any handle is globally unique within the context of the Handle System.
The Handle System allows handles to be resolved in a distributed fashion, using dedicated clients, common clients such as web browsers using special extensions or plug-ins, or unextended clients going through various proxies. In all cases, communication with the Handle System is carried out using Handle System protocols, and in all cases, those protocols have both a formal specification and some specific implementations.
As illustrated above:
The handle protocol allows handle servers to authenticate their clients and to provide data integrity service upon client request. Public key and/or secret key cryptography may be used. Server authentication may be used to prevent eavesdroppers from forging client requests or tampering with server responses.
The Handle System provides the authentication and data integrity services, depending on client request. By default, the handle resolution service does not require any client authentication. However, resolution requests for confidential data assigned to any handle (by its administrator), as well as all administration requests (e.g., adding or deleting handle values) require authentication of the client as having the requisite authority. When authentication is required, the responsible handle server will issue a challenge to the requesting client before carrying out the client's request. To satisfy the authentication requirement, the client must send back the correct response that identifies itself as the administrator, or that it otherwise is in possession of the appropriate credentials. The handle server will respond to the initial request only after successful authentication of the client. Handle clients may choose to use either secret key or public key cryptography for authentication.
The figure below illustrates authentication by a handle client using public/private key.
The handle administration service deals with client requests to manage handles, including adding handles, deleting handles or updating their values. It also deals with naming authority administration via naming authority handles. Each handle can define its own administrator(s) and each administrator is granted a certain set of permissions. The Handle System authentication protocol authenticates the handle administrator before fulfilling any administrative request.
Handles can therefore be managed securely over the public network by authorized administrators at any network location.
A handle has a set of values assigned to it and may be thought of as a record that consists of a group of fields. Each handle value must have a data type specified in its <type> field, that defines the syntax and semantics of its data, and a unique <index> value that distinguishes it from the other values of the set. A set of handle data types has been pre-defined for administrative use. (See Handle System Namespace and Service Definition.)
<type> can be any UTF8-string. Handle System users acknowledge, however, that there are potential conflicts for handle clients if users assign types that are not registered and recognized across the user community. How <types> should be defined and how they should be used is currently under discussion. The non-administrative types that have been registered and defined to date are listed below.
Scalability was a critical design criteria for the Handle System. The problem can be divided into storage and performance. That is, is there some limit to the number of handles that can be added? And, does performance go down, or do some functions simply break with increased numbers of handles, such that at some point the system becomes unusable? Specific details on this are given below, but it is important to keep two higher level issues in mind. First, it is important here, as in many other places, to distinguish between Handle System design and any given implementation. Scalability in design may or may not work out as expected in any given implementation, but if the design is fundamentally scalable, specific implementation problems can be corrected as they are encountered. Secondly, use of the Handle System through some other service, e.g., an http proxy, may well introduce other scalability issues which the basic Handle System design does not and cannot address.
The Handle System has been designed at a very basic level as a distributed system, that is, it will run across as many computers as are required to provide the desired functionality.
Handles are held in and resolved by handle servers and handle servers are grouped into one or more handle sites within each handle service. There are no design limits on the total number of handle services which constitute the Handle System, there are no design limits on the number of sites which make up each service, and there are no limits on the number of servers which make up each site. Replication by site, within a service, does not require that each site contain the same number of servers; that is, while each site will have the same replicated set of handles, each site may allocate that set of handles across a different number of servers. Thus increased numbers of handles within a site can be accommodated by adding additional servers, either on the same or additional computers, additional sites can be added to a service at any time, and additional services can be created. Every service must be registered with the Global Handle Registry, but that service can also have as many sites with as many servers as needed. The result is that the number of handles that can be accommodated in the current system is limited only by the number of computers available.
Constant performance across increasing numbers of handles is addressed by hashing, replication, and caching.
Hashing, a technique well known to database designers, is used in the Handle System to evenly allocate any number of handles across any number of servers within a site, and allows a single computation to determine on which server within a set of servers a given handle is located, regardless of the number of handles or the number of servers. Each server within a site is responsible for a subset of handles managed by that site. Given a specific handle and knowledge of the service responsible for that handle, a handle client selects a site within that service and can perform a single computation on the handle to determine which server within the site contains the handle. The result of the computation becomes a pointer into a hash table, which is unique to each handle site and which can be thought of as a map of the given site, mapping which handles belong to which servers. The computation is independent of the number of servers and handles, and it will not take a client any longer to locate and query the correct server for a handle within a service that contains billions of handles and hundreds of servers, than for a service that contains only millions of handles and only a few servers.
The connection between a given handle and the responsible handle service is determined by naming authority. Naming authority records are maintained by the Global Handle Registry as handles, and these handles are hashed across the Global Handle Registry sites in the same way that all other handles are hashed across their respective service sites. The only hierarchy in Handle System services is the two level distinction between a single global and all locals, which means that the worst case resolution would be that a client with no built in or cached knowledge would have to consult Global and one local.
Another aspect of Handle System scalability is replication. The individual handle services within the Handle System each consist of one or more handle service sites, where each site replicates the complete individual handle service, at least for the purposes of handle resolution. Thus, increased demand on a given service can be met with additional sites, and increased demand on a given site can be met with additional servers. This also opens up the option, so far not implemented by any existing clients, of optimizing resolution performance by selecting the "best" server from a group of replicated servers.
Caching may also be used to improve performance and reduce the possibility of bottleneck situations in the Handle System, as is the case in many distributed systems. The Handle System data model and protocol design includes a space for cache time-outs and handle caching servers have been developed and are in use.
Sessions reduce the authentication processing time for performing a sequence of administrative operations. They allow sharing of authentication information for multiple message exchanges between client and server. For example, a naming authority administrator may authenticate itself once through the session setup, and then register multiple handles under the same session. A batch of CREATE_HANDLE requests for a given naming authority submitted without the establishment of a session requires administrator authentication for each request. Establishing a session when the first handle in the batch is created, and using a session key for authentication for each subsequent handle, eliminates the need for multiple authentication message exchanges.
Sessions also enable encrypting transactions between the client and hosting server.
The following diagram illustrates the exchanges between client and server when a client initiates a session:
|Previous: Algorithms||Next: Algorithms||Table of Contents|