An overview of technical projects at CHS:
phase 2, services using resources other than texts

Our work in phase 1 focused on developing a set of network services for working with classical texts, and for relating those texts, through the CTS Indexing conventions, to other kinds of information.

In phase 2, we have begun to add support for three other kinds of information sources:

Collections of information records

A special class of textual data is made up of collections of identically structured records. These records are likely to be managed with some kind of database system (relational, XML, other). In a network service, the data will be represented in XML and could simply be treated as generic "textual" data, but the concept of sets of records defined by query criteria is valuable and applicable to a wide enough range of applications to justify a specific service.

There are of course many projects that provide some kind of support for networked access to database systems. These include projects that define a persistence layer for data in relational database systems, expressed in XML. (For examples, see the Apache project's Torque persistence layer, or the Middlegen system for generating persistence layers.) Projects working with XML databases may offer direct networked access to their contents. See for example the eXist XML database system.

All of these projects provide useful code bases for working with a variety of back-end data sources, but are not designed to address the entire range of issues a network service must address. They do not, for example, support the data discovery needs of a service. A service for collections of records will have to include in its protocol methods for announcing the existence of and describing the structure of collections.

Our initial work is especially closely modelled on another persistence project, Hibernate (http://www.hibernate.org), and draws directly on ideas in Hibernate's mapping of object/relational/XML data types.

Digital images

For working with image data, a very useful set of operations has been specified by the NIH's image manipulation interface. The ImageJ API for image manipulation defines a set of operations that cover resizing, scaling and other basic uses of image data: see http://rsb.info.nih.gov/ij/.

Parts of the ImageJ API have been repackaged as a servlet as part of the Fedora Digital Repository Management System. In its present form, the servlet operates on images identified by URL; to meet the needs of a publication service, images would have to be identified instead by a permanent, abstract identifier. Since the Fedora source code is available under an open-source license, it should be relatively easy to develop an adequate service by wrapping the Fedora service in a layer that translates abstract IDs into URLs. For the Fedora servlet interface to the ImageJ API, see http://www.fedora.info/release/1.2.1/userdocs/server/localservices/imagemanip/ .

GIS

For spatial or geographic information, clearly defined and thoroughly thought-out standards for network services have been defined by the Open Geospatial Consortium. We plan to adopt the Web Map Service and Web Feature Service. (All OGC standards are available at http://www.opengeospatial.org/specs/?page=specs.)

Links