Web Content Display Web Content Display

The OpenMail™ Interface Access Service

Working Document

Last Updated: January 27, 2010 


Authors:

Deborra Zukowski

Mike Swenson

Introduction

Since the emergence of the Web, application technologies that leverage the Web have undergone two distinct incarnations, and are poised for a third.  First, applications leveraged the Web to exchange documents using the http and ftp protocols. Now with the maturity of XML and Web Services, the web can support collaborative, computational applications. And soon, web technologies will mature so that the promise envisioned by Tim Berners-Lee, that of a semantically-aware web that supports distributed/evolving knowledge and collaborative sense-making based on that knowledge, will become robust enough for use in business applications. However until then, business applications are leveraging the power and robustness of XML for describing information.

Over the last few months, the FMT team has been investigating how we might describe and use structured information (e.g., W3C XML Schema-based descriptions) in a way that mitigates the complexity of migrating applications that use those descriptions into applications that instead use descriptions based on semantic languages like OWL, a knowledge representation language being developed within the W3C Semantic Web Activity[1].  We have developed an API and application framework that abstracts the description of information from the use of that information. This API and framework is specifically targeted to support the migration of information represented in hierarchical XML to a triple-based representation of knowledge.

This interface is intended primarily for migrating XML descriptions to representations that use future semantic web languages. Hierarchical information representations, like XML, can be mapped to directed, acyclic graphs (DAG), e.g., DOM trees. Arbitrary triple-based representations can introduce cycles, e.g., an OWL representation can include a property "hasFriend" that could loop back to an originating node.  At this point, we are assuming that the represented information can be cast into a DAG, which limits the general application of the framework when migrating from OWL into XML.

Motivation

The OpenMail™ program is a reframing of the Mailstream in the context of people, work, time, and place. It focuses on engaging user, developer, and service provider communities to work together, building a set of practices and standards for announcing, accessing, and using available Mailstream products and services in a simpler and more effective manner. Envision a world where mailers could request carrier products based on that they need to do and their preferences on how to accomplish it, as shown in Figure 1 by the Service Request arc. Carriers could create or customize services and announce the updated services (via the ePPML/CPML arc). The mailer's mailing system would automatically adapt as needed to help the mailer carry out the requirements needed for using their requested service. This is a part of the vision underlying the OpenMail™ program.

Figure 1: OpenMail™ participants and examples of information exchanged amongst them.

As shown in figure 1, the definition and standardization of information exchanged amongst participants is a core feature needed to accomplish the program. For example, figure 1 shows that carriers provide information conforming to the ePPML[2]/CPML interface when describing a type of postal service and the requirements needed for mailers to use the service.  Equipment manufacturers need to be able consume such documents, as automatically as possible, to ensure that their products (i.e., mailing systems used by the mailers) can support the new products in a timely manner.

At this time, information interfaces are defined using XML Schema[3], and our current tools and infrastructure rely on the use of XML technology. However, as the OpenMail™ program continues into the future, we envision new interface documents leveraging the power of the W3C Semantic Web Technology Stack[1], where semantic knowledge also becomes a part of the definition. In addition, we fully expect that some of our initial interfaces will also migrate to semantic web technologies, like OWL. As we go forward, we want to ensure that we can build infrastructure, tools, and prototypes that can weather such migration. The proposed Information Access API and Framework is our first prototype for designing representation-agnostic applications. We will be using the API in our rating and provisioning prototypes to better to learn how to improve the API for use in such business applications.

Underlying Concepts

The OpenMail™ Interface Access Framework is heavily derived from the Device Description Repository Simple API (DDR) [4] standard recommended by the W3 Device Description Working Group in December, 2008. The API provides a means to identify property names and access associated values, using a service object that supports simple “getPropertyValue” methods. It is grounded in several underlying concepts, including the notion of evidence, aspects (and correspondingly, unique property references) and property values. 

·         Evidence provides sufficient information to resolve to a particular description file that adheres to the vocabulary (e.g., a description file for a specific device).

·         Aspect identifies a part of the namespace(s) that has to do with a specific kind of the information. For example, a namespace that supports describing mobile devices may describe the different aspects of a mobile device, e.g., hardware, software, connectivity, etc.  These aspects may be a part of the top-level namespace or may be defined with specialized namespaces. Operationally, an aspect defines a scoping of property names for which uniqueness can be guaranteed.

·         Property Reference wraps a Property Name object (that includes both namespace and local name), with an aspect name to provide unique naming.

·         Data Property Value provides explicit type transformation, though only for simple XML types, including String, Float, Double, Integer, Long, Enumeration, and Boolean, meaning that the API is only useful for accessing data property values, i.e., leaf nodes of the description tree. 

·         Object Property Value provides access handles to intermediate nodes of the graph.


Extensions to the DDR API

The Information Access API is an extension of the DDR API in the following ways: 

·         The DDR focuses on accessing leaf nodes only. The OpenMail™ Information Access API supports accessing values for both leaf nodes (data properties) and intermediate nodes (object properties).

·         Many times, there is some information that cannot be uniquely named, even within an aspect. Often, this information is more "information about the information," (i.e., metadata), like name, vendor, and id. The OpenMail™ Information Access API allows both data property values and object property values to include metadata tags that can be accessed directly through the returned value.

·         The DDR provides an access paradigm based on a unique property name. Oftentimes, we may want to access a part of the information based on information about it, e.g., we may want to access a particular connectivity object value based on the type of protocol it supports. We've extended the access methods to include finding a value based on specific sub-property or metadata values.

·         Since the DDR assumed unique property names, the definitions of information could not use recursive structures. In our OpenMail™ interfaces, we have experienced several places where information is best described using structural recursion, e.g., we can have a postal product that defines a pallet containing boxes (also postal products) that contain letters (also postal products). To support structural recursion, we've augmented the property access mechanism with an XPath-like reference node. That is, while DDR property access methods always search from the root of a selected document, we've added the ability to search from a referenced node, specifically a previously retrieved Object Property Value.

·         The DDR was written assuming a large database of documents. They use the concept of Evidence to help select one document from many. Sometimes, we know which document to use, a priori. Therefore, the Framework provides a way to explicitly provide and cache a document.

These extensions are called out in the Design and Implementation section, below.

Design and Implementation

This section lists the DDR interfaces and our extensions. All interfaces that have package names starting with org.w3c.ddr.simple have been downloaded from the W3C, and included in the OpenMail™ Information Access API as is. All other interfaces are extensions added as part of our OpenMail™ E3[1] infrastructure.   

Evidence

The Evidence interface defines a simple name/value mechanism for identifying the appropriate document from a set of managed documents.

org.w3c.ddr.simple.Evidence

public void put(String key, String value);

public boolean exists(String key);

public String get(String key);

Properties

The RDF/OWL semantic web language supports two types of properties, data properties and object properties. DDR supports a simple interface for describing a given property called PropertyName. It augments this interface with the aspect name, ensuring that the property names are uniquely identifiable.  Default values can be used for both namespaces and aspects, to simplify the use of the API for documents that include only a single namespace and core aspect. The PropertyName and PropertyRef interfaces define objects that provide simple access to the parts of the property. 

org.w3c.ddr.simple.PropertyName

public String getLocalPropertyName();

public String getNamespace();

org.w3c.ddr.simple.PropertyRef

public String getLocalPropertyName();

public String getAspectName();

public String getNamespace();

Data/Leaf Nodes

A single interface is used to provide access to all data properties. The interface supports translation to the expected type of the accessed data property. We extended the DDR PropertyValue interface to support a way to store and access metadata for properties. Also, we've renamed the DDR PropertyValue to the E3 DataPropertyValue to help differentiate it from the ObjectPropertyValue, below.

org.w3c.ddr.simple.PropertyValue

public double getDouble() throws ValueException;

public boolean getBoolean() throws ValueException;

public int getInteger() throws ValueException;

public String[] getEnumeration() throws ValueException;

public PropertyRef getPropertyRef();

public String getString() throws ValueException;

public boolean exists();

com.pb.act.e3.repository.DataPropertyValue

Often, information is annotated with information about the information, something called metadata. In OWL, this could be class inheritance or property constraints. In XML, the definition of metadata is often left to the information author. Examples of metadata could include id's, measurement units, etc. In OpenMail™, we use XML attributes for describing metadata. Should we convert to OWL, we'd still like to be able to differentiate between the actual information and its metadata, and so would recommend that some properties be "tagged" as metadata in the API implementation, enabling this method to provide application developers a consistent conceptual mode.

public String getMetaData(String attributeName);

public String getMetaData(String namespaceURI, String attributeName);

org.w3c.ddr.simple.PropertyValues

This interface is not yet supported.

public PropertyValue[] getAll();

public PropertyValue getValue(PropertyRef prop) throws NameException;

Objects/Intermediate Nodes

We expanded the DDR API to support access to and use of internal XML element nodes and OWL individuals. For XML, the notion of object properties is implicit within the hierarchical structure of the documents. The object property values are equivalent to the DOM elements. In OWL, these elements would be modeled with instances, with the property being "has<elementName>." 

To access object property values, we leveraged the PropertyName and PropertyRef interfaces for uniquely naming a node of interest, and added an interface for holding the node information. This interface is similar to the PropertyValue class. It provides a way to get the original naming information, though it does not perform type translation. Instead, it supports incremental navigation by storing a reference to the service that provided it (see below).  

Note that this interface is premised on an underlying tree-based data representation. The underlying information model for OWL is triples. This interface will work with OWL description files, provided that the information represented can be cast to a directed, acyclic graph (DAG). Transforming XML instance documents to OWL documents meets this constraint.

com.pb.act.e3.repository.ObjectPropertyValue

public String getAspectName();

public String getLocalName();

public String getNamespace();

public String getMetaData(String attributeName);

public String getMetaData(String namespaceURI, String attributeName);

public String getTextContext();

public ArrayList<PropertyName> listChildNames() throws NameException;

public E3Service getE3Service();

Access Service

The Service interface is the workhorse of the DDR. Service objects are created for this interface using a Factory pattern. This interface provides methods for building PropertyName and PropertyRef objects and specialized HTTP-based Evidence objects. It also provides the methods for using those objects to access data from the document. More details about this interface and the associated Factory implementation are provided in the DDR code. Note that when the DDR uses the term vocabulary, it maps to the XML notion of a namespace. Also, the E3 Implementation of the DDR Service API returns E3 DataPropertyValue, which extend the DDR PropertyValue. Access to the augmented interface is gained by casting the PropertyValue returned by the DDR interface methods to DataPropertyValue.

org.w3c.ddr.simple.Service

public void initialize(String defaultVocabularyIRI, Properties props)

      throws NameException,InitializationException;

public String getImplementationVersion();

public String getDataVersion();

public PropertyRef[] listPropertyRefs();

public PropertyValue getPropertyValue(Evidence evidence, PropertyRef propertyRef)

      throws NameException, ValueException;

public PropertyValue getPropertyValue(Evidence evidence, PropertyName propertyName)

      throws NameException, ValueException;

public PropertyValue getPropertyValue(Evidence evidence, String localPropertyName)

      throws NameException, ValueException;

public PropertyValue getPropertyValue(Evidence evidence, String localPropertyName,

      String localAspectName, String vocabularyIRI)

      throws NameException, ValueException;

public PropertyValues getPropertyValues(Evidence evidence)

      throws NameException;

public PropertyValues getPropertyValues(Evidence evidence, PropertyRef[] propertyRefs)

      throws NameException;

public PropertyValues getPropertyValues(Evidence evidence, String localAspectName)

      throws NameException;

public PropertyValues getPropertyValues(Evidence evidence, String localAspectName,

      String vocabularyIRI)

      throws NameException;

public PropertyName newPropertyName(String localPropertyName)

      throws NameException;

public PropertyName newPropertyName(String localPropertyName, String vocabularyIRI)

public PropertyName newPropertyName(String localPropertyName, String vocabularyIRI)

      throws NameException;

public PropertyRef newPropertyRef(String localPropertyName)

      throws NameException;

public PropertyRef newPropertyRef(PropertyName propertyName)

      throws NameException;

public PropertyRef newPropertyRef(PropertyName propertyName, String localAspectName)

      throws NameException;

public Evidence newHTTPEvidence();

public Evidence newHTTPEvidence(Map<String,String> map);

 

com.pb.act.e3.repository.E3Service

The E3Service interface provides methods for constructing Evidence based on E3 assumptions, rather than the expected HTTP-based evidence.

public Evidence newE3Evidence();

public Evidence newE3Evidence(Map<String,String> map);

The E3Service interface provides methods for accessing internal nodes, similar to the methods used for accessing data properties.  A second method that limits search depth was added to better tolerate recursive structures. For example, the searchDepth argument can be set to access object properties within a bounded depth of the document root. Two constants are provided for “shortcuts” to SHALLOW and DEEP searching.

public ArrayList<ObjectPropertyValue> getObjectPropertyValues(Evidence evidence,

      PropertyName objectPropertyValueName, String localAspectName)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValues(Evidence evidence,

      PropertyName objectPropertyValueName, String localAspectName, int searchDepth)

      throws NameException, ValueException;

More powerful filter methods were added, based on our OpenMail™ experience. We have found that we need to find a specific set of object property values (often with size 1) based on the value of one of its sub-properties or metadata.                

public ArrayList<ObjectPropertyValue> getObjectPropertyValuesByPropertyReference(

      Evidence evidence, PropertyName objectPropertyValueName,

      PropertyName referencePropertyName, String referencePropertyValue,

      String localAspectName)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValuesByMetaData(

      Evidence evidence, PropertyName objectPropertyValueName,

      PropertyName attributeName, String attributeValue,

      String localAspectName)

      throws NameException, ValueException;

Support for accessing data and object properties from a reference node has also been added. This support is especially useful when sub-properties of a specific element are needed. So two calls can be made to get to a specific property of a specific element, e.g., get the element that has a specific network protocol, and get that element's preferred network address. These methods are comparable to existing methods, with a modification of the first argument from Evidence to the referenced node.

public PropertyValue getPropertyValue(ObjectPropertyValue refObjectPropertyValue,

      PropertyName propName)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValues(

      ObjectPropertyValue objectPropertyValue, PropertyName refObjectPropertyValueName)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValues(

      ObjectPropertyValue refObjectPropertyValue, PropertyName objectPropertyValueName,

      int searchDepth)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValuesByPropertyReference(

      ObjectPropertyValue refObjectPropertyValue, PropertyName objectPropertyValueName,

      PropertyName referencePropertyName, String referencePropertyValue)

      throws NameException, ValueException;

public ArrayList<ObjectPropertyValue> getObjectPropertyValuesByMetaData(

      ObjectPropertyValue refObjectPropertyValue, PropertyName objectPropertyValueName,

      PropertyName attributeName, String attributeValue)

      throws NameException, ValueException;

Examples

Please excuse…  This section needs to be redone with externally available schemas.

 

To use this API, each interface must outline the supported aspects, properties and metadata. This section uses the ### as an example. 

Initialization

The initialization of the framework expects input information to be passed via a properties file, as shown here:

E3Service.properties

The properties file contains at least two entries, one that indicates where the document comes from and the other(s) that provide the class name for a Parser that provides the top nodes for all aspects within a namespace. The framework can access documents one of two ways. First, it can access a document directly, as shown above. Alternatively, it can access the document through a document proxy, e.g., a J2EE session bean. If the latter, then the crudRef property would be set to the name of the session bean, without the ‘file:’ prefix. When a document is accessed directly, that document is automatically cached within the framework. Otherwise, document caching must be explicitly set.

The following code illustrates the factory-based initialization of the framework. 

private static final String ###_NAMESPACE = "http://www.openmail.org/###";

 

// code block showing how to construct a service object

E3Service e3service;

try {

    e3service = (E3Service)E3ServiceFactory.newService(

                     "com.pb.act.e3.repository.E3ServiceImpl",###_NAMESPACE,props);

    // information access happens here....

} catch (InitializationException e) {

    e.printStackTrace();

} catch (NameException e) {

    e.printStackTrace();

} catch (ValueException e) {

    e.printStackTrace();

}  

 

The factory requires the class name for the E3Service implementation, the namespace to set as default, and the properties file described above.

Accessing and Using Data Property Values

Accessing and Using Object Property Values

Handling Recursive Structures

Bibliography

[1] W3C Semantic Web Activity - http://www.w3.org/2001/sw/Activity.html

 [2] UPU ePPML Standard -

       see http://www.upu.int/document/2008/an/cep_c_4_gn_ep_3-4/src/d008_ad00_an00_p00_r00.pdf

[3] W3C XML Activity - http://www.w3.org/XML/Activity.html

[4] The Device Description Repository: http://www.w3.org/TR/2008/REC-DDR-Simple-API-20081205/

[5] RDML Documentation: http://act-e3-openmail/rating

 



[1] E3 stands for ePPML enabled ecosystem, a phrase that represented a focus early in our work.  We use it to represent infrastructure and tools.