Simulation Data Access Protocol (SimDAP) Draft

This page is for outlining the initial draft of the SimDAP Note. Once we've reached agreement on the outline, we will move content to an HTML version in the volute repository.

The current outline is in, I'll try to flesh it out some over the weekend.

-- RickWagner - 14 Nov 2008

1. Introduction

2. SimDAP Data Model

3. Interface Overview

3.1 Architecture

A SimDAP service provides access to initial or derived files from simulations.

3.2 Service Operations

A SimDAP service implements multiple service operations, each of which performs some well defined function when invoked by a client application. The service operations described here use HTTP GET and POST as the low level communications protocol. The functionality of each operation is defined independently of the low level communications protocol, and semantically equivalent operations could be implemented in the future via other protocols.

SimDAP defines the following standard service operations:

GetCapabilities
Return a standardized XML description of the capabilities of the service instance, describing what the service is capable of doing (VOSI compliant, registry cacheable and searchable).
GetAvailability
Return a standardized XML description of the runtime status of the service, describing the state and availability of the service (VOSI compliant).
ListExperiments
Return a list of the experiments served by this SimDAP instance.
ListSnapshots
For a given experiment, list the available snapshots.
QueryData
The QueryData operation requires only the basic EXPERIMENT_ID and SNAP_ID parameters to work. The FIELDS parameter may be used if the corresponding field selection service is available (otherwise it is discarded). No FIELDS specification or a blank FIELDS parameter, is interpeted as: download all available fields. If FIELDS requires unavailable quantities, the corresponding request is discarded.
Preview
The preview can be implemented in different ways, depending on the specific data we are dealing with. In all the cases, if the service is supported, a getPreview method MUST be implemented. The input of this method is the basic couple EXPERIMENT_ID and SNAP_ID. The FIELDS parameter may be used to specify which fields to preview (if supported, otherwise it is discarded). No FIELDS specification or a blank FIELDS parameter, is interpeted as: preview all available fields. If FIELDS requires unavailable quantities, the corresponding request is discarded. If the cutout service is available, the preview service MUST provide instruments to select the fields of interest and the cutout region. This will result in setting the parameters presented in Section 5.3.
Cutout
The goal of the cutout service is to select and extract a sub-volume of data from a given snapshot. Such operation refers to a single snapshot. Multiple sources cutouts, like for various time steps of the same simulation, are not supported by the protocol. Their implementation is up to the client, as, for example, sequences of requests with same subbox and fields but different datasets.
Custom
A service define custom operation.

3.3 Service Profile

The basic form of a SimDAP service is specified in detail in section ??. In the current section we merely summarize the elements of the basic service interface.

3.3.1 Request Format

A service may implement multiple service operations, such as download or preview; these define the service interface. Interfaces may change with time and hence are versioned. It is possible for a given service instance to simultaneously expose multiple interfaces or versions of interfaces.

The SimDAP interface described in this document is based upon a distributed computing platform (DCP) comprising Internet hosts that support the Hypertext Transfer Protocol (HTTP). Thus, the online representation of each operation supported by a service is composed as a HTTP Uniform Resource Locator (URL).

A request URL is formed by concatenating a baseURL with zero or more operation-defined request parameters. The baseURL defines the network address to which request messages are to be sent for a particular operation of a particular service instance on a particular server. Service operations generally share the same baseURL but this is not required.

Example:

   $/sync?REQUEST=download&EXPERIMENT=...

SimDAP defines two versions of the baseURL, one for synchronous operations and another for asynchronous operations. These are formed by contentating the service-baseURL with either /sync? or /async?. Hence for synchronous operations we have a full baseURL of

    /sync?

and for asynchronous operations the full baseURL is

    /async?

In general the service operation is much the same whether or not it executes synchronously or asynchronously. Minor differences in service operation function or input parameters are poindividual service operation below.

Note that since a URI pathname segment is appended to the service baseURL the service baseURL may not contain any HTTP GET parameters, and must be a fixed URI.

3.3.2 Parameters

Parameters may appear in any order. If the same parameter appears multiple times in a request the operation is undefined (if alternate values for a parameter are desired the range-list syntax may be used instead). Parameter names are case-insensitive. Parameter values are case-sensitive unless defined otherwise in the description of an individual parameter.

All service operations define the following standard parameters, which are part of the basic service profile:

REQUEST
The request or operation name (mandatory).
VERSION
The version number of the interface (optional).

The REQUEST parameter specifies the service operation to be executed. VERSION allows a specific version of the interface to be requested. The values of both the REQUEST and VERSION parameters are case-insensitive.

A given service instance may support multiple versions of the SimDAP interface, and by default the service assumes the highest standard version which is implemented (access to any experimental versions supported by a service requires explicit specification of the version by the client). Explicit specification of the interface version assumed by the client is necessary to ensure against a runtime version mismatch, e.g., if the client caches the service endpoint but a newer version of the service is subsequently deployed. If desired the client can omit the VERSION parameter to disable runtime version checking, and default to the highest version standard interface implemented by the service.

All other request parameters are defined separately for each operation.

3.3.3 Parameter Values

Integer numbers are represented as defined in the specification of integers in XML Schema Datatypes. Real numbers are represented as specified for double precision numbers in XML Schema Datatypes. Sexagesimal formatting is not permitted, either for parameter input or in formal output metadata, other than in ISO 8601 formatted time strings (sexagesimal format is permitted in any informal output intended for a human, e.g., text or HTML formatted tables).

SimDAP defines a special range-list format for specifying numerical ranges or lists of ranges as parameter values. For example, 1E-7/3E-6 specifies a closed range from 1E-7 to 3E-6 inclusive. The syntax supports both open and closed ranges. Ranges or range lists are permitted only when explicitly indicated in the definition of an individual parameter. A variant of the range list is the value of the WHERE parameter, used to specify the query constraint for a ParamQuery operation. For a full description of range list syntax refer to section 3.3.1.

3.3.4 Use of GET and POST

Where specified, individual service operations may provide both HTTP GET and POST forms for issuing the service request. Both forms share the same input parameters and operation semantics, being merely two different ways of invoking the same service operation. In general, the GET form is used for synchronous operations which are idempotent (have no side effects, the result is cacheable, multiple instances may be simultaneously active and will return the same result). POST is used for any request which has a side effect, e.g., initiation of an asynchronous job, or which needs to pass a large amount of data to the service, e.g., uploading a table or region mask to be used within a query.

3.3.5 URL Encoding

URL encoding (see section 7.3.2) is a standard technique used to encode characters appearing in HTTP requests, such as a GET URL, to pass characters which are not otherwise legal and could interfere with the HTTP protocol. By using URL encoding it is possible to pass arbitrary character data to a service in a HTTP request, for example an arbitrary ADQL statement could be passed in a simple GET request so long as it is not too large for a GET URL (2K or so characters).

3.3.6 Error Response

In the case of an error, service operations should return a VOTable containing an INFO element with name QUERY_STATUS and the value set to ERROR. More fundamental service or protocol errors may result in an HTTP level protocol error, hence a client program should be prepared to handle either response. A null query, that is a queryData which does not find any data, is not considered an error; likewise an overflow condition is distinguised from error. More information on error responses is given in section 7.

3.4 Request Examples

Some examples of simple SimDAP requests follow. These are intended only to help introduce and illustrate basic usage of the SimDAP service interface; the details are specified in the following sections of this document.

Synchronous parametric query performing a simple cone search of table (baseURL would be replaced with the actual service base URL):

    $baseURL/sync?REQUEST=paramquery&POS=12,34&SIZE=0.5&FROM=foo 

Synchronous ADQL query returning all data from table:

    $baseURL/sync?REQUEST=adqlquery&QUERY=select+*+FROM+foo 

Simple cone search query executed asynchronously:

    curl -d 'REQUEST=paramquery&POS=12,34&SIZE=0.5&FROM=foo' \ 
       $baseURL/async 
    curl -d 'PHASE=RUN' $baseURL/$jobID 
  [wait] 
    curl $baseURL/$jobID/results 

In this example the commonly available curl application (wget or a browser could also be used) is used to issue HTTP GET and POST requests to the remote UWS- based job manager. The query may run for an arbitrarily long time. When the job completes the output can be retrieved.

Asynchronous version of our simple ADQL-based query:

    curl -d 'REQUEST=adqlquery&QUERY=select+*+FROM+foo' $baseURL/async 
    curl -d 'PHASE=RUN' $baseURL/$jobID 
        [wait] 
    curl $baseURL/$jobID/results 

4. SimDAP Service Operations

4.1 GetAvailability

4.1.1 Query Response

4.2 GetCapabilities

4.2.1 Query Response

4.3 ListExperiments

4.3.1 Input Parameters

4.3.2 Query Response

4.4 ListSnapshots

4.4.1 Input Parameters

4.4.1.1 EXPERIMENT

4.4.2 Query Response

4.5 QueryData

4.5.1 Input Parameters

4.5.1.1 EXPERIMENT

4.5.1.2 SNAPSHOT

4.5.1.3 PROPERTIES

4.5.2 Query Response

4.6 Cutout

4.6.1 Input Parameters

4.6.1.1 EXPERIMENT

4.6.1.2 SNAPSHOT

4.6.1.3 PROPERTIES

4.6.1.4 VOLUME

4.6.2 Query Response

4.7 Preview

4.7.1 Input Parameters

4.7.1.1 EXPERIMENT

4.7.1.2 SNAPSHOT

4.7.1.3 PROPERTIES

4.7.2 Query Response

4.8 Custom

4.8.1 Input Parameters

4.8.1.1 EXPERIMENT

4.8.1.2 SNAPSHOT

4.8.1.3 PROPERTIES

4.8.2 Query Response


Topic revision: r2 - 2008-11-15 - RickWagner
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback