VOSpace 1.1 specification 
This contains details of the proposed specification for VOSpace 1.1.
 Abstract 
VOSpace is the IVOA interface to distributed storage. This version extends the existing VOSpace 1.0 specification to support containers, links between individual VOSpace instances, third party APIs, and a find mechanism.
 Introduction 
VOSpace is the IVOA interface to distributed storage. VOSpace 1.0 [REF] defined a flat, unconnected data space. VOSpace 1.1 builds on top of this and introduces the following new functionality:
 
-  containers - this allows the grouping of data in a hierarchical fashion
  -  links - this allows the federation of distinct VOSpace services
  -  third party APIs - this allows data objects and collections to be exposed through other interfaces
  -  find - this offers a more extensive search capability than is provided by list with wildcard support
 
 
 Document roadmap 
The rest of this document is structured as follows:
{TO BE DONE FOR PAPER COPY]
 VOSpace data model 
VOSpace 1.1 extends the VOSpace 1.0 data model by introducing two new node types: 
ContainerNode and 
LinkNode. A new element, capabilities, is also added to the 
DataNode node type.
{DISCUSSION POINT: Should capabilities only exist on the 
ContainerNode?]
[NEW DIAGRAM]
ContainerNode describes a data item that can contain other data items. These can be of any type including other 
ContainerNodes.  A 
ContainerNode has no data bytes associated with it directly but only with its contents - in a tree representation, a 
ContainerNode is a branch  whereas data objects are leaves.
ContainerNode extends 
DataNode and so has the following elements:
 
-  uri: the 
vos:// identifier for the node, URI-encoded according to RFC2396 [REF]. 
  -  properties: a set of metadata properties for the node. 
  -  accepts: a list of the views (data formats) that the node can accept. 
  -  provides: a list of the views (data formats) that the node can provide. 
  -  busy: a boolean flag to indicate that the data associated with the node or its children cannot be accessed. 
 
 
[DISCUSSION POINT: 
capabilities should be inthis list as well since we are adding to the standard definition of _DataNode_].
The 
busy flag is used to indicate that an internal operation, such as the service implementation unpacking an archive format, is in progress and so none of the node data are available.
 Container identifiers 
Slashes in the URI path imply a hierarchical arrangement of data: the data object identified by  
vos://nvo.caltech!vospace/tables/myTable1 is within the container identified by 
vos://nvo.caltech!vospace/tables. In fact, all ancestors in the hierarchy will be resolvable containers back to the root node of the space (this precludes any system of implied hierarchy in the naming scheme for nodes with ancestors that are just logical entities and cannot be reified, e.g. the Amazon S3 system). 
[DISCUSSION POINT: The root node for a VOSpace must be represented by a 
ContainerNode]
 Inheritable properties 
Properties on a 
ContainerNode may be designated as 
inheritable and will propagate to children nodes of the container if they are specified in the 
accepts or 
provides list for this node.
[DISCUSSION POINT: If a property is also declared on a child, which value takes priority? How are properties registered as inheritable?]
 Container views 
For VOSpace 1.1, a view is the data representation (format) of the file that is transferred. If the view is an archive format (tar, zip, etc.) then the space will provide access to the archive contents as children nodes of the container. Whether or not the space actually unpacks the archive is implementation dependent but the service will behave as though it has done so. For example, a client wishes to upload a tar file containing several images to a VOSpace service. If he associates it with (uploads it to) a Structured/UnstructuredDataNode then it will treated as a blob and its contents will be not be available. However, if he uses a 
ContainerNode with an accepts view of "tar" then the image files within the tar file will be represented as children nodes of the 
ContainerNode and accessible like any other data object within the space.
[DISCUSSION POINT: What are the names of the children nodes? Are these Structured/UnstructuredDataNodes? What is the default? How is this set?] 
[DISCUSSION POINT: How does the service identify what it considers to be archive formats?]
If a provides view is an archive format (tar, zip, etc.) then the space will package the container and all its children nodes in the specified format.
LinkNode describes a node that points to another node. These can be of any type including other 
LinkNodes. A 
LinkNode has no data bytes associated with it.
LinkNode extends 
Node and so has the following elements associated with it:
 
-  uri: the 
vos:// identifier for the node, URI-encoded according to RFC2396 [REF]
  -  properties: a set of metadata properties for the node. The properties do not propagate to the target of the LinkNode. One use case is to enable third-party annotations to be associated with a data object but without the data object itself getting cluttered with unnecessary metadata. In this case, the client creates a LinkNode pointing to the data object in question and then adds the annotations as properties of the LinkNode. 
  -  target: the identifier, URI-encoded according to RFC2396, for the data object to which the LinkNode points.
 
 
 Capabilities 
A 
Capability is a third-party interface to a data object. It enables data access using other non-VOSpace methods.
A 
Capability has the following members: 
-  uri: the Capability identifier
  -  endpoint: the endpoint URL to use for the third-party interface
 
 
[DISCUSSION POINT: Should there be any more members to a 
Capability, e.g. 
param to specify additional arguments that might be required for access?]
 Example use cases 
A 
ContainerNode contains image files and has a DAL SIAP capability so that the images in the container can also be accessed using a SIAP service. In this way, a user could create a container in VOSpace, drop some images into it and then query the set of images using the SIAP interface. 
Another example is a 
DataNode with an iRODS capability so that the data replication for this data object can be handled using the iRODS service API located at the specified endpoint.
 Capability identifiers 
Every new type of 
Capability requires a unique URI to identify the 
Capability.
The rules for the 
Capability identifiers are similar to the rules for namespace URIs in XML schema. The only restriction is that it must be a valid (unique) URI.
 
-  An XML schema namespace identifier can be just a simple URN, e.g. urn:my-namespace
  -  Within the IVOA, the convention for namespace identifiers is to use a HTTP URL pointing to the namespace schema, or a resource describing it.
 
 
The current VOSpace schema defines 
Capability identifiers as 
anyURI [TBD]. The only restriction is that it must be a valid (unique) URI.
 
-  A Capability URI can be a simple URN, e.g. urn:my-capability
 
 
This may be sufficient for testing and development on a private system, but it is not scalable for use on a public service.
For a production system, any new 
Capabilities should have unique URIs that can be resolved into a description of the 
Capability. 
Ideally, these should be IVO registry URIs that point to a description registered in the IVO registry:
 
-  ivo://my-registry/vospace/capabilities#my-capability
 
 
Using an IVO registry URI to identify 
Capabilities has two main advantages:
 
-  IVO registry URIs are by their nature unique, which makes it easy to ensure that different teams do not accidentally use the same URI
  -  If the IVO registry URI points to a description registered in the IVO registry, this provides a mechanism to discover how to use the Capability.
 
 
 Capability descriptions 
If the URI for a particular 
Capability is resolvable, i.e. an IVO registry identifier or a HTTP URL then it should point to an XML resource that describes the 
Capability.
A 
CapabilityDescription should describe the third-party interface and how it should be used in this context. 
A 
CapabilityDescription should have the following members:
 
-  uri: the formal URI of the Capability
  -  DisplayName: a simple display name of the Capability.
  -  Description: a text block describing the third-party interface and how it should be used in this context.
 
 
Note that at the time of writing, the schema for registering 
CapabilityDescriptions in the IVO registry has not been finalized. 
 UI display name 
If a client is unable to resolve a 
Capability identifier into a description then it may just display the identifier as a text string: 
-   Access data using urn:edu.sdsc.irods
 
 
If a client can resolve the 
Capability identifier into a description then the client may use the information in the description to display a human readable name and description of the 
Capability: 
 
 Standard capabilities 
The VOSpace team intend to register 
Capability URIs and 
CapabilityDescriptions for the core set of 
Capabilities, e.g.  
-  Cone Search
  -  SIAP
  -  SSAP
  -  TAP
 
 
However, this is not intended to be a closed list and different implementations are free to define and use their own 
Capabilities. 
 Web service operations 
A VOSpace 1.1 service shall be a SOAP service with the following operations:
 Service metadata 
 getProtocols 
This is unchanged from VOSpace 1.0 (Sec 5.1.1).
 getViews 
This is unchanged from VOSpace 1.0 (Sec 5.1.2).
 getProperties 
This is unchanged from VOSpace 1.0 (Sec 5.1.3).
[DISCUSSION POINT: Is this true - do we want to denote inheritable properties in some fashion?]
 getCapabilities 
[DISCUSSION POINT: Do we want this operation?]
 Creating and manipulating data nodes 
 createNode 
Create a new node at a specified location.
 Parameters 
This is the same as VOSpace 1.0 (Sec 5.2.1.1) except that: 
-  the permitted values of 
xsi:type are: 
-  
vos:Node
  -  
vos:DataNode
  -  
vos:UnstructuredDataNode
  -  
vos:StructuredDataNode
  -  
vos:ContainerNode
  -  
vos:LinkNode
 
 
 
 
 .auto replaces 
vos://null as the reserved URI to indicate an auto-generated URI for the destination, i.e. 
vos://service/path/.auto will cause a new unique URI for the node within 
vos://service/path to be generated.
The 
capabilities list for the 
Node cannot be set using this method.
 Returns 
This is the same as VOSpace 1.0 (Sec 5.2.1.2) except that: 
-  the capabilities list for the Node may not be filled in until some data has been imported into the Node.
 
 
 Faults 
This is the same as VOSpace 1.0 (Sec 5.2.1.3) except that: 
-  The service shall throw a LinkFound exception if the parent path includes a link.
  -  The service shall throw a LinkFound exception if the parent node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes
 
 
[DISCUSSION POINT: Do we need both a 
LinkFound and a 
ContainerNotFound exception or does the latter work for both cases?]
 deleteNode 
Delete a node.
When the target is a 
ContainerNode, all its children (the contents of the container) will also be deleted.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.2.2.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.2.2.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.2.2.3) except that: 
-  The service shall throw a LinkFound exception if the parent path includes a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes.
 
 
 listNodes 
List nodes in a space.
When a target URI is a 
ContainerNode, only 
direct (first generation) children of the node will be listed.
 Parameters 
This is the same as VOSpace 1.0 (Sec 5.2.3.1) except that: 
-  Wild cards can only be used in the final part of the URL path: for example, a/b/c/*.txt is allowed by a/*/c/*.txt is not.
 
 
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.2.3.2).
 Faults 
This is unchanged from VOSpace 1.0 (Sec 5.2.3.3).
 findNodes 
Find nodes whose properties match the specified values.
 Parameters 
 
-  token: An optional continuation token from a previous request 
-  No token indicates a request for a new find operation. 
 
 
 
 
The server may impose a limited lifetime on the continuation token. If a token has expired, the server will throw an exception, and the client will have to make a new request.
 
-  limit: An optional limit indicating the maximum number of requests in the response 
-  No limit indicates a request for an unpaged response. However the server may still impose its own limit on the size of an individual response, splitting the results into more than one page if required.
 
 
  -  detail: The level of detail in the returned response 
-  
min: The response contains the minimum detail for each Node with all optional parts removed - the node type should be returned 
-  e.g. 
<node uri="vos://service/name" xsi:type="Node"/>
 
 
  -  
max: The response contains the maximum detail for each Node, including any xsi:type specific extensions
  -  
properties: The response contains a basic node element with a list of properties for each Node with no xsi:type specific extensions.
 
 
  -  matches: A list of match elements identifying the properties and values to match against and whether these should applied in conjunction (and) or disjunction (or). 
 
 
The 
match element has a 
uri attribute to identify the property to which it is applying. The regular expression against which the property values are to be matched is then specified as the value of the 
match element:
<match uri="..."> regex </match>
The 
match elements can be combined in conjunction and/or disjunction by specifying them as subelements of 
<or> and 
<and> respectively. For example, the predicate "(property1 and property2) or property3" would be specified as:
<or>
    <and>
        <match uri="property1"> regex </match>
        <match uri="property2"> regex </match>
    </and>
    <match uri="property3"> regex </match>
</or>
[DISCUSSION POINT: Are wildcards allowed in the property URIs - find me all nodes where any property matches this regular expression? ]
An empty list of 
<matches> implies a full listing of the space.
 Returns 
 
-  token: An optional continuation token, indicating that the response is incomplete 
-  The client may use this token to request the next block of Nodes in the sequence
  -  No token indicates that the list is complete.
 
 
  -  limit: An optional limit which must be present if a limit parameter was used in the request 
-  If present, the value is the value from the original request and not any limit imposed by the service
 
 
  -  nodes: A list of the Nodes matching the requested properties
 
 
 Faults 
 
-  The service shall throw an InternalFault exception if the operation fails
  -  The service shall throw a PermissionDenied exception if the user does not have permissions to perform the operation
  -  The service shall throw a PropertyNotFound exception if a particular property is specified and does not exist in the space 
-  This does not apply if wildcards are allowed in the property URIs
 
 
  -  The service shall throw an InvalidToken exception if it does not recognize the continuation token
  -  The service shall throw an InvalidToken exception if the continuation token has expired
 
 
 moveNode 
Move a node within a VOSpace service.
When the source is a 
ContainerNode, all its children (the contents of the container) will also be moved to the new destination.
When the destination is an existing 
ContainerNode, the source will be placed under it (i.e. within the container).
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.2.4.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.2.4.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.2.4.3) except that: 
-  The service shall throw a DuplicateNode exception if a Node already exists at the destination unless it is a ContainerNode.
  -  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a LinkFound exception if the parent path includes a link.
  -  The service shall throw a LinkFound exception if the parent node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes
  -  The service shall throw an InvalidArgument exception if the source is a ContainerNode and the destination is not.
  -  The service shall throw a ContainerNotFound exception if the target node is a  ContainerNode and does not exist.
 
 
 copyNode 
Copy a node with a VOSpace service. 
When the source is a 
ContainerNode, all its children (the full contents of the container) get copied, i.e. this is a deep recursive copy.
When the destination is an existing 
ContainerNode, the copy will be placed under it (i.e. within the container).
 Parameters 
This is the same as VOSpace 1.0 (Sec 5.2.5.1) except that: 
-  .auto replaces vos://null as the reserved URI to indicate an auto-generated URI for the destination, i.e. vos://service/path/.auto will cause  a new unique URI for the node within vos://service/path to be generated.
 
 
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.2.5.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.2.5.3) except that: 
-  The service shall throw a DuplicateNode exception if a Node already exists at the destination unless it is a ContainerNode.
  -  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a LinkFound exception if the parent path includes a link.
  -  The service shall throw a LinkFound exception if the parent node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes
  -  The service shall throw an InvalidArgument exception if the source is a ContainerNode and the destination is not.
  -  The service shall throw a ContainerNotFound exception if the target node is a  ContainerNode and does not exist.
 
 
 Accessing metadata 
 getNode 
Get the details for a specific Node.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.3.1.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.3.1.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.3.1.3) except that: 
-  The service shall throw a LinkFound exception if the target path includes a link.
 
 
 setNode 
Set the property values for a specific node.
Changes to inheritable properties on 
ContainerNodes will propagate to children nodes of the container where applicable.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.3.2.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.3.2.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.3.2.3) except that: 
-  The service will throw a LinkFound exception if the target path includes a link.
 
 
 Transferring data 
 pushToVoSpace 
Request a list of URLs to send data to a VOSpace node.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.4.1.1) except that:
[DISCUSSION POINT: If a 
Node already exists at the target URI and it is a 
ContainerNode, should it be overwritten by the target 
Node or should the target 
Node become a child of the 
ContainerNode? This also applies to pullToVoSpace.] 
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.4.1.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.4.1.3) except that: 
-  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes.
 
 
 pullToVoSpace 
Import data into a VOSpace node.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.4.2.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.4.2.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.4.2.3) except that: 
-  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes.
 
 
 pullFromVoSpace 
Request a set of URLs that the client can read data from.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.4.3.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.4.3.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.4.3.3) except that: 
-  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes.
 
 
 pushFromVoSpace 
Ask the server to send data to a remote location.
 Parameters 
This is unchanged from VOSpace 1.0 (Sec 5.4.4.1).
 Returns 
This is unchanged from VOSpace 1.0 (Sec 5.4.4.2).
 Faults 
This is the same as VOSpace 1.0 (Sec 5.4.4.3) except that: 
-  The service shall throw a LinkFound exception if the target path includes a link.
  -  The service shall throw a LinkFound exception if the target node is a link.
  -  The service shall throw a ContainerNotFound exception if the parent path is not composed solely of ContainerNodes.
 
 
 Fault arguments 
This is the same as VOSpace 1.0 [Sec 5.5] with the addition of:
 This is thrown with the URI of the missing 
ContainerNode.
This is thrown with the URI of the found 
LinkNode.
 References 
[VOSpace] Matthew Graham, Paul Harrison, Dave Morris, Guy Rixon, VOSpace service specification v1.02, IVOA Recommendation 2007 October 01, 
http://www.ivoa.net/Documents/latest/VOSpace.html
-- 
MatthewGraham - 07 Jan 2008