Discussion of the VOSpace 1.0 specification Document
This is a discussion page for the VOSpace-1.0 service specification document.
This is somewhere where we can post proposals and to enable interested parties to discuss the different versions.
For each version there is a Change request section - please add to this and vote on other suggestions
- +1 if you agree
- -1 if you disagree
- 0 if you have no particular preference
Details and discussion of implementation plans are
here.
Version 0.21
This document was produced as a result of the discussions that occured at the Victoria Interop meeting.
Change Requests
Mandate and define at least one transport protocol
without this a compliant VOSpace will not be able to transfer data to another compliant VOSpace.
I recommend http be mandatory.
see detailed discussion in email thread
http://www.ivoa.net/forum/vospace/0606/0097.htm
--
PaulHarrison
- Which version, http-1.0 or http-1.1 ?
- Which methods http-get, http-put or both ?
- For a space that stores public images, then http-1.1-get makes sense.
- For a space that allows upload to sensitive database tables, then http-1.x-put does not have sufficient authentication.
--
DaveMorris - 16 Jun 2006
I think that where VOSpace is acting as a http server then it would be good to mandate http-1.1 compliance - it does fix issues with 1.0 and afterall it is a 7 year old specification now....
As far as a compliance statement goes - how about
(http-1.1-get or https-1.1-get) and (http-1.1-put or https-1.1-put)
--
PaulHarrison - 19 Jun 2006
Making some form of HTTP mandatory makes it easier to write clients and harder to write services. VOSpace fails unless we get useful services, and services are already harder to write than clients.
--
GuyRixon - 03 Jul 2006
Votes
Specify as optional a small list of well known transport protocols
so that at least implementations will do the same thing for common protocols - this list to include
- ftp
- gridftp
- file - direct access to the file system
--
PaulHarrison
Agree with the list of common protocols.
Defined in a annex to the main specification, including a standard URI and details of the protocol specification (can be just short note and a reference to the external specification).
e.g.
HTTP 1.1 get
URI
ivo://....vospace/protocols/http-1.1-get
Description :
Get data using the HTTP-1.1 GET method as defined in RFC2616.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.3
--
DaveMorris - 16 Jun 2006
Votes
Specify the key names and meanings for a small group of "essential" property keys
The minimum set which I would say we need mandatory names are
- vos.Owner
- vos.ModificationDate
- vos.Size
- vos.MimeType
--
PaulHarrison
Agree with the list of common properties.
Defined in a annex to the main specification, including a standard URI and details of what each property means and how it is represented.
e.g.
Data created date
KEY
vos.data.created.date
Description :
A read-only property generated by the server.
Indicating when the data contents were originally created.
Formatted as a ISO 8601 date-time [yyyy-mm-ddTHH:MM:SS.SSS]
----
Data modified date
KEY
vos.data.modified.date
Description :
A read-only property generated by the server.
Indicating when the data contents were last modified.
Formatted as a ISO 8601 date-time [yyyy-mm-ddTHH:MM:SS.SSS]
Note - As VOSpace-1.0 does not support append, the only way the created and modified dates will be different is if the server modifies the data underneath.
--
DaveMorris - 16 Jun 2006
Votes
Clarify the use of the Format parameter in some calls
I believe that the original intention of Format was to specify a possible transformation of the data (mainly on export of table data from RDBMS based stores). In the current version of the document it reads as if this parameter is merely a description of the data - in which case mime-type suffices. In addition I am not sure if this parameter has any meaning for import operations.
Votes
Change "exception" to "fault"
The document uses the term exception for the "faults" rather than "fault"
--
PaulHarrison
Votes
Consider adding an optional wildcard matching identifier to parameters for ListNodes
This would allow the client to specifly a subset of the VOSpace to be listed - in effect the behaviour would be similar to the "ls" command in unix, with typical simle shell wildcard semantics.
Reason for change: improved efficiency - if
ListNodes always has to list the
whole VOSpace then it is a pretty blunt instrument, especially as the number of data objects in the space increases.
Use Case
Suppose that there is a 1.0 VOSpace containing 5000 data items and a workflow step is writing results into the VOSpace using a common prefix. The next step in the workflow wants to process all of the files produced, but only knows the prefix - without wild card matching the whole of the VOSpace needs to be listed to find the files.
--
PaulHarrison 19 Jun 2006
Votes
Consider adding an optional identifier list to the parameters for ListNodes
This would allow the client to specifly a subset of the VOSpace to be listed - in effect the behaviour would be similar to the "ls" command in unix.
Reason for change: improved efficiency - if
ListNodes always has to list the
whole VOSpace then it is a pretty blunt instrument, especially as the number of data objects in the space increases.
Use Case
Suppose that there is a 1.0 VOSpace containing 5000 data items and a client is currently interacting with a V2.0 VOSpace that has links to a small subset (e.g. 100 data items) - the client needs to make 100 getNodeProperties SOAP calls to the V1.0 space to get the latest metadata about the data objects - with an optional identifier list it can make one call to
ListNodes.
--
PaulHarrison 19 Jun 2006
Votes
Reconsider getNodeProperties and setNodeProperties
the naming of these operations makes them appear as a pair, where in fact they are not - you get a Node out of the getNodeProperties and have to put a
PropertyPairList into the setNodeProperties - If the change to
ListNodes were made as above, then getNodeProperties would be redundant anyway.
--
PaulHarrison
Votes
name |
vote |
comment |
PaulHarrison |
+1 |
the benefit is improved clarity - in addition there is the slightly irritating issue (see below) that you cannot take the property list from the "get" and use it in the "set" because of read-only properties |
DaveMorris |
-1 |
I can't see a significant benefit from changing this |
Semantics not clear for setNodeProperties
It is not clear
- how to delete a property - does a null value of the property pair denote this?
- does the whole set of node properties given as an argument replace the whole set for the node, or is a union operation performed.
- interaction with the "read-only" properties in these scenarios...
--
PaulHarrison
Agree, need to make this clearer.
- Null value deletes property.
- Operation is union not replace.
- Server throws PermissionDenined for read-only properties.
- what if only one of the list of properties is read-only? - need to signal which are the bad properties in the exception - also PermissionDenied could be confusing as this normally refers to the permissions on the data object which could potentially be changed in future versions of VOSpace - certain properties are fundamentally readOnly and so would
--
DaveMorris - 16 Jun 2006
Votes
Change names of parameters for moveNode and copyNode
Target is a confusing name for the "source" part of a move operation it sounds more like "destination" - recommend use "source" and "destination"
--
PaulHarrison
Yep, source and destination are probably better. As long as we make it clear that these are internal locations, not references to external locations.
--
DaveMorris - 16 Jun 2006
Votes
Rename bulk data transfer operations
- I stilll find the data transfer operations confusing- I think that the basic problem is that the push and pull verbs are opposite in meaning and viewed from opposite perspectives. I have a new set of proposals
old name |
new name" |
pushDataToVoSpace |
importDataClientPush |
pullDataToVoSpace |
importDataServerPull |
pushDataFromVoSpace |
exportDataServerPush |
pullDataFromVoSpace |
exportDataClientPull |
which I think are better because
- the overall objective of the operation is the first verb
- the active party in the transfer is identified
- the direction of the transfer is then related to the active party.
--
PaulHarrison - 13 Jun 2006
This is a change to the specification not the WSDL, move this to the specification change page.
--
DaveMorris - 16 Jun 2006
I find the newly-proposed names
more confusing. It's not obvious to me whether "import" moves data into VOSpace or into the client of VOSpace; similarly with "export".
--
GuyRixon - 04 Jul 2006
Votes
New GetPropertyKeys operation
This proposal is
basically because I am still a little worried about interoperability
problems with the completely untyped nature of the property-key pairs
- particularly as they are expected to carry some fundamental
metadata about the data objects in the current implementation. This
call would return the complete list of key names that have been used
in the VOSpace, which would then allow clients to attempt to be
consistent in the use of key names - it is not much but at least it
does provide a mechanism to voluntarily avoid complete anarchy.
Should return a list of the keys with an indication of which ones relate to read-only properties.
--
PaulHarrison - 13 Jun 2006
I agree with adding the method.
However, this should be changed in the specification first, not the WSDL. Move this to the specification change page.
--
DaveMorris - 16 Jun 2006
Votes
Re-assess what metadata should be returned with various faults
Some faults need to return metadata beyond simply their type to convery meaningful information about exactly what has gone wrong. e.g. for a setDataNodeProperties call could specify several read-only properties amongst a larger number of properties, and the client would not know which were the properties in error without a list being returned in the fault.
Votes
Re-assess the semantics of creating a server named data object.
see discussion on mailing list for more background
http://www.ivoa.net/forum/vospace/0606/0095.htm
Votes
Rename Status (of Node object) to TransferStatus
The Status member of the Node object is really referring to the transfer status.
Votes
Consider interactions when two clients attempt operations on same data object
use case
client 2 attempts to delete a file that client 1 is currently downloading via a "pull" transfer.
As specified at the moment, Client 1 could have a data transfer cut off with out any real knowledge of why, or client 2 could receive a
PermissionDenied fault
Possible solutions
- add a DataInUse fault to the destructive manipulation methods
- add a dataBeingRead to the list of Status values
This use case is actually quite an implementation challenge....involves close interaction with the states of the actual data streams for the transports.
Votes
Expand "expired" Node status to include "failed"
perhaps failed is a better name can then encompass the case where a partial data transfer has occured and failed half way, for some reason other than expiry
Votes
Different type of list operation that returns only the uris...
Efficiency...
Votes
Confusing that pullDataToVoSpace, pushDataFromVoSpace return transfer object.
The transfer object contains a location element which is intended to be the uri that pullDataFromVoSpace and pushDataToVoSpace supply as an
output parameter, so it is confusing that pullDataToVoSpace, pushDataFromVoSpace return a uri for a transfer when for those methods the uri for transfer is an
input parameter.
Recommend that the pullDataToVoSpace, pushDataFromVoSpace return only what is necessary for the client to know - this might mean removing the transfer object from the return list, or refactoring the transfer object - global analysis needed.
Votes
Need to have describe how VOSpace should be registered
The registration of VOSpace servers will be crucial their operation - we need some discussion of this and preferably a VOResouce extension schema to accompany it.
PaulHarrison - 27 Jun 2006
Votes