VOSpace home page

Discussion of the VOSpace 1.1 features

This is a discussion page for VOSpace-1.1 service features.

Version 1.1 aims to extend the VOSpace-1.0 specification to include links and containers. The proposed mechanism is that we introduce two new node types:

  • LinkNode - this is like a Node but also has a URI to where the link is pointing
  • ContainerNode - this cannot hold any data (no bytes) but can have children nodes (of any type) and views for container level formatting (aggregate zip/gzip).

At the May 2007 interop, we identified the following items as requiring resolution for VOSpace 1.1:

  • Container level metadata - how to distinguish those that relate to the contents of the container through inheritance and those to the container itself
  • Generated names - vos://null does not work for containers so use ".auto" or "/" as an alternative
  • Typing of protocol and view parameters - these are currently designated as "string" but normal parameters are "URI"
  • ACL - although this forms part of a wider SSO context, should VOSpace have some notions of ACL control
  • Find - a equivalent to the Unix command is desired


This is somewhere where we can post proposals and to enable interested parties to discuss the changes.

Please add your suggestion to this page and vote on other suggestions

  • +1 if you agree
  • 0 if you think the proposal is useful but needs more work
  • -1 if you disagree with the proposal

If you register a 0 or -1 vote, then please add a link to a page outlining your objections or comments.


Logical storage units

A request has from our friends at SDSC to include references to the actual storage units that data is being deposited on. The use case is data replication so, for example, I want to move/copy a data object from a slow tape archive to an ultrafast disk but both hardware units are within the same VOSpace or I want to retrieve a data object from the ultrafast disk copy and not the slow tape one.

I think that we can incorporate this easily into our existing data model. We will refer to hardware units as logical storage units with the implication that they are identified via a logical identifier (URI) that is set by the particular VOSpace implementation. To get the list of available storage units from a VOSpace, we will need a method: getLogicalStorageUnits() which will return a list of URIs. These URIs may be resolvable to a description of the storage unit.

The logical storage unit identifier will be an optional argument in the entity so that as part of the data transfer negotiation, the user can specify a list of storage units that they want the data transferred to/from. The identifier will also be an optional argument in the entity so that specific hardware can be targetted in moving and copying data. (MatthewGraham - 13 Aug 2007)


I'm not sure there is a strong science use case for this. Turn your example round the other way, and what is the science use case for explicitly wanting to get the data from the slow tape store rather than the fast disk store ?

Adding references to the storage units will add a whole load of complexity to VOSpace, that is already handled by other tools and services. As soon as we start to deal with things like replication, we will need to define the expected behaviour in a lot more detail than just simply adding references to logical storage units.

Some of the question that we would need to answer (not a complete list) :

  • If the data for a node is stored on more than one storage unit, if I change the data on one unit, are the changes reflected in the other 'copy'.
  • How does this affect something like tabular data stored in a StructuredData node ?
    • Can the data for a StructuredData node be stored in a database table and as a file on disk at the same time ?
    • If so, what kind of validation is applied when I import data to the disk copy ?
    • If I run a SQL statement that modifies the database table, are the changes replicated to the copy on disk ?

These are all solveable, in fact they have all been solved by systems such as SRB and iRODS. In which case, why try to re-invent the wheel ? If we try to solve these issues in VOSpace, then I am concerned that we will end up doing one of two things.

  1. We base our solutions on how SRB and iRODS have solved the problems.
    • In which case we are effectively saying "a VOSpace service must handle replication the same way that SRB does".
    • This would make it much more difficult to implment a VOSpace service that uses an alternative replication mechanism.
  2. We come up with our own solutions that behave slightly differently to the way that SRB and iRODS have solved the problems.
    • This would make it much more difficult to implement a VOSpace based on SRB and iRODS.

-- DaveMorris - 25 Sep 2007

Votes

name vote comment
MatthewGraham +1 proposer
DaveMorris -1 I'd rather handle this in a separate interface

Alternative service interfaces

In reference to the above suggestion of adding references to logical storage units to support data replication. Why attempt to re-invent the wheel.

If a VOSpace service is based on a SRB or iRODS system, then provide a way for the user to access the SRB or iRODS service interface directly.

If a VOSpace service uses a different replication mechanism, then provide a way for the user to control the replication using that mechanism instead.

The suggestion is we add a list of alternative service interfaces for accessing the node. These can either be aded to the existing provides list, or in a specific list of alternative service capabilities.

In the specific example of data replication using SRB or iRODS.

If we define a URI that means 'access the data using the iRODS service interface'. Then a VOSpace service that is based on a SRB or iRODS server can add the iRODS service interface in the provides list for a node.

    <node uri="vos://xxxx">
        ....
        <provides>
            ....
            <!-- Access the data using an iRODS service -->
            <view uri="ivo://irods.sdsc.edu/interface/irods-v0.9">
                <endpoint>.....</endpoint>
            </view>
        </provides>
    </node>

In effect the VOSpace service is saying, "the data replication for this node can be handled using the iRODs service API at [endpoint].

In order to implement this, we need to allow the view element to contain an endpoint URL.

A slight tweak to the VOSpace provides and view elements, and we get access to all of the iRODS service API for free.

Votes

name vote comment
DaveMorris +1 proposer

Data access using DAL services

The same view | endpoint syntax would also enable us to provide data access using other IVOA services.

If we had a container node that contained images. Then the following example says that the images in the container can also be accessed using a DAL SIAP service.

    <node uri="vos://xxxx">
        ....
        <provides>
            <!--+
                | Provide access to the images in this container using a SIAP service.
                +-->
            <view uri="ivo://net.ivoa.dal/core/siap-1.0">
                <endpoint>http://host/service</endpoint>
            </view>
        </provides>

    </node>

This would enable an astronomer to create a container in vospace, drop some images into it, and then query the set of images using the SIAP interface.

Votes

name voteSorted ascending comment
DaveMorris +1 proposer

ContainerNode

From the introduction above : this cannot hold any data (no bytes) but can have .... views for container level formatting (aggregate zip/gzip)

So, we have :

  • A ContainerNode may have child nodes
  • A ContainerNode cannot hold any data
  • A ContainerNode may have a list of views for accessing aggregated data.

The 'no data' part is (backend) storage specific, and should not be part of the external interface The specification should define what an external actor sees, not the internal implementation details.

Note - if it has a list of accepts and provides views, then to an external Actor a ContainerNode behaves the same way as a DataNode, and does indeed appear to handle data.

So the definition becomes :

  • A ContainerNode may have child nodes
  • A ContainerNode may have a list of views for accessing aggregated data.

However, I don't see the need to specify what type of data the views may or may not provide. A view that provided additional DublinCore metadata about the container itself is perfectly valid, but would be excluded by the 'aggregated data' clause.

So the definition simply becomes :

  • A ContainerNode may have child nodes
  • A ContainerNode may have a list of views.

Votes

name vote comment
DaveMorris +1 proposer

Separation of views and formats

I think it may be useful to re-visit how we represent views and data formats.

In the current schema, we attempt to combine the two concepts in one view URI. We can either represent what the object is, or we can either represent how the data is formatted. However, I don't think the combined form is capable of representing both concepts clearly enough.

Proposal is that we separate the two concepts, by representing the data format using a separate XML element and URI.

    <node uri="vos://xxxx">
        ....
        <accepts>
            <!--+
                | This node accepts images.
                +-->
            <view uri="ivo://vospace/core/view/image">
                <!--+
                    | Formatted as PNG or FITS files.
                    +-->
                <format uri="ivo://vospace/core/format/image-png"/>
                <format uri="ivo://vospace/core/format/image-fits"/>
            </view>
        </accepts>
        ....
    </node>

Votes

name vote comment
DaveMorris +1 proposer

Container views

If we registered a view URI that means "access the container as an aggregate" then we would be able to describe views of a container as an archive file.

In this example, the server is saying it can accept a zip or tar archive, and will unpack the file to create the container contents on the server.

    <node uri="vos://xxxx">
        ....
        <accepts>
            <!--+
                | This (container) node can accept an archive file.
                +-->
            <view uri="ivo://vospace/core/view/container-archive">
                <!--+
                    | Formatted as a ZIP or TAR file.
                    +-->
                <format uri="ivo://vospace/core/format/archive-zip"/>
                <format uri="ivo://vospace/core/format/archive-tar"/>
            </view>
        </accepts>
        ....
    </node>

In this example, the server can provide a zip or tar archive of the container contents.

    <node uri="vos://xxxx">
        ....
        <provides>
            <!--+
                | This (container) node can provide an archive file of the contents.
                +-->
            <view uri="ivo://vospace/core/view/container-archive">
                <!--+
                    | Formatted as a ZIP or TAR file.
                    +-->
                <format uri="ivo://vospace/core/format/archive-zip"/>
                <format uri="ivo://vospace/core/format/archive-tar"/>
            </view>
        </provides>
        ....
    </node>

Nested views and formats

If we allow archive formats to contain nested views and formats, then we can specify the type of files within the archive.

In this example, the server can provide a zip or tar archive containing FITS or PNG image files.

    <node uri="vos://xxxx">
        ....
        <provides>
            <!--+
                | This (container) node can provide an archive file of the contents.
                +-->
            <view uri="ivo://vospace/core/view/container-archive">
                <!--+
                    | Formatted as a ZIP file.
                    +-->
                <format uri="ivo://vospace/core/format/archive-zip">
                    <!--+
                        | Containing images.
                        +-->
                    <view uri="ivo://vospace/core/view/image">
                        <!--+
                            | Formatted as PNG or FITS files.
                            +-->
                        <format uri="ivo://vospace/core/format/image-png"/>
                        <format uri="ivo://vospace/core/format/image-fits"/>
                    </view> 
                </format>
            </view>
        </provides>
        ....
    </node>

Votes

name vote comment
DaveMorris +1 proposer

Generated filenames

At the last IVOA meeting, we discussed service generated names. The fact that vos://null does not work for containers, and the suggestion to use ".auto" or "/" as an alternative.

As far as I can remember, the decision at the time was to use a trailing "/" on the destination filename. I would like to raise one possible issue.

If we have a command line client, based on the conventional behavior for most command line applications the user would probably expect the following command

cp data.xml vos://service/path/
to create a file called vos://service/path/data.xml in the vospace, rather than generate a new filename for it vos://service/path/5116-8621

-- DaveMorris - 26 Sep 2007

Protocol and view parameters

In the current schema, protocol and View parameters are referred to by name.
    <view uri="ivo://net.ivoa.vospace/views#example">
        <param name="user-dn">cn=xx,dc=yy,dc=zz</param>
    </view>

In addition, the recommended practice is to register a set of View or Protocol definitions in the same registry document, using the # fragment identifier to select a specific View within the document.

Having used the # fragment identifier to select the View, this leaves us with a minor problem of how to refer to a specific parameter.

In addition, looking forward to the possibility of refactoring VOSpace as a REST service in the future, it may be useful to model a View using the same syntax as a Node with Property(s). At the moment, this would not be possible, as a Node Property is identified by a URI, but a View parameter is identified by name.

The proposal is to change the schema to use the same syntax as Node Properties for both View and Protocol Parameters. So the above example becomes :

    <view uri="ivo://net.ivoa.vospace/views#example">
        <property uri="ivo://net.ivoa.vospace/property#user-dn">cn=xx,dc=yy,dc=zz</property>
    </view>
Changing the View param element into a standard property element.

Defining the parameters as properties, outside the scope of a specific View or Protocol allows us to use the same definition for common properties that occur in different Views or Protocols, without having to re-define them each time.

Votes

name vote comment
DaveMorris +1 proposer

Core set of properties, protocols and views

Part of the GWS session at the interop meeting will discuss what should be included in the core set of vospace properties, protocols and views.

Discussion page for the core list is here

-- DaveMorris - 27 Sep 2007


Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r13 - 2008-01-29 - MatthewGraham
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback