DatasetMetadata model version 1.0

Goal

All IVOA datsets must contain a common set of metadata elements to facilitate the registration, discovery, and interoperability of these datasets. To date, individual IVOA data models have independently defined this metadata within separate documents. This has resulted in some level of inconsistency between models, as well as document bloat. This document describes a high-level model of generic dataset metadata for the IVOA. The descriptions of many elements of this model are a result of reviewing and combining those contained in the ObsCore (1.0), Spectral (2.0), and Characterisation (1.33) models. As such, it represents a uniform, consistent description set capable of supporting the use cases covered by those models.

Participants

MarkCresitelloDittmar (editor); FrancoisBonnarel, OmarLaurino, GerardLemson, MireilleLouys, ArnoldRots, DougTody

Use Cases

As a high level model, this model will support, either directly or indirectly a wide range of use cases. It is being developed in conjunction with the NDCube data model with primary goals to support the following:

1) Provide the framework model behind ObsCore, thereby indirectly supporting the use cases of that document and the ObsTap access protocol. To clarify, ObsCore describes a flattened view of a more full and complete Observation/ObsDataset model, including only those portions necessary for interoperability and useful in support of the ObsTap access protocol. This document provides the model for datasets derived from observations. By being compatible with ObsCore, we create a consistent path from the dataset to the access protocol(s) used to discover them.

2) Provide the high level dataset metadata model in support of the NDCube model, which describes generic N-Dimensional datasets, including pixelated images to sparse hypercubes. The model must be generic and extensible to serve the same purpose for a wide variety of datasets.

3) Support SIAv2.1

Requirements

  • Structure
    • The model shall be vo-dml compliant, producing a validated vo-dml XML description.
    • shall produce documentation in standard pdf format, and as vo-dml HTML
    • shall re-use, or refer to, dependent models. Additional descriptive or context specific information may be included in the specification, but the element definitions and core descriptions remain with the originating document.
  • Scope
    • shall define high level dataset metadata suitable for re-use by specific dataset models. Specifically, it shall immediately support the NDCube model requirements.
    • dataset metadata shall be consistent in form and content with the VO Resource Metadata document, but is not required to strictly conform to the structure to facilitate reconciliation of existing usage.
    • this model shall define a placeholder Observation model and Observation Dataset specification containing additional metadata related to datasets derived from an Observation and associated with the dataset via Provenance. Examples include elements such as the Target of the observation or Instrument taking the data. The scope of this metadata shall be limited to the set required to support existing related models, namely ObsCore and Spectrum.
    • the model shall reconcile as much as possible, the existing representations in ObsCore, Spectral and Characterisation models to facilitate migration of IVOA models to a common framework.
  • Content
    • High level dataset metadata should primarily relate to the identification of a specific dataset in a curated resource (e.g. archive).
    • This includes metadata typically assigned by the creator and static
      • identifies the type of dataset
      • title, version, etc
      • who/what created it, and when
      • standard/global identifiers assocated with the dataset.
    • and metadata assigned by the curating resource or repository
      • identifies the curator and provides authority metadata
      • access privileges
      • local identifier associated with the dataset to locate it within the repository
    • Observation metadata provides specific metadata related to the content of the dataset, and how it was generated
      • identify the observation
      • identify the target of the observation
      • identify the facility which performed the observation and the instrument used to collect the data
    • ObservationDataset provides additional metadata related to the content of the dataset
      • coordinate system specifications
      • general domain characterisation, giving coverage information in all physical domain spaces

Documents

Latest document:

The latest released document can be found on the IVOA Documents and Standards page.

Volute:

The volute repository holds all revisions of the document, as well as the source open office document, diagrams, etc. here

UML Model

The model is being developed in UML, using Modelio-3.0. The model project can be obtained as a zip file in the volute repository, here.

We also provide an export of the UML specification in XMI format (version 2.4.1), which is compatible with the vo-dml xslt scripts for generating the vo-dml XML representation.

Discussion Topics

This model has spawned from earlier work of the Spectral-2.0 model and Image-1.0 models.

The bulk of the earlier discussions defining the separation and metadata components into Dataset, Observation, ObsDataset took place then and are described on the ImageDM page.

Since that time, focus has concentrated on the areas of vo-dml compliance and the generation of the STC2 model.

More recent WG topics include:

  1. Clarification of various Dataset IDs: March 2015
  2. Working draft review comments: Aug 2015 - no issues, comments folded into document
  3. ResolvedTarget proposal for EPNCore-2.0: Oct 2015 - not incorporated as outside current model scope. Proposed object is a compatible extension of the Target elements and can be defined in EPNCore until the scope of this document is expanded.
  4. vo-dml multiplicity constraints: Dec 2015 - resolved in Feb. focus meeting. Model updated to remove instances as recommended.. added Party model
  5. vo-dml enumeration specification: Jan 2016 - resolved in Feb. focus meeting. Modified enumeration description for non-compliant literal values.
  6. Summary of Bandpass element discussion: Mar 2016 - tweaked ObsConfig elements to better reflect intent of the fields
  7. Updated document feedback: Mar 2016 - minor rollbacks, moved ObsDataset into Observation package
  8. Request to add Identifier object: Apr 2016 - OPEN

Implementations

Validators

The requirements for validating a data model are still unclear. At this stage, we will provide the following:

  • VODML compliance validation via xslt script evaluation of the vodml-xml model.
  • Sample serializations generated by model-aware software. This will show the model is serializable and implementable.
User Implementations
  • Provide mapping of ObsCore elements to the Dataset model. This is not an implementation per-se, but illustrates that the requirements of use case 1 have been met. Indirectly, it could be interpreted to show that any ObsTap response can be cast to a Dataset instance.
  • Serialization of ObsDataset using the VODML Mapping specification currently in WD stage. This example will serve prototype efforts for server/parser interoperability using this specification.
  • Serialization of ObsDataset using 'utype' convention. Content will match the above VODML Mapping example. No formal set of utypes are defined for this component model, so this serialization will define a random character string set of utypes. The primary purpose here is to have an example, but can also serve to aid prototyping transition plans.
Interoperable Implementations

Dataset is a component model, as such it cannot be directly implemented in a science use case. It will be used in conjunction with other models to satisfy the requirements of the use case. The following are examples of threads/cases which the model is targeted to support.

  • To be used in SIAV2.1 thread for discovery and transport of Image data... details TBD
  • To be used in generation of NDCubes instances: NDImage - USVOA; Event - USVOA.Chandra
Topic revision: r4 - 2017-09-27 - MarkCresitelloDittmar
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback