MeasRFC2 < IVOA

TWiki>

IVOA Web>IvoaDataModel>MeasRFC2 (2022-07-21, MarkCresitelloDittmar)

EditAttach

STC2:Meas Proposed Recommendation: Request for Comments

NOTICE : This RFC page replaces RFC#1

Rationale for a second RFC round:

Many comments have been collected after RFC #1. Some of them just required text improvements but some others implied significant model changes, especially for Coords.
A global RFC answer has been sent to WG mailing list in February 2020: answer
These data models should support the description of datasets and DAL responses, by defining fundamental elements which are commonly used in many contexts. The intent is that they be imported as components in these more complex models so that they all build from the same basis, thereby enhancing interoperability.
They should NOT be expected to fully support any use-cases outside of the described set. For example, they cannot currently support the complex error types found in various Catalogs. These use-cases are to be considered in future updates to the models.
Measurements cannot be used without Coords because the latest is imported by Measurements but Coords can be used in some other contexts e.g. Transform.
The nature of the changes in Coords, that impacts Measurements thus, led the TCG to decided on 2020-10-8 a second RFC round for both models.

Summary

Version 1 of STC was developed in 2007, prior to the development and adoption of vo-dml modeling practices. As we progress to the development of vo-dml compliant component models, it is necessary to revisit those models which define core content. Additionally, the scope of the STC-1.0 model is very broad, making a complete implementation and development of validators, very difficult. As such it may be prudent to break the content of STC-1.0 into component models itself, which as a group, cover the scope of the original.

This effort will start from first principles with respect to defining a specific project use-case, from which requirements will be drawn, satisfied by the model, and implemented in the use-case. We will make use of the original model to ensure that the coverage of concepts is complete and that the models will be compatible. However, the form and structure may be quite different. This model will use vo-dml modeling practices, and model elements may be structured differently to more efficiently represent the concepts.

This model covers the description of measured or determined astronomical data, and includes the following concepts:

The association of the determined ’value’ with corresponding errors. In this model, the ’value’ is given by the various Coordinate types of the Coordinates data model (Rots and Cresitello-Dittmar et al., 2019).
A description of the Error model.

The Latest version of the model and supporting docs:

RFC2 Document version (IVOA repository)
- PDF, schema, vodml/xml, html (hosted on volute)
Post-RFC2 Document version (IVOA repository)
- PDF, schema, vodml/xml, html
Development Version: (Git repository)
- Post RFC2 Tag

Implementation Requirements

(from DM Working group twiki):

The "IVOA Document Standards" standard has a broad outline of the implementation requirements for IVOA standards. These requirements fit best into the higher level standards for applications and protocols than for data models themselves. At the Oct 2017 interop in Trieste, the following implementation requirements for Data Model Standards was agreed upon, which allow the models to be vetted against their requirements and use cases, without needing full science use cases to be implemented.

VO-DML models must validate against schema
Serializations which touch each entity of the model. These serializations may be 'fake' (ie: not based on actual data files), and are to be produced by the modeler as unit tests/examples.
Real world serializations covering use cases, produced by others following the model, in a mutually agreed upon format.
Software which interprets these serializations and demonstrates proper interpretation of the content

Serializations:

Example serializations:
- Annotated VOTables:
  - all model elements as VOTable files annotated to the VODML Mapping Syntax ( WD:20170323), produced by Jovial software package.
    - coordinates model elements: here
      - includes xml, and jovial dsl files
    - measurement model elements: here
      - includes xml, and jovial dsl files
    - transform model elements: here
      - includes xml, and jovial dsl files
- Various Formats:
  - independent python code, generated example serializations spanning all elements of the models in 4 formats:
    - *.vot: VOTable-1.3 standard syntax
      - Validates using votlint
    - *.avot: VOTable-1.3 annotated with VO-DML/Mapping syntax
      - Validates using xmllint to a VOTable-1.3 schema enhanced with an imported VO-DML mapping syntax schema
    - *.xml: XML format
      - Validates against the model schema
    - *.xxx: An internal DOC format
      - XML/DOM structure representing the instances generated when interpreting the templates.
  - measurement model elements: here
  - coordinates model elements: here
Usage implementations include annotated serializations of the case data files
Cube: Example files for Cube ( nD-Image and Sparce Cube ) incorporate Measurement and Coordinate model instances

Software:

A detailed study was performed to determine the compatibility of the Meas/Coords data models to the AstroPy package, a popular Python package with intensive support for Space and Time coordinates.

The results of this comparison are described in the STC2 wiki page.

In addtion, several software packages have been developed which generate/manipulate Coordinates model elements.

Jovial: A Java toolset that helps build and generate serializations for VODML compliant data models.
- Version used for DM workshop and example file generation: https://github.com/mcdittmar/jovial
- Original implementation by Omar Laurino: https://github.com/olaurino/jovial
Rama: Python package, parses annotation and instantiates instances of model classes. Includes adaptors to AstroPy classes.
- Version used for DM workshop: https://github.com/mcdittmar/rama
- Original implementation by Omar Laurino: https://github.com/olaurino/rama
- Understands 'VODML Mapping syntax' annotations ( WD:20170323)
ModelInstanceInVOT Code: Python package for processing annotated VOTables
- Implementation by Laurent Michel: https://github.com/ivoa/modelinstanceinvot-code
- Understands 'Merged syntax' annotations ( WD:in progress )
- A notebook show how that code can be used.
TDIG: Working project of Time Series as Cube.
- An effort to enhance SPLAT to load/interpret/analyze TimeSeries data using data annotation
  - the tool was enhanced to use new annotations (eg: TIMESYS, UTypes) to identify and interpret the data automatically.
- Delays in resolving on a standard annotation syntax has hindered progress on this project to fully realize the possibilities. This is a high-priority for upcoming work.
pyVO: extract_skycoord_from_votable()
- Demonstrated in Paris this product of the hack-a-thon generates AstroPy SkyCoord instances from VOTables using various elements embedded in the VOTable.
  - Interrogates a VOTable, identifies key information and uses that to automatically generate instances of SkyCoord.
    - UCD: 'pos.eq.ra', 'pos.eq.dec'
    - COOSYS.system: "ICRS", "FK4", "FK5"
    - COOSYS.equinox
  - The COOSYS maps directly to SpaceFrame, and the value of the system
  - The UCD 'pos.eq' maps directly to meas:EquatorialPosition; with 'pos.eq.ra|dec' identifying the corresponding attributes (EquatorialPosition.ra|dec) as coordinates coords:Longitude and coords:Latitude.
  - This illustrates that even with minimal annotation, this sort of automatic discovery/instantiation can take place. With a defined annotation syntax, this utility could be expanded to generate other AstroPyobjects very easily.

Validators

As noted above, the serializations may be validated to various degrees using the corresponding schema

VOTable-1.3 using votlint: verifies the serialization complies with VOTable syntax
VOTable-1.3 + VODML: verifies the serialization is properly annotated
XML using xmllint with model schema: verifies the serialization is a valid instance of the model.
NOTE: The modeler examples undergo all levels of validation, showing that the VOTable serializations are also valid instances of the model.

I don't believe there are validators for the various software utilities. Their purpose is to show that given an agreed serialization which can be mapped to the model(s), the data can be interpreted in an accurate and useful manner.

Usage

In the period since the close of the RFC2 review, a great deal of effort has been made to illustrate the usability of the Meas/Coords models in the context of real world scenarios. Each have confirmed the usability of the data models, and illustrate how annotating data to models can facilitate interoperability.

These include:

Data Model Workshop - May 2021
- This Git repository contains original implementations from all participants.
DM Case Implementations
- This Git repository contains a set of use case implementations maintained to the current model suite.
  - each workshop usecase implemented using Jovial and Rama tools.
  - Jupyter notebook illustrates the case thread.
- Highlights
  - Proper Motion slider
  - Uniform TimeSeries extraction from multiple data providers
AstroPy Wrapper
- Using an AstroPy wrapper in the ModelInstanceInVOT code (see Software)
- This Git repository holds case implementations
  - Meas/Coords model elements are mapped in VOTable
  - parser interprets annotation to generate model instances, and converts them to SkyCoord instances.
  - Threads:
    - Extract positions, parallax and proper motions from ESAC archive; generate 3D plot of source positions
      - using direct Measurements model instances
      - using converted AstroPy SkyCoord instances
    - Identify annotated source positions and reconcile the coordinate frames.
    - Extract observation history of a source from ESAC XMM TAP archive, track source movement over 20 year period.
- Examples
  - https://github.com/ivoa/modelinstanceinvot-code/tree/merge-syntax/python/examples
- Notebook
  - https://mybinder.org/v2/gh/ivoa/modelinstanceinvot-code/merge-syntax
ADASS 2021 BoF - TAP and the Data Models
- This BoF discussed the possibility and benefits for TAP services to apply on-the-fly annotation of the query responses to serve not only the data, but real model instances.
- Annotated TAP responses can be consumed by software such as those described blow, to interpret the content in terms of IVOA data models, greatly enhancing the interoperability of manipulating query responses from various services in science threads.
- Conclusions of the BoF include: "This session and the following discussions 4 highlighted that TAP services can already serve hierarchical data and that serving legacy data with annotations or even Provenance instances is within our reach."
- Resources
  - ADASS programme overview
  - video recording of BoF
  - twiki page (presentations and notes)
  - proceedings on Astro-ph

Links with Coords

The Measurement model is heavily dependent on the Coordinates model (also in RFC) for its core elements. Information about its relation to the Coordinates model, and how the requirements are distributed can be found on the STC2 page

Comments from the IVOA Community during RFC/TCG review period: 2020-10-26 - 2020-12-07

Comments by Markus Demleitner

The introduction and points (1), (4), (7), (11) from my RFC 1 comments I'd still retain, and I essentially stand by the summary, which I'd update to:

I'd make the model a lot smaller, thus creating space we'll need once we tackle strict errors (i.e., explicit distributions) in earnest one day. We ought to have, perhaps NaiveMeasuremetnt (a pseudo-distribution saying "value is something like an expectation or perhaps a median, and error is something like a first moment or something like that), and perhaps AsymmetricNaiveMeasure. I'm not even convinced we have credible use cases for statError and sysError. And if we want correlations, these should be explicit as relations between, I guess, measurements (or their parameters?).

The set of Measurement types included in this model are those needed to cover the basic properties of the data cube model ( Space - position and velocity, Time, Polarization, and Generic (for most other physical properties). The error model covers the basic error patterns. We will, no doubt see more complex error types in the future, so the model is designed to be extensible without necessarily effecting the other elements of the model (ie: its easy to do a minor version update to accommodate new content). -- MarkCresitelloDittmar - 2022-02-15

The way to exercise that is to have a few data centers annotate a representative part of their tables; I'm not saying they need to be able to precisely annotate everything -- if it's enough for the "automatically plot error bars" use case, I'd say that's a success on which we can build.

The DM workshop cases include examples where Symmetrical, Bounds, and Ellipse are used. The TimeSeries case in particular, plots data with their associated errors. -- MarkCresitelloDittmar - 2022-02-15

On the changes since PR1, the recent addition "The current model assumes Gaussian distributions with shapes defined at the 68% confidence level" we really shouldn't do -- it's a lot more than we can confidently claim about most of our data holdings. At this point, I think we can only say "what we're doing here are rough error bars". Going for actual distributions is something for when we know what we'll do with them, and an "automatic error propagation" use case, to me, is a bit ambitious when we can't even plot error bars at this time.

This text was specifically requested to be added to the model description in the PR1 comments. -- MarkCresitelloDittmar - 2022-02-15

Adding to point (4) of the original RFC comments -- that we shouldn't have Time, Position, Velocity, ProperMotion, and Polarization as separate classes, but instead distinguish physics by UCD as elsewhere in the VO -- people have said that's a serialisation issue; well, it's not in our current plans, where the DM types would directly sit on VOTable elements. This means that both UCD and DM type would do about the same thing. Let's avoid that.

We have added a 'ucd' element to the Measure class to allow users to specify the physical nature of the GenericMeasure. However, this does not negate the need/benefits of having specific classes for the Time, Position, etc. types which have additional associated metadata which are not available to the Generic type. It is also worth noting that any UCD included in the VOTable element would, generally, not be appropriate for usage as the GenericMeasure type specification since these often have more complicated expressions which relate to the role they play, or the frame. -- MarkCresitelloDittmar - 2022-02-15

Since I can't make out a use case for why something figuring out distributions would actually need to know physics, my preference would be to keep the whole notion outside of measurements in the first place. If such a use case were identified, I'd say adding a UCD attribute would be ok, with the understanding that some magic in our VOTable mapping would make that the UCD of the annotated FIELD, PARAM, or GROUP.

Modeling the physics in this model is the feature that will most improve our ability to interoperate. Without this, a Cube is just a series of Measured data. With it, we can see it has Time, 2 Positions, some form of Energy, etc. As stated above, the UCD associated with the FIELD is typically not going to be appropriate for the Measure. For example, a magnitude field may have 'phot.mag;em.opt.V', which includes information about the associated band. For the Measure, we simply want 'phot.mag' to identify the nature of the measure as a magnitude. The band information is found in associated model elements. -- MarkCresitelloDittmar - 2022-02-15

-- MarkusDemleitner - 2020-11-25

Comments from TCG member during the RFC/TCG Review Period: 2022-02-24 - 2022-04-10

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

TCG coordination is fine with this Proposed Recommendation because has gone a thorough review by TCG and community (twice), answered the comments and is accompanied by a proper set of serializations, validators and tools, and updates a central component of the IVOA architecture using the VO-DML standardised approach.

The TCG review and vote has been closed, the document is fine to be pushed further along the REC path to Exec evaluation. This will happen as soon as the Coordinates DM will reach the same stage given the direct dependency of Measurements DM on it.

-- JanetEvans, MarcoMolinaro - 2022-07-15

Applications Working Group

very minor comments

Previous version: : 0.x if it lead to nothing

from my validator in http://www.ivoa.net/xml/VODML/20180519/IVOA-v1.0.vo-dml.xml
tag
<identifier></identifier>
<uri> </uri>
are required before
<title>IVOA Reference Types Data Model ala VO-URP</title>

Response:

I have not seen this in my validation runs. However, this is identifying an error in the IVOA model, not in this one. The IVOA model vo-dml/XML file seems to be missing the uri node, which is required (and required to be before title as you say).

As for the previous version.. I'm not sure if you are suggesting this be removed/empty since there is no previous version?

-- MarkCresitelloDittmar - 2022-07-21

Data Access Layer Working Group

A couple of minor comments only, the standard looks good otherwise

Acknowledgements
- Still has a TBD
2.1.1 - Errors (Uncertainties)
- Suggest a comma after "In multi-dimensional data" in first sentence

Response: These have been corrected with commit ( 4b81c39) in the model document repository.

-- MarkCresitelloDittmar - 2022-06-22

-- JamesDempsey - 2022-05-23

Data Model Working Group

Being closely involved in the different steps of the model construction, I have no specific remarks to make here.

-- LaurentMichel - 2022-05-19

Grid & Web Services Working Group

Given the answer to the Semantic WG, I do not have any particular comment on this standard.

-- GiulianoTaffoni - 2022-06-27

Registry Working Group

No particular comment on this standard. -- RenaudSavalle - 2022-07-05

Semantics Working Group

By reading the proposed recommendation I (Carlo writing) am not sure about the scope of the document. Does it concern only astro-objects (galaxies, planets, stars) or also phenomenon? It is not clearly stated.

Response:

The Measurement model is a very generic base model, and may be used wherever the concepts apply.
In 'Context and Scope':

we state that the model is targeted primarily to representing Measurements in 'datasets, catalogs, and queries',
with focus on the stated use cases; which is to support N-Dimensional Cube data (images, sparse cubes, etc.).

It has also been exercised in the context of 'Catalogs' and annotating TAP responses, which is described above.

-- MarkCresitelloDittmar - 2022-05-18

For example: may I use this DM to describe a set of measurements made in the case of a laboratory astrophysics experience? In this case all the coordinates part is useless and I will need something to describe the experimental set (conditions, instruments). This part is missing.

Response:

I'm not sure I understand the case. The model should be valid for describing simulated results, since the model describes only the 'result' (ie: the measured/determined value), and not how it was obtained (conditions, instruments, settings). That information would be stored in the Observation/Dataset metadata and/or with Provenance

-- MarkCresitelloDittmar - 2022-05-18

Also if I assume that this DM is proposed only for phenomena observed up in the sky, how can we deal with measured spectral line (we have the case in the ongoing LineTAP working draft): we don’t care very much about the coordinates of the measured line, but we need to know the energy states and the emitting element. We cannot describe this measurement with this proposed DM model.

Response:

We do not expect this model to serve all possible cases, it is a Version 1. For example, the DM workshop had cases where MANGO is looking into how to support Quality flags. The error model is one area where we expect a lot of growth in the future as we support more complex cases. If LineTAP has a case which 'should' be supported by this model, we can take it as a project to fold into an update

-- MarkCresitelloDittmar - 2022-05-18

Could you please explain what is the scope and the validity/limitation domain for this recoomandation? What is this for?

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Radioastronomy Interest Group

I Think I understand why radioastronomy need something like Meas datamodel to annotate datasets and described the internal relationships of the data. I think the job is done.
4.7.2 : "so that the magnitude of the quantity is not affected by its longitudinal position" looks odd. Isn't that "latitudinal position" or some equivalent that we need ?
Bounds1D, 2D, 3D : I'm not sure I quite get what is this for ? Is it because we don't have accurate estimate of the uncertainty ? And only a range where it is lying ? and that this common to the whole distribution of measurements ? So it's not an individual quantity ? If this is what you mean it could be clearer (maybe with an example). If not, then explain it differently.
many measurements in astronomy come with an epoch of measurement. I understand that it is outside the scope of the datamodel to describe the relationship measurement/epoch. Probably it's some work for mango ? Right ?

Response:

- 4.7.2: Well.. I'd definitely like to have the right direction in there before going live; the language came directly from the last RFC comments from Operations.
  - I think it's a typo in Operation comment. The magnitude of the quantity is affected by the latitude, not by the longitude. Let's check with Mark. -- FrancoisBonnarel - 2022-07-06
- Bounds: this allows you to specify the uncertainty as a value range (A to B) rather than a delta from the Coordinate value. The assocated Coordinate holds all the relevant Frame info. The velocity instances in the example suite use Bounds.
  - I find the the text in 5.6, 5.7 , 5.8 a little confusing in that case. The word "space" is pushing me in the wrong interpretation. Suggestion "Provide the edges of the uncertainty range. rather than being relative to the associated Coordinate, these represent a range where this Coordinate value is lying." Or something like that. -- FrancoisBonnarel - 2022-07-06
- Epoch: yes, I think the association between Measures and Epoch are best left until there are specific threads being implemented which need that information.
- -- MarkCresitelloDittmar - 2022-07-05

-- FrancoisBonnarel - 2022-07-05

Solar System Interest Group

Points of confusion for me:

Section 2.1.1, "Pixelated Image Cube": I am not at all clear on what the term "pixel domain" means. In my mind, pixels are "image cells".

Response:

Exactly. The pixel domain describes the image cells (which are outside of the Measurement scope, and handled by the Coordinates model). The image cell VALUE however, is within the Measurement scope.

-- MarkCresitelloDittmar - 2022-05-18

Section 2.1.1 and subsequently: What does it mean to be "dimensionally compatible"? Are Angstroms "dimensionally compatible" with Astronomical Units, for example, because they both measure length? Are percentages "dimensionally compatible" with everything, or nothing?

Response:

I think this may be more clear after reading the document.

It simply means that a 1-dimensional measure are expected to have a 1D error (eg Symmetrical, Bounds1D).. not a 2D (Ellipse) or 3D (Ellipsoid).
The question of "Angstrom" vs "AU" is a unit-compatibility one, which is outside the scope of this model. We expect the units associated with any given Measure are compatible with the domain, and the client/consumer is responsible for verifying this.

-- MarkCresitelloDittmar - 2022-05-18

Typos:

Section 1.1, first bullet: "instramental" should be "instrumental".
Seciton 1.1, fourth bullet:"collections properties" should probably be "collections of properties"; "that source" should be "those sources" (grammatically - if the intention of the sentence is different it needs rewriting).
Section 1.1, paragraph following bullets: "and different locations" should be "and from different locations" or "and at different locations".
Section 5.1, second paragraph: "teh" should be "the"; "shpes" should be "shapes".

Response: Corrected..

-- MarkCresitelloDittmar - 2022-05-18

-- AnneRaugh - 2022-04-20

Theory Interest Group

Time Domain Interest Group

Operations

Standards and Processes Committee

TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group	Yes	No	Abstain	Comments
TCG	X
Apps	X
DAL	X
DM	X
GWS	X
Registry	X
Semantics			X
DCP	X
KDIG
RIG	X
SSIG	X
Theory
TD	X
Ops
<nop>StdProc

Topic revision: r27 - 2022-07-21 - MarkCresitelloDittmar

IVOA

Log in or Register

IVOA.net
Wiki Home
WebChanges
WebTopicList
WebStatistics

Twiki Meta & Help
IVOA
Know
Main
Sandbox
TWiki

TWiki intro
TWiki tutorial
User registration
Notify me

Working Groups

Interest Groups

Time Domain

Committees

Stds&Procs

www.ivoa.net
Documents
Events
Members
XML Schema