Ensuring consistency among Data Models.
The Problem
Introduction
This particular discussion spawned from three threads in progress within the IVOA DM working group
- Dataset, Cube, Meas, Coords, etc data models in support of representing Data Products for analysis threads
- MANGO: implementation exercise with applications of the Epoch Propagation thread
- CAOM: Community interest in having this model integrated into the IVOA model landscape
These projects have brought into focus, a short-coming in the Data Model workflow that needs to be addressed if we are to continue improving interoperability while serving these targeted models to the community.
Overview
Models which are targeted to support the representation of data products must be very flexible and detailed in order to properly support the variety and complexity of those data products. However, other usages, such as data access, impose constraints which can allow for simpler model representations of the same domain space (e.g. ObsCore). If these different 'levels' of data models covering the same domain space are to coexist in the data model ecosystem, we need a mechanism to ensure that the simpler models are consistent with their more detailed counterparts. Without this verification, inconsistencies will creep into the models which will hinder our core mission of enabling interoperability.
Additionally, having multiple models covering the same domain space will make it more difficult for users searching for 'the data model for X' to know which model applies to their needs. Having a formal mechanism relating model A to model B would help to illustrate the hierarchy in the models and the clients they are targeted to serve.
References
- IVOA Interop May 2024: Joint session presentation. ( https://wiki.ivoa.net/internal/IVOA/InterOpMay2024DM/session_intro.pdf. )
- email from Marco regarding “Crosswalk” concept
- Table describing translation from one schema to another
- sounds similar to xslt scripts
- these apply to instances?
- DM mail list:
- Joint Session Followup: http://mail.ivoa.net/pipermail/dm/2024-May/006475.html
- PH: “it is generally easier to write a DM that focusses on a specific area without trying to find all the commonalities with other data models - I have done it myself with ProposalDM - in the end the only standard model that I felt that I could import was COORDS and I went ahead are defined things like Observatory and TargetSource that do appear in some other IVOA models - however, I would have been much happier if I could have imported smaller standard models for those concept.”
- Path to CAOM standardization: http://mail.ivoa.net/pipermail/dm/2024-June/006487.html
- I (PD) have consulted with new and current CAOM stakeholders and we have collectively agreed on the following path forward:
- 1. modify the current CAOM (2.4) model to align better with other IVOA models
- 2. modify CAOM to better support radio metadata (CADC and SRCNet use cases)
- 3. clear alignment so that ObsCore is a "data model view" of CAOM
- 4. deliver prototype implementations asap
- deliver WD-CAOM-2.5 by approx Sept (2024)
Use Cases
- Consistency Verification
- Model X defines an object which consolidates elements from Models A and B, which is suitable for use when certain constraints are applied. This is in keeping with the paradigm of "make simple things simple, and more complex things possible". If Model X can formally describe the relation between its consolidated object and the corresponding elements from Models A and B, then automated validation can be applied which ensures the consistency and completeness of the consolidated object.
- e.g. the MANGO EpochPosition object consolidating elements from Measurements and Coordinates.
- Translation (hypothetical)
- Carrying on from the above. With a formal relation defined between these elements, code could be automatically generated to convert instances of one to the other.
- e.g. an application using the MANGO EpochPosition object could make use of a conversion method when accessing a catalogue annotated to the full Measurements/Coordinates model descriptions.
- e.g. populate an ObsCore table from a collection of data products.
Examples
Epoch Propagation:
- MANGO:EpochPosition
- Is modeled essentially as a Measure (ObjectType containing Coordinates, with associated Errors), but is not described as such.
- Diagrams:
- MANGO object
- Corresponding Meas/Coords object(s)
- Connecting the dots.
CAOM:
- This is a much more complex case, with significant domain overlap. The following diagram illustrates the overlaps at a high level. Each will need to be examined in more detail.
The Solution:
Requirements
Questions/Issues