Minutes of Data Modelling Meeting Oct 16th 2003
Agenda
- Summary so far
- Use Cases - Alberto
- Reports from sub-projects decided at Cambridge meeting
- (further discussion on Spectra will continue in DAL group)
Status report
- Data Model provides abstract description of data.
- Provides basis for interoperable exchange of data and queries
- Provides basis for DAL interface design and XML metadata choices
- Process:
- White paper
- UML model
- XML representation - reference implementation
- DAL implements interface
- Summary of convergence efforts for Observation in various data models, which come from different points of view e.g. archive vs analysis tools
- IDHA, CfA, Canadian VO, GAVO, DG, (Brian Thomas - building blocks), and several others *Common Areas
- Processing
- Container
- Instrument
- Coverage
- Issues: Scope and Level of detail and naming
- Need to work on White Paper then iterate with UML diagram
- People are encouraged to send links to DM mailing list, showing their own special data model, preferably as UML plus documentation, pointing out special features, and the mapping to common areas
- Common concepts:
Keywords |
Concept/ Description |
Dataset Observation Result |
Abstract top level container for astronomical data. Could be a values of these |
Provenance obtained with experiment |
Abstract class describing how file was created |
Observatory Observing platform "Mission" |
Location where data was taken/first acquired |
Coverage Meta-coverage Sample |
Limits/boundaries on any parameters (possibly numeric with scientific units) |
Data container Quantity NDArray |
Holds information array having the same "type" - type = numeric/string/enumerated type, eventually classes - with units |
Data acquisition |
|
Coordinate system pixel mapping Reference system Mapping WCS:Frame |
Set of labels and parameters that uniquely define the context of some values |
Mapping WCS:Mapping |
Defines transformation of values between coordinate systems |
Frameset Collection of standards (Type coords) |
Contains one or more coordinate systems and zero or more mappings between them |
History |
The sequence/ensemble of "provenance" items |
Accuracy Errors |
A precise numerical measurement of the fidelity of one or more values. Example: systematic and/or statistical errors |
Quality Flagged accuracy Feature |
A classification of the reliability/fidelity of one or more values. The classification should contain two or more choices. |
We are starting with highest level concepts; once they are agreed, we will define lower level concepts. The level of detail will be guided by the requirements for querying. The model must be extensible to allow lower level commonalities to be captured. In addition this should encourage sharing of concepts.
Use Cases (Alberto Micol)
- Overview:
- Focus on most important aspects
- Need to provide a means to measure progress
- Use cases
- What should a standard data description enable?
- E.g. Data visualisation - coverage
- Various points of view:
- Queries to identify datasets of interest (CVO)
- Visualisation of data (IDHA)
- Raw of processed?
- Instrument model (ETC, LMC)
- Data Analysis (coverage, WCS, astro-spectro-photometric accuracy)
- Data packaging and formats
- How to specify interrelationships between components in file or between files
- Data provenance
- Data processing
- Observing programmes (and other Data Flow aspects)
- and many more
- Different models are different views
- AM's proposals:
- Data Packaging & formats (for data readability)
- Data coverage (for query, visualisation and analysis)
- Instrument model (registry)
- Ideas:
- Create checklist/questionnaire (example will be posted)
- Example datasets (FITS or ASCII) that from an observation - representative or extreme -
- E.g. WFPC2 associations
- Translations between data models
- Scientific use cases to provide breadth of view e.g. NVO use cases
- E.g. create catalogue on the fly from raw data
- Planetary data and other non-fixed position objects
- Multiple formats
- Theory/Simulations/Modelling
- Different types of use case:
- Software use cases - for s/w developers
- Action: BT/AM/DG to collect use cases
- Analysis also needed eventually
- People should contribute ideas to [dm-use cases] by end Nov.
Reports on mini-models
Quantity - Pat D will post summary
- there is an extensive discussion on the mailing list on how to model small amount of data - quantity
Transformations: (David Berry)
- transforming values
- transforming between coordinate systems
A language was proposed on the mailing list involving abstract classes and subclasses, including compound mappings so that arbitrarily complex mappings.
DT proposed that such transformations are the remit of analysis software; it was decided to let work be driven by requirements.
Additional ideas can be found in Radio data
- in AIPS++ MeasurementSet.
- ALMA data format and associated model
Roadmap and Action Plans
Priority:
- Spectral Data Model for SSA
- Observation data model
- Incremental "deliveries" are needed in order to have useable pieces
- Consider Packaging as a possible area of work
- Mosaic images
- There are several different ways of packaging these currently
BT proposes a process for modelling to help set priorities
Observation & Spectra etc - SLOW (some disagreement on this)
^
|
Components
^
|
Quantity & Transformations - FAST (some disagreement on this)
Use of namespaces can keep local models separate from the IVOA