Difference: EpnTapWordLists (1 vs. 2)

Revision 22023-12-11 - MarkusDemleitner

 
META TOPICPARENT name="InterOpNov2023SSIG"

Word Lists in EPN-TAP

This is a list of EPN-TAP columns that would profit from or already have well-defined lists of concept or entity names. The hope is that we can identify, in this session, particular spots of trouble and perhaps even realistic steps for mitigation.

This effort started during the SSIG session at the Fall 2023 Tucson interop. This page is intended to let us track the progress of the various efforts.

Deleted:
<
<
 

spatial_coordinate_description

"Spatial_coordinate_description provides an acronym of the Coordinate Reference System as discussed in the extension and vocabulary page (http://hdl.handle.net/21.15110/epn_tap_extensions) under Planetary Coordinate Systems.

Current Steps

Task: identify recommended refframe per body from what ogc (or the other bodies BC has listed above) does.

Action: BC: will do a VEP for heliophysics RSN.

Discussion

(please crop after a while -- the Wiki has history if push comes to shove)

Markus

Added:
>
>
 If we want to enable meaningful and halfway exhaustive spatial queries in EPN-TAP (and the Registry), we have to force people to use designated frames per body. This is discovery: some loss of precision is acceptable. But we need to define the canonical frames, presumably in http://www.ivoa.net/rdf/refframe.

Baptiste

Changed:
<
<
>
>
 

Stéphane

For discovery on surfaces, something like Mars_IAU could do (successive versions are more and more refined; we currently have some Mars_IAU2000 because this version solved an ambiguity). This is not necessarily true for small bodies however, never resolved before a flyby/orbit.

Need to have frame names consistent / agreed with OGC, at least for body-fixed frames on major bodies - we want them to implement our frames in Earth Observation tools! As Baptiste says, this is in progress (led by USGS and CNES), but anything not Earth is not foreseen in OGC. There was a proposal by Trent Hare some years ago, which also covered planetocentric coordinates, etc (Hare et al 2018, doi : 10.1016/j.pss.2017.04.004 + Malapert & Hare presentation at IVOA spring Interop 2023)

We probably need spacecraft frames for in situ measurements (plasmas, but also landers/rovers), this becomes tricky (even if we use SPICE frames we still have many variations, starting with s/c vs instrument) [BC: For plasma data at mars, IAU_MARS is probably not used at all. Need to define the recommended frame for a given field.]

[MD: also note that for "exotic" frames there's probably no discovery case anyway; people would probably look for "data for probe X at time Y" rather than the actual coordinates, so the basic use case here would probably be "keep this from contaminating unrelated searches). It would still be good if that data turned up in sufficiently fuzzy searches from outsiders...]

Anne

Attempting to require a specific frame for reference is problematic, in that frames are defined and used for a number of reasons - in particular because of chronology for small bodies, e.g. For discovery, reference fram refinements fequently are not significant, but frame might be dictated by the kind of data being sought (fields data vs surface geology, for example).

Deleted:
<
<
 

spatial_origin

"Spatial_origin may be used to identify frame center in specific situations, using either target_name (referring to target center) or the IVOA vocabulary (http://www.ivoa.net/rdf/refposition)"

Current Action

As obsfacil drafts come online (BC), create wider terms as appropriate (MD). Add refpositions for the major planets where data is already present in EPNTAP (MD)

Discussion

MD has talked about that in Paris: https://wiki.ivoa.net/internal/IVOA/InterOpMay2019TDIG/topocenter.pdf.

There is http://www.ivoa.net/rdf/refposition;

  • extend to all IAU names (planetary bodies) ?
  • extend to some ObsFacility names (e.g., spacecraft names)?

The rough terms ("L1") are useful and already used in SPASE. But we really need spacecraft; so, we'll need links from obsfacil into these.

SE: Remember that this doesn't tell where the observer is, but how C1/2/3 coordinates are provided. In general, I would expect spatial_coordinate_description to provide the complete information. Spatial_origin is intended to solve ambiguities or variations on spatial_coordinate_description - eg, BARYCENTER vs Sun center (currently only used to tell that C3 is measured from the surface, not the center). Therefore we need to have spacecraft frames available in spatial_coordinate_description.

time_refposition

This one tells where the time is measured - e.g., can be on Earth or at the spacecraft. The default is the observer location but we need to accommodate other origins, eg, to remove ambiguities when comparing multiple observations of the same event from different locations. We also have target_time_min/max to provide time at target (the common reference).

See spatial_origin

time_scale

"values are preferably taken from the IVOA vocabulary: http://www.ivoa.net/rdf/timescale"

Current action

MD would like to make this a MUST and tell people to complete the

Changed:
<
<
timescale vocabulary as needed; he'll do a PR to that effect.
>
>
timescale vocabulary as needed. This is EPN-TAP PR #36: https://github.com/ivoa-std/EPNTAP/pull/36
 

map_projection

"Provides map projection description (preferably as FITS name or code) or parameters as a free string [...proj4]"

Action

MD writes a PR to the effect of this parameter is not for discovery. We should explain that much. Let's see whether that then gets flamed - if so, perhaps we can define the discovery cases.

Discussion

Anne: Map projection is a selection criteria for mosaics, maps, and Analysis-Ready Data (ARD) in PDS. I am rather surprised that there seems to be a general feeling that map projections cannot be itemized and identified by their standard names as used by planterary cartographers. Like: https://www.icsm.gov.au/education/fundamentals-mapping/overview-fundamentals-mapping, https://futuremaps.com/blogs/news/top-10-world-map-projections, https://www.usgs.gov/publications/map-projections#:~:text=A%20map%20projection%20is%20used,is%20no%20%22best%22%20projection. etc.

Stéphane: We expected a limited number of string values (as in Marmo et al 2018), this is a discovery parameter as Anne says. But 1) some of these projections require definition parameters; 2) some providers insist to put WKT or sets of projection parameters there. This is apparently important to them, and probably required to send data to GIS applications.

dataproduct_type

(im,ma,sp,ds,sc,pr,pf...) vs. http://www.ivoa.net/rdf/product-type

Action

Let's make the vocabulary machine-readable and put it up on a handle-managed PID. Then we can do skos:exactMatch links from product-type. BC will try it. MD may come to Paris for that.

Discussion

AR: This list makes me nervous about sustainability because it conflates data structure (table, cube, image) with observation type (spectrum, imaging, particle counts, field strength, etc.). Use cases are not a substitute for analysis!!!

SE: EPNCore dataproduct_types are defined according to their dimensions and the nature of their axes; there are no measurement types here (eg, spectrum is any vector with a single axis in frequency). EPNCore types are often different from ObsCore ones.

BC: see example here: http://voparis-ontoportal-dev.obspm.fr/ontologies/EPN_DATAPRODTYPE/?p=classes&conceptid=http%3A%2F%2Fvoparis-ns.obspm.fr%2Fepn%2Fdataproduct_type%23ca I have included links to http://www.ivoa.net/rdf/product-type when applicable, as well as "seeAlso" relations for similar concepts. Note: The URI (http://voparis-ns.obspm.fr/epn/dataproduct_type#) can't be resolved at this point, but we will set this up, if we are happy.

Deleted:
<
<
 

target_class

asteroid, dwarf_planet, planet, satellite, comet, exoplanet, interplanetary_medium, sample, sky, spacecraft, spacejunk, star, calibration

Current Action

Baptiste will brainstorm about ways to integrate this with object-type.

Discussion

Markus dreams of having this unified with the draft http://www.ivoa.net/rdf/object-type

Note that target_class is multi-valued in EPN-TAP.

Stéphane:

We need to remain compliant with the types used in the quaero resolver

  • Rings are regions of a target in EPNCore (not a type)
  • Planetary fields are also considered region of a target rather than a target; solar wind and interplanetary fields are covered by interplanetary_medium
Changed:
<
<
  • Asteroids with satellites are asteroids - Retrieving all asteroids with satellites is expected to be the scope of a service (with a column providing the satellites), rather than a parameter value
  • EPNCore has calibration as target_type, indeed important in a dataset (if only to filter them) - this also covers dark frames, flat-fields, etc
>
>
  • Asteroids with satellites are asteroids - Retrieving all asteroids with satellites is expected to be the scope of a service (with a column providing the satellites), rather than a parameter value
  • EPNCore has calibration as target_type, indeed important in a dataset (if only to filter them) - this also covers dark frames, flat-fields, etc
 
  • What is probably missing here are disks (exodisks)
  • But we don't want to open this parameter to the complexity of object_type - this is all covered by star (calibration targets) and sky.

event_type

"A more complete list of values will be maintained separately."

Current Action

We're waiting for VOEvent

Discussion

Markus: We should work with TDIG to come up with a an IVOA vocabulary for that

Stéphane: Waiting for a stable version of VOevent; we can contribute.

species

"The formatting is very basic and simply uses the standard formula in ascii, e.g., H2O for water, CO2 for carbon dioxide or Fe for iron."

Markus: Writing chemical species is one of the main concerns in LineTAP. Since there's also species_inchikey, perhaps we should define this r/o and (effectively) freetext?

filter

"There is no predefined list, because of the large variety of possible denominations, but the best practice is to use a short and accurate ID."

There is the filter profile service, which is the obvious solution. However, we cannot impose an ID from an external service, because we also aim at hosting data from small telescopes, including amateurs'.

Deleted:
<
<
 

feature_name

"Use of official features names defined by IAU (http://planetarynames.wr.usgs.gov/) is preferred when relevant."

Markus: Perhaps stronger language than "is preferred" is possible?

Deleted:
<
<
 

target_region

"Values are best selected from the IVOA rendition of the Unified Astronomy Thesaurus: http://www.ivoa.net/rdf/uat"

Markus would rather use stronger language here and push people to complete the UAT.

Deleted:
<
<
 

dynamical_class

(of small bodies): "See the extension and vocabulary page (https://hdl.handle.net/21.15110/epn_tap_extensions) under Small bodies sub-types.")

Markus thinks that's too exotic for an IVOA vocabulary; but making this properly machine readable (perhaps using vocabulary tooling) would still be nice.

Like all current extensions this is in discussion and not finalized - waiting for inputs from enough teams / services. The idea it to keep it human-editable in this period, if we hope to gather inputs from non-VO people...

dynamical_type

"introduces a subdivision of the above, from an enumerated list"

Markus sees this as for dynamical_class, only more so.

Deleted:
<
<
 

geometry_type

(of how a spectrum was taken) "Possible values are maintained on the extension and vocabulary page"

Markus: as for dynamical_class

Deleted:
<
<
 

spectrum_type

Markus: as for dynamical_class

Deleted:
<
<
 

observer_institute, observer_country

Markus on country: if relevant, we should use ISO country codes

Deleted:
<
<
 

producer_institute

Markus: obsfacil?

Deleted:
<
<
 

instrument_host_name

Markus is hoping for obsfacil here.

Stéphane does not believe we can expect data providers to conform to a VO standard here (space agencies, research teams…). Variations / aliases should be handled by a name resolver, like target_name.

instrument_name

Let's talk about this when we have something tangible for instrument_host_name

Deleted:
<
<
 

sample_classification

(of a spectroscopy specimen) "Provides composition as group, class, sub-class, etc, of sample", example given:

natural#solid#earth#mineral#unclassified#nesosilicate#unclassified#olivine

Deleted:
<
<
 Stephane: hash-lists are a very handy way to avoid hierarchical descriptions, though - those never work

processing_level

Table p. 13: (1..6)

Nobody want to touch this for now.

target_name

"The best practice is to use the official designation of the target as defined by IAU [...] Other best practicies are listed below:"

Stéphane: This refers to different kind of targets - IAU is not in charge of everything, certainly not samples.

messenger

Already using an IVOA vocabulary.

Deleted:
<
<
 
<-- * Set ALLOWTOPICRENAME = TWikiAdminGroup -->

Revision 12023-11-27 - MarkusDemleitner

 
META TOPICPARENT name="InterOpNov2023SSIG"

Word Lists in EPN-TAP

This is a list of EPN-TAP columns that would profit from or already have well-defined lists of concept or entity names. The hope is that we can identify, in this session, particular spots of trouble and perhaps even realistic steps for mitigation.

This effort started during the SSIG session at the Fall 2023 Tucson interop. This page is intended to let us track the progress of the various efforts.

spatial_coordinate_description

"Spatial_coordinate_description provides an acronym of the Coordinate Reference System as discussed in the extension and vocabulary page (http://hdl.handle.net/21.15110/epn_tap_extensions) under Planetary Coordinate Systems.

Current Steps

Task: identify recommended refframe per body from what ogc (or the other bodies BC has listed above) does.

Action: BC: will do a VEP for heliophysics RSN.

Discussion

(please crop after a while -- the Wiki has history if push comes to shove)

Markus

If we want to enable meaningful and halfway exhaustive spatial queries in EPN-TAP (and the Registry), we have to force people to use designated frames per body. This is discovery: some loss of precision is acceptable. But we need to define the canonical frames, presumably in http://www.ivoa.net/rdf/refframe.

Baptiste

Stéphane

For discovery on surfaces, something like Mars_IAU could do (successive versions are more and more refined; we currently have some Mars_IAU2000 because this version solved an ambiguity). This is not necessarily true for small bodies however, never resolved before a flyby/orbit.

Need to have frame names consistent / agreed with OGC, at least for body-fixed frames on major bodies - we want them to implement our frames in Earth Observation tools! As Baptiste says, this is in progress (led by USGS and CNES), but anything not Earth is not foreseen in OGC. There was a proposal by Trent Hare some years ago, which also covered planetocentric coordinates, etc (Hare et al 2018, doi : 10.1016/j.pss.2017.04.004 + Malapert & Hare presentation at IVOA spring Interop 2023)

We probably need spacecraft frames for in situ measurements (plasmas, but also landers/rovers), this becomes tricky (even if we use SPICE frames we still have many variations, starting with s/c vs instrument) [BC: For plasma data at mars, IAU_MARS is probably not used at all. Need to define the recommended frame for a given field.]

[MD: also note that for "exotic" frames there's probably no discovery case anyway; people would probably look for "data for probe X at time Y" rather than the actual coordinates, so the basic use case here would probably be "keep this from contaminating unrelated searches). It would still be good if that data turned up in sufficiently fuzzy searches from outsiders...]

Anne

Attempting to require a specific frame for reference is problematic, in that frames are defined and used for a number of reasons - in particular because of chronology for small bodies, e.g. For discovery, reference fram refinements fequently are not significant, but frame might be dictated by the kind of data being sought (fields data vs surface geology, for example).

spatial_origin

"Spatial_origin may be used to identify frame center in specific situations, using either target_name (referring to target center) or the IVOA vocabulary (http://www.ivoa.net/rdf/refposition)"

Current Action

As obsfacil drafts come online (BC), create wider terms as appropriate (MD). Add refpositions for the major planets where data is already present in EPNTAP (MD)

Discussion

MD has talked about that in Paris: https://wiki.ivoa.net/internal/IVOA/InterOpMay2019TDIG/topocenter.pdf.

There is http://www.ivoa.net/rdf/refposition;

  • extend to all IAU names (planetary bodies) ?
  • extend to some ObsFacility names (e.g., spacecraft names)?

The rough terms ("L1") are useful and already used in SPASE. But we really need spacecraft; so, we'll need links from obsfacil into these.

SE: Remember that this doesn't tell where the observer is, but how C1/2/3 coordinates are provided. In general, I would expect spatial_coordinate_description to provide the complete information. Spatial_origin is intended to solve ambiguities or variations on spatial_coordinate_description - eg, BARYCENTER vs Sun center (currently only used to tell that C3 is measured from the surface, not the center). Therefore we need to have spacecraft frames available in spatial_coordinate_description.

time_refposition

This one tells where the time is measured - e.g., can be on Earth or at the spacecraft. The default is the observer location but we need to accommodate other origins, eg, to remove ambiguities when comparing multiple observations of the same event from different locations. We also have target_time_min/max to provide time at target (the common reference).

See spatial_origin

time_scale

"values are preferably taken from the IVOA vocabulary: http://www.ivoa.net/rdf/timescale"

Current action

MD would like to make this a MUST and tell people to complete the timescale vocabulary as needed; he'll do a PR to that effect.

map_projection

"Provides map projection description (preferably as FITS name or code) or parameters as a free string [...proj4]"

Action

MD writes a PR to the effect of this parameter is not for discovery. We should explain that much. Let's see whether that then gets flamed - if so, perhaps we can define the discovery cases.

Discussion

Anne: Map projection is a selection criteria for mosaics, maps, and Analysis-Ready Data (ARD) in PDS. I am rather surprised that there seems to be a general feeling that map projections cannot be itemized and identified by their standard names as used by planterary cartographers. Like: https://www.icsm.gov.au/education/fundamentals-mapping/overview-fundamentals-mapping, https://futuremaps.com/blogs/news/top-10-world-map-projections, https://www.usgs.gov/publications/map-projections#:~:text=A%20map%20projection%20is%20used,is%20no%20%22best%22%20projection. etc.

Stéphane: We expected a limited number of string values (as in Marmo et al 2018), this is a discovery parameter as Anne says. But 1) some of these projections require definition parameters; 2) some providers insist to put WKT or sets of projection parameters there. This is apparently important to them, and probably required to send data to GIS applications.

dataproduct_type

(im,ma,sp,ds,sc,pr,pf...) vs. http://www.ivoa.net/rdf/product-type

Action

Let's make the vocabulary machine-readable and put it up on a handle-managed PID. Then we can do skos:exactMatch links from product-type. BC will try it. MD may come to Paris for that.

Discussion

AR: This list makes me nervous about sustainability because it conflates data structure (table, cube, image) with observation type (spectrum, imaging, particle counts, field strength, etc.). Use cases are not a substitute for analysis!!!

SE: EPNCore dataproduct_types are defined according to their dimensions and the nature of their axes; there are no measurement types here (eg, spectrum is any vector with a single axis in frequency). EPNCore types are often different from ObsCore ones.

BC: see example here: http://voparis-ontoportal-dev.obspm.fr/ontologies/EPN_DATAPRODTYPE/?p=classes&conceptid=http%3A%2F%2Fvoparis-ns.obspm.fr%2Fepn%2Fdataproduct_type%23ca I have included links to http://www.ivoa.net/rdf/product-type when applicable, as well as "seeAlso" relations for similar concepts. Note: The URI (http://voparis-ns.obspm.fr/epn/dataproduct_type#) can't be resolved at this point, but we will set this up, if we are happy.

target_class

asteroid, dwarf_planet, planet, satellite, comet, exoplanet, interplanetary_medium, sample, sky, spacecraft, spacejunk, star, calibration

Current Action

Baptiste will brainstorm about ways to integrate this with object-type.

Discussion

Markus dreams of having this unified with the draft http://www.ivoa.net/rdf/object-type

Note that target_class is multi-valued in EPN-TAP.

Stéphane:

We need to remain compliant with the types used in the quaero resolver

  • Rings are regions of a target in EPNCore (not a type)
  • Planetary fields are also considered region of a target rather than a target; solar wind and interplanetary fields are covered by interplanetary_medium
  • Asteroids with satellites are asteroids - Retrieving all asteroids with satellites is expected to be the scope of a service (with a column providing the satellites), rather than a parameter value
  • EPNCore has calibration as target_type, indeed important in a dataset (if only to filter them) - this also covers dark frames, flat-fields, etc
  • What is probably missing here are disks (exodisks)
  • But we don't want to open this parameter to the complexity of object_type - this is all covered by star (calibration targets) and sky.

event_type

"A more complete list of values will be maintained separately."

Current Action

We're waiting for VOEvent

Discussion

Markus: We should work with TDIG to come up with a an IVOA vocabulary for that

Stéphane: Waiting for a stable version of VOevent; we can contribute.

species

"The formatting is very basic and simply uses the standard formula in ascii, e.g., H2O for water, CO2 for carbon dioxide or Fe for iron."

Markus: Writing chemical species is one of the main concerns in LineTAP. Since there's also species_inchikey, perhaps we should define this r/o and (effectively) freetext?

filter

"There is no predefined list, because of the large variety of possible denominations, but the best practice is to use a short and accurate ID."

There is the filter profile service, which is the obvious solution. However, we cannot impose an ID from an external service, because we also aim at hosting data from small telescopes, including amateurs'.

feature_name

"Use of official features names defined by IAU (http://planetarynames.wr.usgs.gov/) is preferred when relevant."

Markus: Perhaps stronger language than "is preferred" is possible?

target_region

"Values are best selected from the IVOA rendition of the Unified Astronomy Thesaurus: http://www.ivoa.net/rdf/uat"

Markus would rather use stronger language here and push people to complete the UAT.

dynamical_class

(of small bodies): "See the extension and vocabulary page (https://hdl.handle.net/21.15110/epn_tap_extensions) under Small bodies sub-types.")

Markus thinks that's too exotic for an IVOA vocabulary; but making this properly machine readable (perhaps using vocabulary tooling) would still be nice.

Like all current extensions this is in discussion and not finalized - waiting for inputs from enough teams / services. The idea it to keep it human-editable in this period, if we hope to gather inputs from non-VO people...

dynamical_type

"introduces a subdivision of the above, from an enumerated list"

Markus sees this as for dynamical_class, only more so.

geometry_type

(of how a spectrum was taken) "Possible values are maintained on the extension and vocabulary page"

Markus: as for dynamical_class

spectrum_type

Markus: as for dynamical_class

observer_institute, observer_country

Markus on country: if relevant, we should use ISO country codes

producer_institute

Markus: obsfacil?

instrument_host_name

Markus is hoping for obsfacil here.

Stéphane does not believe we can expect data providers to conform to a VO standard here (space agencies, research teams…). Variations / aliases should be handled by a name resolver, like target_name.

instrument_name

Let's talk about this when we have something tangible for instrument_host_name

sample_classification

(of a spectroscopy specimen) "Provides composition as group, class, sub-class, etc, of sample", example given:

natural#solid#earth#mineral#unclassified#nesosilicate#unclassified#olivine

Stephane: hash-lists are a very handy way to avoid hierarchical descriptions, though - those never work

processing_level

Table p. 13: (1..6)

Nobody want to touch this for now.

target_name

"The best practice is to use the official designation of the target as defined by IAU [...] Other best practicies are listed below:"

Stéphane: This refers to different kind of targets - IAU is not in charge of everything, certainly not samples.

messenger

Already using an IVOA vocabulary.


<-- * Set ALLOWTOPICRENAME = TWikiAdminGroup -->
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback