Difference: ObsCoreExtensionForRadioData (2 vs. 3)

Revision 32023-02-10 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaRadio"

ObsCore for radio data:

This project also discussed on github

Some considerations: Authors: A. Zanichelli, V. Galluzzi, M. Molinaro (INAF)

Here below we report some considerations mainly focused on single dish data with respect to the two draft documents: IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14) and Pulsar and FRB Radio Data Discovery and Access Version 1.0 (IVOA Note 2022-09-22).

These comments are intended for further discussion within the IVOA Radio Interest Group.

dataproduct_type

Current values for dataproduct_type as in the preliminary document Data Product Type vocabulary do not seem suitable to describe single dish observational products in order to allow efficient/successful data discovery.

Following the same “parent / narrower Term” classification, we propose the value “sdradio” to be used 1) as a parent Term for any type of single dish data or 2) as parent Term associated with a set of more specific, narrower terms identifying more precisely the various data products coming from the possible observing modes.

The value “sdradio” identifies the electromagnetic domain of the data product. We would prefer not to use a more generic “singledish” which would be more strictly related to the instrument more than the physical observable (also, single dish instruments are not used in the radio domain only).

FB answer I think you are right that the data predict type of single dish data has to be discussed. However i don't think creating a specific data product type for radio single dish data is consistent with the current concept of data_product_type which is more on the dataset structure with respect to the axes nature; (because it is important for the tools which may display, render or analyse them) so i think according to following discussion things like "cube", "spectra" , "timeseries" or "dynamic spectra" should be ok. We are still looking for some term for spectropolarimetry product. My feeling is that we need to better describe single dish datasets is new parameters such as observing type or modes or scan modes or whatever; FrancoisBonnarel - 2023-02-09

BC answer I agree with François, the dataproduct_type is about the organization of the data dimensionalities, its axes, etc, not the way it is recorded, nor the spectral range. Mixing the "radio domain", "instrument type" and "data dimensionalities" will make things difficult to separate the various semantic components. BaptisteCecconi - 2023-02-09

The optional, free-text dataproduct_subtype parameter could be used for a more detailed description of the data content. A better solution could be to use the sky_scan_mode parameter proposed in Table 1 of IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14). This last parameter offers the advantage of using a predefined vocabulary, thus avoiding the use of free text.

FB answer yes dataproduct_subtype is free text; As written above we probably have to introduce new paramters in this extension. FrancoisBonnarel - 2023-02-09

ML answer about scan mode , and other information about the way the data were obtained: These metadata belong to the observing configuration applied in the instrument to obtain the data. It makes a category by itself. This need is also caracterised for the high energy data, and it is worth to describe those parameters separately from the data producttype. It is no longer a core property in terms of data discovery , but it is very useful to radio astronomers. MireilleLouys - 2023-02-09

BC answer There are already 3 types of pointing listed in the "ObsLocTAP" standard, in a "tracking_type" keyword. The current values are: "tracking", "solar-system-object-tracking", "fixed-az-el-transit". This seems to call for a list of an external terms, which would be maintained in with Semantics WG. BaptisteCecconi - 2023-02-09

The following figure shows the main single dish observing mode:



The following table summarizes the possible values for dataproduct_type and narrower/parent terms associations.

Changed:
<
<

Term

Parent

dataproduct_subtype

sdradio

spatial-profile

#sdradio

skydip

map ()

#cube, #sdradio

on-the-fly map, raster map

on-source

#spectrum, #sdradio

frequency switching, position switching,

tracking

crosscan (*)

#spectrum, #sdradio

on-the-fly cross scan, raster cross scan

() Single dish radio maps cannot be considered as “image” dataproducts. Data are typically written in (a) table(s), each row containing coordinate positions, timestamp and raw intensity (raw counts) and further processing is required to obtain a proper image. Also, in the more general case data are not acquired on a regular 2D grid in a single map. Typical observations consist of more than one map, to be combined to recover the final image. Maps can be obtained in spectropolarimetric mode, so the most appropriate parent term seems to be “cube”.

(*) In principle the crosscan can be executed in raster mode instead of on-the-fly. For this reason the narrower term has been left more generic and the specific description is demanded to dataproduct_subtype.

>
>

Term

Parent

dataproduct_subtype

sdradio

spatial-profile

#sdradio

skydip

map ()

#cube, #sdradio

on-the-fly map, raster map

on-source

#spectrum, #sdradio

frequency switching, position switching,

tracking

crosscan (*)

#spectrum, #sdradio

on-the-fly cross scan, raster cross scan

  FB answer for sdradio : i would say this is more something like an "observation_type" . sdradio will differ from interferometry FrancoisBonnarel - 2023-02-09"
Added:
>
>
(*) Single dish radio maps cannot be considered as “image” dataproducts. Data are typically written in (a) table(s), each row containing coordinate positions, timestamp and raw intensity (raw counts) and further processing is required to obtain a proper image. Also, in the more general case data are not acquired on a regular 2D grid in a single map. Typical observations consist of more than one map, to be combined to recover the final image. Maps can be obtained in spectropolarimetric mode, so the most appropriate parent term seems to be “cube”.

FB answer sure this looks like a (sparse) cube; FrancoisBonnarel - 2023-02-10

(**) In principle the crosscan can be executed in raster mode instead of on-the-fly. For this reason the narrower term has been left more generic and the specific description is demanded to dataproduct_subtype.

 

Note that INAF has no Phased Array Feed receivers onboard the radio telescopes so we are not taking into account cases specific to beamforming techniques. Thus, more values could be needed. Do we have any PAF expert in the RadioIG?

Changed:
<
<

This approach has some advantages: the narrower terms are in principle usable also in other spectral domains, associated with appropriate parent values/dataproduct_subtype. A query may happen in a two-level mode: a generic one can be done on “sdradio” getting back all the data products associated to any narrower term; alternatively a more detailed query can be done directly on one of the narrower terms.

We are aware that this proposal is somehow different from the general VO approach because it is strongly related to a particular instrument/telescope design. However, we are motivated by the need to make single dish data discoverable in an effective manner, which could be hardly achieved by using the current ObsCore dataproduct_type values.

>
>
FB answer i think LOFAR and Nenufar people do have this; Yan ? Baptiste ? Alan ? FrancoisBonnarel - 2023-02-10
 
Added:
>
>
BC answer About phased arrays, yes, I think Alan and myself can give some inputs. BaptisteCecconi - 2023-02-10

This approach has some advantages: the narrower terms are in principle usable also in other spectral domains, associated with appropriate parent values/dataproduct_subtype. A query may happen in a two-level mode: a generic one can be done on “sdradio” getting back all the data products associated to any narrower term; alternatively a more detailed query can be done directly on one of the narrower terms.

FB answer if we consider all this is done by a (some) new parameter(s) to describe the observation (and not the product type) do we prefer several parameters or one single parameter with a hierarchy of terms ? FrancoisBonnarel - 2023-02-10

We are aware that this proposal is somehow different from the general VO approach because it is strongly related to a particular instrument/telescope design. However, we are motivated by the need to make single dish data discoverable in an effective manner, which could be hardly achieved by using the current ObsCore dataproduct_type values.

FB answer In other words : we have to distinguish the description of the observation (which is something like a provenance) from the type of the data which is important for the usage of the data; so really, again, I think we need a (some) new parameters in the extension FrancoisBonnarel - 2023-02-10

ML answer I suggest having a special extension for observing configuration. this would also fit to other domain like X rays, high energy , etc . MireilleLouys - 2023-02-10

 

em_xel and spectral MOCs

Single dish radio data may contain a multifrequency setup, that is many spectral windows disjointed on the spectral axis and with different resolutions (i.e. different numbers of spectral channels). In such a case em_xel can be computed but could lead to an incorrect interpretation of the actual spectral sampling of the dataset. In this respect, current efforts towards the creation of energy/frequency MOCs by the IVOA Applications Working Group could represent a solution. We note that frequency MOCs would also offer a comprehensive representation of the em_min, em_max and overall spectral coverage/properties of the dataset.

A further extension of the work on energy/frequency MOCs could be developed to include also the polarimetric information. In fact, two polarisation states in the same dataset could in principle have different frequency setups.

Added:
>
>
FB answer From the ObsCore/Characterisation point of view what you describe is the "support" concept which is more accurate than "bounds" (= min/max). A support is either a spatial detailed field og view or a set of intervals. MOC can be used to render those things. It would not be a new extension concept but a coding format for it. In VOTable would be rendered by the xtype attribute (xtype="moc" or "stmoc" or "emoc") FrancoisBonnarel - 2023-02-10

BC answer May be a solution would be to have a generic "multi_dim_coverage" column (or just "coverage"); which there would be a MOC (with spatial, temporal and/or spectral domains, defined by an xtype in the column header). This may cover the other comment later in the text about the variation of the field of view across the spectral window. This should be further discussed, of course, but it might be useful here. Baptiste Cecconi - 2023-02-10

 

Therefore, ObsCore mapping on radio data in general represents a proof of concept for current developments on the MOC standard.

An example of multifrequency setup is shown in the following figure: a spectroscopic observation in the so-called “zoom mode” with the Xarcos spectrograph, delivering the two circular polarisations for each spectral window.



Comments on other ObsCore parameters

We collected two examples of ObsCore fields whose interpretation appears to be different from the original IVOA prescriptions.

Facility_name and instrument

These parameters in our view should have the same meaning irrespectively if they are referred to space- atmosphere- or ground-based instrumentation. Typically a spacecraft hosts a number of different instruments similarly to what happens with ground-based telescopes. The same applies to modern balloon-borne experiments. Facilities like the ISS can be seen as equivalent to ground-based Observatories in the sense that they host different telescopes/experiments.

We propose that facility_name identifies the (observatory + telescope) hosting the instrument used to acquire the dataset, while instrument describes the acquisition system used among all those available on that telescope.

For instance: facility_name=ESO-VISTA, instrument=VIRCAM

For the radio domain, generally speaking an instrument would be composed by a number of tokens, e.g. specific filters between the frontend and the backend, the frontend and backend themselves, as well as the beamformer/correlator used.

Changed:
<
<

For single dish data we are currently describing the acquisition system with the combination frontend+backend.

s_fov

In the draft IVOA Obscore Extension for Radio data Version 1.0 s_fov is defined as “A typical value for the field of view size … λ/D where λ is the mid value of the spectral range and D is the diameter of the telescope or the largest diameter of the array antennae or telescopes.”

This appears in contrast with the ObsCore definition: “The s_fov column (1D size of the field of view) contains the approximate size of the region covered by the data product. For a circular region, this is the diameter (not the radius). For most data products the value given should be large enough to include the entire area of the observation; coverage within the bounded region need not be complete, for example if the specified FOV encompasses a rotated rectangular region. For observations which do not have a well-defined boundary, e.g. radio or high energy observations, a characteristic value should be given.”

The former is in fact computed by means of the mid value of the wavelength range, while the latter is computed by means of the maximum value of the wavelength range.

Also, we note that for low frequency aperture arrays whose stations are composed of dipole antennas the diameter D could not be given simply as a dish size but may be defined as a function of the number of dipoles, geometry, etc.

>
>

For single dish data we are currently describing the acquisition system with the combination frontend+backend.

 
Changed:
<
<

o_ucd

In the current UCD vocabulary (UCD1+ Controlled Vocabulary 1.4 https://ivoa.net/documents/UCD1+/20210616/index.html) there appear to be no primary words suitable to describe raw single dish radio data.

For pulsar data and transient radio data, o_ucd=stat.Fourier could be used, as proposed for visibility data in the Obscore extension for radio data document (v 1.0).

The use of o_ucd=phot.flux.density for raw single dish data does not seem appropriate, since the single dish measured quantity is expressed in raw counts. These counts come from the digitisation of a voltage signal generated in the receiver chain by the incoming electromagnetic field.

>
>
FB answer I think we have to follow what you recommend here; FrancoisBonnarel - 2023-02-10
 
Added:
>
>

s_fov

In the draft IVOA Obscore Extension for Radio data Version 1.0 s_fov is defined as “A typical value for the field of view size … λ/D where λ is the mid value of the spectral range and D is the diameter of the telescope or the largest diameter of the array antennae or telescopes.”

This appears in contrast with the ObsCore definition: “The s_fov column (1D size of the field of view) contains the approximate size of the region covered by the data product. For a circular region, this is the diameter (not the radius). For most data products the value given should be large enough to include the entire area of the observation; coverage within the bounded region need not be complete, for example if the specified FOV encompasses a rotated rectangular region. For observations which do not have a well-defined boundary, e.g. radio or high energy observations, a characteristic value should be given.”

FB answer this definition is loose enough that the mid value could be also valid. The result point to discussion would be "what we need this value for"; then we can decide which definition we take in our radio case; FrancoisBonnarel - 2023-02-10

BC answer The radio extension of ObsTAP already includes s_fov_min s_fov_max, which is good. For compatibility and consistency with other services, the s_fov should be filled with the "representative" value. I mean, that it is the value, that the provider considers as most representative as possible of the fov value. The provider should have a good idea of the typical value that a user would use to filter data. BaptisteCecconi - 2023-02-10

The former is in fact computed by means of the mid value of the wavelength range, while the latter is computed by means of the maximum value of the wavelength range.

Also, we note that for low frequency aperture arrays whose stations are composed of dipole antennas the diameter D could not be given simply as a dish size but may be defined as a function of the number of dipoles, geometry, etc.

o_ucd

In the current UCD vocabulary (UCD1+ Controlled Vocabulary 1.4 https://ivoa.net/documents/UCD1+/20210616/index.html) there appear to be no primary words suitable to describe raw single dish radio data.

For pulsar data and transient radio data, o_ucd=stat.Fourier could be used, as proposed for visibility data in the Obscore extension for radio data document (v 1.0).

The use of o_ucd=phot.flux.density for raw single dish data does not seem appropriate, since the single dish measured quantity is expressed in raw counts. These counts come from the digitisation of a voltage signal generated in the receiver chain by the incoming electromagnetic field.

FB answer there is phot.count : wouldn't that be ok for raw single dish data ? FrancoisBonnarel - 2023-02-10

MM answer I think it doesn't because it's not photons that are recorded by the ADC conversion of the EM field. This looks semantically different. But I need @alessandra.zanichelli@inaf.it to check if my comment is right. MarcoMolinaro - 2023-02-10

BC answer there might indeed be some semantics issue here. We had this discusssion a few years ago in the Semantics WG, and the proposed solution was to use "phot.flux.density" for both photometric flux density and EM wave flux density, since there would be no sense to have 2 terms in this case. I would say that the raw counts issue is different: "phot.count" means "Flux expressed in counts" (and this is really counting photon hits), whereas the output of an RF ADC is not photon hit counts. I'm submitting a new term for ADU (i.e., analogue to digital converter units) to the UCD group. Baptiste Cecconi - 2023-02-10

 



Specific comments on IVOA Obscore Extension for Radio data Version 1.0 (IVOA Note 2022-10-14)

  • Introduction (page 3) and thereon: when referring to single dish data format we would avoid to be too specific in referring to the SDFITS format. SDFITS is a registered standard but is not the only possible data format. Other FITS flavours which follow the FITS prescriptions (without being registered standards) are delivered by the various telescopes.

Added:
>
>
it was given as an example, we may add others to be more realistic. FrancoisBonnarel - 2023-02-10"
 
  • Sect 2.1:

  1. single dish may be equipped with multifeed receivers, we would avoid mentioning the acquisition of signal through a central beam only because this is a particular case of a more general scenario.

  2. there seems to be a mix between the definitions of s_fov and primary beam. In ObsCore s_fov is defined as the “approximate size of the region covered by the data product”. This is not the primary beam size of the antenna, for instance a map observation has an s_fov described by the approximate size of the mapped region on the sky. See also note on Sect 4.1 below.

Added:
>
>
This part has to be rewriten FrancoisBonnarel - 2023-02-10
 
  1. the typical case is not the one stated at the beginning of the section (spectrum), since the acquisition of emission for each spectral sample in the spectral band and polarization can be generally done in more complex modes (for instance, map). The following sentences well summarizes the variety of cases. We should rephrase the content, starting from the general statement and later describing specific cases.

Added:
>
>
OK FrancoisBonnarel - 2023-02-10"
 
  1. single dish multifeed receivers typically allow to cover larger spatial regions, acquiring simultaneously spectra from different positions on the sky still sharing the same spectral setup.

Added:
>
>
OK FrancoisBonnarel - 2023-02-10"
 
  • Sect 3: a typo in the first sentence: for radio data (not only for visibilities)

Changed:
<
<
  • Sect 3.2: a correspondent definition for single dish data should be given. Also: with ultrawide band receivers (for instance: 20 GHz bandwidth at 7mm) it may happen that the number of spectral windows (each with its own setup) largely increases thus translating in a multiplication of entry lines in ObsCore for the same observation. How do we plan to deal with these cases? Are we happy to have a large number of records in such cases?

>
>
OK FrancoisBonnarel - 2023-02-10"
 
Added:
>
>
  • Sect 3.2: a correspondent definition for single dish data should be given.

of course FrancoisBonnarel - 2023-02-10"

Also: with ultrawide band receivers (for instance: 20 GHz bandwidth at 7mm) it may happen that the number of spectral windows (each with its own setup) largely increases thus translating in a multiplication of entry lines in ObsCore for the same observation. How do we plan to deal with these cases? Are we happy to have a large number of records in such cases?

I think it's difficult to avoid. Or we have to group together several spectral windows and use the multi-interval support concept FrancoisBonnarel - 2023-02-10"

 
  • Sect 3.3: the definition of D as the largest diameter of the array antennae or telescopes is not correct for any telescope type. In fact, there is no main mirror size in the case of dipole arrays or beamformed data. Instead, D should be defined as the measuring system aperture scale, as already written in Sect 2.

Added:
>
>
OK FrancoisBonnarel - 2023-02-10"
 
  • Sect 3.4: it should be clarified that s_resolution is the best (smallest) spatial resolution achieved during your observation. The minimum of the spectral range should be used instead. this holds true for any radio data. See also note on Sect 4.1 below.

Added:
>
>
i think it's the same discussion than for the s_fov family. What do we expect from such a value ? FrancoisBonnarel - 2023-02-10"
 
  • Sect 3.5: last sentence could be changed to mention MOCs: “the use of MOC for s_region is strongly encouraged to gather the more accurate description of the spatial coverage”

Added:
>
>
See my above comment : MOC is an xtype FrancoisBonnarel - 2023-02-10"

 
  • Sect 3.6: see note above on single dish values for o_ucd.

Changed:
<
<
  • Sect 4.1: by ObsCore definition s_fov coincides with s_fov_max. We suggest to introduce s_fov_mid and s_fov_min. Similarly, s_resolution coincides with s_resolution_min and we suggest introducing s_resolution_min and s_resolution_max.

>
>
OK FrancoisBonnarel - 2023-02-10"
 
Added:
>
>
  • Sect 4.1: by ObsCore definition s_fov coincides with s_fov_max. We suggest to introduce s_fov_mid and s_fov_min. Similarly, s_resolution coincides with s_resolution_min and we suggest introducing s_resolution_min and s_resolution_max.

    _

we agree that in the radio case we need three concepts for fov and resolution. Which we call s_resolution and s_fov has to be discussed according to what usage we want to make of it (discovery or description ) FrancoisBonnarel - 2023-02-10"

 

Comments on ObsCore extension and Pulsar/FRB draft documents . John Tobin

1. spelling of names/definintions/utypes - antennae -> antennas, excentricity -> eccentricity or ellipticity hence uv_distribution_exc might not be appropriately named.

Added:
>
>
OK thank FrancoisBonnarel - 2023-02-10
 2. uv_distribution_fill - this definition seemed quite confusing to me as written it seems like you would always get an answer of 1/N_samples, I think it needs to be summation over i,j for n_cells with n_points >=1/n_cells. Maybe I am reading the definition incorrectly. The other issue with this definition is that it does not account for the fact that a dataset can have a large number of channels, where each is actually is own uv point. Each entry in a VO Table will be split into some number of channels, so this might need to be addressed and perhaps requires its own field. Finally, the uv filling factor will also be different depending on whether a user has continuum or spectral line observations in mind, continuum will have multi-frequency synthesis which implicitly increases its uv-coverage, while a spectral line applications will have worse uv-coverage implicitly.
Added:
>
>
I see your point, and would like to have comments coming from our Astron and JIVE colleagues who originally proposed to characterize the uv coverage this way.FrancoisBonnarel - 2023-02-10"
 3. uv_distance_max, uv_distance_min; This might not quite be fine-grained enough because you might have one really long baseline and one very short baseline, but an array is actually configured somewhere in between. Perhaps also adding a 75th percentile baseline and 50th percentile baseline distance would be useful to add to this since those values would provide more information about where most of the uv-coverage is concentrated.
Added:
>
>
Good point, we were already wondering how to estimate "effective numbers" for these two quantities in order to avoid "outliers". Your percentile is an interesting proposal to investigate. Or can we find another significant minimum and maximum estimation ? FrancoisBonnarel - 2023-02-10"
 
Added:
>
>
Well, for dense-core arrays, there might be very few "outlier" baselines, but those are a very significant addition to the core. Hence, we (NenuFAR team) would like to keep the min and max values as they are. Remember that this metadata should be filled for each observations, hence those values should contain the actual baseline min and max values for an observation, not a generic value for the instrument. Since we are building data discovery metadata, the uv coverage keywords should be consistent with each shared dataset. BaptisteCecconi - 2023-02-10

The 75 percentile uv distance is not a generic value, but could be calculated for each dataset. I get your point that dense core arrays will have fewer outlier baselines, but the density of uv points will be such that the beam one gets from imaging a dataset would be more reliably characterized by something like the 75th percentile baseline rather than the max uv distance. I think there would be value in min, max, and something in between like 75th percentile. JohnTobin - 2023-02-10

 4. sky_scan_mode - this has applications beyond just single-dish as listed because interferometry data also have single-pointing, mosaic, and on the fly mosaics. Also, for single-dish there may be other modes that are not covered like drift-scan.
Added:
>
>
OK. thanks for that. In that case we have to restructure the text. FrancoisBonnarel - 2023-02-10"
 5. s_resolution_beam_dirty - it is unclear what is intended here, whether it's to be a map of the dirty beam or the resolution of the dirty beam. If it's the resolution, this is somewhat redundant because the resolution of a cleaned map is derived from a Gaussian fit to the central core of the dirty beam, and there would be a degeneracy with what is provided by s_resolution. The dirty beam image, or psf image as CASA refers to it, is not always archived. ALMA and the NRAO do not include it in their standard image products for instance, so it is unclear how readily available this information would be for most archives.
Added:
>
>
Obviously the idea was to add a FIELD containing a link to the dirty beam map. The idea was to display it to help the user to figure out the level of quality of data. And this not a queryable column of course.FrancoisBonnarel - 2023-02-10"
 
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback