| ||||||||
Changed: | ||||||||
< < | VOSpace home page | |||||||
> > | VOSpace home page | |||||||
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats. Although the VOSpace team will probably want to register VOStandard resources for the VO data formats, this discussion is not intended to be about using registry URIs vs MIME types to describe data formats. All I am suggesting at the moment is that we work together to produce a list of the core VO data formats. As far as I know, we don't have a list of the common data formats used within the VO. If we do already have one, then please let me know. Initially, this can be just a wiki page that describes the formats, provides links to the relevant RFC documents, lists the common MIME types associated with each format, specifies the VO standard MIME type(s) we want to use and describes how the formats are used within astronomy and the VO. The information collected on the wiki may form the basis for a VOStandard resource that registers the common data formats, but the initial step is to collect the information. -- DaveMorris - 29 May 2007ConceptsThe recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
Data, container and transport formatsMy initial guess is that we need two (possibly three) types of format, a data format, a container format and possibly a transport format.Data formats
Container formats
Transport formats
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format.Java JAR formatAn example of this is the Java Archive format. From the JAR file specification :
VOSpace archive formatWhat I think will be useful for astronomers is to be able to say
The following is probably not best represented by a data format. It may be better to add a separate source data set tag to the file, rather than trying to encode too much metadata in the file type. Data set specific FITS formatHaving talked about this with some of our astronomers, one thing they suggested would be useful is be to be able to define types that represent data from specific data sets. If we had a VO data type that represented FITS image, then they would like to be able to define a new type that represented a FITS image from a specific data set. This new type would extend the definition of FITS image, inheriting all of the properties and behaviour defined fo a generic FITS image, and add details of the specific FITS header fields that that particular data set used in their files. Defining this as an extension type means that tools and applications that don't understand the specific type could handle the data as generic FITS, but tools that did understand the more specific type would be able to make use of the additional metadata that described what the data set specific header fields meant. At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or in a separate property. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific data set was important. How we enable them to make this distinction is up for discussion. |
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats. Although the VOSpace team will probably want to register VOStandard resources for the VO data formats, this discussion is not intended to be about using registry URIs vs MIME types to describe data formats. All I am suggesting at the moment is that we work together to produce a list of the core VO data formats. As far as I know, we don't have a list of the common data formats used within the VO. If we do already have one, then please let me know. Initially, this can be just a wiki page that describes the formats, provides links to the relevant RFC documents, lists the common MIME types associated with each format, specifies the VO standard MIME type(s) we want to use and describes how the formats are used within astronomy and the VO. The information collected on the wiki may form the basis for a VOStandard resource that registers the common data formats, but the initial step is to collect the information. -- DaveMorris - 29 May 2007ConceptsThe recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
Data, container and transport formatsMy initial guess is that we need two (possibly three) types of format, a data format, a container format and possibly a transport format.Data formats
Container formats
Transport formats
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format.Java JAR formatAn example of this is the Java Archive format. From the JAR file specification :
VOSpace archive formatWhat I think will be useful for astronomers is to be able to say
| ||||||||
Added: | ||||||||
> > | The following is probably not best represented by a data format. It may be better to add a separate source data set tag to the file, rather than trying to encode too much metadata in the file type. | |||||||
Data set specific FITS formatHaving talked about this with some of our astronomers, one thing they suggested would be useful is be to be able to define types that represent data from specific data sets. If we had a VO data type that represented FITS image, then they would like to be able to define a new type that represented a FITS image from a specific data set. This new type would extend the definition of FITS image, inheriting all of the properties and behaviour defined fo a generic FITS image, and add details of the specific FITS header fields that that particular data set used in their files. Defining this as an extension type means that tools and applications that don't understand the specific type could handle the data as generic FITS, but tools that did understand the more specific type would be able to make use of the additional metadata that described what the data set specific header fields meant. At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or in a separate property. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific data set was important. How we enable them to make this distinction is up for discussion.<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats. Although the VOSpace team will probably want to register VOStandard resources for the VO data formats, this discussion is not intended to be about using registry URIs vs MIME types to describe data formats. All I am suggesting at the moment is that we work together to produce a list of the core VO data formats. | ||||||||
Changed: | ||||||||
< < | As far as I know, at the moment we don't have a list of the common data formats used within the VO. If we do already have one, then please let me know. | |||||||
> > | As far as I know, we don't have a list of the common data formats used within the VO. If we do already have one, then please let me know. | |||||||
Initially, this can be just a wiki page that describes the formats, provides links to the relevant RFC documents,
lists the common MIME types associated with each format, specifies the VO standard MIME type(s) we want to use
and describes how the formats are used within astronomy and the VO.
The information collected on the wiki may form the basis for a VOStandard resource that registers the common data formats,
but the initial step is to collect the information.
-- DaveMorris - 29 May 2007
ConceptsThe recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
| ||||||||
Changed: | ||||||||
< < | Data formats and container formats | |||||||
> > | Data, container and transport formats | |||||||
Added: | ||||||||
> > | My initial guess is that we need two (possibly three) types of format, a data format, a container format and possibly a transport format. | |||||||
Changed: | ||||||||
< < | ||||||||
> > | Data formats | |||||||
Deleted: | ||||||||
< < | My initial guess is that we need two types of format, a data format and a container format. | |||||||
| ||||||||
Added: | ||||||||
> > |
Many of the common data formats have multiple MIME types associated with them, and the meaning may depend on the context.
It would be useful to bring together the information about the different data formats and MIME types from the existing VO standards.
This may make it easier for new VO standards and services to refer to, and preferably re-use, information about MIME types and data formats from the existing VO standards.
One example of this is that we may already have three or four different recomended MIME type strings that mean ASCII VOTable.
Which MIME type string is used depends on what type of service you are using, and what the VOTable file contains (metadata about a set of results or the actual data itself).
Container formats | |||||||
| ||||||||
Changed: | ||||||||
< < | Ideally, we would want to be able to come up with a standard vocabulary that could represent the following concepts : | |||||||
> > | To meet the design goals of VOSpace, we need to be able to represent the following concepts to the client application : | |||||||
| ||||||||
Changed: | ||||||||
< < | I defer to others who have a much better understanding of the FITS format and its useage | |||||||
> > | I hope others who have a much better understanding of the FITS format and its useage within astronomy will be able to define this in more detail. | |||||||
Deleted: | ||||||||
< < | within astronomy to define this in more detail. | |||||||
Changed: | ||||||||
< < | Ideally, we would want to be able to come up with a standard vocabulary that could represent the following concepts : | |||||||
> > | Ideally, we would want to be able to distinguishes between the following concepts : | |||||||
| ||||||||
Changed: | ||||||||
< < | I don't know if we can define a simple MIME type that distinguishes between these two. | |||||||
> > | ||||||||
Added: | ||||||||
> > | Transport formats
| |||||||
Changed: | ||||||||
< < | At the moment I'm looking at building a common list of what formats we want to be able to represent, and our initial best-guess at how we want to describe them. If this particular case is overly complex and rarely used, then we label it as 'there be dragons' and move on. | |||||||
> > | Transport formats will be specific to the particular tranport protocol used. The most common examples are the content-encoding options that can be applied to HTTP data streams. I don't know if any other tranport protocols support equivalent options for processing or compressing the data stream. | |||||||
Added: | ||||||||
> > | However, it may be useful to at least list the main content-encoding options that we expect VO services to be able to apply to HTTP data streams and identify the standard MIME type strings that VO services should use to refer to them. | |||||||
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format.Java JAR formatAn example of this is the Java Archive format. From the JAR file specification :
VOSpace archive formatWhat I think will be useful for astronomers is to be able to say
Data set specific FITS formatHaving talked about this with some of our astronomers, one thing they suggested would be useful is be to be able to define types that represent data from specific data sets. If we had a VO data type that represented FITS image, then they would like to be able to define a new type that represented a FITS image from a specific data set. This new type would extend the definition of FITS image, inheriting all of the properties and behaviour defined fo a generic FITS image, and add details of the specific FITS header fields that that particular data set used in their files. Defining this as an extension type means that tools and applications that don't understand the specific type could handle the data as generic FITS, but tools that did understand the more specific type would be able to make use of the additional metadata that described what the data set specific header fields meant. At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or in a separate property. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific data set was important. How we enable them to make this distinction is up for discussion.<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats. | ||||||||
Added: | ||||||||
> > | Although the VOSpace team will probably want to register VOStandard resources for the VO data formats, this discussion is not intended to be about using registry URIs vs MIME types to describe data formats. All I am suggesting at the moment is that we work together to produce a list of the core VO data formats. As far as I know, at the moment we don't have a list of the common data formats used within the VO. If we do already have one, then please let me know. Initially, this can be just a wiki page that describes the formats, provides links to the relevant RFC documents, lists the common MIME types associated with each format, specifies the VO standard MIME type(s) we want to use and describes how the formats are used within astronomy and the VO. The information collected on the wiki may form the basis for a VOStandard resource that registers the common data formats, but the initial step is to collect the information. | |||||||
-- DaveMorris - 29 May 2007
| ||||||||
Added: | ||||||||
> > | Concepts | |||||||
The recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers.
The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
Data formats and container formats | ||||||||
Changed: | ||||||||
< < | ||||||||
> > | ||||||||
Added: | ||||||||
> > | ||||||||
My initial guess is that we need two types of format, a data format and a container format.
Specialization and inheritance | ||||||||
Added: | ||||||||
> > | ||||||||
Some data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format.
Java JAR format | ||||||||
Added: | ||||||||
> > | ||||||||
An example of this is the Java Archive format.
From the JAR file specification :
VOSpace archive format | ||||||||
Added: | ||||||||
> > | ||||||||
What I think will be useful for astronomers is to be able to say
Data set specific FITS format | ||||||||
Added: | ||||||||
> > | ||||||||
Having talked about this with some of our astronomers, one thing they suggested would be useful is be to be able to define
types that represent data from specific data sets.
If we had a VO data type that represented FITS image, then they would like to be able to define a new type
that represented a FITS image from a specific data set.
This new type would extend the definition of FITS image, inheriting all of the properties and behaviour defined fo a generic FITS image, and add details of the specific FITS header fields that that particular data set used in their files.
Defining this as an extension type means that tools and applications that don't understand the specific type could handle the data
as generic FITS, but tools that did understand the more specific type would be able to make use of the additional metadata
that described what the data set specific header fields meant.
At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or
in a separate property.
However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific data set
was important.
How we enable them to make this distinction is up for discussion.
<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats. | ||||||||
Added: | ||||||||
> > | -- DaveMorris - 29 May 2007 | |||||||
The recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data. | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
Data formats and container formatsMy initial guess is that we need two types of format, a data format and a container format.
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format.Java JAR formatAn example of this is the Java Archive format. | ||||||||
Changed: | ||||||||
< < | From the JAR file [[http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html][specification] : | |||||||
> > | From the JAR file specification : | |||||||
VOSpace archive formatWhat I think will be useful for astronomers is to be able to say
| ||||||||
Changed: | ||||||||
< < | Survey specific FITS format | |||||||
> > | Data set specific FITS format | |||||||
Changed: | ||||||||
< < | Having talked about this with some of our astronomers, one thing they did mention would be useful is be to be able to define extension types that represent data from specific surveys. | |||||||
> > | Having talked about this with some of our astronomers, one thing they suggested would be useful is be to be able to define types that represent data from specific data sets. | |||||||
If we had a VO data type that represented FITS image, then they would like to be able to define a new type | ||||||||
Changed: | ||||||||
< < | that represented a FITS image from a specific survey. | |||||||
> > | that represented a FITS image from a specific data set. | |||||||
Changed: | ||||||||
< < | This new type would extend the standard FITS image, and describe the specific FITS header fields that | |||||||
> > | This new type would extend the definition of FITS image, inheriting all of the properties and behaviour defined fo a generic FITS image, and add details of the specific FITS header fields that that particular data set used in their files. | |||||||
Deleted: | ||||||||
< < | that particular survey used in their files. | |||||||
Changed: | ||||||||
< < | The extension type would not define a new MIME type, so files of this type would inherit the standard MIME type from FITS image. However, the more specific content URI would point to the extension type, enabling users and software tools to be able to process the data more accurately. | |||||||
> > | Defining this as an extension type means that tools and applications that don't understand the specific type could handle the data as generic FITS, but tools that did understand the more specific type would be able to make use of the additional metadata that described what the data set specific header fields meant. | |||||||
At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or | ||||||||
Changed: | ||||||||
< < | in a separate field. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific survey | |||||||
> > | in a separate property. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific data set | |||||||
was important.
How we enable them to make this distinction is up for discussion.
<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats.The recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
| ||||||||
Changed: | ||||||||
< < | However, even if we eventually decide not to use the registered URIs in services and applications, creating a list of the main VO data formats and their corresponding MIME types will provide a useful resource for developers working on VO projects. | |||||||
> > | Even if we eventually decide not to use the registered URIs in services and applications, creating a list of the main VO data formats and their corresponding MIME types will provide a useful resource for developers working on VO projects. | |||||||
The first step is to make an initial list of the formats, define the corresponding MIME types and the inheritance hierarchy.
| ||||||||
Deleted: | ||||||||
< < | ||||||||
Data formats and container formatsMy initial guess is that we need two types of format, a data format and a container format.
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
In this model FITS may be a special case. From my admittedly basic understanding of FITS, it is possible for FITS to be both a data format (a FITS file containing tabular or image data) and a container format (one FITS file containing multiple images or tables). I defer to others who have a much better understanding of the FITS format and its useage within astronomy to define this in more detail. Ideally, we would want to be able to come up with a standard vocabulary that could represent the following concepts : | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Deleted: | ||||||||
< < | All I am looking for at the moment is a list of what we want to be able to represent, and our initial best-guess at how we want to refer to them. | |||||||
Added: | ||||||||
> > | At the moment I'm looking at building a common list of what formats we want to be able to represent, and our initial best-guess at how we want to describe them. If this particular case is overly complex and rarely used, then we label it as 'there be dragons' and move on. | |||||||
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format. | ||||||||
Added: | ||||||||
> > | Java JAR format | |||||||
An example of this is the Java Archive format. | ||||||||
Changed: | ||||||||
< < | From the JAR file specification : | |||||||
> > | From the JAR file [[http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html][specification] : | |||||||
| ||||||||
Changed: | ||||||||
< < | This particular example may not be useful for astronomers, but I listed it here because it is an already established extension format that developers will be famiular with. | |||||||
> > | This particular example may not be useful for astronomers, but I listed it here because it is an already established extension format that developers will be familiar with. | |||||||
Added: | ||||||||
> > | VOSpace archive format | |||||||
What I think will be useful for astronomers is to be able to say
| ||||||||
Added: | ||||||||
> > | Survey specific FITS format | |||||||
Added: | ||||||||
> > | Having talked about this with some of our astronomers, one thing they did mention would be useful is be to be able to define extension types that represent data from specific surveys. | |||||||
Added: | ||||||||
> > | If we had a VO data type that represented FITS image, then they would like to be able to define a new type that represented a FITS image from a specific survey. | |||||||
Added: | ||||||||
> > | This new type would extend the standard FITS image, and describe the specific FITS header fields that that particular survey used in their files. | |||||||
Added: | ||||||||
> > | The extension type would not define a new MIME type, so files of this type would inherit the standard MIME type from FITS image. However, the more specific content URI would point to the extension type, enabling users and software tools to be able to process the data more accurately. | |||||||
Added: | ||||||||
> > | At this point I don't know if this is a GoodIdea or not, or whether this information should be encoded in the content type or in a separate field. However, our astronomers seemed to think that the ability to distinguish between a generic FITS file and a FITS file from a specific survey was important. | |||||||
Changed: | ||||||||
< < | ||||||||
> > | How we enable them to make this distinction is up for discussion. | |||||||
Deleted: | ||||||||
< < | ||||||||
<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats.The recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
| ||||||||
Added: | ||||||||
> > | Data formats and container formatsMy initial guess is that we need two types of format, a data format and a container format.
Specialization and inheritanceSome data formats may be specializations of existing formats, and may inherit the MIME type or other details from their parent format. An example of this is the Java Archive format. From the JAR file specification :
| |||||||
<--
|
VO and VOSpace data formatsThis is a discussion page looking at defining and registering a list of standard VO data formats.The recent discussion thread about data formats and MIME types on the DAL mailing list started with a question about "gzipped images in SIAP 1.0" and has openned up into a wider discussion about MIME types and HTTP headers. The thread has highlighted the fact that there are at least three different concepts that we need to represent when transferring data.
<--
|