Theory IG Meeting, 27-28 February 2006, IoA, University of Cambridge
Background
Over the past year, signigicant progress have been made by the
International Virtual Observatory Alliance (IVOA) in developing
standards and protocols for astronomical data storage, discovery and
retrieval. Applications and webservices implementing these advances
are now being developed to allow astronomers to query, crossmatch and
analyse data from many of the survey catalogues that are publicly
available.
At the IVOA interoperability meeting in Kyoto (see
InterOpMay2005), the Theory
Interest Group - charged with ensuring that the VO standards also meet
the requirements of theoretical (or simulated) data - outlined a set
of near-term targets with the aim of determining where the existing
IVOA standards and implentations must be updated to allow the
discovery, exchange and analysis of simulated data using VO services.
Computational Cosmology is one area of astronomy that should be a
major benficiary of an international virtual observatory. N-body
simulations are now pushing available technological resources to their
limits in the drive to understand how large scale structure formed in
the universe. Many independant groups are working towards solving the
problems of hierarchical structure formation, performing large-scale
simulations as an integral part of their investigations. However, the results of many of these simulations are not publicly
available. Furthermore, of those that are, the data is stored in a wide variety of (mostly undocumented) formats and systems, often chosen as having been convenient at the time. Therefore the interchange and direct comparison of results by independant groups is uncommon as often much effort is required in obtaining, understanding and translating data in order for it to be of any use.
A primary goal of the IVOA's Theory Interest group is thus to decide
upon a standard file format for raw simulation data and a metadata
language (based on the Universal Content Descriptors used for
observational data) with which to describe the contents and which can
later be expanded to encompass other forms of astrophysical
simulations. Based on this, a set of requirements of the Data Access
Layer (DAL) and Virtual Observatory Query Language (VOQL) working
groups must be identified to so that data from cosmological data can
seamlessly be discovered and retrieved through VO
portals. Additionally, grid-enabled analysis packages - such as halo identification
algorithms or routines to perform 'virtual observations of
simulations' - need to be developed and published as online tools through VO portals so that simulated data can be
analysed and compared to observed data within the VO environment.
We thus propose a workshop in order to bring together current members of the theory interest group with members of the various groups involved in cosmological simulations to initiate discussions and propose solutions to the issues outlined above. We therefore have arranged a two day meeting, from the 27-28 Feb 2006, at the
IoA, University of Cambridge. The main purpose of the meeting is to identify the main requirements that simulated data and services impose on the existing architecture in order to ensure their succesfull incorporation into the VO. Hence the format of the meeting will be discussion orientated,
To register, please email Nicholas Walton (
naw@ast.cam.ac.uk), or add your name to the list of Attendees below.
We hope to see in Cambridge in February!
Nic Walton and Laurie Shaw
Workshop Aims
- Proposing and determining a standard data format and file structure for particle data files extracted from Nbody/SPH simulations
- Development of a meta-data model for the discovery and querying of published data
- Compiling a technical 'note' of the main requirements of simulated data for the other IVOA working groups, especially with regard to data discovery, querying, and retrieval, and the registration of selected data products and services. To be presented at the IVOA Interop in Victoria in May 2006.
- Identifying the type of services to be made available for the analysis/visualisation of simulated data (e.g. virtual telescopes, online semi-analytical galaxy formation)
Background Links and Work
Registration
Please email Nicholas Walton (naw@ast.cam.ac.uk) (Subject: Theory IG- Registration) if you plan to attend this meeting or simply add your name to the list of Attendees below
Agenda and Presentations
Meeting to open 10.00 am, 27 Feb 2006 and close 16.00, 28 Feb 2006
Agenda and Presentations: NB - Preliminary Draft (subject to change, open to suggestions)
Mon 27 Feb 2006 |
Time |
Title |
Speaker/Chair |
Comments |
Document |
10.00 |
Welcome Address |
Nicholas Walton |
|
|
10.10 |
Theory in the Virtual Observatory |
Laurie Shaw |
Current status of theory in the VO. Goals for this workshop and the Interop in Victoria |
[.pdf] |
10.40 |
Talks and Demos |
|
Overview of Ongoing Theory Projects |
|
UK |
Laurie Shaw |
|
|
11.30 |
Break |
11.45 |
Talks and Demos |
|
Overview of Ongoing Theory Projects (contd) |
|
Italy |
Claudio Ghellar |
|
[.pdf] |
|
Italy |
Riccardo Smareglia |
|
[.pdf] |
|
USA |
Peter Teuben |
|
[.pdf] |
13.00 |
Lunch |
11.45 |
Talks and Demos |
|
Overview of Ongoing Theory Projects (contd) |
|
France |
Herve Wozniak |
|
[.pdf] [.ppt] |
14.30 |
Discussion Session 1 |
|
Theory VO Data Products: Data Models, Data formats and metadata |
15.30 |
Break |
15.45 |
Discussion Session 2 |
|
Theory VO Data Products (contd) |
17.00 |
Close - day 1 |
Tue 28 Feb 2006 |
Time |
Title |
Speaker/Chair |
Comments |
09.30 |
Discussion Session 1 |
Peter Teuben |
Data Access: Simple Particle Access, developing a theory analogue to simple cone search, requirements of ADQL for theory queries |
11.00 |
Break |
11.30 |
Discussion Session 2 |
|
Theory Services: Virtual Telecopes and Analysis tools |
12.30 |
Lunch |
|
13.30 |
Discussion Session 3 |
Nicolas Walton |
Requirements of other IVOA working groups: road plan for Victoria |
15.30 |
Close |
Attendees
If you plan to attend please could you add you name to this list (or if not a twiki devotee - send an email to Nic Walton at
naw@ast.cam.ac.uk)
Name |
Institute |
Comments |
NicholasWalton |
IoA, Cambridge, UK |
|
LaurieShaw |
IoA, Cambridge, UK |
|
Kona Andrews |
IoA, Cambridge, UK |
|
John Helly |
Durham, UK |
|
Sverre Aarseth |
IoA, Cambridge, UK |
mon a.m. |
Gerard Lemson |
MPE, Munich, D |
apologies |
PeterTeuben |
UMaryland, USA |
|
Dave De Young |
NOAO, USA |
apologies |
Herve Wozniak |
Lyon, F |
|
Claudio Gheller |
CINECA - Bologna, I |
|
Riccardo Smareglia |
Trieste, I |
|
Fabio Pasian |
Trieste, I |
|
Valeria Manna |
Trieste, I |
|
Ugo Becciani |
Catania, I |
apologies |
Notes
Welcome
NW introduced the attendees and noted the meeting goals
Theory-Group to date
LDS gave a brief overview of TIG activities to date [slides]
Meetings aims:
- create group interested to push theory data
- specify data products woth publishing
- e.g. halo cats, semi-analytical galaxy catalogues
- requirements for registering data products
- description of data products
- querying of data products
- formats, FITS, HDF5
- special theory protocols - simple theory access protocol?
- additional services
- analysis tools
- incorporation of semi-analytic models
- vitual telescopes
Aim to create an IVOA technical note - including a few science cases which define the processes and interelationships of services required to facilitate those cases.
Discussion Points
- authorisation
- registry
- what is the theory equivalent of RA/Dec
- searching large theory models vs creating new simulations via rnning the application
- integration of applications to analysis theory data
Theory Projects
UK:
- LDS - see slides
- JH - VirtU initiative - links Virgo collaborators.
GAVO
see
http://www.g-vo.org/portal/
I-VObs Theory
CG: [slides]
- ENZO
- cosmological simulations metadata
- MHD simulations
NVO
PT: [slides]
- Services for Theory in the VO
- Sevices - based on provenance
- data extraction
- SnapShots (PAT)
- P = n= families of particles
- A: attributes (x,v, M, R, T, ...)
- T: times
- nbody FITS
- Worldlines (TAP)
- native forma for collisional dynamics
- observer centric time delayed Cosmological?
- APT cubes
- object memory model - e.g. NEMO - most codes don't use this.
- Gridding
Horizon
Discussion Points
Millenium run ~20TB consisting of 64 snapshots.
Techncal Issues
Data Models
CG: Italian view on a data model for numerial simulation data.
Tracking 'worldlines' - so e.g. what happens to a particular particle. Compare with tracking data of a GRB - link to VOEvent?
UCDs
It was AGREED that theory UCDs should be submitted to the UCD1+ (
http://cdsweb.u-strasbg.fr/UCD/tools.htx) site. (ACTION: LDS)
It was further AGREED that a top level UCD tree of 'sim.' could be appropriate to allow for the description of simulation UCDs. (c.f. UCD suggestions of LDS - see
http://wiki.ivoa.net/internal/IVOA/IvoaTheory/UCDproposal.pdf).
ACTION LDS: Add UCDs that are urgent for sim data to IVOA Theory pages at
IvoaTheory. If no comments are recieved that UCD will be okay. Review and agree in Victoria (May 2006).
Data Set Access
It was AGREED that the meeting participants would attempt to provide access to a number of datasets through standard existing VO access mechanisms
- JH: some Millenium simulations cats from Durham/MPE
- LDS: will put up some TPM sim catalogues using AG DSA access
- CG: sim data catalogue using VOTECH access (based on AG DSA):
- HW: galics as a possible test data case.
- PT: cluster data from sverre's code - via NVO openskynode.
By Victoria we would to have examples of 3-4 datasets available through 'std' VO intefaces - discoverable via the registry and queriable via VO interfaces.
Role of the Theory Interest Group in the IVOA
- it was AGREED that there aseemed to be a need to generate theory specific standards.
- Simple Numerical Access Protocol - a service that would then allow resources to appear in the registry as tagged 'theory' (c.f. images-SIAP, catalogues-Cone Search,
Towards a Simple Numerical Access Protocol.
PT/LDS noted some initial efforts towards a SNAP atthe 2004 NVO summer school.
Data file types
- FITS - not yet implemented for nbody files
- HDF5
- Gadget - note: limited metadata
would need to seperate the metadata describing the files and put those in a catalogue. these would index the actual data files.
Use case examples
- CG gave an example of VisIVO selecting simulation data points - based on position
Q: Why isn't sim data in a database. A: it's faster to work on binary data.
common selection criteria:
- subset on the number of particles
- subset on time
- returns based on ranging based on another column - thus select on M> 10^13 for example.
analogy with SIAP - here a simple srevice returns a VOTable giving pointers to the data. In the SPAP case we'd
need to mandate the equivalent 'cutout service' type such that the user was able to access the data cutout. This reflects the few providers of very large simulations in Theory.
- submit query - i need a simulation on this type of problem - e.g. cluster
- query the simulation
- cut out the clusters
Developing SNAP
It was AGREED that SNAP should be developed.
outline draft for SNAP
- 1. Overview: In operation SNAP represents a negotiation between the client and the theory dataset service. The client describes the ideal theory data subset - what it would like to get back from the theory dataset service - and the theory dataset service returns a list, encoded as a VOTable, of the (often virtual) theory sub-datasets it can actually return. The client then examines this to determine if it is interested in any of the available theory sub-datasets, possibly iterates with the service to refine the query, then issues a series of getSnap requests to retrieve the selected theory sub-datasets. A key point is that it is entirely up to the service what datasets, if any, it offers to the client. These datasets may range from a simple list of static archive datasets which intersect the region of interest defined by the client, to a synthetic dataset matching the ideal dataset requested by the client. In the latter case the dataset is not computed until it is actually accessed or retrieved by the client. The bulk of this document represents a technical specification for the simple numerical access interface. For examples of how the interface is intended to be used, please refer to the Usage Examples section below.
- 2. Requirements for Compliance
- 2.1 Compliance: The keywords "MUST", "REQUIRED", "SHOULD", and "MAY" as used in this document are to be interpreted as described in RFC 2119. An implementation is compliant if it satisfies all the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be "unconditionally compliant"; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be "conditionally compliant".
- 3. Image Service Types: It is assumed that compliant image services fall into one of n categories. These categories are used primarily to characterize the service as part of service registration, to aid clients in discovering the services best suited to their needs. How an image service implements this specification depends on which category it falls in. The categories are defined as follows:
- 3.1 SNAP Cutout Service: This is a service which extracts or "cuts out" rectangular regions of some larger theory dataset, returning a subset of the requested size to the client. Such subsetss are usually drawn from a database or a collection of large numerical simulation outputs. To be considered a cutout service, the returned subset should closely approximate (or at least not exceed) the size of the requested region; however, a cutout service may resample the pixel data.
- 3.2 SNAP stacking service: query the same volume but at different time steps. [DEFINE]
- 3.3 others??
Datasets can be returned as
???. HDF5?, VOTable (binary)?
Ultimately, VO data models will provide a means to describe more complex data objects within the VO than be directly addressed by the SNAP prototype. VO data models are needed for many data objects or attributes such as virial mass of a model ... and so forth.
- 4. Image Query
- 4.1 Input: what would the minimum parameters be? X,Y,Z and time - and need to specify the unit - thus volume in Mpc^3 for instance
- 4.2 Successful Output
- 4.3 Error Responses and Other Unsuccessful Results
- 5. Image Staging
- 5.1 Staging and Messaging Protocols (Preliminary)
- 6. Image Retrieval
- 6.1 Input
- 6.2 Successful Output
- 6.3 Error Response
- 7. Service Metadata
- 7.1 Metadata Query
- 7.2 Registering a Compliant Service
- 8. Usage Examples
- 8.1 Client Examples
- 8.2 Service Examples
DRAFT TIGER TEAM: Gerard, CG, RS, PT, LDS, NAW, HW, JH, UB (NAW to circulate initial draft).
SNAP implementations
CG, PT and (potentially Gerard Lemson) will endeavour to make early implemetations of SNAP service.
ADQL for theory
Refering to the current draft version 1.01 at
http://www.ivoa.net/Documents/WD/ADQL/ADQL-20050624.pdf - he meeting decided thatthis was probably appropriate for theory use. There maybe meta functions that could be useful for theory (c.f. 'region') but it was not clear exactly what these would be at this point.
ACTION: discuss in Victoria
Registry
the registry structure is probably okay. but need o better charaterise theeory data and applications.
Coordinate systems
refering to the
STC draft
http://www.ivoa.net/Documents/PR/STC/STC-20050315.html - a key theory issue could be the issue of 'units'.
Virtual telescopes
longer term work - see e.g. Bouwens' Universe Construction Set (BUCS) -
http://www.ucolick.org/~bouwens/bucs/index.html
Planning for Victoria
- NW: email the theory and ivoa list with the outcome of this meeting.
- Victoria Interop, 15-19 May 2006 - see InterOpMay2006:
- Theory-IG meeting should happen - suggest the Mon 15 May 2006.
- Topics:
- SNAP protocol
- reports on experience with accessing theory catalogues
- ADQL extension
- Theory UCDs
- Technical note - this would describe T-IG activities, and note that there is a need for a Theory working group charged with producing e.g. SNAP. (aim to submit request for Theory-WG to the July 06 IVOA exec meeting - thus Theory-WG first meeting would be Sep 2006.
- Timescale: draft SNAP by May Victoria meeting.
Logistical Information
Meeting Equipment
Wireless networking will be available in the meeting rooms. Digital projectors and OHPs will be provided.
Taxis
There will be a sign up sheet where you can give details of times required for taxis to the airport, railway station, etc, on the front reception desk in the Hoyle building. Our receptionist will then make the necessary booking for you.
Location
The meeting will be held in the Hoyle building - primarily the committee meeting room - a
map of the IoA gives the local location.
Travel and Weather
Information as to how to get to the
IoA can be found at
http://www.ast.cam.ac.uk/contact/map/
The latest weather forecast for
Cambridge from the BBC Weather Office
Accommodation
If you require accommodation for this meeting, please check the Cambridge Tourist office - they maintain a comprehensive list of hotels at
http://www.visitcambridge.org/visitors/wheretostay.php. The Cambridge Lodge Hotel on the Huntingdon Road is the closest to the
IoA - a 10 minute walk.
Acommodation queries.
If you have problems with booking accommodation for this meeting, please could you let Judith Moss know (
mailto:jm@ast.cam.ac.uk), she maybe able to help.
Meeting Dinner
19.30 at Cafe Naz - see
http://www.cafenaz.co.uk/welcome.htm - Address: 45 / 47 Castle Street, Cambridge, Cambridgeshire CB3 0AH
Tel: 01223 363 666, Tel: 01223 302 687
--
NicholasWalton - 28 Feb 2006