Theory IG Meeting, 27-28 February 2006, IoA, University of Cambridge



Workshop Participants


Background

Over the past year, signigicant progress have been made by the International Virtual Observatory Alliance (IVOA) in developing standards and protocols for astronomical data storage, discovery and retrieval. Applications and webservices implementing these advances are now being developed to allow astronomers to query, crossmatch and analyse data from many of the survey catalogues that are publicly available.

At the IVOA interoperability meeting in Kyoto (see InterOpMay2005), the Theory Interest Group - charged with ensuring that the VO standards also meet the requirements of theoretical (or simulated) data - outlined a set of near-term targets with the aim of determining where the existing IVOA standards and implentations must be updated to allow the discovery, exchange and analysis of simulated data using VO services.

Computational Cosmology is one area of astronomy that should be a major benficiary of an international virtual observatory. N-body simulations are now pushing available technological resources to their limits in the drive to understand how large scale structure formed in the universe. Many independant groups are working towards solving the problems of hierarchical structure formation, performing large-scale simulations as an integral part of their investigations. However, the results of many of these simulations are not publicly available. Furthermore, of those that are, the data is stored in a wide variety of (mostly undocumented) formats and systems, often chosen as having been convenient at the time. Therefore the interchange and direct comparison of results by independant groups is uncommon as often much effort is required in obtaining, understanding and translating data in order for it to be of any use.

A primary goal of the IVOA's Theory Interest group is thus to decide upon a standard file format for raw simulation data and a metadata language (based on the Universal Content Descriptors used for observational data) with which to describe the contents and which can later be expanded to encompass other forms of astrophysical simulations. Based on this, a set of requirements of the Data Access Layer (DAL) and Virtual Observatory Query Language (VOQL) working groups must be identified to so that data from cosmological data can seamlessly be discovered and retrieved through VO portals. Additionally, grid-enabled analysis packages - such as halo identification algorithms or routines to perform 'virtual observations of simulations' - need to be developed and published as online tools through VO portals so that simulated data can be analysed and compared to observed data within the VO environment.

We thus propose a workshop in order to bring together current members of the theory interest group with members of the various groups involved in cosmological simulations to initiate discussions and propose solutions to the issues outlined above. We therefore have arranged a two day meeting, from the 27-28 Feb 2006, at the IoA, University of Cambridge. The main purpose of the meeting is to identify the main requirements that simulated data and services impose on the existing architecture in order to ensure their succesfull incorporation into the VO. Hence the format of the meeting will be discussion orientated,

To register, please email Nicholas Walton (naw@ast.cam.ac.uk), or add your name to the list of Attendees below.

We hope to see in Cambridge in February!

Nic Walton and Laurie Shaw

Workshop Aims

  • Proposing and determining a standard data format and file structure for particle data files extracted from Nbody/SPH simulations
  • Development of a meta-data model for the discovery and querying of published data
  • Compiling a technical 'note' of the main requirements of simulated data for the other IVOA working groups, especially with regard to data discovery, querying, and retrieval, and the registration of selected data products and services. To be presented at the IVOA Interop in Victoria in May 2006.
  • Identifying the type of services to be made available for the analysis/visualisation of simulated data (e.g. virtual telescopes, online semi-analytical galaxy formation)

Background Links and Work

Registration

Please email Nicholas Walton (naw@ast.cam.ac.uk) (Subject: Theory IG- Registration) if you plan to attend this meeting or simply add your name to the list of Attendees below


Agenda and Presentations

Meeting to open 10.00 am, 27 Feb 2006 and close 16.00, 28 Feb 2006

Agenda and Presentations: NB - Preliminary Draft (subject to change, open to suggestions)

Mon 27 Feb 2006
Time Title Speaker/Chair Comments Document
10.00 Welcome Address Nicholas Walton    
10.10 Theory in the Virtual Observatory Laurie Shaw Current status of theory in the VO. Goals for this workshop and the Interop in Victoria [.pdf]
10.40 Talks and Demos   Overview of Ongoing Theory Projects
  UK Laurie Shaw    
11.30 Break
11.45 Talks and Demos   Overview of Ongoing Theory Projects (contd)
  Italy Claudio Ghellar   [.pdf]
  Italy Riccardo Smareglia   [.pdf]
  USA Peter Teuben   [.pdf]
13.00 Lunch
11.45 Talks and Demos   Overview of Ongoing Theory Projects (contd)
  France Herve Wozniak   [.pdf] [.ppt]
14.30 Discussion Session 1   Theory VO Data Products: Data Models, Data formats and metadata
15.30 Break
15.45 Discussion Session 2   Theory VO Data Products (contd)
17.00 Close - day 1

Tue 28 Feb 2006
Time Title Speaker/Chair Comments
09.30 Discussion Session 1 Peter Teuben Data Access: Simple Particle Access, developing a theory analogue to simple cone search, requirements of ADQL for theory queries
11.00 Break
11.30 Discussion Session 2   Theory Services: Virtual Telecopes and Analysis tools
12.30 Lunch  
13.30 Discussion Session 3 Nicolas Walton Requirements of other IVOA working groups: road plan for Victoria
15.30 Close


Attendees

If you plan to attend please could you add you name to this list (or if not a twiki devotee - send an email to Nic Walton at naw@ast.cam.ac.uk)

Name Institute Comments
NicholasWalton IoA, Cambridge, UK  
LaurieShaw IoA, Cambridge, UK  
Kona Andrews IoA, Cambridge, UK  
John Helly Durham, UK  
Sverre Aarseth IoA, Cambridge, UK mon a.m.
Gerard Lemson MPE, Munich, D apologies
PeterTeuben UMaryland, USA  
Dave De Young NOAO, USA apologies
Herve Wozniak Lyon, F  
Claudio Gheller CINECA - Bologna, I  
Riccardo Smareglia Trieste, I  
Fabio Pasian Trieste, I  
Valeria Manna Trieste, I  
Ugo Becciani Catania, I apologies

Notes

Welcome

NW introduced the attendees and noted the meeting goals

Theory-Group to date

LDS gave a brief overview of TIG activities to date [slides]

Meetings aims:

  • create group interested to push theory data
  • specify data products woth publishing
    • e.g. halo cats, semi-analytical galaxy catalogues
  • requirements for registering data products
  • description of data products
  • querying of data products
  • formats, FITS, HDF5
  • special theory protocols - simple theory access protocol?
  • additional services
    • analysis tools
    • incorporation of semi-analytic models
    • vitual telescopes

Aim to create an IVOA technical note - including a few science cases which define the processes and interelationships of services required to facilitate those cases.

Discussion Points

  • authorisation
  • registry
  • what is the theory equivalent of RA/Dec
  • searching large theory models vs creating new simulations via rnning the application
  • integration of applications to analysis theory data

Theory Projects

UK:

  • LDS - see slides
  • JH - VirtU initiative - links Virgo collaborators.

GAVO

see http://www.g-vo.org/portal/

I-VObs Theory

CG: [slides]

  • ENZO
  • cosmological simulations metadata
  • MHD simulations

NVO

PT: [slides]

Horizon

Discussion Points

Millenium run ~20TB consisting of 64 snapshots.

Techncal Issues

Data Models

CG: Italian view on a data model for numerial simulation data.

Tracking 'worldlines' - so e.g. what happens to a particular particle. Compare with tracking data of a GRB - link to VOEvent?

UCDs

It was AGREED that theory UCDs should be submitted to the UCD1+ (http://cdsweb.u-strasbg.fr/UCD/tools.htx) site. (ACTION: LDS)

It was further AGREED that a top level UCD tree of 'sim.' could be appropriate to allow for the description of simulation UCDs. (c.f. UCD suggestions of LDS - see http://wiki.ivoa.net/internal/IVOA/IvoaTheory/UCDproposal.pdf).

ACTION LDS: Add UCDs that are urgent for sim data to IVOA Theory pages at IvoaTheory. If no comments are recieved that UCD will be okay. Review and agree in Victoria (May 2006).

Data Set Access

It was AGREED that the meeting participants would attempt to provide access to a number of datasets through standard existing VO access mechanisms

  • JH: some Millenium simulations cats from Durham/MPE
  • LDS: will put up some TPM sim catalogues using AG DSA access
  • CG: sim data catalogue using VOTECH access (based on AG DSA):
  • HW: galics as a possible test data case.
  • PT: cluster data from sverre's code - via NVO openskynode.

By Victoria we would to have examples of 3-4 datasets available through 'std' VO intefaces - discoverable via the registry and queriable via VO interfaces.

Role of the Theory Interest Group in the IVOA

  • it was AGREED that there aseemed to be a need to generate theory specific standards.
    • Simple Numerical Access Protocol - a service that would then allow resources to appear in the registry as tagged 'theory' (c.f. images-SIAP, catalogues-Cone Search,

Towards a Simple Numerical Access Protocol.

PT/LDS noted some initial efforts towards a SNAP atthe 2004 NVO summer school.

Data file types

  • FITS - not yet implemented for nbody files
  • HDF5
  • Gadget - note: limited metadata

would need to seperate the metadata describing the files and put those in a catalogue. these would index the actual data files.

Use case examples

  • CG gave an example of VisIVO selecting simulation data points - based on position

Q: Why isn't sim data in a database. A: it's faster to work on binary data.

common selection criteria:

  • subset on the number of particles
  • subset on time

  • returns based on ranging based on another column - thus select on M> 10^13 for example.

analogy with SIAP - here a simple srevice returns a VOTable giving pointers to the data. In the SPAP case we'd need to mandate the equivalent 'cutout service' type such that the user was able to access the data cutout. This reflects the few providers of very large simulations in Theory.

  • submit query - i need a simulation on this type of problem - e.g. cluster
  • query the simulation
  • cut out the clusters

Developing SNAP

It was AGREED that SNAP should be developed.

outline draft for SNAP

  • 1. Overview: In operation SNAP represents a negotiation between the client and the theory dataset service. The client describes the ideal theory data subset - what it would like to get back from the theory dataset service - and the theory dataset service returns a list, encoded as a VOTable, of the (often virtual) theory sub-datasets it can actually return. The client then examines this to determine if it is interested in any of the available theory sub-datasets, possibly iterates with the service to refine the query, then issues a series of getSnap requests to retrieve the selected theory sub-datasets. A key point is that it is entirely up to the service what datasets, if any, it offers to the client. These datasets may range from a simple list of static archive datasets which intersect the region of interest defined by the client, to a synthetic dataset matching the ideal dataset requested by the client. In the latter case the dataset is not computed until it is actually accessed or retrieved by the client. The bulk of this document represents a technical specification for the simple numerical access interface. For examples of how the interface is intended to be used, please refer to the Usage Examples section below.
  • 2. Requirements for Compliance
    • 2.1 Compliance: The keywords "MUST", "REQUIRED", "SHOULD", and "MAY" as used in this document are to be interpreted as described in RFC 2119. An implementation is compliant if it satisfies all the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be "unconditionally compliant"; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be "conditionally compliant".
  • 3. Image Service Types: It is assumed that compliant image services fall into one of n categories. These categories are used primarily to characterize the service as part of service registration, to aid clients in discovering the services best suited to their needs. How an image service implements this specification depends on which category it falls in. The categories are defined as follows:
    • 3.1 SNAP Cutout Service: This is a service which extracts or "cuts out" rectangular regions of some larger theory dataset, returning a subset of the requested size to the client. Such subsetss are usually drawn from a database or a collection of large numerical simulation outputs. To be considered a cutout service, the returned subset should closely approximate (or at least not exceed) the size of the requested region; however, a cutout service may resample the pixel data.
    • 3.2 SNAP stacking service: query the same volume but at different time steps. [DEFINE]
    • 3.3 others??
Datasets can be returned as ???. HDF5?, VOTable (binary)?

Ultimately, VO data models will provide a means to describe more complex data objects within the VO than be directly addressed by the SNAP prototype. VO data models are needed for many data objects or attributes such as virial mass of a model ... and so forth.

  • 4. Image Query
    • 4.1 Input: what would the minimum parameters be? X,Y,Z and time - and need to specify the unit - thus volume in Mpc^3 for instance
    • 4.2 Successful Output
    • 4.3 Error Responses and Other Unsuccessful Results
  • 5. Image Staging
    • 5.1 Staging and Messaging Protocols (Preliminary)
  • 6. Image Retrieval
    • 6.1 Input
    • 6.2 Successful Output
    • 6.3 Error Response
  • 7. Service Metadata
    • 7.1 Metadata Query
    • 7.2 Registering a Compliant Service
  • 8. Usage Examples
    • 8.1 Client Examples
    • 8.2 Service Examples

DRAFT TIGER TEAM: Gerard, CG, RS, PT, LDS, NAW, HW, JH, UB (NAW to circulate initial draft).

SNAP implementations

CG, PT and (potentially Gerard Lemson) will endeavour to make early implemetations of SNAP service.

ADQL for theory

Refering to the current draft version 1.01 at http://www.ivoa.net/Documents/WD/ADQL/ADQL-20050624.pdf - he meeting decided thatthis was probably appropriate for theory use. There maybe meta functions that could be useful for theory (c.f. 'region') but it was not clear exactly what these would be at this point.

ACTION: discuss in Victoria

Registry

the registry structure is probably okay. but need o better charaterise theeory data and applications.

Coordinate systems

refering to the STC draft http://www.ivoa.net/Documents/PR/STC/STC-20050315.html - a key theory issue could be the issue of 'units'.

Virtual telescopes

longer term work - see e.g. Bouwens' Universe Construction Set (BUCS) - http://www.ucolick.org/~bouwens/bucs/index.html

Planning for Victoria

  • NW: email the theory and ivoa list with the outcome of this meeting.
  • Victoria Interop, 15-19 May 2006 - see InterOpMay2006:
    • Theory-IG meeting should happen - suggest the Mon 15 May 2006.
  • Topics:
    • SNAP protocol
    • reports on experience with accessing theory catalogues
    • ADQL extension
    • Theory UCDs
    • Technical note - this would describe T-IG activities, and note that there is a need for a Theory working group charged with producing e.g. SNAP. (aim to submit request for Theory-WG to the July 06 IVOA exec meeting - thus Theory-WG first meeting would be Sep 2006.
  • Timescale: draft SNAP by May Victoria meeting.


Logistical Information

Meeting Equipment

Wireless networking will be available in the meeting rooms. Digital projectors and OHPs will be provided.

Taxis

There will be a sign up sheet where you can give details of times required for taxis to the airport, railway station, etc, on the front reception desk in the Hoyle building. Our receptionist will then make the necessary booking for you.

Location

The meeting will be held in the Hoyle building - primarily the committee meeting room - a map of the IoA gives the local location.

Travel and Weather

Information as to how to get to the IoA can be found at http://www.ast.cam.ac.uk/contact/map/

The latest weather forecast for Cambridge from the BBC Weather Office

Accommodation

If you require accommodation for this meeting, please check the Cambridge Tourist office - they maintain a comprehensive list of hotels at http://www.visitcambridge.org/visitors/wheretostay.php. The Cambridge Lodge Hotel on the Huntingdon Road is the closest to the IoA - a 10 minute walk.

Acommodation queries.

If you have problems with booking accommodation for this meeting, please could you let Judith Moss know (mailto:jm@ast.cam.ac.uk), she maybe able to help.

Meeting Dinner

19.30 at Cafe Naz - see http://www.cafenaz.co.uk/welcome.htm - Address: 45 / 47 Castle Street, Cambridge, Cambridgeshire CB3 0AH Tel: 01223 363 666, Tel: 01223 302 687


-- NicholasWalton - 28 Feb 2006


Topic revision: r6 - 2007-01-19 - BrunoRino
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback