Rwp02 Toward Registry Requirements =================================== May 09, 2003. Intro ------ The Registry kick-off meeting, London 19-20, 2003 set up a number of working groups for development of IVOA registries. Rwp02 - Requirements, Science Cases, Use Cases, Test Cases has the charter to focus on science case scenarios, and use cases related specifically to VO registries. In particular, to develop use cases which will help define requirements for VO registries. ( webpage: http://www.ivoa.net/twiki/bin/view/IVOA/IVOARegWp02 ) This document represents the current status of Rwp02, including: + Work Plan + Scope + Working Conceptual Definition of a Registry for Science, + Broad Operational Requirements + Key Science Case Candidates + Requirements Imposed by Science Cases + Use Cases extracted from Science Cases The intention is that this draft provides a starting point for discussion at the Registry Working Group session at the Cambridge Interoperability Meeting, May 12-16, 2003. ( Registry WG session is Wed, May 14, 0900-1300, 1400-1700) Work Plan --------- - Define Scope draft v0.1 11 April - posted to IVOARegWp02 - Key Science Cases draft v0.1 11 April - posted to IVOARegWp02 ( includes a brief review of VO Science Cases in relation to registries ) - Use Cases draft v0.1 25 April - not done - Requirements draft v0.1 09 May - this draft document (Cambridge Meeting May12 -16) - Requirements drafts v0.2 28 June Science Cases Use Cases Test cases Scope of Work Package ---------------------- 1. Identify Key Science Cases to be used as drivers for defining registry requirements. These should be illustrative of the range of requests that may be sent to a regsistry. 2. Describe a set of Use Cases, which are representative of envisaged registry usage, and include the use cases required to execute the Key Science Cases. 3. Define a set of actual queries against which registry implementations may be tested. 4. Requirements. Make a set of requirements which are necessary to be able to execute the Use Cases, and will be useful in actual development of registries. Working Definition(s) of Registries ----------------------------------- A registry is a queriable service/resource that responds with a structured description of other services/resources. Registered services/resources are described by various types of metadata as outlined in RSM v6. Registries are a way of narrowing down the search for resources to a manageable subset. It is not intended that a registry replicates all data into a central repositry, but rather a condensed set of metadata. A registry is a dynamic database of metadata describing a set of Internet-available resources. A registry is used to identify and locate resources satisfying user-specified criteria, and to direct more detailed information requests to the relevant services. Broad Operational Requirements ------------------------------ (Adopted from NVO draft registries requirements - R. Plante) A. Registration Contents: 1. Descriptions of Resources (Resource metadata as described in RSM v6) 2. Descriptions of Services: + how to invoke the service, including inputs and outputs + other characteristics of the service; in particular, metadata about the kind of data returned by the service. + compliance details when the service is meant to be an implementation of a standard service. 3. The metadata structures supported by the registry should be consistent with those used by IVOA at large. 4. Need to support a hierarchical notion of resources in order to describe sites that manage multiple collections, missions, services, etc. B. Registry Queries 1. Resources and Services can be searched for based on characteristics. 2. Query results returned in machine-interpretable form 3. Can search for: + resources + services of particular types 4. It should be possible to uniquely retrieve a description of a resource or service either via a unique ID or via a small and predictable set of metadata. C. Registration Process and Registry Evolution 1. It should be easy for data/service provider to register a new service. It should not require one to resubmit information about the resource that was registered before for another service. 2. It should be easy or automatic to update the metadata associated with a resource or service. 3. It should be easy to unregister a resource and service. 4. The registry should expect that registered services may become temporarily unavailable. 5. The registry should account for the possibility that a service will become permanently unavailable without it being explicitly unregistered. 6. The registry system shall support classes of services sharing common characteristics where these shared characteristics can be specified once and then referred to from the implementing services. Updates to shared characteristics can also be made using a single update request to the registry. This is based on the idea that registries are used to support coarse searching for resources that might have what the user wants; a definitive search would accomplished by querying the candidate resources directly. Key Science Cases ----------------- Key science cases to be used as drivers for defining registry requirements. These are intended to be illustrative of the range of requests that may be sent to a registry. Following a review of the current science cases across the various VO initiatives, (http://www.ivoa.net/internal/IVOA/IVOARegWp02/KeySciCaseReview.txt) a 1st draft set of science cases has been selected. Selected Science Cases: 1. AstroGrid: Brown Dwarf Selection NVO: Select Dwarf Galaxies by Colour for Observational Follow-up Chosen as representative of parameter constrained catalog search scenarios. 2. AstroGrid: Deep Field Surveys Chosen as representative of a data search scenario, with use of coverage (spatial and temporal) constraints. 3. NVO: Gamma Ray Burst + Chosen because NVO is planning registry specific developments for this science case to turn their GRB demo into a more general "show me the sky" tool 4. NVO: Find Super Novae Pre-Burst Observation Chosen as representative of 'find all data at this point' type scenarios. (Perhaps this is the same as the GRB general tool) 5. AstroVirtel: Luminosity functions of Star Clusters in Nearby Galaxies Chosen because of detailed requirements for registry functions. See: http://www.ivoa.net/internal/IVOA/IVOARegWp02/astrovirtel_use_case.pdf Requirements Imposed by Key Science Cases ----------------------------------------- At present the requirements imposed by the science cases is only in the form of a list of registry requests drawn from the science cases. This is meant as a representative list, not in any way complete. It is envisaged that this should lead to a heirarchical list of science metadata that needs to be in a registry. Most importantly what is the minimal set. Working list of example registry requests:: Catalogs relevant to Galaxy Clusters Catalogs with coverage of I,K,R at given locations Catalogs of dwarf galaxies with color or magnitude measurements Identify resources relevant to Galactic Clusters Identify Deep Field Survey Resources Identify which parameters can be queried for a given resource Data Provider Services requests on - coverage - space, time, energy - resolution, field of view, pixel scale, limiting - calibration, data quality - requests for data of unique provenance (not multiple datasets or catalogs derived from the same observations) ...need help here to expand list and make into requirements... Interpretive capabilities. -------------------------- A number of the science case scenarios imply a kind of interpretive capability for registries. Some of these are familiar, but others are much more challenging, and it is not clear if they belong as a registry requirement. ( Rather the requirement on the registry might be that the metadata is listed in such a way as to allow construction of interpretive capabilities as registered services ) The most familiar of these is Coordinate transformations and object name resolving. These are often the starting points for searches. Current archives/catalog browsers handle this fairly well so it is assumed that reistry will be able to do the same. (In the registry framework it is often assumed this will be handled by a coordinate/astronomical name resolver registered service) Higher level interpretive capabilities assumed in some of the science case scenarios include the ability to, for example, search by REDSHIFT, not only returning results where redshift is explicit, but also results where REDSHIFT may be recognised and calculated using explicit VELOCITY values. This implies that astronomical concepts and their relationships are somehow encoded within a registry. Further, there has been a suggestion for "Expert Knowledge Registries" into the registry so that a registered service "knows" : "what is a galaxy", "what is a deVaucouleours profile", "what is an elliptical galaxy", etc.. The issue of international language support has also been mentioned. - Having interpretive capabilities actually within a registry seems to implies the need for library of functions for the registry. - The complexity could easily get out of hand, so suggest a minimal set of functions to do Coordinates and Name Resolving. (As implemented in GLU for example) or ... only have interpretive capabilities as calls to registered services. Use Cases --------- + Use cases extracted from Key Science Cases - requirements for registries of different granularity ... + Categorized Use Cases (adapted from R. Plante use cases) 1) Locating Data Collections Find collections that may contain desired data. Return possibilities * the type of resource (e.g. archive, survey, catalog ) * Collection's Home Page * a description of supported (access) services Possible Search criteria * the type of resource * type of data included (images, spectra, catalog, ...) * frequency waveband * sky coverage * time coverage * the types of services supported (SIA, Cone search, etc.) * descriptive keywords 2) Locating Services A. Data Access Services Example: Find all optical, ground-based image archives that support SIA. Find all mosaicing services that cover the north galatic pole. Return: * service description (describing supported inputs, outputs, and access URL) Possible Search criteria: * type of service * type of data included (images, spectra, catalog, ...) * frequency waveband * sky coverage * time coverage * descriptive keywords * access type (i.e. what level of authorization required; e.g. public vs. authenticated or proprietary) B. Generic/Processing Services Examples: Find a name-resolver Find a coordinate converter Return * service description Possible Search criteria: * type of service * input/output objects * descriptive keywords * access type (i.e. what level of authorization; e.g. public vs. authenticated or proprietary) 3) Locating Resources Return * description of resource * contact information * URL to Resource's home page Possible Search criteria: * resource name or identifier * descriptive keywords Registry Science Metadata ------------------------- The proposals for resources metadata in RSM v6 and eleswhere have been compared against some sceintific criteria. See. Anita Richards: http://www.ivoa.net/forum/registry/0200.htm