(r6) SimDALRFC < IVOA

TWiki>

IVOA Web>TemplateRFC>SimDALRFC (revision 6)~~EditAttach~~

SimDAL 1.0 Proposed Recommendation: Request for Comments

Public discussion page for the IVOA SimDAL 1.0 Proposed Recommendation.

The latest version of the SimDAL Specification can be found at:

http://www.ivoa.net/documents/SimDAL/index.html

Reference Interoperable Implementations

SimDAL Reference Implementations

Comments from the IVOA Community and TCG members during RFC period: 2016-07-08 - 2016-08-22

Comments from Mark Taylor

I don't have a strong interest in SimDAL, and I have not thoroughly reviewed this draft, but I read it and have some comments.

This document departs from usual VO procedures in various ways, apparently reinventing the capabilities of TAP and the Registry for its own purposes. There is a rationale provided in Appendix B for avoiding use of TAP, which I'm not sure I find convincing, but I haven't gone into the requirements of simulation data access carefully enough to want to comment further on that.

Anwser:

The notion of views may present similarities with TAP/TAP schemas. TAP has not been chosen as a solution because it does not fulfill the requirements for Theory. Theoretical services will publish very different kind of numerical models and simulations (N-body / SPH / MHD simulations, asterosismology models, radiative transfer codes, astrochemistry models, ...). Some of these theoretical results have a lot of properties characterizing simulated objects (> 100 000 in one the SimDAL implementation). These numbers are growing due to the progresses in numerical models.

We would need to have the properties as table columns in a table in a relational database, which is simply not possible for the majority of the rdbms currently in use (which we would have to use if we would like to use TAP, since TAP is strongly SQL, and so relational, coupled). Storing such data in TAP-way in RDBMS would require to have the properties as table columns in a table but it is not possible to manage high dimension data (i.e. table columns) for the majority of the RDMS currently in use (Postgress, MySQL). High dimension data and their use is much more properly served by other type of storage architectures. That publishers cannot (or would have great difficulty - i.e nonsense - to) use with TAP when they do not have SQL compatibility/adapter.

Note that if the definition of SimDAL has been so long, that is because many technological solutions have been tested (and implemented) before reaching the present proposition. Among them TAP has been tested on various data management systems / storage architecture. The conclusion of this implementation is that TAP is not an option. The views solution adopted in SimDAL has two benefits
1 - it decouples the standard VO interface of the technology to store the data (so a publisher can choose the technology he preferes depending on the particularities of his data)
2 - it is as similar as possible to TAP (virtual table + view schema) so that publishers already familiar with the VO should not be lost.

Concerning the SimDAL Repository part:
First, note that SimDAL components (and among them the SimDAL Repositories) are registered in the IVAO registries.
To the difference of the registries, SimDAL Repositories describe resources (protocols /codes, projects, etc.) with the semantics defined in the Simulation DataModel So it is only with SimDAL Repositories that a search for resources can be done using the SimDM semantics. Moreover, SimDAL Repositories are places where the SimDM XML serializations of projects and protocols (codes) are stored. These serializations are the descriptions
of theoretical projects and codes that are published in the VO. IVOA registries do not have functionalities to store and query such serializations whereas SimDAL Repositories do.
Discussions with Markus (for the Registry W.G.) showed that some parts of these serializations could be transformed and ingested in the IVOA registries. Nevertheless, this would be done loosing the relationships between SimDM classes, and so loosing the hierarchy of the model and a part of the SimDM semantics.
Presently, the SimDAL Repository search API does not allow to fully benefit of the SimDM XML serializations despite most scientific use cases would require fine grain search in these SimDM serializations to discover efficiently protocols and projects of interest. This has been a choice for the version 1.0 of SimDAL. Indeed, in the coming months / years we do not expect to have a lot of registered IVOA theory services and so, it should be easy for users to discover theoretical services with the SimDAL Repositories as presently defined. Nevertheless, when more and more theoretical services will be registered finer grain search will be necessary. SimDAL Repositories as defined in version 1.0, storing the full XML serializations of projects and protocols, contain all the informations and the standardized relationships between these informations to answer these use cases. It will then be time to extend the capabilities of its Search API.

Answer to the answer: Well, ok. If you've found by experience that TAP is insufficient for SimDAL's requirements I believe you, and I'm well prepared also to believe that inventing some custom discovery/search API may be a better way to proceed than trying to generalise the more mainstream VO technologies to cope with SimDAL's specific requirements. However, a couple of comments on that:

One downside of this approach is that you don't benefit from the large amount of scrutiny and testing that have gone into existing TAP/Registry protocols, and this means that you may have to take extra care in specifying exactly how the APIs you're defining here are supposed to behave (this in one reason that validators are valuable tools, to check where that hasn't happened). One example that springs to mind: it looks like the intention of the JSON query language described in sec 5.1 is that constraints in the "where" list are ANDed together. But I don't see that written down explicitly (apologies if I've missed it), and I don't see any provision for OR logic. I also don't see discussion here of case sensitivity for field names. If things like this are not specified explicitly it may inhibit interoperability in implementations.

Your response here is not what's written down in Appendix B, which mentions other reasons for avoiding TAP. It would be useful if you summarised in Appendix B all the reasons that SimDAL decided on non-TAP/non-Registry solutions.

-- MarkTaylor - 2016-08-09

Section 1.1: Only a specimen IVOA architecture diagram is included, a real one should be used. In view of the unusual content of this standard as I dicussed above, there should be some more detailed discussion here of which IVOA standards this document uses, which ones it avoids in favour of its own ways of doing similar things, and why.

Answer: Indeed. The diagram has been replaced. If a diagram with all the standards is required, it will be introduced in the corrected version of the document.

Section 3.2: The use of VOTable to encode errors here says it follows DALI, but in fact it looks different from the usual way that DALI-compliant services do it. The specification in this document encodes errors as a sequence of multiple (error_msg,error_code) pairs as rows within a TABLE, while DALI encodes an error as a single INFO element outside the TABLE element. I suspect this is a misunderstanding of DALI intention, but maybe it's deliberate because of the need to report sequences of errors rather than single ones. It should either be changed to match standard DALI practice, or if not it should be clear from the text that this is not DALI standard.

Answer: Thank you. That has been corrected.

Section 4.2: "The response schema of the results table is (FIELD IDs):" but the following table has FIELDs with name attributes as listed rather than ID attributes. Some of the VOTable samples use lower-case element names, which is not permitted in VOTable.

Answer: Thank you. Also corrected.

There are reference implemenations listed, which is good. However, I don't see any validators. I played around a bit with the implementations (not really understanding how to drive it properly); quite a few links in the obspm implementation lead to error pages. Validation tools should be provided by this stage of the review process, and ought to help in identifying missing/broken functionality like that I currently see in the obspm implementation.

Answer: At the InterOp of Sesto, in May 2015, when the procedure to finalize SimDAL has been launched, Severin Gaudet (as chair of the TCG) asked for a validator but said that a client compatible with the reference implementations is a validator. So, a client instead of a simple validator as been developed. It is compatible with the two reference implementations.

We tested the client (https://app.ism.obspm.fr/simdal-client/) and it seems to work properly.
A few comments on its use:
1 - To search for simulation, follow the order in the top menu: Search in the Repository, then do a SimDAL Search, and finally search in Access data. Each step provide the URIs for the next one.
2 - In the repository search, first select a SimDAL Repository before doing a {search} or ask for the list of {projects}.
3 - At each step, after a search, the system provides the URI of the services. These URIs have to be copy-paste in the next step.

-- MarkTaylor - 2016-07-14

Answers: --IVOA.FranckLePetit and DavidLanguignon - 2016-08-09