Simulation Data model Proposed Recommandation: Request for Comments

This wiki page document will act as RFC center for the Proposed Recommendation entitled " Simulation Data Model v1.0 ". The specification can be found below as attached files.

NB The specification includes additional documents beyond the PR document itself. Links to these additional documents can be found here: http://wiki.ivoa.net/twiki/bin/view/IVOA/IVOATheorySimDMspec.
In particular note the HTML document which contains the data model in full detail

Reference Implementations

There are several the Simulation DM 1.0 reference implementations documented in the following IVOA Note, 02/April/2012

RFC Review period: 04 May 2011 to 12 October 2011

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your WikiName so authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the Data Model and Dal mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document.

TCG Review Period: 20 Oct 2011 - 20 Nov 2011



Thank for adding your comments below...

Comments

I think all implementation examples can be gathered in the same Implementation Note . This means to move Appendix C for Millenium Simulations from the 'Appendix ' document to the Implementation Note, as suggested in the Appendix text.

-- MireilleLouys - 04 May 2011

Most of the Utypes have a SimDB: prefix in the accompanying documentation and in the implementation note. I suggest to change this suffix into SimDM: in order to distinguish the use-case of a data base or simulations ( discussed in the theory group) from other use-cases dealing with individual serialisations of SimDM. ""simdb" also appears as the main package name in the UML representation: this could be discussed and clarified.

-- MireilleLouys - 04 May 2011

REPLY:
I have changed the prefix to SimDM in the accompanying documents, particular the HTML fioe with th detailed description of the model. If UTYPE prefixes should be lower case that can be changed.

The root package name could (should?) be changed to SimDM without any problems. But as you suggest this may be discussed in Naples.

-- GerardLemson - 14 May 2011

UPDATE: seems that volute does not contain latest version with SimDM. I think I know why and will fix this asap.

-- GerardLemson - 17 May 2011

Which document/s are we discussing?

The document that was submitted to the list (and is available in the wiki page), that is, the Data Model document, makes reference to other "accompanying documents" very often. For instance, it mentions the Appendix many times.

This Appendix is not part of the document and was not submitted to the list together with the SimDM document.

The important question is: What are we revising here? the submitted document or it together with the mentioned set of "Accompanying documents"?

In fact, when I've tried to use the links at the end of the document, they don't work (at least for all the cases that I've tried). Navigating the volute web page I've been able to find documents with similar names that I assume that are the referred ones, but I can't be sure.

In the particular case of the "Appendix", it seems clear that it is a draft (with comments like "TBD review this section", "@@TODO discuss this further @@", it etc). This is ok in a draft, but it shows that it is not a final document. So I assume that it is not part of this discussion, is it?

REPLY:
This was commented upon by Miguel Cervino as well during the WG review. The top of this RFC page therefore explicitly states that all documents to which links can be found here are part of the specification.

-- GerardLemson - 14 May 2011

REPLY:

OK, I'll send my comments to the appendix as soon as possible.

I wasn't aware of this new (?) practice of discussing things on a wiki page, so I didn't read those comments. I will look for it.

But I really think that it is, at least, not much practical that a standard is distributed in so many documents, some of them unfinished, that must be found in different places.

By the way, I have tried the 6 links to the accompanying documents at the end of the SimDM document and, 5 of them don't work. Maybe you could take a look to it.

-- CarlosRodrigoBlanco - 14 May 2011

Isn't it too summarized?

If this is the document for the Simulation data model I imagine that it should contain a description of all the elements in the model, and that if something is not described here, it is not part of the model.

4.2 Thinks like "There is the Party class, which represents an individual or organisation, and is not so important for the moment." and then no other comment is made about this class in the document (although it seems to be seen as part of the data model).

In other words: is the "party class" part of the model? or it isn't yet but we intend to include it in future versions? or is it explained in an "accompanying document"?

In general, I have the impression that this document tries to be a friendly general description of the model, without going to the hard details, probably tending to summarize with the intention that it is easier to understand. But I think that the details should be part of the document.

REPLY:
See the comment above. In particular the HTML document contains documentation about each and every element that is part of the model.

-- GerardLemson - 14 May 2011

Punctual comments

The first thing is again: couldn't we find another word for (experimental) Protocol?? not just adding a "(experimental)" in italics before it?

It was very clear at Nara that using this word in a VO context is extremely confusing. And, if we are going to continue using it, we should write the "(experimental)" adjective everywhere, without exceptions (including figures).

In general, I think that there are too many references to protocols and simDB in the document. I understand that some comments must be done because the modelling is done in such a way that the data can be accessed. But the data model is about how the data is represented/modelled and not about how the data is accessed. And I think that it would be better to have as few references to protocols as possible.

RESPONSE:
We can discuss this, however you would not expect a data model that deals with provenance of images for example not to contain the term Image. So why should we forbid a data model that deals with special types of Protocols (such as simulation codes) to contain a concept with that name? Note that a "protocol" in the "IVOA sense", for example the DAL protocols, are special instances of the same Protocol concept, namely the method by which one defines that a certain action is executed. In our model Experiments are performed according to a Protocol, there is nothing wrong with the choice of name in my opinion.

I think in general the context within the words are used will make clear what is meant.

-- GerardLemson - 14 May 2011

REPLY:

The truth is that when the "protocol" word is heard in a VO meeting mostly everybody thinks on "DAL protocols", not "the way a theoretical experiment is done". And trying to have a conversation on how a (DAL) protocol will deal with (experimental) protocols becomes quite hard. That happened in Nara and it led to a very confusing discussion.

That wouldn't happen with the word image in a data model for images. There, I cannot imagine how the word would be misunderstood.

The point is not if the word is correct or not. I agree that it is correct. The point is if it is misleading given that it is a word that is broadly used in the VO with a different meaning (even if it is based in the same semantic concept).

But it is a minor point. If people agree on using this term, let's use it.

-- CarlosRodrigoBlanco - 14 May 2011

Service/Webservice

I've found it quite difficult to understand where the "service" class is located in the model (I'm not completely sure that I understand it yet). Looking at figures 1 and 2 I assumed at first that a service is specified for each experiment (which would mean that there is one service for each run of the code). But looking at figure 3 and section 3, and then section 4.8, I see that both "Service" and Experiment (and Protocol and Project) are subclasses of Resource. For me this would mean that there is one service for each resource (although the resource contains several experiments run under a given protocol) and that this service can be used to access the results of all the experiments. Am I right? If so, I think that figures 1 and 2 are a little misleading.

RESONSE:
You are mistaken. The fact that a Service "is a" Resource and so is an Experiment has nothing to say about the number of resources per service. In fact the model itself explicitly indicates the relation between a Service and other resources through its collection of AccessibleResource-s. This indicates that a service can give access to any desired number of resources.

Note that the Service concept is on purpose kept rather vague. The goal was simply to allow users to find (web) services that give access to (results of) experiments or sets of experiments. This can be achieved in a database containing descriptions of experiments and services linked to them. One would go to the service itself to find out how the access is achieved.

-- GerardLemson - 14 May 2011

Section 2: History

I don't think this section is very important, but given that it is part of the document, I would make a couple of small changes.

(a) It is true that some types of simulations (the cosmological ones, for instance) are often made in big collaborations. But it is also true that other types of simulations are performed by a team of one astronomer plus one student. And, and least, I wouldn't say that the first case is more usual. In fact I'm convinced that this is a point that should be considered: many (if not most) theoretical simulations are made by small groups.

Thus, I would delete the sentence "and is these days often performed in large collaborations".

(b) When it starts talking about S3 it says: "A recent effort has been"... I don't think it is so recent.

The Note is more that two years old and the idea had been already presented in March 2007 at the "Astronomical Spectroscopy and the Virtual Observatory." workshop (without the S3 name). Actually it was a parallel effort, for microsimulations, when other people were focused on 3+1 simulations. I would just change it to "Another effort", "A different approach" or something like that. And, by the way, it is not a result of an investigation started at Cambridge as we were working in it before Cambridge (and had a first version implemented for isochrones and evolutionary tracks, not just theoretical spectra).

"S3 is actually a direct reworking " to "S3 is a generalisation ". I don't know what "a direct reworking" really means, but I don't think S3 is a direct reworking of TSAP at this stage (maybe it was at the end of 2006 when we first implemented it).

(c) We didn't really decided at Victoria that S3 and SimDAP would be merged in a single protocol named SimDAL. We decided that it would be nice to do so and that we would investigate if it is possible or not. Actually I don't know enough what SimDAP means to be able to say how both protocols could be merged, or if they can be merged or if it is a good idea to merge them in an only protocol. But, in any case, this is not the subject of this document.

In fact I think that stating the intentions of the TIG about protocols and so (and what is decided or not about that) is out of the scope of a data model document. Even though this is the historical introduction, I don't think it is necessary (or even good) to include such statements. Thus, better than discussing about it, I would drop the last paragraph.

Section 3.1

"and SimDAP has merged with S3 to form SimDAL: a family of access protocols for theory data"

I would change "merged" to "joined" or something like that. At this stage, both things are included under the SimDAL concept, but we don't really now yet if they will be an only protocol, two protocols, two flavours of the same thing or what. Actually, the word "family" seems to imply that there will be more than own "flavour" and, for me, seems contradictory if we say that they are merged.

REPLY:
I would like to ask Herve to comment on this.

-- GerardLemson - 14 May 2011

How to use the model

It is quite clear that this data model has been made mostly with two ideas in mind: cosmological 3+1 simulations and a data base of simulations (mostly of this type). This is quite obvious throughout the document (and it's perfectly understable given the history).

But, given that this is intended to be THE model for simulations and that I have implemented more than 20 services for theoretical data, I have to try to find the correspondence between the concepts that I use for those services and the ones in the data model. And I must say that it is not easy even for the most simple models. And what worries me is that, if it is not easy for me (I'm not an expert in data modelling, object oriented programming, uml and so, but I've been attending talks and discussions about this for a long time), how will it be for scientists who have their grid of models and want to make them available in the VO?

I think that we very much need, at least, a simple recipe on how to implement this data model for simple cases (let's say: those usually called microsimulations). And probably some examples should be included in the document so that data providers have an starting point to this complex standard.

And I assume that it is partly my assignment to do such a thing (or at least I would like to be able to do it). But I'm still not sure if I am able to figure out how to do it.

REPLY:
Honestly I have difficulty responding to this again, as I think I have done so repeatedly before, which always included the offer to try to assist you in concrete cases. This might have improved the model if necessary, or at least its explanation!

Though a long time ago the model started out with the aim "only" to describe simulations with a spatial/temporal nature, ranging from stellar systems to large scale structure, it was found by all active participants that without undue difficulty other simulations could be described as well.

In the most recent Strasbourg interop for example I gave a presentation how S3 concepts map onto the SimDM. The fact that you can not see how that could work in practice for concrete examples may also be due to lack of understanding of the model. Therefore I have offered more than once to help you doing this for your particular examples. Though in Nara you said you would try, I have not heard back from you. Maybe the examples below can be used for this? I will have a look at them and maybe we can try to work on them in Naples?

Without you trying to do this mapping, or if you have difficulties trying to get assistance, I find this criticism unjust.

-- GerardLemson - 14 May 2011

REPLY:

Honestly, I don't understand that acritude.

First, this paragraph in particular wasn't a criticism.

I don't think that it is a good idea to take this to the personal level, but about the personal comments:

Of course I have a lack of understanding of the data model. I think I have recognised it once and again.

By your presentation in Strasbourg you mean this one: http://wiki.ivoa.net/internal/IVOA/InterOpMay2009Theory/SimDB-S3.ppt ? It mostly says "S3 does not need the whole data model but there is a subset of it that can be useful". It doesn't say how. And it doesn't answer any of the questions that I'm making.

In Nara, when I told that I would try to make the matching between the data model and the "microsimulations", you told me that you were going to send me some kind of form (or something like that) so that I could try to fill it with the information for some of the models that I implement. And that information would be useful to address this point in detail (or something like that is what I understood). I don't know exactly what all this means and what you intended to send me but I was waiting for it. Finally I imagined that you have forgotten about it and you were focused in other things (I understand it because I also was). And I made my try to make son example serialisations of a simple case. And of course I think it would be nice to discuss that in Naples.

In any case, this is not the point. I think you have always been as helpful as possible and I don't have any criticism about that.

The point is: am I the only one that finds this data model (with all the accompanying documents, some here, some on the web, some on volute...) so difficult to understand specially when trying to use it?

Because one thing is getting a general idea like "the model describes theory metadata, the protocols, the experiments, the results and so". And another different thing is trying to use it in practice. I mean, one thing is understanding the meaning of each box (classes, instances and so), another one is finding which one corresponds to each concept that one wants to use and another different one is trying to, in practice, serialise it.

I am sure that you understand all this because you have created it. I just want to point out that I don't. I'm sorry. And I wonder if I am the only one.

-- CarlosRodrigoBlanco - 14 May 2011

A try to make a couple of examples

When writing a votable containing data for a theoretical simulation, I assume that the data model should be useful to better characterise the content of that votable (concepts and relations between them).

Thus, my main exercise has been trying to use this data model in a extremely simple case. I have a collection of theoretical isochrones and I want to rewrite it adding utypes from the data model.

And I must say that I'm not sure at all about how to do that.

The main idea to be able to say:

  • This is an isochrone
  • It is the isochrone for an star (or set of stars, let's say "star" for simplifying)
  • It has been calculated with the Baraffe et al model
  • The parameter is the star age.

And, even:

  • the isochrone is a table with four columns:
    • mass
    • effective temperature
    • logarithm of gravity
    • bolometric luminosity

In a simple model I would imagine something like this:

<PARAM name="model"  utype="SimpleSimDM:model" value="Baraffe"/>
<PARAM name="object" utype="SimpleSimDM:targetObject" value="star"/>
<PARAM name="INPUT:age" value="0.00100" unit="Gyr" ucd="phys.age" utype="SimpleSimDM:inputParameter"/>

<TABLE>
<PARAM utype="SimpleSimDM:product" value="isochrone"/>
<FIELD name="mass" utype="SimpleSimDM:product.property"/>
<FIELD name="teff" utype="SimpleSimDM:product.property"/>
<FIELD name="logg" utype="SimpleSimDM:product.property"/>
<FIELD name="Lum" utype="SimpleSimDM:product.property"/>

(...)

But this SimDM model is not simple. It is complex and very hierarchical. And it is known that this hierarchical structure is not easy to represent in a flat document as a votable.

I just try to identify the utypes more adequate for each concept and add relations (by grouping) with the semantic labels so that eventually they can be liked to some vocabulary.

And I even get a little lost here. I get confused by the ObjectType, TargetObjectType, RepresentationObjectType, ExperimentRepresentationObject, RepresentationObject... I'm not very sure what I should use for just saying "this is a star", what for saying "this is an isochrone", what for representing the object contained in the isochrone (with its properties, mass, logg, etc...). I get the feeling that this is quite a flexible and interesting idea, but I don't really understand much about all these classes named Object without some examples.

Finally:

I have writen two quite simple votables:

  • the first one contains one isochrone and I try to use the datamodel in it.
  • the second contains a list of isochrones for a given range of ages.

Could someone tell me (probably Gerard) if they make sense?

(In some cases I have added groups with only one param inside, which is quite unnecessary. I do it just to show the idea that something else could be added, if needed, as related to that param)

If you make some comments and help me about this, I would try to do something similar for the much more complicated case of asteroseismology simulations.

By the way, I don't find in the model any "box" to specify a bibliografic reference. I think it would be important to address that too.

-- CarlosRodrigoBlanco, 12 May 2011

REPLY:

Dear Carlos,

To map your isochrone models on SimDM, you should have :

  • TargetObjectType : Star
  • InputParameter : Age
  • OutputDataObjectType : isochrone
  • Property : mass, effective temperature, gravity, bolometric
luminosity Moreover, you can use the Ressource description to mention all references you wish as Baraffe et al.

Concerning the "Object" classes, if I can give you a piece of advice, you should only care about those with SKOS concepts. Browsing the vocabularies can help to understand how to do the mapping.

It is not very useful to try and serialize this instance of SimDM in a VO-Table. As you said it, in most cases, because of the hierarchy of SimDM this cannot be done. Such an effort of serialization is only useful for communication between a server and a client. It is the purpose of S3 but not of SimDM that just aims at describing protocols and experiments. It will be easier to do such a comparison between S3 and the access protocol.

Best regards.

-- FranckLePetit, 17 October 2011

Comments from Mark Taylor (for Apps WG)

(This comment is on the 20110428 version; I have just seen a reference to a 20110520 version on the IVOATheorySimDMspec page, but the link is broken. I have not read all of the supplementary material, some of which appears to be at the end of broken links.)

The basic content of the data model seems OK to me, though being quite ignorant about astrophysical simulations I'm not qualified to say whether it does a good job of capturing the relevant concepts.

There are a few issues with the organisation of the document that warrant comment:

  • The large number of supplementary documents is, as noted by others, somewhat confusing. These may be useful or necessary supplements to the standard, but it's not clear how they are to be assessed within the framework of document review process, or whether they form part of the standard itself. This question, which may affect other documents in the future, may be something which should be considered at the TCG level. It would be useful at least to have near the start of the main document an outline of what the supplementary material is and where it fits in to the context of the standard.
  • I'm not sure what the status or purpose of the Appendix (Appendices) supplementary document is. I would normally expect an appendix to be part of the document itself.
  • The History section (also historical material e.g. in section 3.1) seems to be of marginal use in a standards document. Although it does contain some relevant context, the details of what meetings spawned what developments don't seem to be that relevant to people who want to use the standard.

REPLY:

For the version dated 2011.09.06, the appendix has been included in the main document. According to another comment we have moved a few sections to the Implementation Note. The History section has been moved to Appendix A.

-- HerveWozniak - 17 Oct 2011

The standard might be more comprehensible to potential adopters if a short (perhaps incomplete) example or two appeared in the body of the text. I appreciate that some of the supplementary material contains examples, and perhaps this is the best way to do it if the examples are necessarily bulky. If that's the case a pointer in the body of the text at appropriate points to the relevant externnal document would help (apologies if that's there and I've missed it).

REPLY:

A full example can hardly be given in the main body of the document since any XML document could be as long as 10,000 lines. However, we believe that the Implementation Note provides such a strong help for future implementers. Examples are given there for Protocol and Experiment classes.

-- HerveWozniak - 17 Oct 2011

There are also a few editorial issues:

  • A couple of paragraphs just stop mid-sentence (sections 4.7, 6.3.3).
  • There is a missing reference to a (missing?) chapter ("...will have to be registered (see chapter for that discussion)" in sec 6.1.
  • The document refers to itself in several places as a Working Draft or pre-WD (e.g. Note 8).
  • A small number of spelling mistakes (similarioty, serialiased, UYUPE).
  • Reference [10] to TAP can be updated to the REC version of TAP 1.0.
-- MarkTaylor - 08 Jul 2011

REPLY:

We have updated the document according to your comments for the release of the 2011.09.06.doc PR, before the extent of the RFC. Apologizes for having replied so late.

-- HerveWozniak - 17 Oct 2011


Comments from Enrique Solano (for Apps WG) Oct 12th ANSWERS REQUESTED

  • In my opinion, there is a fundamental issue still missing in the document: Specific examples where the data model has been used. This is a complex model and without examples it will be very difficult for the data providers to implement it in their data collections.

REPLY:
Implementation examples can be found in the Implementation Note (see MireilleLouys comment, DM WG at that time, on May 4th) in a form that editors found to be smarter than simply providing an endless XML file. The Implementation Note has been provided as an accompanying document of the PR, has been updated for the RFC extent and will be published as an IVOA Note soon after REC.

-- HerveWozniak - 18 Oct 2011

For the Naples Interop. (May 2011) Carlos Rodrigo uploaded a couple of examples on the use of the data model in a very simple case (a collection of isochrones). He identified a number of problems (even in this very simple case) and asked for help and comments. No replies yet.

REPLY:

F. LePetit has answered and has given advice. It is not the purpose of SimDM to be serialized in a VOTable. For PDR "micro-"simulations the XML file is 13000 lines long!

-- HerveWozniak - 18 Oct 2011

  • Other comments to the document
# Pag 39.

> A recent effort has been the proposal for a simpler access standard for small
>scale simulation, the Simple Self-describing Service protocol (S3, [14]).

--> Remove "recent" or change it by "parallel". The S3 Note was published in 2008.

REPLY:

OK.

Action: change 'recent' by 'parallel'

-- HerveWozniak - 18 Oct 2011

> This was a result of an investigation started in the Cambridge 2007 interoperability meeting
> whether “micro-physics” simulations as they are sometimes called require special
>attention.

This is not correct. A description of how to handle theoretical spectra (a clear example of "microsimulations") appears in a SSAP Proposed Recommendation issued on Sep 17, 2007. The Cambridge Interop. took place on Sep 27-28. What it is included in the Proposed Recommendation is the result of a joint collaboration between ESA-VO and SVO started some month before. Moreover, already in 2004, Spanish and Mexican groups were working in this topic in the framework of the PGos3 project.

REPLY:

Handling theoretical spectra is related to handling the output of a modeling process or simulation (a DataObjectType in SimDM). The full description of the model (or the simulation) that produces such spectra is a much larger topic. Describing all kind of microsimulations, in a generic approach, is also a much larger topic that was decided during the Cambridge 2007 INTEROP.

-- HerveWozniak - 18 Oct 2011

> The SimDM was shown to be able to incorporate the metadata for
> S3-like services, and indeed proposes extensions
> of that.

Well. Not yet proven. Actually, this is why we are asking for comments about the couple of examples (using the model for isochrones) Carlos uploaded to the Twiki page some months ago. Simply because we do not know whether SimDM is able to incorporate the S3 metadata.

REPLY:

This has been discussed in various past INTEROP sessions and dedicated meetings (such as Sep 2010 in Strasbourg). SimDM has been improved to handle S3 services needs. The Implementation Note (section 5) contains an in-depth analysis of this problem.

-- HerveWozniak - 18 Oct 2011

> It was decided that the S3 protocol should be merged
> with/incorporated into the SimDAP standard

Clearly, this sentence has to be rephrased. It is quite weird to say that it was decided to merge/incorporate S3 into the SimDAP standard if there is NO a single reference to SimDAP in the whole IVOA Documents and standards page. On the contrary, you'll find there the S3 Note.

REPLY:

S3 in indeed described in an IVOA Note but is not an IVOA standard. SimDAP is a long evolution from preliminary SNAP initiative and its goal is described in %The IVOA in 2008: Technical Assessment and Roadmap% released in August 2008. The decision to have only one access protocol for accessing simulation has been taken in Victoria 2010. SimDAL (which is the name of the common effort to define a DAL protocol) is in the IVOA roadmap since then.

Action: rephrase: “…should be merged with SimDAP protocol to create the SimDAL standard”

-- HerveWozniak - 18 Oct 2011

>The appendix document addresses this question from a formal point of > view, namely by defining how S3-like
> services can be described by the data model.

If I am correct, the appendix is not included in the document. And, again, whether S3 services can be described by the data model or not is still an open question (see above).

REPLY:

This is a mistake. Appendices have been included in the last document (20110906).

Action: replace “appendix document” by “implementation note”

-- HerveWozniak - 18 Oct 2011

> We then propose how also the S3 proposal could use the model.

To my knowledge, not in the present version of the document.

REPLY:

Right. This part has been moved to the Implementation Note in order to collect all points related to implementation in a single document (see MireilleLouys comment on May 4th).

-- HerveWozniak - 18 Oct 2011



Comments from TCG member during the TCG Review Period: 20 Oct 2011 - 20 Nov 2011

WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard.

IG chairs or vice chairs are also encouraged to do the same, although their inputs are not compulsory.

TCG Chair & Vice Chair (Christophe Arviset, Séverin Gaudet)

First, I want to thank the authors of the document for the hard work to put this SimDM document together, through the many IVOA Interop planned and unplanned sessions and any other discussions and meetings that took place over the last years since this was initiated. Let me also apologize for bringing these comments so late in the process.

I know that before SimDM became what it is, there have been many discussions on many other potential standards (eg SNAP, SimDM, SimDB, SimDAP, SimDAL, SimTAP, and I might forget some…) and these transcribed in many places through the SimDM document.

As we discussed in the context of other standards, the final REC document is supposed to reflect WHAT the standard is, and not really HOW the standard has reached its mature state.

My understanding of how the Simulations within the VO are going to be published, registered and accessed in the VO is through the context of SimDAL / SimDM / SimDB, with:

  • SimDM being the model for Simulation (this document)
  • SimDAL being the Data Access Protocol to access SimDM. SimDAL is still being discussed and will probably be based on TAP
  • SimDB, and one still needs to determine how the required fonctionalities can be addressed by the existing VO Registries (with some potential extensions) and / or through TAP/VOSI interfaces

Saying all the above, to improve its readability in the overall IVOA context, the SimDM document would need some editorial cleaning and should more clearly

  • introduce in section “Link to IVOA Architecture” the SimDAL/SimDM/SimDB context

REPLY
the section has been extended to better introduce the various initiatives. -- HerveWozniak - 2 Mar 2012

  • in the rest of the document, concentrate on the MODEL for Simulation (with less constant references to SimDAL and SimDB)

REPLY
we have rigourously kept the number of reference to SimDB as low as possible. However, SimDM also aims to support SimDB, a repository of simulation metadata. So we cannot fully avoid the word SimDB ! The number of reference to SimDAL is even smaller. -- HerveWozniak - 2 Mar 2012

  • in the main text of the document, remove ALL other references to any previously envisaged but finally not put forward IVOA standards (ie SNAP, SimDAP, SimTAP, VO-URP, …). The historical background of decisions, meetings, agreements, etc… can be added in Appendix A for reference but should not confuse the reader through the rest of the text.

REPLY
done -- HerveWozniak - 2 Mar 2012

The SimDM main document remains the document to be reviewed by the TCG to ensure consistency with other IVOA standards, through the comments and approval of the WG chairs. The Implementation Note is very welcome and demonstrates the usability of the model. Nonetheless, its does not represent the normative document to be formally reviewed. If there are arguments about S3 (more in the area of access protocols, but not a formal IVOA standard anyway), we should remove these references from the main document and from the implementation note.

I feel that an updated PR document should be issued with these editorial updates, and then we should have another 2 weeks review within the TCG before the document goes to the EXEC for final approval.

-- ChristopheArviset, 6 Feb 2012

Thanks to the authors for all the updates which addresses my points and make the DM clearer. I approve the document.

-- ChristopheArviset, 8 March 2012

Applications Working Group (Mark Taylor, Enrique Solano)

# Comments on the main document (simulation data model). Enrique Solano.

* Pag 39:

+ "A parallel effort has been the proposal for a simpler access standard for small scale simulation, the Simple Self-describing Service protocol (S3, [14]). This was a result of an investigation started in the Cambridge 2007 interoperability meeting..."

--> This is not correct. S3 is the natural evolution of TSAP, a protocol to handle theoretical spectra which appeared in a SSAP Proposed Recommendation issued on Sep 17, 2007. TSAP was the result of a joint collaboration between the ESA-VO and SVO projects.

REPLY :

Let me quote again what I have already wrote above in reply to the same question (on 18 Oct 2011): "Handling theoretical spectra is related to handling the output of a modelling process or simulation (a DataObjectType in SimDM). The full description of the model (or the simulation) that produces such spectra is a much larger topic. Describing all kind of microsimulations, in a generic approach, is also a much larger topic that was decided during the Cambridge 2007 INTEROP."

By the way, I have changed "recent" by "parallel" as YOU suggested (see your previous comment during RFC).

Last but not least does the Application Working Group approve or disapprove the document ? This is unclear for me since the discussion definitively focused on only one sentence of the first Appendix...

For the rest of the comments, they are related to the implementation note, which is a Note, not the PR. I let Franck answering this. I even wonder whether we have to discuss a Note on this page...

-- HerveWozniak - 15 Dec 2011

# Comments on the Implementation Note (Enrique Solano):

* Section 5 (S3)

+ "Most of the S3 protocol can be described using SimDB/DM concepts".

SimDM is a Data Model and S3 is an access protocol --> The data accessed through the S3 protocol may (or may not) be described using SimDM but clearly not the protocol itself.

REPLY (by Franck) :
I agree with this point. S3 is more related to an access protocol than to a DM. We decided to add this part to the document after a VO-Theory meeting in Strasbourg. During this meeting, S3 authors wished to understand the potential link between some aspects of S3 and SimDM. The present text is the answer to this reflexion. Since S3 is closer to DAL than to DM, we may remove this section of the implementation note.

+ "It is useful to interpret S3 as a TAP service"

Meaningless. What is the use of comparing simple protocols (SIAP, SSAP, ConeSearch, S3) with more complex ones (TAP)?

In general, this section is dangerously mixing different concepts and should be profoundly rewritten.

We've made the exercise of implementing SimDM in different theoretical collections accessed using S3. According to this, we think that section 5 of the Implementation Note should be rewritten like this:

REPLY (by Franck) :
As mentioned above by Enrique, S3 is closer to an access protocol than to a DM. As a consequence there is no need to mention S3 in the implementation note of SimDM.

++++++++++++

5.- S3

S3 (http://ivoa.net/Documents/latest/S3TheoreticalData.html) is a proposal for a simple protocol to provide access to theoretical data in the framework of the Virtual Observatory. S3 stands for Simple Self-described Service. This name reflects the ability of the data server to describe itself in a simple standardized way. S3 is defined by a number of simple HTTP GET requests : 1. Metadata: Returns the parameters defining the service. 2. Data query:Returns information about files thatexistforgiven(rangesof) parameters. 3. Retrieve file: Returns a particular file (or a cutout of the file).

The result of a Metadata query is a VOTable describing the service in natural language and a list of PARAM elements, one for each parameter accepted by the Data query request. The data query finds models corresponding to particular (ranges of) values for these parameters, selected ones of which can be fully or partially downloaded using the Retrieve file request.

5.1 SimDM implementation on theoretical collections accessed through S3

In this section we provide some examples on how to represent different S3-accessed theoretical collections using SimDM (an isochrone, a list of isochrones, a stellar structure model, a pulsation model and an asteroseismic model).

A key question to be answered is the degree of interoperability among these representations, i.e., will a VO-tool be able to handle different theoretical collections available in different services, identify common parameters and compare them? In our opinion, neither the UCDs nor the skos vocabulary are detailed enough to fulfil this requirement that can only be accomplished using appropriate links to an underlying physical data model (see, the examples on asteroseismology). Attached yo will find: - An isochrone - A list of isochrones - An stellar structure model - A pulsation model - An stellar + pulsation list of models

REPLY (by Franck) :
In some of the examples, definitions of concepts are provided but are not defined by SKOS concepts as recommended by the Semantic W.G. For example :http://svo.cab.inta-csic.es/theory/sisms3/concepts.php

Remember that, concepts used in a SimDM implementation, can refer to "general" vocabularies as the ones provided at http://votheory.obspm.fr or "specific" vocabularies developed by publishers of VO-Theory services. Specific vocabularies aimed at allowing publishers to define precisely specific concepts used in their numerical codes that may be too specific to be in the general vocabulary. Nevertheless, up to now, we have never refused to add new concepts (even specific) in the general vocabulary. To have a single vocabulary would be more simple to maintain and to use in the VO architecture.

Finally, I would like also to remember that, as mentioned in the Semantic session of Napoli InterOp, an evolution of SKOS concepts used in SimDM could be to use OWL or RDF schemas to describe fined concepts and link together concepts from different SKOS vocabularies.

A few typographical and editorial errors, mostly easily fixed (Mark Taylor):

  • Table of Contents: it would be nice if the Appendices were listed here as well as the main sections.
  • p.3: RFC2119 has the citation index "[0]" and does not appear in the bibliography
  • Sec 1: "This Note deals with the data model..." it's not a Note it's a standards track document
  • The associated Implementation Note is referred to a couple of times in the text. This should have a bibliography entry so that readers can find out what/where it is.
  • Footnote 8 says "once the specification becomes a Proposed Recommendation..."; it already is. But presumably this comment will be removed before REC in any case.
  • Sec 3.2: "Here bwe only..."
  • Sec 3.2: "...specification thata is..."
  • Sec 3.3: "Algorithms ore contained..."
  • Sec 4.1: "...a UTYPE existed of a word..." - should that be "consisted"?
  • Sec 5.3.1: text "Erreur ! Source du renvoi introuvable." should be corrected
  • Sec 5.3.1: document refers to itself as a WD
  • Sec 5.3.3: "In that case the only requirement would be ... and that" sentence just stops. Correct this, it's not clear how much of the preceding text is not supposed to be there.
  • Appendix C.1: "collection off Experiment"
  • Appendix C.2: "epxlicitly"
  • Appendix C.2: "But in the We propose in SimDB..."
  • Appendix D: "We then propose how also the S3 proposal could use the model..." - doesn't appear to be the case, I think this refers text removed from an earlier version.

REPLY :
Thanks for having read so carefully the document. The corrections will be implemented in the final version. -- HerveWozniak - 15 Jan 2012

Finally:

As reported in earlier comments on this page, the Applications WG has taken issue with the stated details of how the Simulation Data Model relates to other protocols. We thank the authors for taking steps to address this, though there remain reservations about some of the current wording. However, the contended text does not affect the normative part of the standard, and Applications therefore recommends acceptance of SimDM in its current form.

-- MarkTaylor - 19 Apr 2012

Data Access Layer Working Group (Patrick Dowler, Mike Fitzpatrick)

* Introduction There are comments in the text and table 1 about the related document eventually moving to the IVOA repository; these must be part of the set of documents found on the IVOA Documents page - likely the generated page with the abstract and links to various versions should have links to all documents and clearly show which are different documents and which are alternative formats. In general, I find it acceptable to organise the content in this way and for extra documents be part of the spec and provided in whatever form best suits the subject matter... but this is not just something convenient to do after the approval.

Q. When can this be resolved?

REPLY :
This issue is being fixed (thanks to your private advice smile ). We are just checking the consistency between the various documets (e.g. same date) and I'll ask the document coordinator to upload all documents on the IVOA site. So tomorrow maybe ? -- HerveWozniak - 15 Mar 2012

* figure 1, page 12 I find it very hard to get a grasp of a data model when no cardinality information is expressed in the diagram. Specifically, how many parameters, targets, input data, and results does an Experiment have? I can guess and make assumptions (well, I suspect I would be wrong to assume that all of them are cardinality 1), but this diagram could be much more explicit and informative using standard notations. A few characters in the diagram are much better than a sentence or two in the text. I see that some later diagrams do show cardinality (eg figure 7)... this applies to some extent to other figures, but since the goal of those figures is to illustrate specific uses of the model the lack of extraneous details can be at the authors discretion.

Q. Can you add this information at least to the primary UML diagram (fig 1)?

REPLY :
In fact the full information is in the accompanying document html/SimDM.html (presently available there http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/specification/html/SimDM.html until I upload it on the IVOA site, and linked from the .doc itself). We think preferable not to have too much details in the .doc since it has been designed ao as to be legible for people not fully familiar with data modelling.

-- HerveWozniak - 15 Mar 2012

* 3.8 Data access services This section is clearly looking forward to SimDAL and trying to provide some hooks, but it seems premature. The exact form and features of the service(s) are not known (that is even allowed for) so all this seems to do is say that one can find a base URl and registry ID and go look the thing up. VOSI-capabilities allows one to specify related services, but this may not be at the fine-grained granularity required in SimDM. This also seems closely related to the DataLink discussions, where one could find, for a specific "dataset" a set of different links... to different services/urls providing access to the dataset.

Q. Can the authors comment on this?

REPLY :
It is true that the web service part of the data model has little detail. This is on purpose for precisely the reasons stated, we do not know a lot about the services that a user might provide to give access to his/her simulation results. Some of these may be standardized under SimDAL, others may be custom. For either of these the main important information within the context of SimDM is which SimDM/Resource-s the service gives access to, and then the url where to find the service.

This supports the workflow where within (say) a SimDB one can find SimDM/Resource-s one is interested in. The reasons for being interested are what the simulations have produced, how etc. Having found these Experiments or Protocols or Projects one can look for SimDM/Service-s referencing them using a straightforward query. Having found such services in the SimDB one can use the baseURL to access them, or lookup their registryID in a registry to get more information about the service. IF such a service is a SimDAL service the user will know that a specification is followed. I guess we can state confidently that such services will exist, even if we don't know their details yet.

-- HerveWozniak - 15 Mar 2012

* 4.1 utypes

Oh man... smile The utype style adopted here is quite different in syntax to what we have seen before: involving packages and classes where previous uses of the utype concept only really had classes, fields, and references (the latter two looking essentially the same). I fear that this will place another de-facto utype standard in place by virtue of having been used in specific standard without general design and thought; yes, that has happened already with more than just utypes frown I understand the caveats in 5.3.3 and I understand that the authors want to complete the model, which means assigning utypes.

Q. Does that mean that TCG review is validating and accepting this utype design, syntax and set of rules?

REPLY :
I'm not sure the question is for SimDM's editors smile . Anyway, let me advocate that we have really think about SimDM's utypes for a very long time, discuss our options with DM WG, presented our ideas during an INTEROP (I can try to recover which one) and we even wrote in the PR we are ready to adopt other way to specify utype as soon as IVOA is able to define a standard on that point. So, we are not trying to enforce any de facto standard. -- HerveWozniak - 15 Mar 2012

* editorial: In the abstract: "protocols are in the make and" would be better xpressed as "protocol(s) are being developed and" (this sentence is also in the introduction, page 7)

The reference to ADQL in 4.1 (page 31) does not include the reference number [24]... I only noticed because I saw ADQL in the reference list as was looking for TAP smile

REPLY :
DOne. Thanks. -- HerveWozniak - 15 Mar 2012

Approved - PatrickDowler 2012-03-20

Data Model Working Group (Jesus Salgado, Omar Laurino)

  • Latest revision is more focused in the data model itself that, although not being an expert on simulations, looks a (complex) data model written by experts able to describe a (complex) problem like theorical simulations.
  • Issues and discussions, mainly affecting data access protocols to be used to consume SimDM, are now more clearly outside present spec.
  • Historical background was relegated to an appendix. Although this part looks to me not really needed by a possible future implementer, the authors have decided to maintain it for future reference and its location does not interfere the reading.

The only technical open issue that I can foresee is a possible impact in the definition of utypes (more important here than for other IVOA DMs) due to the definition of utypes in the note that the group is developing but this should not delay present DM and a future update (if needed) on this particular point could be considered at certain point.

I approve. -- JesusSalgado - 08 Mar 2012

Grid & Web Services Working Group (Andreas Wicenec, Andre Schaaff)

This is one of the most discussed standards I've seen and it is probably the most complex data model in the IVOA. As far as I can judge it seems to serve the needs by this community, although I'm not quite sure how much biased it is to a certain group of astrophysical simulations. I guess time will tell and we have to see how people will go about adopting this standards. I approve.

Registry Working Group (Gretchen Greene, Pierre Le Sidaner)

  • the term used on SIMDB should be "compliant" with other standards Service -> BaseURL should be Service -> referenceURL or accessURL (e.g. coming from VOResource)

REPLY :
Let discuss this point when we will submit the SimDB. -- HerveWozniak - 15 Mar 2012

  • Main difficulty found with the model is generic => very complex to understand. Could you please provide s simple example in the use of this model, specifically with the instance of combined DM, DAL, and registry standards. An example in the document would help to clarify the combined implementations.

REPLY :
Many examples have been introduced under the form of instance diagrams which we considered much more useful for the reader than XML examples. We are sure this will allow the document to be more legible. -- HerveWozniak - 15 Mar 2012

Approved. -- GretchenGreene - 12 Apr 2012

Semantics Working Group (Sebastien Derriere, Norman Gray)

The PR has been the result of extensive discussions in order to reach a consensus, which might explain why the document still feels in some places more like a "work in progress, with room for discussion", than a normative reference. For example, the first sentence of section 3 : "The data model that we propose here...". Well, if it's a recommendation, we do more that proposing it.

But the lastest version was greatly improved, with many historical discussions moved in the appendix, which improved readibility.

Semantics approves the current document, pending a few typos :

  • p7 Section 5 deals on -> deals with
  • p10 Can include -> Can I include
  • p13 there is a reference to 6.3B.14, should be B.14
  • p14 aims to describe -> aims at describing
  • p14 there actually -> there are actually
  • p18 to models the internals -> to model the internals
  • p18 (missing) in/for other parts of the model.
  • p22 by a attributes -> by the attributes
  • p27, in fig 12 there is twice the property redshift in the box for Snapshot:OutputDataObjectType
  • p28 to assigns -> to assign
  • p29 the mass unit looks strange... confusion between the two numericValues in figure 14 ?
  • p29 left out for -> left out of
  • p36 This in contrast -> This is in contrast
  • p45 First paragraph is not finished ??
  • p46 Fig 20 caption : form -> from
  • p47 Fig 21 caption : the the -> the
  • p48 Third bulleted paragraph is not finished ??
  • p50 Fig 25 caption : sterotype -> stereotype

Also, could the authors please use a different font (not Arial!) for the final document (or embed fonts in the PDF)? This font is not freely available, and the default substitution font makes the printed document really ugly and hard to read !

-- SebastienDerriere - 26 Mar 2012 REPLY :
I wonder whether I already read the document... hum ! OK I will make the change everywhere (even the first sentence of Sect. 3) for the last version (how often I used the word 'last' ?). Thanks for your careful reading (and approval). Btw, Msun/h is indeed a mass unit mainly used in collisionless cosmological simulations (pure N-body) since the results are scalable with h (=H0/100, unitless). -- HerveWozniak - 11 Apr 2012

VOEvent Working Group (Matthew Graham, John Swinbank)

There are no interactions with the standards within our WG, so we trust the experts who have worked on the document and approve this document.

-- MatthewGraham - 11 Apr 2012

Data Curation & Preservation Interest Group (Alberto Accomazzi)

Knowledge Discovery in Databases Interest Group (Giuseppe Longo)

Theory Interest Group (Herve Wozniak, Franck Le Petit)

I approve (obviously). -- HerveWozniak - 11 Apr 2012

Standards and Processes Committee (Francoise Genova)



Topic attachments
I Attachment Action Size DateSorted ascending Who Comment
PDFpdf NoteImplementationSimDB_4.pdf manage 2462.4 K 2011-05-09 - 12:38 MireilleLouys Simulation Data model Implementation /Ivoa Note
XMLxml isochrone-list.xml manage 2.3 K 2011-05-11 - 22:54 CarlosRodrigoBlanco Isochrones in a given parameter range votable
Microsoft Word filedoc PR-SimulationDataModel-v.1.00-20110428.doc manage 1178.0 K 2011-05-16 - 06:08 HerveWozniak Simulation Data Model v1.0 / Doc file
PDFpdf PR-SimulationDataModel-v.1.00-20110428.pdf manage 523.5 K 2011-05-16 - 06:07 HerveWozniak Simulation data model/ Proposed Recommendation
Microsoft Word filedoc PR-SimulationDataModel-v.1.00-20110906.doc manage 1159.5 K 2011-09-13 - 08:39 HerveWozniak Simulation data model version 2011.09.06
PDFpdf PR-SimulationDataModel-v.1.00-20110906.pdf manage 856.0 K 2011-09-13 - 08:38 HerveWozniak Simulation data model version 2011.09.06
PDFpdf NoteImplementationSimDB_1.0-20110910.pdf manage 785.4 K 2011-10-18 - 04:57 HerveWozniak Simulation Data model Implementation Note Update
Microsoft Word filedoc PR-SimulationDataModel-v.1.00-20111019.doc manage 1156.5 K 2011-10-20 - 03:41 HerveWozniak SimDM 1.0 PR release Oct 19 for TCG review
PDFpdf PR-SimulationDataModel-v.1.00-20111019.pdf manage 1043.8 K 2011-10-20 - 03:41 HerveWozniak SimDM 1.0 PR release Oct 19 for TCG review
XMLxml isochrone.xml manage 2.6 K 2011-11-15 - 22:50 EnriqueSolano isochrone
XMLxml list_of_isochrones.xml manage 3.3 K 2011-11-15 - 22:49 EnriqueSolano list of isochrones
XMLxml pulsation.xml manage 3.5 K 2011-11-15 - 22:45 EnriqueSolano Pulsation model
XMLxml stellar_structure.xml manage 3.4 K 2011-11-15 - 22:46 EnriqueSolano Stellar structure model
XMLxml stellar_structure_list_of_models.xml manage 10.5 K 2011-11-15 - 22:47 EnriqueSolano List of stellar and pulsation models
Microsoft Word filedoc PR-SimulationDataModel-1.00-20120302.doc manage 1615.0 K 2012-03-07 - 14:36 HerveWozniak Simulation Data Model 1.0 20120302 / Doc file
PDFpdf PR-SimulationDataModel-1.00-20120302.pdf manage 1541.2 K 2012-03-07 - 14:36 HerveWozniak Simulation Data Model 1.0 20120302
Topic revision: r59 - 2012-04-19 - MarkTaylor
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback