--------------------------------------------------------------------------- Thursday January, 18 provenance DM effort - Day 1- ProvenanceDay in Postdam M.Louys ---------------------------------------------------------------------------minutes afternoon Participants: Kristin Riebe, Michèle Sanguillon, Anastasia Galkin, Ole Streicher, François Bonnarel, Mathieu Servillat, Markus Nullmeier, Mireille Louys Morning Jan 18 : Get organized with the topics to discuss #Changes into the PROV-DM document agreed to : create 2 DAL documents PROV-DAL to go quickly to REC with the DM doc PROV-TAP , waiting for consistency and validation of TAP1.1 #List and content of documents to issue - Implementation note to guide the implementers - Discussion on DAL docs: *advantage of 1 doc For the visibility Markus D. worries for 2 access protocols and wishes everything to be unified under TAP. But this needs more infrastructure and more homogeneity and consensus across different projects *PROV-DAL seems easier to set-up if you have TAP already implemented , this is easy to translate a Prov-DAL query in TAP ADQL query and provide the response. AIP has DB implementation. it can be turned from SQL to ADQL. AIP has a Prov-DAL implementation in progress. Need remapping for having a clear TAP support. Ole ?? who will implement a Prov-access client? history, for example : Aladin, SVOM -there are some dedicated access pages in the Aladin API, designed for some projects : ex GAIA in Aladin : a special form to access GAIA data that converts to TAP queries for the Vizier server ?How is it for providers? RAVE : query interface not yet TAP SVOM , CTA : provenance stored close to the entities --> focus on history on entities Inside a consortium: share the provenance between partners : this is another situation a wider set of possibilities with PROV-TAP: you do not need to anticipate all parameters combinations Reminder of the W3C note for access : prov-AQ. Would it be the same visibility for the IVOA note? ask all features on all classes is possible . For PROV-DAL you have a limited set of possible queries all agreed : - W3C compatibility for visualisation should be kept . - 4 documents planned: provdm DM REC provdal DAL WG REC provtap DAL WG REC implementations NOTE #Meetings to consider for participation - RDA March Berlin 21-23 - Provenance week annoncement by Harry Enke / W3C a conf. in London in 2018 July 9-13. We should try to send 2 people. Deadline for abstracts March 12, paper submission March 19 -Consider ASTERICS meeting in Edinburg in April for intermediate work before next interop. dates not yet fixed. #Discussion on outputs compatibility with W3C Mathieu : provDAL as W3C interface / prov tap as ivoa interface Parameter, hadStep relation, ActivityFlow can be added Ole: ESO provenance info is difficult to pull out and refurbish as PROV metadata. need some translation and knowledge. Mireille : how can we plan the W3C extension for our needs Voprov ( Michèle ) has some strategy for this. - you can add attributes to existing classes - you cannot add new classes or new relations ex . hadStep Kristin has experienced this by expressing this through wasInformedBy (had step). The idea is to allow W3C *serialisation* with all the features of IVOA PROV DM Why do we need W3C compatibility: for the visualisation mainly, for the visibility of this effort, ... CTA needs (Mathieu) - needs for prov-TAP as part of ObsTAP - search for datasets on provenance features ... Currently PROV DAL serves W3C serialisations and IVOA serialisation should we keep prov-dal for W3C serialisation only ??? no We want prov-DAL to help engage people for Provenance so it must be part of IVOA effort clearly. should be part of the IVOA description looks like we provide a double layer protocol : through prov-DAL visualize query from regular scenarios through prov-TAP query on all features Afternoon Jan 18 : Handling different provenance dialects : IVOA and W3C Ole : a proposal to use XSL to translate from to another Markus who should do this translation? the provider before delivering one version or the other and not the client What does PROV-TAP deliver in terms of serialisation format? François: VOTable output is mandatory but you may have other formats provided Markus : ADQL User defined functions can be implemented in TAP. It can provide the history of an entity in just one TAP query ... The columns need to be defined PROV-DAL should have data model compatibility for both IVOA and W3C Which required format? Mireille: all for prov-DAL , even VOTable because the user may want to re-use this info by searching the result tables only, and not by looking to the graphs Usage of VO tools like TOPCat encouraged for very large datasets Which required models expression ? W3C and IVOA PROV-JSON seems better accepted and wide spread today. Can we translate every time to and from VOtable : yes if we include or refer to a JSON/VOtable mapping table. Kristin, Michèle: See what voprov library can do for this kind of translation validation can be done through JSON schema validation. to do so we would have to define our JSON schema for the IVOA PROV in PROV-JSON Dal access into the provenanceDM document we agreed to remove the details in two DAL specifications , PROV-DAL and PROV-TAP docs, but a small DAL paragraph on access is still needed to introduce the serialisation section. where do stand in the IVOA recommendation process? #should provide a valid VODML expression for the model - almost done - checked by Kristin # reference implementations cross validation of serialisation : #examples of usage in various projects not directly involved in the DM elaboration Astro-wise Hugo Buddelmeier Granada University Jose Enrique Ruiz All aspects of the model should be checked... at least in fake serialisation. voprov as tool which can read any prov-format ... useful for serialisation examples exchange scenarios of usage of Prov-DM: how long to generate a Provenance for a light curve Provenance of the points activity :extraction , calibration, zero point calibration, etc. datalinks for original images. action proposed by Francois : describe datasets considered for TD models. Explain the various data attached #discussion point: the 'rights' attribute in Entity Class rights --> VOresource rights M. Demleitner currenly suggests to change the values and mentions licence style instead. ex: CC-O creative commons still unstable and may not be generally adopted among the community. What is modeled is more than what is implemented VOSI endpoints : can show what your service is serving : /tables can allow you to distinguish between a table view and a relation table of the data base how is the view generated can also be documented. services can be stacked together e.g. provide a Prov-DAL interface on top of the TAP service serving Prov-TAP tables. Action : Look again at /tables endpoints in the DAL standards documents. For TAP services : you can have several levels, by using the datamodel param and /capability how can you declare it in PROV-DAL ( VOSI?) no way #the data model implementation profiles proposed by mail by Mireille. Activity flow: Groups of activities: - can share diff. activities ... - a group has a shared property among its members : time stamp , pipeline version, etc. --------------------------------------------------------------------------- Friday January, 19 provenance DM effort - Day 2- ProvenanceDay in Postdam M.Louys --------------------------------------------------------------------------- #Entity rights do I have access to all entities if I have the provenance public/not public but it changes through time define an attribute for this public_release_date this is the date that the consortium decides to use for publication can be NULL and in this case the data file will not be public #was derived from, was informed by - these relations are redundant with the used/activity/wasgeneratedby cycle - the relation instances can be rebuilt from the Used and wasGeneratedBy relation in the data base - represent it as a view on the main classes/tables of the data base : color different in the UML diagram for instance - in the TAP schema will have table type = view in Modelio ? How can we express constraints to mention # levels of the model is it a capability? do we need several TAP_SCHEMA versions depending on what is supported? should we standardize the levels ? how to describe them Here is a suggestion of DM levels from Mireille, after the discussion in Postdam. #Simple Core Ac/E/Ag + relations #Core Prov template Core: - Single individual Activity / Entity / Agent - relations A/E : used , wasGeneratedBy - relations A/Ag: was associated with - relations E/Ag: was attributed to nb:Activities are chained by the Entities relations - Parameter Descriptionclasses: - ActivityDescription :: (this is the recipe applied when running an Activity ) - EntityDescription - Parameter Description - UsedDescription - wasGeneratedByDescription each class {C} has a double link --> & <-- to its description class: {C}Description #WorkflowProv template Core Prov template enriched with : - ActivityFlow - relation ActivityFlow/A hadStep - relation A/A: wasInformedBy #DataFlow template Core Prov template enriched with : - Collection class (of Entities) - relation E/Collection isMember - relation E/E: wasDerivedFrom #Full template: all I suppose we can describe each implementation with a feature matrix, containing a tic for each implemented feature.