Difference: IvoaKDD (1 vs. 44)

Revision 442022-09-07 - YihanTao

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplines, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual
IVOA.InterOpNov2021KDD InterOpNov2021 Nov 2021 virtual

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: RaffaeleDAbrusco

Changed:
<
<
Vice Chair: n/a
>
>
Vice Chair: Yihan Tao
 
<--  
-->

Revision 432022-04-28 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplines, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual
IVOA.InterOpNov2021KDD InterOpNov2021 Nov 2021 virtual

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: RaffaeleDAbrusco

Vice Chair: n/a

Deleted:
<
<
Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao

 
<--  
-->

Revision 422022-04-26 - PetrSkoda

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplines, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual
Added:
>
>
IVOA.InterOpNov2021KDD InterOpNov2021 Nov 2021 virtual
 

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: RaffaeleDAbrusco

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 412021-11-02 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Changed:
<
<
Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.
>
>
Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplines, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.
  The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: RaffaeleDAbrusco

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 402021-06-16 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Changed:
<
<
Chair: Kai Lars Polsterer
>
>
Chair: RaffaeleDAbrusco
  Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 392020-05-08 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
Added:
>
>
IVOA.InterOpMay2020KDD InterOpMay2020 May 2020 Sydney virtual
 

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 382019-05-12 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
Added:
>
>
IVOA.InterOpMay2019KDD InterOpMay2019 May 2019 Paris
 

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 372018-06-01 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
Added:
>
>
IVOA.InterOpMay2018KDD InterOpMay2018 May 2018 Victoria
 

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 362017-10-27 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai
Added:
>
>
IVOA.InterOpOCt2017KDD InterOpOct2017 October 2017 Santiago
 

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 352017-05-18 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points:

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services.

  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

Changed:
<
<
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak
>
>
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Franck Le Petit
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak
  Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 342017-05-18 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

Changed:
<
<
The activities of the KD-IG include the following items with a strong emphasis on the first two points :
>
>
The activities of the KD-IG include the following items with a strong emphasis on the first two points:
Added:
>
>
 
  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
Changed:
<
<
  1. Introducing uncertainties and probabilistic description to VO standards and services
>
>
  1. Introducing uncertainties and probabilistic description to VO standards and services.

 
  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Defining requirements for implementing and adding machine learning capabilities to services.
  3. Coordinating and unifying the access to data visualization functionalities.
  4. Discussing the aspect of data provenance with respect to data used to derive/train models.
  5. Introducing proper statistical scoring and evaluation methods as services.
  6. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  7. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

Changed:
<
<
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Kai Lars Polsterer
PetrSkoda
HerveWozniak
>
>
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Jiri Nadvornik
Kai Lars Polsterer
PetrSkoda
HerveWozniak
  Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 332017-05-18 - HerveWozniak

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include the following items with a strong emphasis on the first two points :

  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services
  3. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  4. Defining requirements for implementing and adding machine learning capabilities to services.
  5. Coordinating and unifying the access to data visualization functionalities.
  6. Discussing the aspect of data provenance with respect to data used to derive/train models.
  7. Introducing proper statistical scoring and evaluation methods as services.
  8. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  9. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Vice Chair: n/a

Task Force Members:

Changed:
<
<
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Kai Lars Polsterer
PetrSkoda
HerveWozniak
>
>
RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Kai Lars Polsterer
PetrSkoda
HerveWozniak
  Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 322017-05-18 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

Changed:
<
<
The activities of the KD-IG include:
>
>
The activities of the KD-IG include the following items with a strong emphasis on the first two points :
Added:
>
>
  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  2. Introducing uncertainties and probabilistic description to VO standards and services
 
  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
Deleted:
<
<
  1. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
 
  1. Defining requirements for implementing and adding machine learning capabilities to services.
  2. Coordinating and unifying the access to data visualization functionalities.
  3. Discussing the aspect of data provenance with respect to data used to derive/train models.
  4. Introducing proper statistical scoring and evaluation methods as services.
  5. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  6. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Changed:
<
<
Chair: Kai Lars Polsterer
>
>
Chair: Kai Lars Polsterer
 
Changed:
<
<
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Rick Ebert
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
HerveWozniak
>
>
Vice Chair: n/a
Added:
>
>
Task Force Members:

RaffaeleDAbrusco
Rick Ebert
MatthewGraham
Steve Groom
Kenny Lo
Kai Lars Polsterer
PetrSkoda
HerveWozniak

Passive Members:

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Manuel Luis Sarro Baro
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao

 
<--  
-->

Revision 312017-05-18 - HerveWozniak

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include:

  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  3. Defining requirements for implementing and adding machine learning capabilities to services.
  4. Coordinating and unifying the access to data visualization functionalities.
  5. Discussing the aspect of data provenance with respect to data used to derive/train models.
  6. Introducing proper statistical scoring and evaluation methods as services.
  7. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  8. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Changed:
<
<
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Rick Ebert
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
>
>
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Rick Ebert
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
HerveWozniak
 
<--  
-->

Revision 302017-05-18 - RickEbert

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include:

  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  3. Defining requirements for implementing and adding machine learning capabilities to services.
  4. Coordinating and unifying the access to data visualization functionalities.
  5. Discussing the aspect of data provenance with respect to data used to derive/train models.
  6. Introducing proper statistical scoring and evaluation methods as services.
  7. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  8. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Changed:
<
<
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
>
>
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Rick Ebert
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
 
<--  
-->

Revision 292017-05-18 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.

The activities of the KD-IG include:

  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  3. Defining requirements for implementing and adding machine learning capabilities to services.
  4. Coordinating and unifying the access to data visualization functionalities.
  5. Discussing the aspect of data provenance with respect to data used to derive/train models.
  6. Introducing proper statistical scoring and evaluation methods as services.
  7. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  8. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->

Related Topics

In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Changed:
<
<
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
>
>
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
 
<--  
-->

Revision 282017-05-16 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery Interest Group



Charter

Changed:
<
<
Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplins, including not only visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements from the scientific community.
>
>
Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disceplins, including visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements of the scientific community.
  The activities of the KD-IG include:
  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  3. Defining requirements for implementing and adding machine learning capabilities to services.
  4. Coordinating and unifying the access to data visualization functionalities.
  5. Discussing the aspect of data provenance with respect to data used to derive/train models.
  6. Introducing proper statistical scoring and evaluation methods as services.
  7. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  8. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->

Motivations

Changed:
<
<
During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”
>
>
During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”
<-- 

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

-->
 
Deleted:
<
<
As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.
 Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.
Changed:
<
<
KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.
>
>
KD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.
 
Changed:
<
<
The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.
>
>
The KD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, Time Domain, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.
 
Changed:
<
<
We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.
>
>
We also wish to stress that, in ultimate analysis, the goal of the KD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.
 

KD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides
Changed:
<
<

Next tasks

>
>
<-- 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
-->
 
Changed:
<
<
  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
>
>

Related Topics

Deleted:
<
<
  1. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  2. Inventory of existing Data Mining models of relevant astrophysical interest.
  3. Definition of standard template data sets for Data Mining models test and debugging.
  4. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  5. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  6. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
 
Changed:
<
<

Priorities

>
>
In the past hot topics had been identified. These are following the priorities emerged during the first KD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.
Deleted:
<
<
These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.
  Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 272017-05-15 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"
Changed:
<
<

IVOA Knowledge Discovery in Databases

>
>

IVOA Knowledge Discovery Interest Group

 


Charter

Changed:
<
<
We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:
>
>
Knowledge discovery is the task of processing and analyzing data-sets with the aim of extracting new knowledge. This area spans widely across multiple disciplins, including not only visualization, remote data exploration, machine learning techniques, statistical methods, workflow orchestration, and polymorphic data access. To support the process of discovery, the KD-IG interacts closely with the other working/interest groups and feeds back requirements from the scientific community.
Deleted:
<
<
  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.
More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.
 
Added:
>
>
The activities of the KD-IG include:
  1. Presenting and collecting best practice examples of scientific data analytics in astronomy.
  2. Participating in the definition of new data preservation and exchange formats with respect to support machine learning algorithms.
  3. Defining requirements for implementing and adding machine learning capabilities to services.
  4. Coordinating and unifying the access to data visualization functionalities.
  5. Discussing the aspect of data provenance with respect to data used to derive/train models.
  6. Introducing proper statistical scoring and evaluation methods as services.
  7. Contributing to the discussion on scripting and orchestrating the scientific discovery workflow.
  8. Supporting the development of dedicated knowledge discovery applications.
<-- 

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

-->
 

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

Changed:
<
<

KDD-IG Meetings

>
>

KD-IG Meetings

 
IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
Changed:
<
<

Other Interesting Meetings for KDD-IG

>
>
IVOA.InterOpMay2016-KDD InterOpMay2016 May 2016 Cape Town
Added:
>
>
IVOA.InterOpMay2017-KDD InterOpMay2017 May 2017 Shanghai

Other Interesting Meetings for KD-IG

 
Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: Kai Lars Polsterer

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 262016-01-28 - KaiLarsPolsterer

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.
More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Changed:
<
<
Chair: George Djorgovski
>
>
Chair: Kai Lars Polsterer
 
Changed:
<
<
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
>
>
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Kai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
 
<--  
-->

Revision 252015-06-18 - FabioPasian

 
META TOPICPARENT name="WebHome"
Changed:
<
<

IVOA Knowledge Discovery in Databases

>
>

IVOA Knowledge Discovery in Databases

 
Added:
>
>
 

Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

Changed:
<
<
  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.
>
>
  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.
Deleted:
<
<
 More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.
Deleted:
<
<
 

Motivations

Changed:
<
<
During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”
>
>
During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”
 
Changed:
<
<
As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.
>
>
As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.
 
Changed:
<
<
Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.
>
>
Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.
  KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.
Changed:
<
<
The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.
>
>
The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.
  We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.
Deleted:
<
<
 

KDD-IG Meetings

Changed:
<
<
IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
>
>
IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
 

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Changed:
<
<
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides
>
>
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides
 

Next tasks

Changed:
<
<
  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
>
>
  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
 

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

Changed:
<
<
  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;
>
>
  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;
 

Members

Changed:
<
<
Chair:
>
>
Chair: George Djorgovski
Deleted:
<
<
GiuseppeLongo
 
Changed:
<
<
Sheelu Abraham
>
>
Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
GiuseppeLongo
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
Deleted:
<
<
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao
 
Changed:
<
<
>
>

Deleted:
<
<

Revision 242012-06-26 - root

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Meetings

IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


Revision 232011-05-17 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

Changed:
<
<

KDD-IG Group Meetings

>
>

KDD-IG Meetings

 
Changed:
<
<
Theory IG Session At Meeting When Where
>
>
IG Session At Meeting When Where
 
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
Changed:
<
<
>
>
IVOA.InterOpMay2011KDD InterOpMay2011 May 2011 Naples
 

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Sheelu Abraham
Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 222010-11-15 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Added:
>
>
Sheelu Abraham
 Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
CiroDonalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 212010-10-13 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski

Changed:
<
<
Ciro Donalek
>
>
CiroDonalek
 Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
NicholasWalton
Yongheng Zhao


<--  
-->

Revision 202010-09-05 - NicholasWalton

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
Ciro Donalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs

Added:
>
>
NicholasWalton
 Yongheng Zhao


<--  
-->

Revision 192010-08-25 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Changed:
<
<
Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA
>
>
Meeting When Where Docs
Challenges and Methods for Massive Astronomical Data August 2010 CfA Slides
 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Knowledge Discovery in Databases in Astronomy;
  5. Knowledge Discovery in Databases and VO standards;
  6. Specific fields of applications of KDD in astronomical research;

Members

Chair: GiuseppeLongo

Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
Ciro Donalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
Yongheng Zhao


<--  
-->

Revision 182010-08-25 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
Changed:
<
<
  1. A user guide for Data Mining in Astronomy;
  2. Data Mining and VO standards;
>
>
  1. A user guide for Knowledge Discovery in Databases in Astronomy;
  2. Knowledge Discovery in Databases and VO standards;
Added:
>
>
  1. Specific fields of applications of KDD in astronomical research;
 

Members

Chair: GiuseppeLongo

Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
Ciro Donalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
Yongheng Zhao


<--  
-->

Revision 172010-08-19 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
Changed:
<
<
  1. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
>
>
  1. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general user to seamlessly exploit the complex data repositories offered by the VO.
 
  1. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  2. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  3. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;

Members

Chair: GiuseppeLongo

Lauretta Auvil
NickBall
ThomasBoch
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
George Djorgovski
Ciro Donalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
OmarLaurino
Ann Lee
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
RoyWilliams
Chris Stubbs
Yongheng Zhao


<--  
-->

Revision 162010-08-09 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
Changed:
<
<

Members (Incomplete)

>
>

Members

  Chair: GiuseppeLongo
Added:
>
>
Lauretta Auvil
 NickBall
Changed:
<
<
Ciro Donalek
>
>
ThomasBoch
Added:
>
>
Sen Bodhisattva
KirkBorne
MaxBrescia
Robert Brunner
RaffaeleDAbrusco
Darren Davis
ReinaldoDeCarvalho
DaveDeYoung
SebastienDerriere
 George Djorgovski
Added:
>
>
Ciro Donalek
Paul Eglitis
Pepi Fabbiano
PierreFernique
Peter Freeman
Fabian Gieseke
Alyssa Goodman
MatthewGraham
Alexander Gray
Paul Green
BobHanisch
Sheth Kartik
AjitKembhavi
 OmarLaurino
Changed:
<
<
RaffaeleDAbrusco
>
>
Ann Lee
Added:
>
>
JeffLusted
AshishMahabal
BobMann
William March
Joseph Mazzarella
SabineMcConnell
Fionn Murtagh
NinanSajeethPhilip
Rebecca Nugent
PaoloPadovani
FabioPasian
Misha Pesenson
Hai Lars Polsterer
Manuel Luis Sarro Baro
PetrSkoda
RiccardoSmareglia
Antonino Staiano
 RoyWilliams
Added:
>
>
Chris Stubbs
Yongheng Zhao
 


<--  
-->

Revision 152010-08-05 - NickBall

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;

Members (Incomplete)

Chair: GiuseppeLongo

Changed:
<
<
Nicholas Ball
>
>
NickBall
 Ciro Donalek
George Djorgovski
OmarLaurino
RaffaeleDAbrusco
RoyWilliams


<--  
-->

Revision 142010-08-04 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
Changed:
<
<

Members

>
>

Members (Incomplete)

  Chair: GiuseppeLongo

Nicholas Ball
Ciro Donalek
George Djorgovski
OmarLaurino
RaffaeleDAbrusco
RoyWilliams


<--  
-->

Revision 132010-08-03 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
Added:
>
>

Other Interesting Meetings for KDD-IG

Meeting When Where
Challenges and Methods for Massive Astronomical Data August 2010 CfA
 

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;

Members

Chair: GiuseppeLongo

Nicholas Ball
Ciro Donalek
George Djorgovski
OmarLaurino
RaffaeleDAbrusco
RoyWilliams


<--  
-->

Revision 122010-07-28 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
Changed:
<
<

Members

>
>

Members

  Chair: GiuseppeLongo
Changed:
<
<
RaffaeleDAbrusco
>
>
Nicholas Ball
Added:
>
>
Ciro Donalek
George Djorgovski
 OmarLaurino
Changed:
<
<
>
>
RaffaeleDAbrusco
Added:
>
>
RoyWilliams
 


<--  
-->

Revision 112010-07-21 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD-IG Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;

Members

Added:
>
>
Chair: GiuseppeLongo
 
Changed:
<
<
>
>
RaffaeleDAbrusco
Added:
>
>
OmarLaurino
 


<--  
-->

Revision 102010-07-16 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

Changed:
<
<

KDD Group Meetings

>
>

KDD-IG Group Meetings

 
Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
Added:
>
>

Members

 
Deleted:
<
<

Members

Coming soon

 


<--  
-->

Revision 92010-07-16 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Priorities

Changed:
<
<
These are the hot topics that will be tackled in the early stages of the IG-KDD, following the priorities emerged during the first IG-KDD meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.
>
>
These are the hot topics that will be tackled in the early stages of the KDD-IG, following the priorities emerged during the first KDD-IG meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.
  Please follow the links and edit the specific pages:
Changed:
<
<
  1. IvoaKDDDictionary|Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user-guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
>
>
  1. Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
 

Members

Coming soon


<--  
-->

Revision 82010-07-16 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest.
  4. Definition of standard template data sets for Data Mining models test and debugging.
  5. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  6. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  7. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
Added:
>
>

Priorities

These are the hot topics that will be tackled in the early stages of the IG-KDD, following the priorities emerged during the first IG-KDD meeting held at the IVOA.InterOpMay2010KDD in Victoria, and singled out by the Chair in his welcome message to the members.

Please follow the links and edit the specific pages:

  1. IvoaKDDDictionary|Dictionary of Data Mining terms;
  2. Census of Data Mining and Machine Learning tools and methods of astronomical interest;
  3. Template datasets for algorithm benchmarking;
  4. A user-guide for Data Mining in Astronomy;
  5. Data Mining and VO standards;
 

Members

Coming soon


<--  
-->

Revision 72010-07-16 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD Group Meetings

Theory IG Session At Meeting When Where
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
Changed:
<
<
  1. Inventory of existing Data Mining models of relevant astrophysical interest. 1.4 Definition of standard template data sets for Data Mining models test and debugging.
>
>
  1. Inventory of existing Data Mining models of relevant astrophysical interest.
Added:
>
>
  1. Definition of standard template data sets for Data Mining models test and debugging.
 
  1. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  2. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  3. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Members

Coming soon


<--  
-->

Revision 62010-07-15 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

KDD Group Meetings

Changed:
<
<
Theory IG Session At Meeting When Comments
>
>
Theory IG Session At Meeting When Where
 
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest. 1.4 Definition of standard template data sets for Data Mining models test and debugging.
  4. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  5. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  6. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Members

Coming soon


<--  
-->

Revision 52010-07-15 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Deleted:
<
<

Welcome message

 

Charter

We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:

  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.

More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.

Added:
>
>

Motivations

During the Strasbourg InterOp Meeting it emerged the need for an Interest Group on Data Mining (KDD-IG) as an indispensable step to bridge the Virtual Observatory Infrastructure with the expected VO science. In fact, "...Data mining, or KDD, is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. In other words, traditional data analysis is assumption driven as a hypothesis is formed and validated against the data. Data mining, in contrast, is discovery driven as the patterns are automatically extracted from data....”

As such, Data Mining (DM) can be considered as the “frontier” of VO enabled science since it represents the only way to capture and reveal the scientific knowledge (patterns, trends, correlations, etc.) hidden behind the complexity of Massive Data Sets.

Data Mining is a rapidly evolving set of methodologies which needs to be imported under the VO umbrella and not just another application. As such, DM cannot be just a tool or a suite of tools offered by a group of developers to a “passive community”. Data Mining involves a large number of researchers across many domains. The astronomical community, which has only recently entered the Massive Data Sets era, makes use of just a handful of methods and tools which very often are far from optimal. The synergy of different expertise present in the IVOA makes it the ideal arena for exploring new and more modern approaches.

KDD-IG requires a strong and continuous interaction with the scientific community which, besides testing the proposed solutions, methods, and tools, will also provide feedback and inputs aiming at extending the scientific capabilities of the VO.

The KDD-IG will interfaces to many other IVOA working and interest groups: Applications, Semantics, VOEvent, Data Models, Grid & Web Services, and Resource Registry. This cross- discipline nature is also a primary reason to create a specific IG. Data Mining, in fact, addresses sophisticated and extreme modes of usage which require a careful orchestration and fine tuning of standards, methods, and tools provided by the other IVOA WGs and IGs. Typical examples are the automatic extraction of bases of knowledge from VO archives using VO ontologies; the transparent access to large computational facilities regardless the computational paradigm; the automated switching from asynchronous to synchronous mode of data access; and the extreme usage of workflows and advanced visualization methods. Furthermore, effective KDD requires the possibility for an inexperienced user to contribute, or at least seamlessly use under the VO infrastructure, his/her own KDD routines and methods. This situation puts strong requirements on security issues and opens new problems for ticketing and scheduling. In other words, the KDD-IG will provide feedback to the solutions implemented by the WG’s and, by posing new operational problems, will stimulate the development and adoption of new solutions and standards.

We also wish to stress that, in ultimate analysis, the goal of the KDD-IG is to allow the VO to produce new scientific knowledge publishable in astronomical journals. On the one end its activities will contribute to demonstrate to the community the power and necessity of federated access to the vast VO universe of data and, on the other, KDD-IG will illustrate the power and performance of data mining algorithms to facilitate and accelerate astronomical discovery within this data universe.

 

KDD Group Meetings

Theory IG Session At Meeting When Comments
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria

Next tasks

  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest. 1.4 Definition of standard template data sets for Data Mining models test and debugging.
  4. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  5. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  6. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).

Members

Coming soon


<--  
-->

Revision 42010-07-15 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Added:
>
>

Welcome message

 

Charter

Deleted:
<
<
Prova prova prova ....
 
Changed:
<
<

Topic 1

>
>
We will develop and test scalable data mining algorithms and the accompanying new standards for VO interfaces and protocols, so that these algorithms can be discovered and used transparently within VO science workflows or in standalone data exploration applications. Therefore the activities of the KDD-IG will be:
Added:
>
>
  1. Support the definition of an ontology of the KDD tasks required by the astronomical community. This ontology will be used to define programming and documentation standards.
  2. Make an inventory of existing methods relevant for astrophysical applications (more than 100 new KDD models and methods appear every month on specialized journals).
  3. Identify reference data sets to be used for comparing, debugging and testing methods and tools.
  4. Foster the implementation, using available VO standards and methods, of general purpose data exploration and data mining methods which will allow the general userto seamlessly exploit the complex data repositories offered by the VO.
  5. Provide/receive feedbacks to/from the WGs in order to improve the usability of VO tools and standards.
  6. Provide/receive from the community information to improve both the usability and the potentialities of Data Mining tools under the VO.
  7. Define and pursue specific science cases which will be used to showcase the VO capabilities to the community.
 
Added:
>
>
More important than anything else, we wish to use this IG as an arena where different groups can share experiences and plan future developments.
 
Changed:
<
<
....
>
>

KDD Group Meetings

 
Added:
>
>
Theory IG Session At Meeting When Comments
IVOA.InterOpMay2010KDD InterOpMay2010 May 2010 Victoria
 
Changed:
<
<

Topic 2

>
>

Next tasks

 
Added:
>
>
  1. Definition of a taxonomy of Data Mining models. This taxonomy will contribute to the Standard Vocabulary of the Semantics WG2.
  2. Definition of the requirements which a Data Mining model needs to match in order to be imported under the VObs standards.
  3. Inventory of existing Data Mining models of relevant astrophysical interest. 1.4 Definition of standard template data sets for Data Mining models test and debugging.
  4. Definition of standard data sets to be used as bases of knowledge for debugging and test of supervised methods.
  5. Definition of procedures to extract and validate robust bases of knowledge from the VObs data archives using the VObs ontology.
  6. Study of the scalability of Data Mining models under different computing infrastructures (definition of best benchmarks).
 
Changed:
<
<
....
>
>
Added:
>
>

Members

Coming soon

 


<--  
-->

Revision 32010-07-13 - RaffaeleDAbrusco

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases



Charter

Changed:
<
<
>
>
Prova prova prova
 ....

Topic 1

....

Topic 2

....


<--  
-->

Revision 22010-06-22 - BrunoRino

 
META TOPICPARENT name="WebHome"
Changed:
<
<

IVOA Knowledge Discovery in Databases

>
>

IVOA Knowledge Discovery in Databases

 
Changed:
<
<
>
>

 
Added:
>
>

 

Charter

....

Topic 1

....

Topic 2

....


<--  
-->

Revision 12010-06-21 - BrunoRino

 
META TOPICPARENT name="WebHome"

IVOA Knowledge Discovery in Databases

Charter

....

Topic 1

....

Topic 2

....


<--  
-->
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback