Difference: InterOpMay2023KD (1 vs. 21)

Revision 212023-05-09 - ChristopheArviset

 
META TOPICPARENT name="InterOpNov2021"
Deleted:
<
<
 

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Changed:
<
<
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
>
>
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
 
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. .pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
META FILEATTACHMENT attachment="Foundation_models_for_Astronomy-Yihan_Tao.pdf" attr="" comment="" date="1683615506" name="Foundation_models_for_Astronomy-Yihan_Tao.pdf" path="Foundation_models_for_Astronomy-Yihan_Tao.pdf" size="378343" user="YihanTao" version="1"
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618659" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="RaffaeleDAbrusco" version="2"
META FILEATTACHMENT attachment="martinezgalarza_ivoa_interop_2023.pdf" attr="" comment="" date="1683618576" name="martinezgalarza_ivoa_interop_2023.pdf" path="martinezgalarza_ivoa_interop_2023.pdf" size="8135955" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_IVOAInterOp_Bologna_2023.pdf" attr="" comment="" date="1683619114" name="slides_session_IVOAInterOp_Bologna_2023.pdf" path="slides_session_IVOAInterOp_Bologna_2023.pdf" size="114543" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" attr="" comment="" date="1683619728" name="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" path="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" size="9154855" user="AndreSchaaff" version="1"
Added:
>
>
META FILEATTACHMENT attachment="20230509-Kruk-IVOABologna-Exploring_astronomy_data_archives_at_scale_using_deep_learning_and_crowdsourcing.pdf" attr="" comment="" date="1683635293" name="20230509-Kruk-IVOABologna-Exploring_astronomy_data_archives_at_scale_using_deep_learning_and_crowdsourcing.pdf" path="20230509-Kruk-IVOABologna-Exploring_astronomy_data_archives_at_scale_using_deep_learning_and_crowdsourcing.pdf" size="6668868" user="ChristopheArviset" version="1"
 

Revision 202023-05-09 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"
Changed:
<
<
ChatGPTIVOA.pdf
>
>
 

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. .pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
Changed:
<
<
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
>
>
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
 
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
META FILEATTACHMENT attachment="Foundation_models_for_Astronomy-Yihan_Tao.pdf" attr="" comment="" date="1683615506" name="Foundation_models_for_Astronomy-Yihan_Tao.pdf" path="Foundation_models_for_Astronomy-Yihan_Tao.pdf" size="378343" user="YihanTao" version="1"
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618659" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="RaffaeleDAbrusco" version="2"
META FILEATTACHMENT attachment="martinezgalarza_ivoa_interop_2023.pdf" attr="" comment="" date="1683618576" name="martinezgalarza_ivoa_interop_2023.pdf" path="martinezgalarza_ivoa_interop_2023.pdf" size="8135955" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_IVOAInterOp_Bologna_2023.pdf" attr="" comment="" date="1683619114" name="slides_session_IVOAInterOp_Bologna_2023.pdf" path="slides_session_IVOAInterOp_Bologna_2023.pdf" size="114543" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" attr="" comment="" date="1683619728" name="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" path="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" size="9154855" user="AndreSchaaff" version="1"

Revision 192023-05-09 - AndreSchaaff

 
META TOPICPARENT name="InterOpNov2021"
ChatGPTIVOA.pdf

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. .pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link
Added:
>
>
 
META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
META FILEATTACHMENT attachment="Foundation_models_for_Astronomy-Yihan_Tao.pdf" attr="" comment="" date="1683615506" name="Foundation_models_for_Astronomy-Yihan_Tao.pdf" path="Foundation_models_for_Astronomy-Yihan_Tao.pdf" size="378343" user="YihanTao" version="1"
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618659" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="RaffaeleDAbrusco" version="2"
META FILEATTACHMENT attachment="martinezgalarza_ivoa_interop_2023.pdf" attr="" comment="" date="1683618576" name="martinezgalarza_ivoa_interop_2023.pdf" path="martinezgalarza_ivoa_interop_2023.pdf" size="8135955" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_IVOAInterOp_Bologna_2023.pdf" attr="" comment="" date="1683619114" name="slides_session_IVOAInterOp_Bologna_2023.pdf" path="slides_session_IVOAInterOp_Bologna_2023.pdf" size="114543" user="RaffaeleDAbrusco" version="1"
Added:
>
>
META FILEATTACHMENT attachment="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" attr="" comment="" date="1683619728" name="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" path="IVOA-Bologna-2023-KDIG-SchaaffA-final.pdf" size="9154855" user="AndreSchaaff" version="1"
 

Revision 182023-05-09 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"
Added:
>
>
ChatGPTIVOA.pdf
 

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Changed:
<
<
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
>
>
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
 
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Changed:
<
<
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
>
>
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. .pdf
 
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Changed:
<
<
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
>
>
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
 
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
META FILEATTACHMENT attachment="Foundation_models_for_Astronomy-Yihan_Tao.pdf" attr="" comment="" date="1683615506" name="Foundation_models_for_Astronomy-Yihan_Tao.pdf" path="Foundation_models_for_Astronomy-Yihan_Tao.pdf" size="378343" user="YihanTao" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618223" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="YihanTao" version="1"
>
>
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618659" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="RaffaeleDAbrusco" version="2"
Added:
>
>
META FILEATTACHMENT attachment="martinezgalarza_ivoa_interop_2023.pdf" attr="" comment="" date="1683618576" name="martinezgalarza_ivoa_interop_2023.pdf" path="martinezgalarza_ivoa_interop_2023.pdf" size="8135955" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_IVOAInterOp_Bologna_2023.pdf" attr="" comment="" date="1683619114" name="slides_session_IVOAInterOp_Bologna_2023.pdf" path="slides_session_IVOAInterOp_Bologna_2023.pdf" size="114543" user="RaffaeleDAbrusco" version="1"
 

Revision 172023-05-09 - YihanTao

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Changed:
<
<
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
>
>
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
 
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
Added:
>
>
META FILEATTACHMENT attachment="Foundation_models_for_Astronomy-Yihan_Tao.pdf" attr="" comment="" date="1683615506" name="Foundation_models_for_Astronomy-Yihan_Tao.pdf" path="Foundation_models_for_Astronomy-Yihan_Tao.pdf" size="378343" user="YihanTao" version="1"
META FILEATTACHMENT attachment="KDIG_IC_pdf.pdf" attr="" comment="" date="1683618223" name="KDIG_IC_pdf.pdf" path="KDIG_IC_pdf.pdf" size="2350052" user="YihanTao" version="1"
 

Revision 162023-05-09 - AdrianDamian

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Changed:
<
<
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA.  
>
>
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA. pdf
 
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
Added:
>
>
META FILEATTACHMENT attachment="ChatGPTIVOA.pdf" attr="" comment="" date="1683613514" name="ChatGPTIVOA.pdf" path="ChatGPTIVOA.pdf" size="190569" user="AdrianDamian" version="1"
 

Revision 152023-05-07 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 CEST

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Changed:
<
<
Adrian Damian   5'    
>
>
Adrian Damian Discover IVOA with ChatGPT 5' Discovering the International Virtual Observatory Alliance (IVOA) with ChatGPT is a seamless and insightful experience. Powered by its advanced natural language processing capabilities and extensive knowledge, ChatGPT can provide comprehensive guidance on exploring IVOA. By interacting with ChatGPT, users can ask questions, seek explanations, and receive step-by-step assistance in utilizing IVOA services. From understanding the standards and protocols to connecting with the Table Access Protocol (TAP) service, ChatGPT offers intuitive explanations and practical examples. With its ability to provide tailored responses and address individual queries, ChatGPT serves as a valuable companion in unraveling the world of IVOA.  
 
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 142023-05-07 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Changed:
<
<
Time: Tuesday May 09 11:00 UTC

>
>
Time: Tuesday May 09 11:00 CEST

 
Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Added:
>
>
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
 
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Deleted:
<
<
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
 
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Adrian Damian   5'    
Panel + audience Discussion - panel comprising speakers 25'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 132023-05-07 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Changed:
<
<
Panel + audience Discussion - panel comprising speakers 30'   pdf
>
>
Adrian Damian   5'    
Added:
>
>
Panel + audience Discussion - panel comprising speakers 25'   pdf
 Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 122023-05-07 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
Changed:
<
<
André Schaaff   5'   pdf
>
>
André Schaaff AI in querying astronomical data services 5' This short presentation is mainly in the context of astronomical data services Querying as we have experienced through a Rasa-based Chatbot for several years. It is now possible to integrate ChatGPT into a Rasa chatbot, an interesting path to explore. In addition to that, another on going experiment is to evaluate how to add "astronomical" skills to Alexa, OK google, Siri, etc. pdf
 
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Panel + audience Discussion - panel comprising speakers 30'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 112023-05-03 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Changed:
<
<
Yihan Tao Foundation models for Astronomy 5'   pdf
>
>
Yihan Tao Foundation models for Astronomy 5' While ChatGPT provides powerful conversational AI services utilizing large language models (LLMs), this talk focuses on how we can leverage the advances in AI technology to help astronomy research. "Foundation models" are models trained on large, broad data and can be adapted to a wide range of downstream tasks. LLMs are a special case of foundation models trained on language data for natural language processing tasks. This talk will introduce the concept of foundation models and some existing works that use astronomical data (such as light curves and images) to build foundation models for astronomy tasks. The potential of foundation models for advancing astronomy research will also be discussed. pdf
 
André Schaaff   5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Panel + audience Discussion - panel comprising speakers 30'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 102023-05-01 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Yihan Tao Foundation models for Astronomy 5'   pdf
André Schaaff   5'   pdf
Changed:
<
<
Rafael Galarza-Martinez Intro to Transformers 10'   pdf
>
>
Rafael Galarza-Martinez Intro to Transformers 10' Transformers are the type of machine learning algorithm behind the highly successful generative pretrained transformer models that allow for tools such as ChatGPT. In this talk I give a very general introduction to transformers, present their architecture, describe their advantages over other algorithms used in language processing, such as recurrent neural networks (RNNs), and focus on how their self-attention module enables the most complicated language tasks. I will also present a small example of how Transformers are being used in the analysis of astrophysical data. pdf
 
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Panel + audience Discussion - panel comprising speakers 30'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 92023-04-30 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery

Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Yihan Tao Foundation models for Astronomy 5'   pdf
Added:
>
>
André Schaaff   5'   pdf
 
Rafael Galarza-Martinez Intro to Transformers 10'   pdf
Changed:
<
<
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
>
>
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI 's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
 
Panel + audience Discussion - panel comprising speakers 30'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 82023-04-26 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"
Changed:
<
<

Knowledge Discovery 1

>
>

Knowledge Discovery

 
Changed:
<
<
Time: Wednesday Nov 03 22:00 UTC

Raffaele D'Abrusco

Introduction

5' pdf

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' pdf
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' pdf

Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

>
>
Time: Tuesday May 09 11:00 UTC

Speaker Title Time Abstract Material
Raffaele D'Abrusco Greetings and Introduction 10'   pdf
Sandor Kruk Exploring astronomy data archives at large scales using deep learning and crowdsourcing 15'+5' The vast amount of data in astronomy archives presents an opportunity for new discoveries. Deep learning combined with crowdsourcing provides an efficient way to explore this data using. the intuition of the human brain and the processing power of machines. In the Hubble Asteroid Hunter project, we used a deep learning algorithm on Google Cloud, trained on volunteer classifications from the asteroidhunter.org Zooniverse project to search two decades of Hubble Space Telescope (HST) observations from the ESA HST archives for objects not targeted by the Hubble observations. The project, which was set up as a collaboration between Zooniverse, ESAC Science Data Center and engineers at Google, led to the discovery of 1700 asteroids (Kruk et al. 2022), including 1031 previously unknown ones (Garcia Martinet al., in prep.), 198 new strong gravitational lenses (Garvin et al. 2022), and quantified the impact of artificial satellites on HST observations (Kruk et al. 2023). In this talk, we will present the results of this project and highlight the benefits of scientifically exploiting the vast amounts of data available in astronomy data archives using novel techniques. pdf
Raffaele/Yihan Introduction to mini-session on generative AI and language models 5'   pdf
Yihan Tao Foundation models for Astronomy 5'   pdf
Rafael Galarza-Martinez Intro to Transformers 10'   pdf
Ioana Ciucă Galactic ChitChat: Using Large Language Models to Engage with Astronomy Literature 5' We showcase the capacity of the OpenAI's large language model GPT-4 for meaningful engagement with Astronomy papers using in-context prompting. We employ a distillation technique to optimise efficiency, reducing the input paper size by 50% while preserving paragraph structure and semantic integrity. The in-context model excels at providing detailed answers contextualised by related research findings by examining its responses within a multi-document context (ten distilled documents). Our investigation highlights the potential of foundation models for the astronomical community. For example, they can help researchers gain insights from astronomical literature, such as validating new scientific hypotheses or proposing novel ideas. pdf
Added:
>
>
Panel + audience Discussion - panel comprising speakers 30'   pdf
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link
 
META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"

Revision 72021-11-04 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Changed:
<
<

Raffaele D'Abrusco

Introduction

5' pdf

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
>
>

Raffaele D'Abrusco

Introduction

5' pdf

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
 
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' pdf
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' pdf

Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
Added:
>
>
META FILEATTACHMENT attachment="slides_session_KDIG_IVOA_2021FallInterOp.pdf" attr="" comment="" date="1636049545" name="slides_session_KDIG_IVOA_2021FallInterOp.pdf" path="slides_session_KDIG_IVOA_2021FallInterOp.pdf" size="197990" user="RaffaeleDAbrusco" version="1"
 

Revision 62021-11-03 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Raffaele D'Abrusco

Introduction

5' pdf
Changed:
<
<

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' pdf
>
>

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' pdf
 
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' pdf

Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
Added:
>
>
META FILEATTACHMENT attachment="skoda-bayesian-redshift.pdf" attr="" comment="" date="1635970467" name="skoda-bayesian-redshift.pdf" path="skoda-bayesian-redshift.pdf" size="1855828" user="RaffaeleDAbrusco" version="1"
META FILEATTACHMENT attachment="Mahabal_IVOA_20211103.pdf" attr="" comment="" date="1635970664" name="Mahabal_IVOA_20211103.pdf" path="Mahabal_IVOA_20211103.pdf" size="60610" user="RaffaeleDAbrusco" version="1"
 

Revision 52021-11-03 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Changed:
<
<
Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' TBD
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' TBD
>
>

Raffaele D'Abrusco

Introduction

5' pdf

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' pdf
Added:
>
>
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' pdf
 
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' pdf

Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link

META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"

Revision 42021-11-03 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' TBD
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' TBD
Changed:
<
<
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' TBD
>
>
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' pdf
  Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link
Added:
>
>
META FILEATTACHMENT attachment="KD-IG_anomalies.pdf" attr="" comment="" date="1635945677" name="KD-IG_anomalies.pdf" path="KD-IG_anomalies.pdf" size="35575806" user="RaffaeleDAbrusco" version="1"
 

Revision 32021-10-27 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Changed:
<
<
Time: Wednesday Nov 03 22:00 UTC
>
>
Time: Wednesday Nov 03 22:00 UTC
 
Changed:
<
<
Speaker(s) Title and Abstract Time Material
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' TBD
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

7' TBD
>
>
Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).

7' TBD
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

Bayesian deep learning is a relatively new approach that starts to enter the astronomy. Unlike majority of the current methods it does provide the uncertainty of its predictions. So we can visually check the suspicious cases with high uncertainty. We demonstrate this in the experiment with spectroscopic redshift prediction from SDSS quasar catalogues .This allowed us to find a number of quasars which are probably normal stars with wrong estimate of redshift from the SDSS pipeline.

7' TBD
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' TBD
Deleted:
<
<
Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).


7' TBD
 
Changed:
<
<
Moderator: TBD, Notetaker: TBD, Etherpad link
>
>
Moderator: Raffaele D'Abrusco, Notetaker: TBD, Etherpad link
 

Revision 22021-10-26 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Speaker(s) Title and Abstract Time Material
Changed:
<
<
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective
at identifying anomalous objects in large astronomical datasets. So
far, that has meant "finding objects in sparsely populated regions of
a multidimensional feature space". This is done using a number of
methods that includes ensemble methods such as random forests
searches, and more recently generative models that identify anomalies
as those objects are more difficult to reconstruct by the trained
model. This has produced huge lists of anomalies in diverse datasets
that include SDSS galaxy spectra, Kepler and TESS light curves, and
X-ray catalogs. Yet, most of those anomalies are not followed up,
because of a cultural difficulty for scientists to interpret
multi-dimensional scatter plots that have no labels in their axes. We
argue that such cultural barrier can be overcome with novel ways to
combine domain knowledge expertise with data visualization, or even
incorporating domain knowledge directly into the anomaly detection
algorithms. We would like to discuss ways in which VO tools can help
in the identification of anomalies that represent true astronomical
discoveries, by harvesting the currently publicly available catalogs
of anomalies.

7' TBD
>
>
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective at identifying anomalous objects in large astronomical datasets. So far, that has meant "finding objects in sparsely populated regions of a multidimensional feature space". This is done using a number of methods that includes ensemble methods such as random forests searches, and more recently generative models that identify anomalies as those objects are more difficult to reconstruct by the trained model. This has produced huge lists of anomalies in diverse datasets that include SDSS galaxy spectra, Kepler and TESS light curves, and X-ray catalogs. Yet, most of those anomalies are not followed up, because of a cultural difficulty for scientists to interpret multi-dimensional scatter plots that have no labels in their axes. We argue that such cultural barrier can be overcome with novel ways to combine domain knowledge expertise with data visualization, or even incorporating domain knowledge directly into the anomaly detection algorithms. We would like to discuss ways in which VO tools can help in the identification of anomalies that represent true astronomical discoveries, by harvesting the currently publicly available catalogs of anomalies.

7' TBD
 
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

7' TBD
Changed:
<
<
Ashish Mahabal

7' TBD
>
>
Ashish Mahabal

Data Sheets and Model Cards

Astronomy datasets have been growing, and so are the attempts to use them wth a variety of machine learning techniques. While we would like to use all data, data fusion for diverse uneven or not-fully-matched datasets can be a challenge. Creating machine learning and artificial intelligence models for such datasets and follow-up validation can be challenging owing to lack of large labeled training datasets. To address this two related concepts that have emerged recently in data science are that of Data Sheets for data sets (Gebru et al. arXiv:1803.09010), and model cards for models (Mitchell et al., arXiv:1810.03993). This is just like each component in the electronics industry comes with a datasheet that describes operating characteristics, test results, recommended use etc., We recommend that for each astronomy dataset uniform and standardized datasheets that advertise similar meta-properties should be created, not just stating what “is” but also where each of the dataset could go, much like lego-blocks. This will enable data fusion, and also thwart mis-guided use of datasets. Similarly the models that we build will carry not just the usual provenance, but explicit characteristics displaying known biases and hence added caution when being used in certain ways. While this trend started in social fields where bias is explicit, it has been successfully applied in the Planetary Data System (PDS) setup for identifying key descriptors in an equally diverse dataspace (https://pds.nasa.gov/datastandards/documents/im/v1/index_1G00.html#10.31%C2%A0%C2%A0class_pds_observation_area).


7' TBD
  Moderator: TBD, Notetaker: TBD, Etherpad link

Revision 12021-10-25 - RaffaeleDAbrusco

 
META TOPICPARENT name="InterOpNov2021"

Knowledge Discovery 1

Time: Wednesday Nov 03 22:00 UTC

Speaker(s) Title and Abstract Time Material
Rafael Martinez Galarza

Harvesting outliers: data barriers to turn anomalies into discoveries

Over the last few years astronomers have become increasingly effective
at identifying anomalous objects in large astronomical datasets. So
far, that has meant "finding objects in sparsely populated regions of
a multidimensional feature space". This is done using a number of
methods that includes ensemble methods such as random forests
searches, and more recently generative models that identify anomalies
as those objects are more difficult to reconstruct by the trained
model. This has produced huge lists of anomalies in diverse datasets
that include SDSS galaxy spectra, Kepler and TESS light curves, and
X-ray catalogs. Yet, most of those anomalies are not followed up,
because of a cultural difficulty for scientists to interpret
multi-dimensional scatter plots that have no labels in their axes. We
argue that such cultural barrier can be overcome with novel ways to
combine domain knowledge expertise with data visualization, or even
incorporating domain knowledge directly into the anomaly detection
algorithms. We would like to discuss ways in which VO tools can help
in the identification of anomalies that represent true astronomical
discoveries, by harvesting the currently publicly available catalogs
of anomalies.

7' TBD
Petr Skoda

SDSS redshift prediction based on Bayesian Deep Learning

7' TBD
Ashish Mahabal

7' TBD

Moderator: TBD, Notetaker: TBD, Etherpad link

 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback