Speaker | Title | Time | Material | Abstract |
---|---|---|---|---|
Andre Schaaff | NLP-chatbot R&D at CDS | 10'+5' | Over the past years the CDS has undertaken a long term R&D work on Natural Language Processing applied to the querying of astronomical data services. The motivation was to enable new ways of interaction, especially a chatbot, as an alternative to the traditional forms with the aim to reach query results satisfying professional astronomers. The Virtual Observatory (VO) brought us standards like TAP, UCDs, ..., implemented in the CDS services, helping us to query our services and opening the door to query the whole VO. We will give a quick reminder and status of this work around a chatbot. We started in 2023 to explore how to improve it with the OpenAI API. We are now forking from this initial work to study how to apply it to the improving of our services, in a wider AI use. We will give a first overview of this new R&D study. |
|
Sebastien Trujillo | ‘Spherinator + HiPSter: from the known unknowns to the unknown unknowns’ | 10'+5' | Current applications of machine learning to astrophysics focus on teaching machines to perform domain-expert tasks accurately and efficiently across enormous datasets. Although essential in the big data era, this approach is limited by our own intuitions and expectations, and provides at most only answers to the ‘known unknowns’. To address this, we are developing a new conceptual framework and software tools to help astronomers maximize scientific breakthroughs by letting the machine learn unbiased interpretable representations of complex data ranging from observational surveys to simulations. Our tools automatically learn low-dimensional representations of complex objects such as galaxies in multimodal data (e.g. images, spectra, datacubes, simulated point clouds, etc.), and provide interactive explorative access to arbitrarily large datasets using a simple graphical interface. Our framework is designed to be interpretable, work seamlessly across datasets regardless of their origin, and provide a path towards discovering the ‘unknown unknowns’. | |
Giuseppe Riccio | 10'+5' | |||
John Abela | The Computational Evolution of Human Intelligence in AI | 10'+5' | ||
Sara Shishehchi | Leveraging Large Language Model(LLM)-based Agents with Multiple Tool Integration for Enhanced Search in the Canadian Astronomy Data Centre | Searching for data, including images, using the advanced search tool on the Canadian Astronomy Data Centre (CADC) website can be difficult for users, as it requires knowledge of the ADQL language and involves multiple steps to narrow and refine search queries. The goal of this project is to leverage Large Language Models (LLMs) and autonomous agents to create a chatbot that assists users in searching for images in the CADC database using natural language. Our LLM-based agent accepts queries in English, converts them to ADQL code, and returns the results after executing the query against the database. The system is designed to handle common user errors, such as spelling mistakes, incorrect column names, and incorrect values. In such cases, the chatbot suggests a shortlist of similar but correct values that the user might have intended. The user's feedback is then collected to retrieve the correct content. This robustness was achieved by incorporating Retrieval-Augmented Generation (RAG) and semantic search tools, which verify query components with the user before execution and test them against the database. To evaluate the performance of our system, we created a dataset of questions across different categories: standard questions, spelling errors, incorrect columns, and incorrect values. The system demonstrates 80-90% accuracy on benchmarks, which is a significant improvement over existing systems built using OpenAI’s custom GPT, which achieved less than 20% accuracy on the same tests. Our solution streamlines the search process for CADC users, making data retrieval more efficient and accessible. |
||
Panel: Andre Schaaff, Sebastien Trujillo, Giuseppe Riccio, John Abela, Chenzhou Cui Moderators: Yihan Tao, Sara Bertocco, Jesus Salgado
|
Discussion on the use of AI in astronomy and its impact on IVOA standards |