Scientists state new data standards for AI models


Scientific image of the Bragg diffraction peak in a 15×15 pixel patch of an unformed two-crystal gold sample. The height represents the number of photons. This data was collected at the Advanced Photon Source and processed at the ThetaGPU supercomputer. Image supplier: Argonne National Laboratory / Eliu Huerta.
Aspiring bakers are often called upon to adapt award-winning recipes based on different kitchen settings. For instance, someone could use a whisk instead of a mixer to make award-winning chocolate chip cookies.
The ability to reproduce a recipe in different situations and with different settings is important for both talented chefs and chefs alike. computational scientists, who later faced a similar problem of adapting and recreating their own “recipe” when trying to validate and work with new AI models. These models have applications in science from climate analysis to brain research.
“When we talk about data, we have a real understanding of the documents,” said Eliu Huerta, scientist and head of the Translational AI division at the US Department of Energy’s (DOE) Argonne National Laboratory. digital assets that we handle. “With an AI model, it’s a little less obvious; are we talking about intelligently structured data, or is it a computer, or software, or a mixture?”
In a new study, Huerta and his colleagues articulated a new set of standards for managing AI models. Adapted from recent research on automation data managementthese standards are called FAIR, which stands for standards that are found, accessible, interoperable and reusable.
“By creating FAIR AI models, we no longer have to build each system from scratch,” said computer scientist Ben Blaiszik of Argonne. “It becomes easier to reuse concepts from different groups, helping to create cross-pollination between groups.”
According to Huerta, the fact that many AI models are currently unfair poses a challenge to scientific discovery. “For many of the studies that have been done to date, it has been difficult to approach and reproduce the AI models referenced in the literature,” he said. “By creating and sharing FAIR AI models, we can reduce the amount of duplication efforts and share best practices on how to use these models to enable great science.”
To meet the needs of a diverse user community, Huerta and his colleagues combined a unique set of data management and high-performance computing platforms to establish a FAIR protocol and quantify the “FAIR” level of AI models. The researchers paired FAIR data published at an online repository called the Materials Database, with FAIR AI models published at another online repository called the Materials Database. The Data and Learning Center for Science, as well as with AI and supercomputing resources at the Argonne Leadership Computing Facility (ALCF).
In this way, the researchers were able to create a computational framework that can help connect disparate hardware and software, create AI models that can run similarly across platforms, and deliver results are reproducible. ALCF is a facility used by the DOE Office of Science.
The two keys to creating this framework were platforms called funcX and Globus, which allowed researchers to access high-performance computing resources right from their laptops. “FuncX and Globus can help overcome differences in hardware architecture,” said co-author Ian Foster, director of Learning and Data Science at Argonne. “If someone is using one computer architecture and someone else is using another, we now have a way to speak a common AI language. That’s an important part of it.” in making AI better interoperable.”
For the study, the researchers used the sample dataset of an AI model that uses diffraction data from Argonne’s Advanced Photon Source, which is also a user base of the DOE Office of Science. To perform the calculations, the team used the SambaNova system of the ALCF AI Testbed and the NVIDIA GPU of the Theta supercomputer (graphics processing unit).
“We are pleased to see the productivity benefits of FAIR from sharing models and data to give more researchers access to more researchers,” said Marc Hamilton, NVIDIA Vice President of Architecture and Engineering. into high-performance computing resources. “Together, we are supporting the expanding universe of High performance computers combine experimental data and advanced instrumentation with AI to accelerate scientific discovery. “
“SambaNova is excited to partner with researchers at Argonne National Laboratory to pursue AI interface innovation and emerging hardware architectures,” said Jennifer Glore, vice president SambaNova Systems Customer Engineering Manager adds. “AI will have an important role in the future of scientific computing, and the development of FAIR principles for AI models along with new tools will empower researchers to enable discovery automated on a large scale. We look forward to continuing our cooperation and development at ALCF Tested AI.”
A research-based paper, “FAIR Principles for AI Models, with Practical Application for Accelerated High Energy Diffraction Microscopy”, appeared in Scientific Data on November 10, 2022.
Nikil Ravi et al., FAIR principles for AI models with practical applications for accelerated high-energy diffraction microscopy, Scientific Data (In 2022). DOI: 10.1038 / s41597-022-01712-9
Provided by
Argonne National Laboratory
Quote: Scientists outline new data standards for AI models (2022, November 10) retrieved November 10, 2022 from https://techxplore.com/news/2022-11- science-articulate-standards-ai.html
This document is the subject for the collection of authors. Apart from any fair dealing for personal study or research purposes, no part may be reproduced without written permission. The content provided is for informational purposes only.