Researchers at the National Science Foundation’s newly created institute designed to improve artificial intelligence algorithms for scientific research say they expect to make dramatic advances in the institute’s first year of operation.
The Accelerated Artificial Intelligence Algorithms for Data-Driven Discovery (A3D3) Institute is a multidisciplinary organization designed to “lead a paradigm shift in the application of real-time artificial intelligence (AI) at scale to advance scientific knowledge and accelerate discovery,” according to the institute’s website. Institute researchers will target three fields of study—high-energy physics, multimessenger astrophysics and systems neuroscience—that require massive reams of data.
As scientific data sets become progressively larger, algorithms to process this data quickly become more complex. AI has emerged as a solution to efficiently analyze these massive data sets, the institute’s website explains. “Emerging processor technologies, such as graphics processing units [GPUs] and field-programmable gate arrays [FPGAs], allow AI algorithms to be greatly accelerated. The combination of AI and these processors is leading to a revolution in the way we analyze data, minimizing the time needed to perform the most advanced of analyses, and allowing us to address the challenges brought about by the omnipresent onslaught of data,” the website states.
To harness new developments and use them to advance scientific knowledge, the National Science Foundation provided a $15 million, five-year grant to create the A3D3 Institute under the Harnessing the Data Revolution program. The organization includes principal investigators from California Institute of Technology, Duke University, Massachusetts Institute of Technology (MIT), Purdue University, University of California San Diego, University of Illinois at Urbana-Champaign, University of Minnesota, University of Washington and University of Wisconsin-Madison.
In the area of high-energy physics, the institute will work with the Large Hadron Collider, which processes more data than any other scientific instrument on Earth, according to an MIT press release announcing the A3D3’s creation. The Large Hadron Collider is the world’s largest particle accelerator and offers revolutionary scientific insights into particle science, including such areas as dark energy and dark matter.
Thanks to advances in techniques such as medical imaging and electrical recordings from implanted electrodes, neuroscience is also gathering larger amounts of data on how the brain’s neural networks process responses to stimuli and perform motor information. A3D3 plans to develop and implement high-throughput and low-latency AI algorithms to process, organize and analyze massive neural data sets in real time to probe brain function and enable new experiments and therapies.
Multimessenger astrophysics includes the identification of astronomical events by efficiently processing data from gravitational waves, gamma-ray bursts and neutrinos picked up by telescopes and detectors, the MIT press release explains.
“We cover these three scientific areas to touch the fundamental curiosity of human beings—who we are, where we came from and where we are going to,” says Shih-Chieh Hsu, associate professor, Department of Physics, University of Washington, and A3D3 director and principal investigator. “This is a fundamental part of physics we’re trying to address under the high-energy physics area. Multimessenger astrophysics is really crucial to unlock the most dramatic cosmic events, and neuroscience is really to unlock how our brains work.”
Philip Harris, assistant professor, Department of Physics, MIT, and A3D3 deputy director, adds that AI already is becoming integral to many areas of scientific research, but the AI algorithms themselves need to advance. “The idea is that a lot of people are now using AI for many things, and it’s kind of taking over many domains. Our focus is to make AI run really fast. So, the point of A3D3 is to take a bunch of critical scientific problems that need real-time AI and focus on both developing the AI technology and developing the use cases to solve some very important scientific problems.”
Despite the complex scientific topics and the deluge of data required for each, A3D3 scientists say they should be able to report major accomplishments in the institute’s first year of operation.
In December, the group likely will deploy an AI-based alert system to read data from the gravitational wave detector known as the Laser Interferometer Gravitational-Wave Observatory, or LIGO, which is supported by the National Science Foundation. The alert system also may be used with Japan’s Kamioka Gravitational Wave Detector, which is better known as KAGRA. Gravitational waves are “ripples” in space-time, according to the LIGO website and are used to detect colliding black holes, supernovae and other major cosmic events.
Additionally, the A3D3 team has created a software tool that will be used for real-time data processing for the Large Hadron Collider. “The data rates we’re talking about here are larger than any other device in the world. It’s hundreds of terabytes per second,” Harris explains. “Our AI toolkit is going to be deployed in the next run of the Large Hadron Collider starting next summer.”
The rapid advances are possible because many of the researchers with the institute were working together as an informal cooperative before the A3D3 was officially formed in October. The informal community of scientists developed what they called the Fast Machine Learning Lab, which attracted some initial National Science Foundation funding before morphing into the A3D3 with the five-year grant.
One of the group’s innovations is to use algorithms for self-driving cars and repurpose them to study particle collisions, but for physics research the algorithms must run much faster than they do in cars. Additionally, even though autonomous car algorithms identify objects involved in collisions, including trees and people, collisions used for physics research are much more complex, Harris explains. “You get these sprays of particles, and you can have one spray of particles on top of another spray of particles, so they are intermixed into a single object,” he says. “But you cannot have a human and a tree mixed into each other.”
Prior to formally creating the institute, the scientists built a computer program known as a compiler and dubbed it High-Level Synthesis for Machine Learning (HLS4ML). Compilers essentially convert one computer language into other languages. The HLS4ML compiler translates traditional, open-source machine learning packages into HLS that can be configured for specific scientific use cases. “It’s a compiler for programming these specialized processors … specifically designed so that you can do parallel processing,” Harris explains.
That allows much faster data processing, but how much faster is difficult to quantify. “You couldn’t make AI algorithms run this fast before. The algorithms that are running at the Hadron Collider run at hundreds of nanoseconds. Normal AI algorithms run in milliseconds,” Harris notes. “But it’s somewhat of an unfair comparison because normal AI algorithms can be very large, and we focus on smaller algorithms and run them really fast.”
Hsu also touts the HLS4ML’s flexibility and ease of use. “This is a framework that is so friendly and easy to pick up. It’s portable. That’s why it attracts people beyond our original target of particle physics,” he says, adding that the tool has drawn international interest. With the resources provided for the institute, the team will be better able to support new features for scientists working in other areas of research.
The advances primarily focus on FPGAs and GPUs. Both types of processors allow AI algorithms to run much faster. The A3D3 team developed a software tool known as Sonic that translates computer processing unit AI algorithms to GPUs for faster processing. Another version of the tool used specifically for astronomy is called Hermes. “We’re coming up with ways to transition from computer processing units to GPUs to exploit the fact that AI systems can run really fast on GPUs,” Harris says.
The A3D3 researchers have not yet worked with the Defense Department or broader national security community, but they suggest their research can provide benefits outside of the organization’s three focus areas. It could be used to process satellite imagery much more quickly, for example. Or it could be helpful for AI algorithms used in the cyber arena.
A3D3 researchers at the University of California San Diego have developed a first-of-its-kind testbed for cybersecurity known as Voyager, according to a university press release. Amit Majumdar, who leads the San Diego Supercomputer Center at the university, indicates in the release a lot of synergy exists between the Voyager project and A3D3 given the institute’s focus on FPGAs and GPUs.
Harris calls Voyager a complementary effort and suggests the institute’s algorithms may run on the Voyager system to provide greater throughput. “The National Science Foundation giving us this funding allows us to really expand the scope and change the dynamic so that we can do a lot more. This is just the beginning. There are a lot of different areas we can explore,” he asserts.