PELICAN takes flight

June 5, 2024

The ATLAS detector at CERN — The ATLAS detector (Photo courtesy of CERN)

A novel algorithm that harnesses math and physics principles offers enhanced pattern recognition capabilities.

Using data collected by the ATLAS experiment at the Large Hadron Collider (LHC), David W. Miller investigates the fundamental structure of matter by studying quarks and gluons (elementary particles that comprise protons and neutrons), as well as searching for never-before-seen particles such as those that may comprise dark matter. These experiments detect 40 million collisions per second and produce about a petabyte of data per minute—a fraction of what the LHC is ultimately capable of producing. This is an almost incomprehensibly vast amount of data, but physicists can leverage machine learning to help recognize patterns in the sea of information.

Miller, Associate Professor in Physics, the Enrico Fermi Institute, and the Kavli Institute for Cosmological Physics, teamed up with computer scientists, mathematicians, and physicists in different fields to improve algorithms designed to sift through the noise. PELICAN (the Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network) is a new type of algorithm that uses a special architecture based on mathematical symmetry groups, making it simpler, easier to understand, and more effective. The algorithm is detailed in a paper published in the Journal of High Energy Physics.

We spoke with the PELICAN team, which includes Miller, Flatiron Institute postdoctoral fellow Alexander Bogatskiy, UChicago PhD students Timothy Hoffman and Jan Offermann, and undergraduate researcher Xiaoyang Liu. This interview has been edited and condensed.

Please describe the research in plain language.

We’ve developed a novel pattern recognition algorithm for identifying and measuring sprays of particles produced in the energetic collisions of particle collider experiments, like those at the LHC. Our work takes a new approach to fully combining fundamental principles of physics and mathematical symmetries with novel artificial intelligence techniques to address open questions in particle physics. These models are no longer the “black box” stereotype of traditional machine learning. Instead, they’re designed from first principles to yield mathematically rigorous and physically meaningful insight into not just what prediction the model produces, but also how.

Can you explain the black box stereotype?

Part of the reason applications in machine learning get labeled as “black boxes” is because many are focused on some question that doesn’t have a fundamental answer. There is no true answer for how many clicks an ad should generate. It’s just: more is better. The ends justify the means.

We know the perils of that approach—people can develop extraordinarily bad, even destructive habits. It’s in part because of those tendencies that people often consider these sorts of algorithms as black boxes. “I don’t care what’s happening or why, I just want more clicks.”

That perception has leaked into circumstances where there is a fundamentally true or false dichotomy in the answers. Gravity either always pulls things down or it doesn’t. In the research that we do into particle collisions where we’re trying to measure features, dynamics, and properties of the fundamental universe, we firmly believe as scientists that there are right and wrong answers.

So there’s a lot of incentive for trying to break that stereotype of the black box algorithm, where we want to know all the guts, we want to know all the details and all the whys for how this thing works.

What are symmetries in this context?

David Miller explains mathematical symmetry through upside-down cats. Video Link: https://www.youtube.com/watch?v=7pSzSjzj5Q8

Miller defines Lorentz group symmetry using time dilation and the twin paradox. Video Link: https://www.youtube.com/watch?v=bOqWw32yviw

How or why does building a machine learning architecture on these mathematical principles benefit physics research?

The sprays of particles that emanate from the high-energy proton-proton collisions that are created inside the LHC respect this Lorentz group symmetry. It’s only sensible that the algorithms that we use to analyze those systems and extract information also understand or respect those mathematical symmetry structures.

If you build an algorithm that essentially tracks the rotation of a cat on a two-dimensional image, you can reduce the overall complexity of the architecture. The ability to recognize an infinite number of cats as cats no matter how they are positioned, for instance, makes interpreting the world quite a bit simpler.

But also, we think that there’s a fundamental reason why the physics systems—the particle sprays—that we’re using this algorithm to describe look the way they do. There isn’t a fundamental reason why a cat is upside down in a picture. It just happens to be hanging from its cat tree. But the particles coming out of the LHC, as far as we know, have no choice but to respect this Lorentz group symmetry under all circumstances. Therefore, we think that the interpretation of the answers that the algorithm is providing is more robust and has a stronger basis in fundamental science and mathematics as a result.

What exactly are the answers that this algorithm provides?

Let’s say we want to find evidence of the existence of dark matter. At the LHC, one of the many things that we’re doing is trying to make dark matter in the laboratory, and hypothetically it could be related to any number of possible new particles. We want to differentiate new physics from old physics that we know from the Standard Model of particle physics. And so, similar to the cats and dogs, we want to be able to classify different particle species, and that’s what this algorithm helps us accomplish.

For example, we want to differentiate between a top quark and a bottom quark, or up and down quarks, because we think these new particles could preferentially create top quarks as opposed to up and down quarks. By improving our ability to distinguish between various particle species, we may be better able to detect the presence of new physics.

What do you find most exciting or important about the results?

PELICAN demonstrated state-of-the-art performance on multiple particle physics tasks related to pattern recognition and measurement. The most surprising aspect is that this field-leading performance in terms of accuracy and precision was achieved despite being a dramatically simpler model in terms of the number of parameters and overall complexity.

In the separate task of particle reconstruction (the process of discerning the presence and type, trajectory, and energy of particles following a collision), where we need to measure the momentum of an unobserved particle based on its visible decay products, not only does PELICAN produce similarly accurate and precise measurements, but it does so in an explainable way: the output is composed of physical features that can be directly interpreted by physicists.

What’s next?

We are delving more deeply into the relative impact of several different aspects of our novel algorithm design. In particular, we want to definitively demonstrate that it is the mathematical symmetries that we designed it to respect that are responsible for the boost in performance compared to other approaches. The relationship between symmetries and the algorithm performance hasn’t been clearly demonstrated so far, and that will be important for the future direction of the entire field of machine learning in particle physics.

Aside from this, we want to apply PELICAN-like architectures to other important tasks in particle physics, including the interpretation of low-level data directly from the instrumentation, such as measuring charged particle track trajectories, as well as in searches for new particles that might explain even greater outstanding questions in physics. We’re also excited to implement a lightweight version of PELICAN that is designed to operate on dedicated hardware devices in the real-time data processing and filtering systems that decide what data to keep and what to throw away before any human ever sees it.

What are some real-world applications or wider implications of this research?

Miller describes how algorithms like PELICAN might one day help self-driving cars avoid obstacles. Video Link: https://www.youtube.com/watch?v=Topg9aIAL6U

Physics-inspired machine learning models have a wide variety of possible uses in fundamental physics research. We’ve benchmarked the performance of this algorithm on a very well-understood problem: the classification or differentiation between top quarks and other types of quarks, as mentioned earlier.

Another task could be differentiating Higgs bosons from other types of particles. The ability to distinguish the presence of a Higgs boson in our dataset with greater precision and accuracy could be a huge boon to various scientific pursuits.

This type of model could also benefit applied problems like protein folding (biology/pharmacology) and plasma control design (plasma physics/fusion energy). In particular, our research shows promise in the areas of real-time decision-making processes, and in time-consuming high-resolution data analysis tasks where it has been shown to outperform traditional rule-based methods. Beyond the fundamental science domains, we imagine physics-inspired algorithms could potentially increase the robustness and reliability of any number of machine learning and artificial intelligence applications that are applied in the context of trying to understand the physical world around us.

Citation: “Explainable Equivariant Neural Networks for Particle Physics: PELICAN.” A. Bogatskiy, T. Hoffman, D. W. Miller, J. T. Offermann, X. Liu, Journal of High Energy Physics, March 20, 2024.

Support: University of Chicago Data Science Institute and Department of Physics, National Science Foundation, and the Scientific Computing Core at the Flatiron Institute (a division of the Simons Foundation)