This impact paper presents a vision for the integration of generative AI into the physical sciences, emphasizing the critical role of interdisciplinary collaboration and educational initiatives to fully harness the benefits of this intersection.
The physical sciences and artificial intelligence (AI) have been intertwined throughout their respective histories—AI has been routinely used for data analysis in the physical sciences, while principles from the physical sciences have repeatedly driven significant methodological advances in AI. Similar to their impact in fields such as computer vision and natural language processing, recent generative AI approaches hold transformative potential for all aspects of physical science research, from analyzing data and generating new hypotheses to enabling challenging theory calculations and designing new experiments. In return, the physical sciences are fueling significant progress in the development of new AI technologies, spanning from algorithms to hardware. This impact paper presents a vision for the integration of generative AI into the physical sciences, emphasizing the critical role of interdisciplinary collaboration and educational initiatives to fully harness the benefits of this intersection.
Keywords: physical sciences; artificial intelligence; generative AI; scientific discovery; machine learning
Funding: This work is supported by the National Science Foundation (NSF) under Cooperative Agreement PHY-2019786 (The NSF AI Institute for Artificial Intelligence and Fundamental Interactions [IAIFI]; http://iaifi.org/).
Physics specifically, and the physical sciences broadly, are motivated by a need to understand the fundamental principles that govern the Universe, from its subatomic building blocks to its largest structures on cosmic scales. A cornerstone of the discovery process in the physical sciences is the interplay between theoretical models and observations, with each providing insights and constraints on the other as a way to improve our understanding of the natural world. This interplay can happen in different ways: the direct analysis of experimental data and its interpretation in the context of theoretical models or the realization of theoretical models through detailed simulations and first-principles calculations, enabling the prediction of many physically relevant quantities. In each case, powerful statistical and computational techniques are necessary to enable scientific discovery. Both observational data and calculations of theoretical models have increased in complexity over the last decades, and fully capitalizing on this increased complexity necessitates corresponding developments on the frontiers of data analysis and scientific computing.
In this context, it is not surprising that the tools of artificial intelligence (AI) and machine learning (ML) have been employed in physics for decades, with insight from this domain also being fed back to inform key developments in AI. For example, artificial neural networks have been used in particle physics since the late 1980s (Denby 1988; Peterson 1989) to analyze the results of collider experiments, with their use originally limited to small parts of the data analysis pipeline. Conversely, many sampling and optimization techniques ubiquitous in modern AI have their roots in physics applications or methods (Wainwright and Jordan 2008).
In the last decade, the role of AI in physics, and in the sciences more broadly (Wang et al. 2023), has begun to transform. More recently still, the AI landscape has been energized by the emergence of a powerful new class of generative AI methods that have achieved remarkable success in domains such as image and video synthesis, text generation, and general-purpose reasoning. Generative large language models (LLMs) (Brown et al. 2020; Vaswani et al. 2023) and downstream products such as Claude, Gemini, and GPT have showcased the ability of these tools to capture and generate human-like text across a wide range of domains. Similarly, generative image models (Ramesh et al. 2021; Rombach et al. 2022) like DALL-E, Midjourney, and Stable Diffusion have demonstrated the ability to synthesize strikingly realistic and creative images from textual descriptions. Under this umbrella, the paradigm of foundation models (Bommasani et al. 2022) has also emerged; while traditional AI methods have largely focused on improving performance on narrowly scoped tasks, foundation models aim to be deployable for a wide range of tasks simultaneously. Diffusion models like DALL-E are a prime example—although most commonly used for image generation, by having learned a distribution over natural images jointly with natural language, they can easily be adapted to many downstream tasks such as in-painting/out-painting, editing, and style transfer (Song et al. 2023).
As the new domain of generative and foundation models evolves, perspectives from physics are continuing to play an important role. For example, many commonly used optimization and sampling algorithms are rooted in either physics applications or physics-motivated reasoning. A striking example is in fact generative text-to-image models—under the hood, they are based on so-called diffusion models, which use principles from non-equilibrium thermodynamics (Sohl-Dickstein et al. 2015), showcasing how physics perspectives can contribute to algorithmic development in AI.
Even as physics insights have informed the development of generative AI tools, applications of these tools to physics are by some metrics still in the early stages. Far beyond applications to text and images, there is a growing recognition that these techniques could have a transformative impact on the way we do science. That is, generative AI offers a new set of tools for scientific discovery, with potential to transform the entire research workflow from hypothesis generation (Douglas 2022) to experimental design (Li et al. 2021), data analysis (DeZoort et al. 2023; Melchior et al. 2021), and theory calculations (Cranmer et al. 2023). Beyond applications to individual tasks (e.g., faster simulations, better classification performance), custom foundation models can also enable fundamentally new ways of interacting with physics data and models (Subramanian et al. 2023). Applications of generative AI in the physical sciences span the full spectrum of maturity, with some examples in the proof-of-principle stage and others in active deployment for scientific discovery.
Despite the clear promise, realizing the full potential of generative AI in the physical sciences will require overcoming several challenges, both technical and community-oriented. In particular, physics applications often have specific requirements for generative AI methods to be useful (e.g., exactness guarantees, precise control over uncertainties, and interpretability). These requirements are typically different from those present in industrial applications or in other scientific domains, making physics problems a unique and also fertile ground for the development of new and robust generative AI methods.
In this impact paper, we examine the current state-of-the-art and future potential of generative AI for scientific discovery in the physical sciences. We consider applications across domains from particle physics to astronomy, and discuss key challenges and open questions. We also explore how physics perspectives are helping to advance AI capabilities through physics-inspired architectures and theoretical frameworks. Finally, we outline the steps needed from a community perspective to realize the promise of this exciting intersection in full, from educational initiatives to cross-sector collaboration.
At the heart of the interplay between theory and observations in physics is the principle that complex, real-world observations are manifestations of simpler physical laws, and the ultimate goal of physics is to uncover, describe, and understand these laws. A fundamental obstacle in this quest is the challenge of working with and analyzing high-dimensional data. AI tools such as neural networks are especially capable of navigating high-dimensional spaces, making them well-suited for physical analyses when wielded with intention. We outline here several classes of applications of AI, and in particular generative AI models, that have the potential to transform the cycle of discovery in the physical sciences.
Simulations (also called forward models or digital twins) are a workhorse in physics and in other domains in the natural sciences and beyond, providing a versatile way to compare theory with observation. Producing simulations often involves sampling from a series of physically informed distributions (e.g., the distribution of particles produced in collider experiments, the distribution of how they respond to interactions with a detector, etc.) before arriving at a realization of the data (e.g., a collection of detected particles) or directly sampling from an underlying, potentially highly complex, physical probability distribution. Realizing complex high-dimensional simulations and using them for various downstream tasks that involve comparison with theory or observation are both arduous tasks, typically requiring a great deal of domain-informed simplification.
Generative AI can play an important role in overcoming some of the computational challenges of simulations for applications in the physical sciences (Cranmer, Brehmer, and Louppe 2020; Lai et al. 2024). The use of generative AI in these contexts is often directly analogous to its successful use in image generation models or LLMs. In the scientific context, rather than sampling from the distribution of images or text given a prompt, we sample from the possible realization of physical observations given an underlying theory constraint (which plays a role analogous to a “prompt”). Emulating complex, expensive simulations with high fidelity at a fraction of the computational cost compared with traditional simulators, AI-based models, sometimes referred to as surrogate models, can be trained using existing simulated data (or potentially even observed experimental data) to learn the underlying distribution of the physical system in question. They can then be used to generate new samples from this distribution efficiently, enabling rapid exploration of different physical scenarios and parameter spaces. There has been recent progress in developing such surrogate models in the domains of particle physics and cosmology, in particular targeting particle collider data (Butter et al. 2023), cosmological large-scale structure observations (Cuesta-Lazaro and Mishra-Sharma 2023; Legin et al. 2023), and various astrophysical objects like galaxies (Lanusse et al. 2021). These models can then be used in a variety of downstream tasks toward better characterization and analysis of the physical systems in question. This is critical as we scale up in the era of big data in these domains, where generating simulated data as large in scale as that collected by the experiments can be prohibitively expensive when using traditional simulation pipelines, and the level of detail required may be too high to be amenable to traditional (e.g., parametric) descriptions.
The extent to which generative techniques can replace standard physics-aware simulations is tied to the level of accuracy and control over the model that a specific downstream task requires. In certain applications in particle and astrophysics requiring a high degree of precision, for instance, small discrepancies between simulations and observations, as little as 1 part in 105 in some applications, can lead to sub-par performance and even false signals. This motivates the need to design improved and more scalable algorithms tailored to physics applications with higher fidelity than can be provided with existing methods. With some exception, much of the work in this direction in domains like particle physics and astrophysics is still at the proof-of-concept stage, with significant research effort going into making these models useful for current and future data analysis tasks. As an example, efforts to replace expensive event generators in collider physics with generative AI proxies currently have to trade off between speed and fidelity (Buhmann et al. 2023; Buhmann, Kasieczka, and Thaler 2023; Butter et al. 2023), when both are necessary for deployment-ready applications.
Besides emulating expensive simulations, generative AI can also be used to accelerate the realization of fundamental physical theories (e.g., as described by theory equations, without using training data in the traditional sense; Cranmer et al. 2023). By incorporating the inherent symmetries and constraints of a theory directly into the structure of a generative model, such a model can be used, for example, to accelerate the computation of high-dimensional integrals appearing in various forms throughout applications in the physical sciences through more efficient sampling procedures. These classes of approach are finding success in applications as diverse as protein design (Bose et al. 2024) and quantum field theory calculations for nuclear physics (Boyda et al. 2020, 2022; Kanwar et al. 2020).
Anomaly detection refers to the problem of detecting patterns in data that do not conform to prespecified theoretical models and/or experimental conditions. This, naturally, is a key mechanism for discovering new and unpredicted physical phenomena. In the context of scientific discovery, this class of tools enables the detection of signals of new fundamental physics that we may not otherwise know to look for. They also, however, play an important role in, for example, the monitoring and control of the experimental apparatus and the prompt flagging of unexpected failure modes and adaptation of detector settings to environmental changes (i.e., data-quality monitoring).
A key step of the anomaly detection process is to assess the statistical significance of an anomalous observation or set of observations by computing the expected probability of an occurrence under the assumed theory models and experimental conditions. This is one place where generative AI can play a crucial role, by efficiently learning the distribution of the data under standard expected conditions, even when the space of the data is extremely high-dimensional (Kasieczka et al. 2021; Letizia et al. 2022). With the use of AI, in fact, anomaly detection algorithms have been pushed to the rawest stage of data collection, aiming for the most comprehensive level of data inspection (Pol et al. 2019, 2020). This is in contrast to traditional approaches, which can only operate on simplified high-level data products, potentially overlooking subtle signals that affect the data at a lower level.
Similar to the high fidelity requirements for generative AI–based simulators in physics, requirements for anomaly detection pipelines in the physical sciences are particularly stringent compared with ML-based detection methods routinely applied to other domains of science, industry, and everyday life (Belis, Odagiu, and Aarrestad 2024). This is especially evident in the context of collider experiments, where searching for new physics often involves looking for rare events amidst a vast background of known processes, and discovery can only be claimed if a “5𝝈” level of statistical significance is met—that is, if the probability of claiming a false anomaly is less than roughly 1 part over 10 million. To meet these challenges, generative AI models used for anomaly detection in physics must not only achieve a high level of fidelity but also provide reliable estimates of their uncertainties. This requires careful validation of models against data and a deep understanding of their limitations. This is a topic of ongoing research, with benchmarking efforts and community challenges providing useful avenues for comparing different methods (Aarrestad et al. 2022; Kasieczka et al. 2021).
Despite ongoing challenges, anomaly detection methods are now being pushed closer to deployment on real data in many subdomains of the physical sciences. For example, both the CMS and ATLAS experiments at the Large Hadron Collider recently released physics analyses deploying state-of-the-art anomaly detection techniques on real collider data to search for new phenomena (Aad et al. 2024; CMS Collaboration 2024), with many of the techniques deployed relying on generative AI. Although no significant anomalies were detected in either case, these searches demonstrated the efficacy of a wide range of AI-assisted anomaly detection techniques for physics discovery and showcased advantages over traditional counterparts. Anomaly detection methods have also been deployed on astrophysical data, for example, to discover novel structures—a large number of stellar stream candidates—using observations of the properties of millions of Milky Way stars (Raikman et al. 2023; Shih, Buckley, and Necib 2023).
Generative AI could potentially aid in the discovery of new scientific theories and hypotheses through integration into the physics research process. By learning the underlying structure of plausible theories and models, generative algorithms could suggest novel theoretical frameworks or highlight promising avenues for experimental investigation. Quoting notable mathematician Terence Tao (Tao 2023), “The 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process. When integrated with tools such as formal proof verifiers, internet search, and symbolic math packages, I expect, say, 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well.” It is easy to similarly imagine integration of AI into various parts of the physics research pipeline; one could envision interfacing AI-driven advancements in mathematics with the analysis tools discussed in previous sections, creating an end-to-end workflow for scientific discovery in physics. This pipeline would incorporate not only mathematical insights but also experimental constraints and design considerations, enabling researchers to transform theoretical predictions into testable hypotheses and analyze experimental data, integrating theory, computation, and experiment.
Early work using language models to generate new mathematical conjectures has already demonstrated how this could become a valuable tool for scientific theory discovery. For example, in the domain of theoretical particle physics, recent research has demonstrated the potential of transformers (one of the backbone architectures in the generative AI revolution) to learn and simplify the complex mathematical structure of physical objects named “scattering amplitudes,” which encode the probabilities of particle interactions (Cai et al. 2024; Dersy, Schwartz, and Zhang 2022; Merz et al. 2023). These methods also have the potential to connect related concepts across disparate fields in ways that may not be easily achievable to domain-focused practitioners. However, there remain open questions about the role of interpretability and simplicity in scientific theories, and whether machine-generated theories that are not human-understandable would constitute a genuine advance in scientific understanding (Krenn et al. 2022; Schwartz 2022).
The success of foundation models in domains like computer vision (Ramesh et al. 2021) and natural language processing (Brown et al. 2020) has spurred considerable interest in understanding the potential of the paradigm for scientific discovery and the development of foundation models for the sciences with various degrees of domain-specificity (Subramanian et al. 2023). Since scientific data typically feature very different structures than that of natural text or images, both in situ and when processed by measurement apparatus, generalist pretrained models are unlikely to be immediately useful for interacting with scientific data, which also comes in diverse modalities. At the same time, scientific data across domains is likely to share a great deal of common structure; data across disparate domains is, after all, realized by common underlying physical laws. It is possible, therefore, that a foundation model trained across scientific domains can learn to uncover novel latent structures and regularities that would otherwise be inaccessible when working with a single domain. Recent work offering promise in this direction showed that pretraining a single model on multiple heterogeneous fluids–based physical systems outperforms individually trained models on diverse downstream tasks (McCabe et al. 2023).
There has also been interest in developing foundation models tailored to particular physics domains or use-cases, for example, astronomical survey observations (Mishra-Sharma, Song, and Thaler 2024; Parker et al. 2023) and particle collider data (Birk, Hallin, and Kasieczka 2024; Golling et al. 2024; Mikuni and Nachman 2024). By learning semantically meaningful representations of the data, these models can be used for a variety of downstream tasks either immediately or after fine-tuning on a small amount of available labeled data. Data representations of this kind have the potential to retain the information relevant to distinguish data points originating from different physical processes (Harris et al. 2024), enabling the design of downstream tasks that aim to relate or combine multiple aspects of physics. Physics data is often especially well suited for constructing multimodal foundation models. Taking astrophysics as an example, we often have access to multiple different representations of the same underlying object (e.g., an image, a spectrum, a light curve, multiple simulated realizations, etc.). This multimodality can be leveraged to construct powerful physically informative representations of the data (Parker et al. 2023). The general setting of having a large amount of unlabeled data, and a small amount of labeled data to fine-tune toward a particular task, is ubiquitous in many physics domains, which lends itself well to being leveraged for training foundation models.
Despite these early successes, however, there is a tension that is yet to be fully explored between developing general multipurpose models and specific solutions tailored to particular domains or tasks. In contrast with many industry applications of foundation models, for applications in the physical sciences there is often domain-specific knowledge, such as ab-initio (first-principle) constraints for physical systems. While general foundation models can potentially uncover novel latent structures across scientific domains, it is likely that incorporating such constraints may lead to more effective and interpretable models for particular applications (Park, Harris, and Ostdiek 2023). How the benefits of both the flexibility of general models and the advantages of domain insight can be best combined continues to be explored.
In addition to the question of the most effective way to use generative AI in the context of the physical sciences, the nature of the field also leads to several caveats to consider. One key challenge is that science is ultimately about (human) understanding rather than merely predicting or generating. In this direction, a key open question is how to extract physical insights from learned foundation models in a way that genuinely advances scientific understanding (Krenn et al. 2022).
Humans have drawn inspiration from natural systems in designing artificial systems since time immemorial. We see striking examples across various fields of engineering, such as construction, aviation, and materials science, and the design of AI is no exception. The dynamics of physical systems as they move toward their minimal energy configurations provide natural examples of efficient solutions to optimization problems, which intelligent artificial systems can learn to emulate. Simultaneous to the growing impact of generative AI on discovery in the physical sciences, it is not surprising, then, that there is also great potential in the continued use of insights from the physical sciences to transform generative AI.
Physics-based algorithms have been described as “unreasonably effective” for AI, and physics has provided inspiration in particular for many of the probabilistic algorithms that generative AI is rooted in. As previously noted, diffusion models, which kicked off the generative AI revolution in the space of images and now have shown to be equally effective for video and voice generation, have their roots in the diffusion process from non-equilibrium thermodynamics in physics (Sohl-Dickstein et al. 2015). Recently, physics-inspired models other than diffusion have been used to develop new, performant classes of generative AI models (Liu et al. 2023; Xu et al. 2022, 2023).
The importance of physics insights into the development of generative AI methods should not be a surprise; historically, the development of sampling algorithms, which are the conceptual backbone of generative AI, has been closely intertwined with methods and applications in physics. This is partly a consequence of the fact that, out of necessity, rigorous statistical methodology is a central part of the analysis of physics data and processes. As examples, Markov Chain Monte Carlo and Hamiltonian/Hybrid Monte Carlo (Duane et al. 1987) were both inspired by and initially developed for physics applications. Similarly, variational inference, a powerful technique for performing high-dimensional inference that is ubiquitous in AI, originated from the free energy minimization principle in statistical mechanics (Peterson and Anderson 1987). Physics-based algorithms continue to yield effective frameworks for optimization (De Luca and Silverstein 2022) and sampling (Robnik et al. 2023) today. In general, physics provides a rich set of abstractions and a framework for developing new statistical and computational techniques for AI, and it will be important to continue exploiting this connection in the future toward advancements and innovations in generative AI.
At present, generative AI algorithms are predominantly realized on graphics processing units (GPUs), which, while effective, were not specifically designed for this purpose. There is growing interest in developing new hardware solutions from the ground up specifically catering to AI training and inference (e.g., TPUs, IPUs, etc.). In a similar vein to inspiring generative AI algorithms, physics-inspired processing technologies are being explored for the potential to accelerate generative AI workloads and enable new applications. For example, photonic chips (Hamerly et al. 2019; Hamerly, Bandyopadhyay, and Englund 2022), which utilize light for computation, and thermodynamic computing (Coles et al. 2023), which harnesses naturally occurring noise in electrical circuits to realize generative AI algorithms, are also being developed as tailored for generative AI applications. Analog computation (Camsari et al. 2017; Camsari, Sutton, and Datta 2019; Onen et al. 2022; Ramakrishnan and Hasler 2014), which uses continuous physical quantities to perform calculations, is another area of interest. By directly exploiting the natural dynamics of physical systems, analog computers aim to efficiently solve complex optimization problems relevant to generative AI. These approaches fall under the broader category of neuromorphic computing, which seeks to emulate the structure and function of biological neural networks in hardware. These novel computing paradigms, often inspired by physical principles, have the potential to accelerate generative AI workloads and enable new applications. This is particularly relevant as the energy cost of generative AI soars, potentially leading to bottlenecks in scaling up algorithms on current hardware (Crawford 2024).
A specific example is provided by the subdomain of particle physics, in which the need for fast, real-time event filtering at collider experiments has driven the development of specialized hardware and algorithms capable of processing vast amounts of data with extremely low latency. The stringent latency requirements, which are more demanding than in other domains, have spurred innovations in both hardware (Duarte et al. 2018) and software (FastML Team 2023). For example, custom hardware triggers (i.e., data filters) based on field-programmable gate arrays and application-specific integrated circuits are being developed to process collision events within microseconds, while using only a fraction of the available computing resources. In parallel, novel fast ML algorithms, such as those for anomaly detection, are being designed to run efficiently on these hardware platforms (Duarte et al. 2018). The innovations arising from these physics-inspired hardware and algorithmic developments could have far-reaching implications for low-latency and low-power AI applications, including generative AI, in domains beyond particle physics like health monitoring, wireless networking, and edge computing (Deiana et al. 2022).
Principles from physics, particularly statistical mechanics and theoretical physics, have recently been applied to elucidate the theoretical foundations underpinning the training dynamics, robustness, and generalization properties of various AI algorithms, including transformer models (Geshkovski et al. 2024). Similarly, concepts from effective field theory (Demirtas et al. 2023; Halverson, Maiti, and Stoner 2021) and renormalization group theory (Cotler and Rezchikov 2023a, 2023b; Lin, Tegmark, and Rolnick 2017; Mehta and Schwab 2014) have been employed to study the learning dynamics and generalization properties of deep learning models. Moreover, the influential work on scaling laws for LLMs (Kaplan et al. 2020), which showed that the performance of language models scales predictably with the amount of training data, compute, and model size, was directly inspired by approaches commonly employed in physics. The phenomenon of “grokking” in deep learning, where neural networks showed a propensity to suddenly generalize better when trained for a very long period, can be elegantly framed as a phase transition (Liu et al. 2022), drawing parallels to similar phenomena in physical systems. Analogies between classical mechanics and learning dynamics of neural networks have been used to understand existing optimization and regularization strategies by leveraging symmetries inherent to the learning process (Tanaka and Kunin 2021).
While these concepts have shown initial promise in improving our understanding and control of AI systems, the field is still in its early stages, and there remain challenges in translating theoretical insights into practical applications. The rapidly growing scale and complexity of AI architectures used in real-world applications often diverge from the ideal systems that can be easily understood through theoretical analysis. Nonetheless, insights from physics could help bridge the gap between low-level mechanisms and high-level emergent behavior of AI systems and provide a unified theoretical framework for understanding the diverse range of architectures and learning paradigms used in modern AI.
Physics applications often have specific requirements for AI algorithms to be effective, which are distinct from those in other natural science domains. For example, when used for discovering drugs with certain desired properties and downstream uses, it may be acceptable for AI algorithms to be imperfect, as long as they speed up the traditional iteration pipeline and tag candidate molecules for manual follow-up. In contrast, deploying an AI algorithm to look for new particles in collider data requires that it be robust and statistically well-calibrated; the potential of missing existing signals in the data is not acceptable (Kitouni, Nolte, and Williams 2023). Moreover, in theoretical applications, there are often stringent requirements for exactness guarantees (Gukov, Halverson, and Ruehle 2024), or that various symmetries, invariances, or limits of physical systems be precisely respected by an AI architecture. These demands make physics applications particularly interesting from an algorithmic development perspective, because they necessitate novel advancements rather than out-of-the-box application of existing AI approaches.
As an example, in the field of atomistic systems (e.g., molecules, materials), the inherent symmetries and invariances of the physical systems under study have motivated the development of equivariant (i.e., symmetry-preserving) neural networks (Duval et al. 2024; Geiger and Smidt 2022). These architectures bake in the relevant symmetries (e.g., rotational, translational invariance) directly into the model structure, leading to improved sample efficiency and generalization. In parallel, the design of equivariant ML architectures incorporating complex group symmetries was pioneered through the development of ML-based sampling techniques for lattice quantum field theory (Kanwar et al. 2020). Equivariant models have since also found applicability beyond these systems in domains such as computer vision, graphics (Bronstein et al. 2021), and robotics (Brehmer et al. 2023). While this is just one example, it demonstrates the recurring motif that the specific requirements of applications of AI in the physical sciences demand innovation, which leads to advances in other areas.
As AI continues to advance, it is crucial to recognize the value of physics insights in driving the field forward. Given the unique set of challenges in applying generative AI to physics, the successes may appear more modest compared with other domains where these methods have been applied. However, it is these very challenges that make physics valuable for developing new methods. By pushing the boundaries of what is possible and necessitating the creation of novel techniques and architectures, the unique demands of physics problems are driving innovations that can benefit the entire field of generative AI, and it is possible that future breakthroughs will emerge from the intersection of AI and physics. By fostering collaboration between academia and industry, we can harness the power of this intersection to address increasingly complex challenges in both AI and the physical sciences.
While transformative impacts are already fully evident in some domains of science, realizing the full potential of generative AI in the physical sciences will require a concerted effort to foster collaboration, share resources, and build bridges between the AI and physics communities. Here we outline some key challenges and opportunities in this direction.
Cultivating a new generation of talent at the interface of AI and the physical sciences is crucial for the long-term success of the field. Recognizing and rewarding interdisciplinary research, and fostering a culture at universities that values collaboration and cross-pollination across disciplines, will help to attract and retain researchers. It can, however, be challenging at present for curiosity-driven research at the intersection of physics and AI to find an institutional home, because it often falls between traditional subject boundaries. Creating dedicated centers or institutes (e.g., NSF AI Institutes) and independently funded positions (e.g., Schmidt AI Fellowships) can provide support for this kind of interdisciplinary work for researchers who may otherwise fall through departmental cracks.
Although such efforts should be enthusiastically encouraged, they typically support research at the student and postdoctoral levels, rather than the faculty level, without a natural off-ramp into permanent academic positions. In fact, researchers at the intersection of AI and disciplines like physics may find themselves becoming less employable for traditional academic positions over time, as their work becomes less recognizably mainstream in domain departments. This leads faculty aspirants to hedge against diving fully into interdisciplinary research, instead maintaining one foot in their home discipline to ensure a viable career path, setting back progress and concentration of talent in-between fields.
To fully realize the potential of interdisciplinary research, long-term career paths must be created alongside short-term opportunities. This may involve rethinking traditional departmental structures and tenure processes to allow for more flexibility, creativity, and risk-taking. It may also require developing new kinds of permanent positions (e.g., dedicated interdisciplinary lines) that provide stability for researchers working between fields. When such positions are intended to be jointly appointed across departments (e.g., a domain department and a computational or engineering-focused department), care must be taken to structure them, especially at the junior faculty level, in a way that provides a clear departmental home and unified set of expectations for service and tenure, rather than effectively defaulting to a combination of the standard requirements from each department, which can place an undue burden on early-career researchers.
In addition to nurturing individual talent, fostering the growth of the interdisciplinary community as a whole is vital. Joint venues, such as ML sessions at physics conferences and topical physical sciences workshops at AI conferences (e.g., the “Machine Learning and the Physical Sciences” and “AI for Science” series), can serve as points of convergence for the two communities. Interdisciplinary journals like Machine Learning: Science and Technology are another example of a platform aiming to bridge work between the two communities. Besides providing a platform to submit and showcase work at the intersection of physics and AI, they also encourage cross-pollination of ideas, for example, by exposing domain scientists attending ML conferences to broader developments in AI.
While these joint venues have been helpful in launching the community at the interface of physics and AI, the field is maturing to a point where it would benefit from more dedicated spaces. Presently, research in this area can struggle to find a home and effective venue for broad engagement across subfields. Encouraging and supporting the organization of dedicated conferences, workshops, and journals focused specifically on the intersection of AI and physics, in addition to continuing support for joint sessions at broader venues, can help sustain the growth of this interdisciplinary community. These dedicated spaces can provide a forum for deep discussion of the unique challenges and opportunities at this intersection and help to build a shared identity and purpose for the field.
It is also necessary to continue to develop new models for partnership between academia and industry. Many of the most exciting advances in AI are happening in the private sector, particularly in areas with immediate applications or profit potential such as materials science (Merchant et al. 2023), climate science (Kochkov et al. 2024), and drug design (Ingraham et al. 2023). At the same time, the culture of open, curiosity-driven inquiry that is central to much of physics research should be protected and offers many potential long-term avenues for advances, as outlined in previous sections. Creating opportunities for collaboration and knowledge-sharing, while respecting the different incentives at play, could help to accelerate progress while upholding the values of the scientific enterprise. Examples of particular directions for encouraging academia–industry partnerships involve the development of industry-sponsored research positions and data-sharing agreements.
Another opportunity is provided by shared benchmarks and challenge datasets, which can drive progress and enable meaningful comparisons between different AI approaches, while also making problems in physics more accessible to the AI community. The Open Catalyst series of benchmarks and challenges (Tran et al. 2023), a joint initiative between Meta AI and Carnegie Mellon University, is an example of this kind of industry–academia partnership, in this case targeting the design of AI methods for atomistic systems. As another example, organizations like MLCommons aim to bridge domains through standardized benchmarking and shared models and have enabled unbiased exploration of model and processing architecture design. The physical sciences would similarly benefit from a suite of benchmarks spanning disciplines and common tasks. These benchmarks could serve as a focal point for collaboration and help to establish a common language and set of priorities across fields.
Continued progress at the intersection of AI and physics will be driven by the upcoming generation of researchers. The current generation of researchers working at this interface are typically physicists who have learned AI methods “on the job,” or AI researchers with an interest in physical sciences applications. While this has been sufficient to launch the field, as the intersection continues to become more sophisticated and specialized, it is increasingly important to grow a new generation of talent that is truly “multilingual” across disciplines and deeply trained at the intersection of these fields from the outset. Education plays a crucial role in fostering this kind of interdisciplinary expertise.
Traditionally, curricula in physical sciences domain departments (e.g., Physics, Materials Science, etc.) and Computer Science departments have been siloed, with limited overlap and collaboration. While the materials needed to learn the necessary skills for working at the intersection of AI and the physical sciences exist on both sides, there is a growing need for courses that are specifically designed to align with scientific interests and applications. In particular, application of AI to the physical sciences requires a unique skill set that could be taught explicitly, whereby both physical domain knowledge and computational tools are utilized cohesively. This requires cooperation and agreement from both departments on the content and structure of these offerings.
At the undergraduate level, courses that introduce physics students to key concepts and techniques in data science and ML, with a focus on applications in physics, can help to build a foundation for interdisciplinary work. By providing hands-on experience with real-world physics datasets and computational tools, such courses can help students develop the practical skills needed to apply AI methods in their research. Interdisciplinary degree programs at the graduate level, such as the PhD in Physics, Statistics, and Data Science at MIT, are another mechanism for fostering cross-pollination between AI and physics by providing students with deep training across fields.
In many ways, the physical sciences are shielded from some of the most pressing ethical concerns surrounding AI due to the nature of physical data and the lack of direct human consequences in most applications. This can allow for greater freedom in exploring and developing AI techniques within the context of physics research. However, as the scope and impact of AI in physics continue to grow, new ethical questions emerge. One example is the increasing scale of computational resources required for state-of-the-art AI models in physics. The energy consumption and carbon footprint associated with training and running these models raise concerns around sustainability and the responsible allocation of resources. Efforts to develop more efficient AI architectures and algorithms, often inspired by physics itself (as described previously), can help to soften these concerns.
Nevertheless, while the immediate applications of AI to fundamental physics may not have direct human consequences, it is crucial for researchers to consider the potential downstream implications of their work. The techniques and insights developed in the context of physics could eventually be applied to other domains where the ethical stakes are higher. Unlike previous technological developments in physics, AI has the potential to rapidly transform multiple aspects of society in ways that may be difficult to predict or control. By actively participating in the conversation around the ethical implications of their work, physicists can help shape the responsible development and deployment of AI technologies.
In this impact paper, we have discussed the significant potential for bidirectional impacts between generative AI and the physical sciences. We briefly summarize these below.
Toward facilitating scientific discovery in physics, generative AI offers a powerful set of tools for accelerating simulations, enhancing anomaly detection pipelines, and aiding in the development and identification of new theories and concepts. At the same time, physics provides a rich set of abstractions and methods for developing new generative AI methods.
Realizing the full potential of generative AI in physics will require addressing the unique challenges posed by physics applications. The need for robustness, precision, and interpretability in physics necessitates the development of novel AI architectures and algorithms that are specifically tailored to meet these requirements. This presents an opportunity for the physics community to drive innovations in AI that will have implications beyond the physical sciences.
The physical sciences carry unique computational challenges that have historically played a major role in pushing methodological developments in AI forward. Contemporary problems across disciplines in the physical sciences similarly offer significant opportunities to spur methodological developments in generative AI, and novel algorithms can be motivated and inspired by problems in the physical sciences.
The creation of viable long-term career pathways for researchers at the intersection of AI and the physical sciences is crucial for driving the field forward. This may involve rethinking traditional departmental structures and tenure processes to allow for more flexibility, creativity, and risk-taking. Faculty appointments shared across departments or dedicated interdisciplinary lines are one avenue in this direction.
Fostering collaboration across disciplines is crucial for successfully advancing generative AI in the physical sciences. Revisions to departmental structure and dedicated interdisciplinary venues, such as conferences and journals, are needed to provide a platform for researchers from both AI and physics to exchange ideas, share progress, and form new collaborations. These venues play a key role in building a strong community and facilitating the cross-pollination of ideas between these fields.
Education is a critical component in preparing the next generation of researchers to work at the intersection of AI and physics. Interdisciplinary educational programs, such as courses, majors, and graduate programs that combine training in both AI and physics, are essential to equip students with the unique skills and knowledge needed to tackle challenges and opportunities at this interface.
Exploring new models for partnership between academia and industry is important for accelerating progress and ensuring that the advances made in generative AI for physics are translated into real-world impact. These partnerships can facilitate knowledge-sharing and provide access to resources (e.g., compute) and expertise. At the same time, maintaining a balance between profit-driven industry research and curiosity-driven scientific inquiry is crucial, as the latter is the key driver of progress in the physical sciences.
Articulating a clear and compelling vision for the future of AI in physics can help to build broad support and momentum for the field. It is essential to communicate to policymakers, funders, and the public the significant potential for the combination of generative AI with the physical sciences to deepen our understanding of the Universe, drive breakthroughs that benefit society, and push the boundaries of AI. Coming together as a community to craft a roadmap for the field and setting out a plan for tackling them would be a productive step in this direction.
The intersection of generative AI and the physical sciences presents a significant opportunity to advance both fields and drive scientific discovery. By fostering interdisciplinary collaboration, investing in educational initiatives, and exploring new models for partnership across sectors, we can harness the full potential of this intersection. Doing so will not only accelerate progress in the physical sciences but also contribute to the development of more robust, precise, and interpretable AI systems with far-reaching implications across and beyond scientific domains.
We thank all the participants of the Symposium on the Impact of Generative AI in the Physical Sciences, held at MIT March 14–15, 2024, for their insightful talks and discussions, which helped inform some of the material in this impact paper: Thea Aarrestad (ETH Zurich), Simon Batzner (Google DeepMind), Song Han (MIT), David Hogg (New York University & Flatiron Institute), Daniel Huttenlocher (MIT), Pavel Izmailov (OpenAI), Jared Kaplan (Anthropic), Vijay Reddi (Harvard University), Anna Scaife (University of Manchester), Matt Schwartz (Harvard University and IAIFI), Hidenori Tanaka (Harvard), and Kevin Yang (Microsoft Research). We thank Jesse Thaler (MIT and IAIFI) and Mike Williams (MIT and IAIFI) for helpful feedback.
Aad, G., B. Abbott, K. Abeling, N. J. Abicht, S. H. Abidi, A. Aboulhorma, H. Abramowicz, et al. 2024. “Search for New Phenomena in Two-Body Invariant Mass Distributions Using Unsupervised Machine Learning for Anomaly Detection at √𝑠=13 TeV with the ATLAS Detector.” Physical Review Letters 132, no. 8 (February): 081801. https://doi.org/10.1103/PhysRevLett.132.081801.
Aarrestad, T., M. van Beekveld, M. Bona, A. Boveia, S. Caron, J. Davies, A. De Simone, et al. 2022. “The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider.” SciPost Physics 12, no. 1 (January): 043. https://doi.org/10.21468/SciPostPhys.12.1.043.
Belis, Vasilis, Patrick Odagiu, and Thea Klaeboe Aarrestad. 2024. “Machine Learning for Anomaly Detection in Particle Physics.” Reviews in Physics 12 (December): 100091. https://doi.org/10.1016/j.revip.2024.100091.
Birk, Joschka, Anna Hallin, and Gregor Kasieczka. 2024. “OmniJet-α: The First Cross-Task Foundation Model for Particle Physics.” Preprint, submitted March 8, 2024. https://doi.org/10.48550/arXiv.2403.05618.
Bommasani, Rishi, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, et al. 2022. “On the Opportunities and Risks of Foundation Models.” Preprint, submitted August 16, 2021. https://doi.org/10.48550/arXiv.2108.07258.
Bose, Avishek Joey, Tara Akhound-Sadegh, Guillaume Huguet, Kilian Fatras, Jarrid Rector-Brooks, Cheng-Hao Liu, Andrei Cristian Nica, Maksym Korablyov, Michael Bronstein, and Alexander Tong. 2023. “SE(3)-Stochastic Flow Matching for Protein Backbone Generation.” arXiv. https://doi.org/10.48550/ARXIV.2310.02391.
Boyda, Denis, Salvatore Calì, Sam Foreman, Lena Funcke, Daniel C. Hackett, Yin Lin, Gert Aarts, et al. 2022. “Applications of Machine Learning to Lattice Quantum Field Theory.” Preprint, submitted February 10, 2022. https://doi.org/10.48550/arXiv.2202.05838.
Boyda, Denis, Gurtej Kanwar, Sébastien Racanière, Danilo Jimenez Rezende, Michael S. Albergo, Kyle Cranmer, Daniel C. Hackett, and Phiala E. Shanahan. 2020. “Sampling Using SU(N) Gauge Equivariant Flows.” Preprint, submitted August 12, 2020. https://doi.org/10.48550/arXiv.2008.05456.
Brehmer, Johann, Joey Bose, Pim de Haan, and Taco S. Cohen. 2023. “EDGI: Equivariant Diffusion for Planning with Embodied Agents.” Proceedings of the 37th International Conference on Neural Information Processing Systems, edited by A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, 63818–34. Red Hook, NY: Curran Associates.
Bronstein, Michael M., Joan Bruna, Taco Cohen, and Petar Veličković. 2021. “Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges.” Preprint, submitted April 27, 2021. https://doi.org/10.48550/arXiv.2104.13478.
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language Models Are Few-Shot Learners.” Preprint, submitted May 28, 2020. https://doi.org/10.48550/arXiv.2005.14165.
Buhmann, Erik, Cedric Ewen, Darius A. Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Quétant, John Andrew Raine, Debajyoti Sengupta, and David Shih. 2023. “EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion.” Preprint, submitted September 29, 2023. https://doi.org/10.48550/arXiv.2310.00049.
Buhmann, Erik, Gregor Kasieczka, and Jesse Thaler. 2023. “EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets.” SciPost Physics 15, no. 4 (October): 130. https://doi.org/10.21468/SciPostPhys.15.4.130.
Butter, Anja, Tilman Plehn, Steffen Schumann, Simon Badger, Sascha Caron, Kyle Cranmer, Francesco Armando Di Bello, et al. 2023. “Machine Learning and LHC Event Generation.” SciPost Physics 14, no. 4 (April): 079. https://doi.org/10.21468/SciPostPhys.14.4.079.
Cai, Tianji, Garrett W. Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer, and Lance J. Dixon. 2024. “Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory.” Preprint, submitted May 9, 2024. https://doi.org/10.48550/arXiv.2405.06107.
Camsari, Kerem Y., Brian M. Sutton, and Supriyo Datta. 2019. “P-Bits for Probabilistic Spin Logic.” Applied Physics Reviews 6, no. 1 (March): 011305. https://doi.org/10.1063/1.5055860.
Camsari, Kerem Yunus, Rafatul Faria, Brian M. Sutton, and Supriyo Datta. 2017. “Stochastic 𝑝-Bits for Invertible Logic.” Physical Review X 7, no. 3 (July): 031014. https://doi.org/10.1103/PhysRevX.7.031014.
CMS Collaboration. 2024. “Model-Agnostic Search for Dijet Resonances with Anomalous Jet Substructure in Proton-Proton Collisions at √s = 13 TeV.” CMS-PAS-EXO-22-026, CERN, March 20, 2024. https://cds.cern.ch/record/2892677.
Coles, Patrick J., Collin Szczepanski, Denis Melanson, Kaelan Donatella, Antonio J. Martinez, and Faris Sbahi. 2023. “Thermodynamic AI and the Fluctuation Frontier.” Preprint, submitted February 9, 2023. https://doi.org/10.48550/arXiv.2302.06584.
Cotler, Jordan, and Semon Rezchikov. 2023a. “Renormalization Group Flow as Optimal Transport.” Physical Review D 108, no. 2 (July): 025003. https://doi.org/10.1103/PhysRevD.108.025003.
———. 2023b. “Renormalizing Diffusion Models.” Preprint, submitted August 23, 2023. https://doi.org/10.48550/arXiv.2308.12355.
Cranmer, Kyle, Johann Brehmer, and Gilles Louppe. 2020. “The Frontier of Simulation-Based Inference.” Proceedings of the National Academy of Sciences 117, no. 48 (May): 30055–62. https://doi.org/10.1073/pnas.1912789117.
Cranmer, Kyle, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, and Phiala E. Shanahan. 2023. “Advances in Machine-Learning-Based Sampling Motivated by Lattice Quantum Chromodynamics.” Nature Reviews Physics 5, no. 9 (August): 526–35. https://doi.org/10.1038/s42254-023-00616-w.
Crawford, Kate. 2024. “Generative AI’s Environmental Costs Are Soaring — and Mostly Secret.” Nature 626, no. 8000 (February): 693. https://doi.org/10.1038/d41586-024-00478-x.
Cuesta-Lazaro, Carolina, and Siddharth Mishra-Sharma. 2023. “A Point Cloud Approach to Generative Modeling for Galaxy Surveys at the Field Level.” Preprint, submitted November 28, 2023. https://doi.org/10.48550/arXiv.2311.17141.
Deiana, Allison McCarn, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, et al. 2022. “Applications and Techniques for Fast Machine Learning in Science.” Frontiers in Big Data 5 (April): 787421. https://doi.org/10.3389/fdata.2022.787421.
De Luca, G. Bruno, and Eva Silverstein. 2022. “Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization.” Preprint, submitted January 26, 2022. https://doi.org/10.48550/arXiv.2201.11137.
Demirtas, Mehmet, James Halverson, Anindita Maiti, Matthew D. Schwartz, and Keegan Stoner. 2023. “Neural Network Field Theories: Non-Gaussianity, Actions, and Locality.” Preprint, submitted July 6, 2023. https://doi.org/10.48550/arXiv.2307.03223.
Denby, B. 1988. “Neural Networks and Cellular Automata in Experimental High Energy Physics.” Computer Physics Communications 49, no. 3 (June): 429–48. https://doi.org/10.1016/0010-4655(88)90004-5.
Dersy, Aurélien, Matthew D. Schwartz, and Xiaoyuan Zhang. 2022. “Simplifying Polylogarithms with Machine Learning.” Preprint, submitted June 8, 2022. https://doi.org/10.48550/arXiv.2206.04115.
DeZoort, Gage, Peter W. Battaglia, Catherine Biscarat, and Jean-Roch Vlimant. 2023. “Graph Neural Networks at the Large Hadron Collider.” Nature Reviews Physics 5, no. 5 (May): 281–303. https://doi.org/10.1038/s42254-023-00569-0.
Douglas, Michael R. 2022. “Machine Learning as a Tool in Theoretical Science.” Nature Reviews Physics 4, no. 3 (February): 145–46. https://doi.org/10.1038/s42254-022-00431-9.
Duane, Simon, A. D. Kennedy, Brian J. Pendleton, and Duncan Roweth. 1987. “Hybrid Monte Carlo.” Physics Letters B 195, no. 2 (September): 216–22. https://doi.org/10.1016/0370-2693(87)91197-X.
Duarte, Javier, Song Han, Philip Harris, Sergo Jindariani, Edward Kreinar, Benjamin Kreis, Jennifer Ngadiuba, et al. 2018. “Fast Inference of Deep Neural Networks in FPGAs for Particle Physics.” Journal of Instrumentation 13, no. 7 (July): P07027. https://doi.org/10.1088/1748-0221/13/07/P07027.
Duval, Alexandre, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, and Michael Bronstein. 2024. “A Hitchhiker’s Guide to Geometric GNNs for 3D Atomic Systems.” Preprint, submitted December 12, 2023. https://doi.org/10.48550/arXiv.2312.07511.
FastML Team. 2023. “Hls4ml.” Zenodo, December 19, 2023. https://doi.org/10.5281/zenodo.1201549.
Geiger, Mario, and Tess Smidt. 2022. “E3nn: Euclidean Neural Networks.” Preprint, submitted July 18, 2022. https://doi.org/10.48550/arXiv.2207.09453.
Geshkovski, Borjan, Cyril Letrouit, Yury Polyanskiy, and Philippe Rigollet. 2024. “A Mathematical Perspective on Transformers.” Preprint, submitted December 17, 2023. https://doi.org/10.48550/arXiv.2312.10794.
Golling, Tobias, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy, and John Andrew Raine. 2024. “Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models.” Preprint, submitted January 24, 2024. https://doi.org/10.48550/arXiv.2401.13537.
Gukov, Sergei, James Halverson, and Fabian Ruehle. 2024. “Rigor with Machine Learning from Field Theory to the Poincaré Conjecture.” Nature Reviews Physics 6, no. 5 (May): 310–19. https://doi.org/10.1038/s42254-024-00709-0.
Halverson, James, Anindita Maiti, and Keegan Stoner. 2021. “Neural Networks and Quantum Field Theory.” Machine Learning: Science and Technology 2, no. 3 (April): 035002. https://doi.org/10.1088/2632-2153/abeca3.
Hamerly, Ryan, Saumil Bandyopadhyay, and Dirk Englund. 2022. “Asymptotically Fault-Tolerant Programmable Photonics.” Nature Communications 13, no. 1 (November): 6831. https://doi.org/10.1038/s41467-022-34308-3.
Hamerly, Ryan, Liane Bernstein, Alexander Sludds, Marin Soljačić, and Dirk Englund. 2019. “Large-Scale Optical Neural Networks Based on Photoelectric Multiplication.” Physical Review X 9, no. 2 (May): 021032. https://doi.org/10.1103/PhysRevX.9.021032.
Harris, Philip, Michael Kagan, Jeffrey Krupa, Benedikt Maier, and Nathaniel Woodward. 2024. “Re-Simulation-Based Self-Supervised Learning for Pre-Training Foundation Models.” Preprint, submitted March 11, 2024. https://doi.org/10.48550/arXiv.2403.07066.
Ingraham, John B., Max Baranov, Zak Costello, Karl W. Barber, Wujie Wang, Ahmed Ismail, Vincent Frappier, et al. 2023. “Illuminating Protein Space with a Programmable Generative Model.” Nature 623, no. 7989 (November): 1070–78. https://doi.org/10.1038/s41586-023-06728-8.
Kanwar, Gurtej, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Sébastien Racanière, Danilo Jimenez Rezende, and Phiala E. Shanahan. 2020. “Equivariant Flow-Based Sampling for Lattice Gauge Theory.” Physical Review Letters 125, no. 12 (September): 121601. https://doi.org/10.1103/PhysRevLett.125.121601.
Kaplan, Jared, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. “Scaling Laws for Neural Language Models.” Preprint, submitted January 23, 2020. https://doi.org/10.48550/arXiv.2001.08361.
Kasieczka, Gregor, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, et al. 2021. “The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics.” Reports on Progress in Physics 84, no. 12 (December): 124201. https://doi.org/10.1088/1361-6633/ac36b9.
Kitouni, Ouail, Niklas Nolte, and Mike Williams. 2023. “Robust and Provably Monotonic Networks.” Machine Learning: Science and Technology 4, no. 3 (August): 035020. https://doi.org/10.1088/2632-2153/aced80.
Kochkov, Dmitrii, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, et al. 2024. “Neural General Circulation Models for Weather and Climate.” Preprint, submitted November 13, 2023. https://doi.org/10.48550/arXiv.2311.07222.
Krenn, Mario, Robert Pollice, Si Yue Guo, Matteo Aldeghi, Alba Cervera-Lierta, Pascal Friederich, Gabriel dos Passos Gomes, et al. 2022. “On Scientific Understanding with Artificial Intelligence.” Nature Reviews Physics 4, no. 12 (December): 761–69. https://doi.org/10.1038/s42254-022-00518-3.
Lai, Ching-Yao, Pedram Hassanzadeh, Aditi Sheshadri, Maike Sonnewald, Raffaele Ferrari, and Venkatramani Balaji. 2024. “Machine Learning for Climate Physics and Simulations.” Preprint, submitted April 20, 2024. https://doi.org/10.48550/arXiv.2404.13227.
Legin, Ronan, Matthew Ho, Pablo Lemos, Laurence Perreault-Levasseur, Shirley Ho, Yashar Hezaveh, and Benjamin Wandelt. 2023. “Posterior Sampling of the Initial Conditions of the Universe from Non-linear Large Scale Structures Using Score-Based Generative Models.” Preprint, submitted April 7, 2023. https://doi.org/10.48550/arXiv.2304.03788.
Letizia, Marco, Gianvito Losapio, Marco Rando, Gaia Grosso, Andrea Wulzer, Maurizio Pierini, Marco Zanetti, and Lorenzo Rosasco. 2022. “Learning New Physics Efficiently with Nonparametric Methods.” The European Physical Journal C 82, no. 10 (October): 879. https://doi.org/10.1140/epjc/s10052-022-10830-y.
Li, Jizhou, Xiaobiao Huang, Piero Pianetta, and Yijin Liu. 2021. “Machine-and-Data Intelligence for Synchrotron Science.” Nature Reviews Physics 3, no. 12 (December): 766–68. https://doi.org/10.1038/s42254-021-00397-0.
Lin, Henry W., Max Tegmark, and David Rolnick. 2017. “Why Does Deep and Cheap Learning Work so Well?” Journal of Statistical Physics 168, no. 6 (September): 1223–47. https://doi.org/10.1007/s10955-017-1836-5.
Liu, Ziming, Ouail Kitouni, Niklas Nolte, Eric J. Michaud, Max Tegmark, and Mike Williams. 2022. “Towards Understanding Grokking: An Effective Theory of Representation Learning.” Preprint, submitted May 20, 2022. https://doi.org/10.48550/arXiv.2205.10343.
Liu, Ziming, Di Luo, Yilun Xu, Tommi Jaakkola, and Max Tegmark. 2023. “GenPhys: From Physical Processes to Generative Models.” Preprint, submitted April 5, 2023. https://doi.org/10.48550/arXiv.2304.02637.
McCabe, Michael, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, et al. 2023. “Multiple Physics Pretraining for Physical Surrogate Models.” Preprint, submitted October 4, 2023. https://doi.org/10.48550/arXiv.2310.02994.
Mehta, Pankaj, and David J. Schwab. 2014. “An Exact Mapping between the Variational Renormalization Group and Deep Learning.” Preprint, submitted October 14, 2014. https://doi.org/10.48550/arXiv.1410.3831.
Melchior, Peter, Rémy Joseph, Javier Sanchez, Niall MacCrann, and Daniel Gruen. 2021. “The Challenge of Blending in Large Sky Surveys.” Nature Reviews Physics 3, no. 10 (October): 712–18. https://doi.org/10.1038/s42254-021-00353-y.
Merchant, Amil, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol, Gowoon Cheon, and Ekin Dogus Cubuk. 2023. “Scaling Deep Learning for Materials Discovery.” Nature 624, no. 7990 (December): 80–85. https://doi.org/10.1038/s41586-023-06735-9.
Merz, Garrett William, Tianji Cai, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer, and Lance Dixon. 2023. “Transformers for Scattering Amplitudes.” Paper presented at the Machine Learning and the Physical Sciences Workshop at the 37th Conference on Neural Information Processing Systems (NeurIPS), New Orleans, LA, December 15, 2023.
Mikuni, Vinicius, and Benjamin Nachman. 2024. “OmniLearn: A Method to Simultaneously Facilitate All Jet Physics Tasks.” Preprint, submitted April 24, 2024. https://doi.org/10.48550/arXiv.2404.16091.
Mishra-Sharma, Siddharth, Yiding Song, and Jesse Thaler. 2024. “PAPERCLIP: Associating Astronomical Observations and Natural Language with Multi-Modal Models.” Preprint, submitted March 13, 2024. https://doi.org/10.48550/arXiv.2403.08851.
Onen, Murat, Nicolas Emond, Baoming Wang, Difei Zhang, Frances M. Ross, Ju Li, Bilge Yildiz, and Jesús A. del Alamo. 2022. “Nanosecond Protonic Programmable Resistors for Analog Deep Learning.” Science 377, no. 6605 (July): 539–43. https://doi.org/10.1126/science.abp8064.
Park, Sang Eon, Philip Harris, and Bryan Ostdiek. 2023. “Neural Embedding: Learning the Embedding of the Manifold of Physics Data.” Journal of High Energy Physics 2023, no. 7 (July): 108. https://doi.org/10.1007/JHEP07(2023)108.
Parker, Liam, Francois Lanusse, Liam, Siavash Golkar, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Geraud Krawezik, et al. 2023. “AstroCLIP: Cross-Modal Foundation Model for Galaxies.” Preprint, submitted October 4, 2023. https://doi.org/10.48550/arXiv.2310.03024.
Peterson, Carsten. 1989. “Track Finding with Neural Networks.” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 279, no. 3 (July): 537–45. https://doi.org/10.1016/0168-9002(89)91300-4.
Peterson, Carsten, and James R. Anderson. 1987. “A Mean Field Theory Learning Algorithm for Neural Networks.” Complex Systems 1, no. 5: 995–1019.
Pol, Adrian Alan, Virginia Azzolini, Gianluca Cerminara, Federico De Guio, Giovanni Franzoni, Maurizio Pierini, Filip Široký, and Jean-Roch Vlimant. 2019. “Anomaly Detection Using Deep Autoencoders for the Assessment of the Quality of the Data Acquired by the CMS Experiment.” EPJ Web of Conferences 214 (September): 06008. https://doi.org/10.1051/epjconf/201921406008.
Pol, Adrian Alan, Gianluca Cerminara, Cecile Germain, and Maurizio Pierini. 2020. “Data Quality Monitoring Anomaly Detection.” In Artificial Intelligence for High Energy Physics, edited by Paolo Calafiura, David Rousseau, and Kazuhiro Terao, 115–49. Hackensack, NJ: World Scientific. https://doi.org/10.1142/9789811234033_0005.
Raikman, Ryan, Eric A. Moreno, Ekaterina Govorkova, Ethan J. Marx, Alec Gunny, William Benoit, Deep Chatterjee, et al. 2023. “GWAK: Gravitational-Wave Anomalous Knowledge with Recurrent Autoencoders.” Preprint, submitted September 20, 2023. https://doi.org/10.48550/arXiv.2309.11537.
Ramakrishnan, Shubha, and Jennifer Hasler. 2014. “Vector-Matrix Multiply and Winner-Take-All as an Analog Classifier.” IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22, no. 2 (February): 353–61. https://doi.org/10.1109/TVLSI.2013.2245351.
Ramesh, Aditya, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. “Zero-Shot Text-to-Image Generation.” Preprint, submitted February 24, 2021. https://doi.org/10.48550/arXiv.2102.12092.
Robnik, Jakob, G. Bruno De Luca, Eva Silverstein, and Uroš Seljak. 2023. “Microcanonical Hamiltonian Monte Carlo.” Preprint, submitted December 16, 2022. https://doi.org/10.48550/arXiv.2212.08549.
Rombach, Robin, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. “High-Resolution Image Synthesis with Latent Diffusion Models.” Preprint, submitted December 20, 2021. https://doi.org/10.48550/arXiv.2112.10752.
Schwartz, Matthew D. 2022. “Should Artificial Intelligence Be Interpretable to Humans?” Nature Reviews Physics 4, no. 12 (December): 741–42. https://doi.org/10.1038/s42254-022-00538-z.
Shih, David, Matthew R. Buckley, and Lina Necib. 2023. “Via Machinae 2.0: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2.” Preprint, submitted March 2, 2023. https://doi.org/10.48550/arXiv.2303.01529.
Sohl-Dickstein, Jascha, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. “Deep Unsupervised Learning Using Nonequilibrium Thermodynamics.” Preprint, submitted March 12, 2015. https://doi.org/10.48550/arXiv.1503.03585.
Song, Yang, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. 2023. “Consistency Models.” Preprint, submitted March 2, 2023. https://doi.org/10.48550/arXiv.2303.01469.
Subramanian, Shashank, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael Mahoney, and Amir Gholami. 2023. “Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior.” Preprint, submitted June 1, 2023. https://doi.org/10.48550/arXiv.2306.00258.
Tanaka, Hidenori, and Daniel Kunin. 2021. “Noether’s Learning Dynamics: Role of Symmetry Breaking in Neural Networks.” Preprint, submitted May 6, 2021. https://doi.org/10.48550/arXiv.2105.02716.
Tao, Terence. 2023. “Embracing Change and Resetting Expectations.” In AI Anthology, edited by Eric Horvitz. https://unlocked.microsoft.com/ai-anthology/terence-tao/.
Tran, Richard, Janice Lan, Muhammed Shuaibi, Brandon M. Wood, Siddharth Goyal, Abhishek Das, Javier Heras-Domingo, et al. 2023. “The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts.” ACS Catalysis 13, no. 3 (March): 3066–84. https://doi.org/10.1021/acscatal.2c05426.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2023. “Attention Is All You Need.” Preprint, submitted June 12, 2017. https://doi.org/10.48550/arXiv.1706.03762.
Wainwright, Martin J., and Michael I. Jordan. 2008. “Graphical Models, Exponential Families, and Variational Inference.” Foundations and Trends® in Machine Learning 1, no. 1–2 (November): 1–305. https://doi.org/10.1561/2200000001.
Wang, Hanchen, Tianfan Fu, Yuanqi Du, Wenhao Gao, Kexin Huang, Ziming Liu, Payal Chandak, et al. 2023. “Scientific Discovery in the Age of Artificial Intelligence.” Nature 620, no. 7972 (August): 47–60. https://doi.org/10.1038/s41586-023-06221-2.
Xu, Yilun, Ziming Liu, Max Tegmark, and Tommi Jaakkola. 2022. “Poisson Flow Generative Models.” Preprint, submitted September 22, 2022. https://doi.org/10.48550/arXiv.2209.11178.
Xu, Yilun, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, and Tommi Jaakkola. 2023. “PFGM++: Unlocking the Potential of Physics-Inspired Generative Models.” Preprint, submitted February 8, 2023. https://doi.org/10.48550/arXiv.2302.04265.