Understood! I will gather a comprehensive list of the latest research papers (from the last two years) on AI for scientific discovery across all disciplines, including physics, chemistry, biology, medicine, material science, and more. I will include peer-reviewed journal articles, preprints, and conference papers where applicable.

The results will be categorized by discipline and will include:

Title
Authors
Year of publication
Methods used (e.g., deep learning, reinforcement learning, generative models, etc.)
Key results
Challenges mentioned
Future directions proposed

I will update you as soon as I have compiled the information.

AI-Driven Scientific Discovery Across Disciplines (2023–2025)

AI techniques have increasingly enabled breakthroughs in various scientific fields. Below, we summarize recent (last ~2 years) high-impact papers across physics, chemistry, biology, medicine, and materials science. Each entry includes the paper’s title, authors, year, methods, key results, challenges, and future directions, along with references.

Physics

Magnetic control of tokamak plasmas through deep reinforcement learning (2022) – Jonas Degrave et al.

Methods: Deep reinforcement learning (DRL) was used to train an AI controller for magnetic coils in a tokamak plasma simulator. The controller learned policies in simulation and was then deployed on a real fusion device (the TCV tokamak).
Key Results: The DRL-based controller autonomously learned to shape and maintain a high-temperature plasma in various configurations. It successfully kept the plasma stable and even “sculpted” it into different target shapes (including advanced configurations) without plasma-wall collisions. This marked the first time an AI controller managed a real fusion plasma, opening new experimental possibilities.
Challenges: Controlling fusion plasmas is a high-dimensional, fast feedback problem – the system must adjust coil voltages thousands of times per second to confine a 100-million-degree plasma. Previous methods required painstaking manual tuning and could not easily handle the diversity of plasma shapes. Ensuring the learned policy was safe and effective on the physical tokamak was also non-trivial.
Future Directions: This work demonstrates that AI can accelerate fusion research by handling complex control tasks. The authors suggest applying such AI controllers to other plasma scenarios and larger reactors. More broadly, it opens avenues to integrate AI for real-time control in physics experiments, potentially speeding up progress toward viable fusion energy.

Deep symbolic regression for physics guided by units constraints (2023) – Wassim Tenachi et al.

Methods: Introduces “Physical Symbolic Optimization (PhySO),” a symbolic regression approach that combines deep reinforcement learning with physics domain knowledge. The system generates analytical equations to fit data while enforcing dimensional analysis constraints (i.e. correct units), drastically reducing the search space of formulas.
Key Results: PhySO rediscovered dozens of known physics equations (reportedly 74 laws) from raw data, recovering formulas like the harmonic oscillator and other classical laws. It achieved state-of-the-art performance on the noisy Feynman benchmark (a standard test set of physics formulas), outperforming other symbolic regression methods in both accuracy and interpretability. This showcases AI’s ability to produce human-understandable scientific models rather than black-box predictions.
Challenges: Searching for equations that both fit the data and make physical sense is combinatorially hard (“you can’t add potatoes and carrots” when terms have mismatched units). The vast space of possible formulas is a major challenge; by enforcing unit consistency, the algorithm avoids unphysical combinations, making the search tractable. Another challenge is generalization – ensuring the discovered formula is not just a fit to noise but a true law.
Future Directions: This approach paves the way toward automated discovery of new physical laws from experimental dataphyso.readthedocs.io. The authors envision applying it to real-world datasets where the underlying theory is unknown, potentially accelerating the discovery of novel physicsphyso.readthedocs.io. Future work will focus on scaling to more complex systems and improving the efficiency of the search, as well as integrating other scientific constraints to guide equation discovery.

Chemistry

Automated synthesis of oxygen-producing catalysts from Martian meteorites by a robotic AI chemist (2023) – Jun Jiang et al.

Methods: An autonomous robotic chemist platform was employed to discover and synthesize catalysts for oxygen generation (the oxygen evolution reaction, OER) using Martian meteorite material. The system uses laser spectroscopy to analyze composition, performs multi-step chemical processing, then applies machine learning (neural networks trained on quantum chemistry & molecular dynamics simulation data) to predict optimal mixtures. Finally, it uses Bayesian optimization to select experiments and iteratively improve the catalyst – all with no human intervention.
Key Results: In roughly two months, the AI-driven system identified an OER catalyst composed of five Martian meteorite-derived elements that performs exceptionally well. The catalyst sustained oxygen production for over 550,000 seconds (≈6.4 days) at Mars-like temperature (−37 °C) with minimal degradation. This performance would meet the oxygen needs for human survival with only ~15 hours of sunlight (if scaled appropriately). Notably, achieving this level of optimization would have taken human researchers an estimated 2,000 years of experimentation, highlighting a massive acceleration in discovery.
Challenges: Mars presents unique challenges: any catalyst must be synthesized in situ from local resources (to avoid costly transport from Earth). The search space of possible chemical combinations from meteorite material is huge and was previously impractical to explore manually. Ensuring the robotic system’s reliability across many delicate experimental steps (weighing, mixing, heating, etc.) was also critical. The team had to integrate chemistry domain knowledge (e.g. density functional theory for reaction activity) into the AI to guide it effectively.
Future Directions: This work is a step toward “self-driving labs” for chemistry. The authors suggest that, in the future, human explorers on Mars could deploy AI chemists to set up oxygen factories autonomously. More generally, the approach could be generalized to discover catalysts or materials for other processes (energy storage, materials synthesis) in remote or resource-limited environments. The team is now working to broaden the platform for diverse organic and inorganic syntheses without human input, indicating a rapid development in automated chemical discovery.

Biology

De novo design of high-affinity binders of bioactive helical peptides (2024) – Susana Vázquez-Torres et al.

Methods: This study applied generative deep learning to protein design. Researchers used a diffusion-based model (RFdiffusion) to generate novel protein structures that wrap around target molecules, and a sequence-optimization network (ProteinMPNN) to fill in amino acid details. These AI tools were combined with traditional protein engineering to design proteins that bind specific helical peptide hormones and signaling molecules.
Key Results: The team created several novel proteins that bind tightly and specifically to bioactive peptides (such as glucagon and others). One designed binder achieved picomolar affinity – essentially the highest binding affinity ever reported for a computer-designed protein targeting a peptide. The designed proteins showed extreme specificity (they do not bind unrelated molecules) and functioned in laboratory assays, which is a significant improvement over earlier computationally designed binders. This demonstrates that AI-guided protein design can produce binders rivaling or exceeding antibodies in affinity.
Challenges: Designing binders for flexible, informally structured targets (like peptide hormones) is difficult – these targets lack rigid structure for a protein to grab onto. Traditional methods often failed or required massive screening. Additionally, antibodies (the usual binders) are expensive and have stability issues. The challenge was to generate a stable protein that could cradle a floppy peptide with high affinity. The AI had to navigate a vast search space of protein folds and sequences to meet this challenge.
Future Directions: The success of these designs suggests AI-designed proteins could become cheap alternatives to antibodies for therapeutics and diagnostics. Future work will explore using such binders in medicine (e.g. as sensors or drugs for hormone-related conditions) and extending the approach to other challenging targets (e.g. intrinsically disordered proteins). The researchers also plan to refine the generative models to improve design success rates and to create proteins with novel functions beyond binding, heralding a new era of AI-driven protein engineering.

Near-complete structure of the human nuclear pore complex solved with AI-assisted methods (2022) – Fontana, Tong et al. (Harvard Medical School & UC Berkeley collaboration)

Methods: Researchers combined AlphaFold (DeepMind’s protein structure AI) predictions with experimental data (cryo-electron microscopy) to map the architecture of the nuclear pore complex (NPC). The NPC is a giant protein assembly in the cell nucleus. AlphaFold was used to predict the 3D structures of many NPC proteins (nucleoporins) that were previously unsolved, and these were fitted into cryo-EM density maps to build an atomic model of the NPC’s scaffold.
Key Results: The team obtained an almost complete atomic model of the human nuclear pore complex, particularly the massive cytoplasmic ring structure, which had eluded scientists for decades. AlphaFold’s predictions provided pieces that were missing in experimental reconstructions, allowing the assembly of the NPC puzzle. This is a significant scientific discovery: the NPC (~1000 protein subunits) is one of the largest complexes in cells, and understanding its structure provides insight into how molecules are transported between the nucleus and cytoplasm. Essentially, AI helped crack a “behemoth” of molecular biology.
Challenges: The NPC’s size and complexity made it incredibly challenging – it contains >30 distinct proteins repeated in eight-fold symmetry, totaling over 1,000 pieces. Traditional structural biology methods (X-ray, cryo-EM) struggled with incomplete data and flexible regions. The AI predictions had to be accurate and trustworthy to integrate into the model. It was also challenging to validate that the AI-predicted structures were correct in the context of the whole assembly, requiring careful cross-checking with experiments.
Future Directions: Solving the NPC structure opens the door to new biology – researchers can now investigate how the NPC works in detail (e.g. how it selectively permits transport, how it assembles, and how it might be targeted by viruses or drugs). This case also exemplifies how AI can be used to tackle other large molecular complexes. Future efforts will apply similar AI-augmented modeling to other unsolved cellular machines. Additionally, the NPC blueprint may inform biomedical research (e.g. understanding diseases caused by NPC defects) and inspire the design of nanopore technologies by mimicking its structure.

Medicine

Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii (2023) – Gary Liu et al. (James J. Collins & Jonathan Stokes groups)

Methods: The researchers trained a deep neural network to identify new antibiotic molecules effective against A. baumannii, a deadly drug-resistant bacterium. They screened ~7,500 compounds experimentally to teach the model which chemical structures inhibit bacterial growth and which do not. The model, once trained, virtually screened a large chemical library (6,680 compounds it hadn’t seen) and predicted a few hundred top candidates in hours. These AI-picked candidates were then tested in the lab to find actual antibiotic activity.
Key Results: The AI approach yielded a novel antibiotic compound, named abaucin, that effectively kills A. baumannii but has almost no effect on other bacteria. This narrow-spectrum activity is desirable because it targets the pathogen without harming beneficial microbes, and it may slow resistance development. Abaucin was shown to treat A. baumannii wound infections in mice and works via a unique mechanism (it interferes with the bacteria’s lipoprotein trafficking by inhibiting a protein called LolE). This discovery is notable as A. baumannii is a World Health Organization “critical” priority pathogen, and new therapies are urgently needed.
Challenges: A. baumannii is notorious for multi-drug resistance, and very few new antibiotics have been developed in recent decades. Traditional drug discovery is slow and often yields broad-spectrum drugs that bacteria quickly resist. A key challenge was the data limitation – the team needed enough training examples of molecules to teach the AI, which they addressed by generating their own experimental data. There’s also the challenge of translating an AI prediction into a real drug: ensuring the compound is effective in animal models and safe in humans remains an ongoing hurdle.
Future Directions: This study underscores that AI can significantly accelerate the search for antibiotics. The authors are expanding this approach to other dangerous pathogens (“public enemy number one” was A. baumannii, next could be others). Future work will involve optimizing AI-discovered compounds and possibly advancing them to clinical trials. There is also interest in using AI to design antibiotics with desired properties (like narrow spectrum or novel mechanisms) from scratch. More broadly, this success encourages using AI in pharmaceutical discovery to combat antibiotic resistance and other pressing health threats.

Accurate proteome-wide missense variant effect prediction with AlphaMissense (2023) – Jun Cheng et al. (Google DeepMind)

Methods: AlphaMissense is an AI model built on the AlphaFold protein structure system, fine-tuned to predict the pathogenicity of genetic mutations. It was trained using large databases of human and primate genetic variation, learning from patterns of which amino acid substitutions tend to be tolerated in evolution and which are not. By combining protein structural context (from AlphaFold) with evolutionary conservation data, it assesses whether a given single amino-acid change (missense mutation) is likely benign or disease-causing – all without direct training on clinical disease data.
Key Results: AlphaMissense was applied to all possible missense mutations in humans – 71 million variants – and could confidently classify about 89% of them as either likely benign or likely pathogenic. This is a huge leap, as the vast majority of human genetic variants are “Variants of Uncertain Significance.” The model achieved state-of-the-art accuracy on various benchmarks, outperforming prior tools in identifying known disease mutations. In a striking result, the average AlphaMissense-predicted pathogenicity score for all mutations in a gene correlates with whether that gene is essential for cell survival, suggesting the model’s predictions carry meaningful biological signal. A database of these predictions has been made publicly available.
Challenges: Interpreting genetic variants is a fundamental challenge in genomics – there are far more variants than we have experimental data for. Before this work, clinicians faced millions of variants with unknown significance. A challenge for AlphaMissense was to leverage protein structure effectively: many prior predictors used sequence conservation alone. Ensuring the model didn’t produce too many false pathogenic predictions was crucial (to avoid alarm fatigue in genomics). Additionally, integrating the model’s outputs with existing clinical databases in a usable way is an ongoing challenge.
Future Directions: AlphaMissense’s predictions are already being integrated into resources like Ensembl and UniProt to help researchers and clinicians prioritize mutations of concern. In the future, such AI models could guide genetic screening – for example, focusing attention on the rare mutations most likely to cause disease. The authors suggest extending the approach to other types of mutations (like insertions/deletions) and to other species’ genomes. As more experimental data on variant effects become available, models like AlphaMissense can be further refined. Ultimately, this work is a step toward using AI for predictive genomic medicine, where computational tools fill in gaps in our experimental knowledge and accelerate the understanding of genotype–phenotype relationships.

Materials Science

Graph Networks for Materials Exploration (GNoME): Millions of new materials discovered with deep learning (2023) – Amil Merchant, Ekin D. Cubuk et al.

Methods: GNoME is a graph neural network model trained on known inorganic crystals from the Materials Project database. Each material’s crystal structure is represented as a graph of atoms, and the network learns to predict formation energy (stability). The team used active learning, iteratively retraining the model and adding new hypothetical structures that the model found promising, to efficiently explore a huge space of possible compositions and arrangements. Calculations were accelerated using Google’s computing infrastructure to evaluate millions of candidates.
Key Results: The AI model predicted 2.2 million new inorganic crystal structures that are stable (i.e. lower in energy than a combination of other compounds). Among these, ~380,000 were identified as especially stable – potential candidates for practical use. This single project effectively added knowledge equivalent to “nearly 800 years’ worth” of traditional materials discovery. Notably, some predicted materials could be useful for superconductors, batteries, or photovoltaics, according to the authors. All 380k high-confidence predictions are being incorporated into the Materials Project open database for experimentalists to investigate.
Challenges: Traditional materials discovery is slow and often serendipitous – researchers might make and test only a handful of new materials in a project. The search space of possible compounds is astronomical, and most candidates are unstable. GNoME addressed the challenge of data scarcity (limited known crystal structures) by leveraging chemical knowledge from a decade of curated data. Another challenge was ensuring the model’s predictions are reliable; the team validated the most promising AI-generated crystals with additional physics calculations and found strong agreement. Scaling the computation to millions of candidates was also non-trivial, requiring optimization of the model and use of high-performance computing.
Future Directions: The discovery of these materials is just the first step. Next is experimental synthesis: the database provides a treasure trove for chemists to explore. The integration of GNoME with robotic labs (like the A-Lab, below) could further accelerate turning predictions into real materials. Future AI efforts may explore even larger chemical spaces (including organic/hybrid materials) and target specific properties (e.g. finding materials with high superconductivity or specific optical properties). The success here points toward an era of AI-driven materials science, where algorithms significantly cut down the time to find materials for technologies addressing energy and climate challenges.

Autonomous laboratory (A-Lab) discovers novel materials with minimal human input (2023) – Kristin Persson et al. (Lawrence Berkeley National Lab)

Methods: A-Lab is an AI-driven robotics lab that can autonomously carry out materials synthesis and characterization. It uses artificial intelligence to plan and execute experiments: reading scientific literature for guidance, mixing chemicals, baking samples, and measuring outcomes. The AI employs active learning: it picks experiments to perform, learns from the results, and adjusts its plans to optimize towards making stable new compounds. Data from the Materials Project (including the new GNoME predictions) were used to inform the initial candidates.
Key Results: In its first demonstration, A-Lab successfully synthesized 41 new inorganic compounds in 17 days. This equates to over two new materials per day – a rate far beyond a human scientist. For comparison, synthesizing a single new material can take months of human work. The AI-guided robot identified recipes and produced materials that were predicted (by models) to be stable, validating many of those predictions experimentally. This achievement was published in Nature alongside the GNoME study, highlighting a complementary leap: AI not only predicts materials but can also physically create them rapidly.
Challenges: Automating a chemistry lab involves numerous challenges: reliable handling of chemicals, avoiding contamination, and interpreting experimental data on the fly. The AI had to decide which among dozens of precursors and what conditions to try, a task traditionally guided by human intuition. Ensuring safety and correctness in an unsupervised setting was paramount. Another challenge was the integration of diverse data sources – A-Lab’s AI had to combine knowledge from materials databases, published literature, and its own experiments to make decisions.
Future Directions: This autonomous lab approach could greatly accelerate materials innovation. Future iterations of A-Lab may handle more complex synthesis (including nanomaterials or organic compounds) and operate continuously to iteratively improve materials (e.g. for better battery electrodes or catalysts). By coupling AI prediction (like GNoME) with AI experimentation, the vision is a closed-loop system where discovery and testing feed into each other at machine speed. In the long run, such self-driving labs might tackle challenges from discovering new drugs to inventing sustainable materials, fundamentally changing how science is done.

References: The content above is summarized from recent publications and reports, including peer-reviewed journals and reputable sources, among others, to ensure accuracy and clarity in presenting these AI-driven scientific discoveries. Each cited source corresponds to a specific supporting detail as indicated by the numbering.