For the last decade, the field of deep learning and AI has been dominated by applications to images and text. However, in the past two years, the field has seen an upsurge of chemical and biological applications.
The international conference on learning representations [ICLR], is the largest academic AI conference in the world, with an h5-index of 253, and was no exception to this trend in chemical/biological topics. ICLR 2022 included 14 conference papers on small molecules, 5 on proteins, 7 on other biological topics, and an entire workshop devoted to machine learning for drug discovery.
There were also many methods papers for data types commonly encountered in chemistry. This included 4 papers on point clouds [small molecules, ions, and proteins], 15 papers on graph neural networks [small molecules and biochemical interaction networks], and 12 papers treating equivariance [an important property of data with 3D coordinates, including molecular structures].
Here I’ve gathered and summarized all ICLR papers with application to chemistry and biology. Happy reading!
Small Molecules/Drug Discovery
- A model for predicting 3D molecular conformation from the molecular graph using a diffusion model. They demonstrate how to incorporate equivariance into the diffusion model.
- Re-frames conformation prediction as an unrolled optimization where the model learns the gradient field of the landscape of optimal conformers. Proposes an SE(3) equivariant neural network that predicts the gradients of the atom coordinates.
- A model for autoregressive molecule generation. This model places successive atoms in a relative frame described by distance, torsion, and angle, providing E(3) invariance to the probability density.
- A differentiable scaffolding tree that converts discrete molecules into continuous representations, allowing for gradient-based molecule optimization.
- A method for deriving molecular generation rules (a grammar) from very small datasets. This enables training of generative models on very small molecular generation datasets.
- An algorithm to build molecule synthesis pathways using synthetic trees. Also, a genetic algorithm for synthesizable molecule generation based on synthetic tree updates.
- A GNN for scaffold-based molecular generation.
- An equivariant transformer that predicts molecular potentials. Includes an extensive analysis of what is learned by the attention mechanism.
- A self-supervised learning algorithm for learning molecule representations that incorporate both 2D graph and 3D geometric information.
- A message passing GNN for molecules that incorporates 3D information in the form of distance, torsion, and angle, making the learned features E(3) invariant.
- A model that is invariant to bond torsion angles but capable of learning chirality. This makes it possible to design models that distinguish between enantiomers for stereochemistry-sensitive prediction tasks.
- Main insight: a desirable property of molecule embeddings is that, given a reaction, the sum of reactants is equal to the sum of the products. They present a minimax contrastive representation learning objective that encourages this property.
- A regularisation method that adds noise to the graph and a noise-correction term to the loss function, improving representations.
- A benchmark for active learning in drug discovery, with results given in the MLDD workshop.
- VAE representation learning of biological sequences, incorporating multiple-sequence-alignment information. Validated on an ancestral sequence reconstruction task.
- Contrastive representation learning algorithm for protein sequences, incorporating information from the Gene Ontology knowledge graph.
- A transformer that takes as input proteins represented as graphs and coordinates, uses a geometric transformer to predict C-alpha representations that are invariant to rotation and translation of the input, and passes these representations through a convolutional interaction module to predict residue-residue interface contacts.
- A roto-translationally invariant, autoregressive graph generative model for proteins. Generates sequence residues sequentially while updating structure coordinates for the already-generated residues.
- An equivariant graph neural network for keypoint prediction, which can be used for 3D protein-protein docking. The network predicts “keypoints” (interface points) for two proteins, finds the rotation/translation to align the keypoints, and applies that roto-translation to align one protein with the other.
- Extends Granger causal inference, a technique for causal modeling, from sequences to directed acyclic graphs (DAGs). Does this by using a message-passing graph neural network to predict Granger causality. Applies this to infer gene loci that mediate regulation of specific genes.
- A parallel training method for graph neural networks, enabling large-scale training that models higher-order atomic interactions. Sets new state-of-the-art for several metrics on the Open Catalyst 20 dataset.
- A new conditional sampling approach for solving inverse problems with generative models – in this case, constructing an image X consistent with certain measurements Y.
- Roto-translationally invariant graph variational autoencoder which incorporates inductive biases for material stability and periodicity.
- Introduces the “maximum n-times coverage” problem, which requires selection of k overlays to maximize the summed coverage of weighted elements. Demonstrates how this is a helpful framing of a vaccine design problem.
- A spatial/graph policy network for reinforcement learning-based molecular optimization.
- A deep Bayesian generative model to infer a graph structure that captures molecular interactions across different modalities. Uses a Gromov-Wasserstein optimal transport regularization in the latent space to align latent variables of heterogeneous data.
- A genetics graph convolutional network paired with an imaging network, linking imaging phenotypes of disease with biological pathways.
Following are other papers that are not directly focused on chemical/biological applications, but which deal with related topics. Papers can be found by name in the ICLR conference proceedings on OpenReview:
A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion
Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework
Deep Point Cloud Reconstruction
TPU-GAN: Learning temporal coherence from dynamic point cloud sequences
Scattering Networks on the Sphere for Scalable and Rotationally Equivariant Spherical CNNs
Top-N: Equivariant Set and Graph Generation without Exchangeability
Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks
Equivariant Subgraph Aggregation Networks
Geometric and Physical Quantities improve E(3) Equivariant Message Passing
Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?
Group equivariant neural posterior estimation
Properties from mechanisms: an equivariance perspective on identifiable representation learning
Equivariant Graph Mechanics Networks with Constraints
Frame Averaging for Invariant and Equivariant Network Design
A Program to Build E(N)-Equivariant Steerable CNNs
DEGREE: Decomposition Based Explanation for Graph Neural Networks
Graph Condensation for Graph Neural Networks
Automated Self-Supervised Learning for Graphs
On Evaluation Metrics for Graph Generative Models
A New Perspective on “How Graph Neural Networks Go Beyond Weisfeiler-Lehman?”
Do We Need Anisotropic Graph Neural Networks?
Large-Scale Representation Learning on Graphs via Bootstrapping
GRAND++: Graph Neural Diffusion with A Source Term
Graph Neural Networks with Learnable Structural and Positional Representations
Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction
How Attentive are Graph Attention Networks?
Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels
Expressiveness and Approximation Properties of Graph Neural Networks
Graph-Guided Network for Irregularly Sampled Multivariate Time Series
Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions.