Machine Learning for Drug Discovery at ICLR 2022

Close-up view of a detailed circuit board with intricate pathways and components

For the last decade, the field of deep learning and AI has been dominated by applications to images and text. However, in the past two years, the field has seen an upsurge of chemical and biological applications.

The international conference on learning representations [ICLR], is the largest academic AI conference in the world, with an h5-index of 253, and was no exception to this trend in chemical/biological topics. ICLR 2022 included 14 conference papers on small molecules, 5 on proteins, 7 on other biological topics, and an entire workshop devoted to machine learning for drug discovery.

There were also many methods papers for data types commonly encountered in chemistry. This included 4 papers on point clouds [small molecules, ions, and proteins], 15 papers on graph neural networks [small molecules and biochemical interaction networks], and 12 papers treating equivariance [an important property of data with 3D coordinates, including molecular structures].

Here I’ve gathered and summarized all ICLR papers with application to chemistry and biology. Happy reading!


Small Molecules/Drug Discovery

GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation

  • A model for predicting 3D molecular conformation from the molecular graph using a diffusion model. They demonstrate how to incorporate equivariance into the diffusion model.

Energy-Inspired Molecular Conformation Optimization

  • Re-frames conformation prediction as an unrolled optimization where the model learns the gradient field of the landscape of optimal conformers. Proposes an SE(3) equivariant neural network that predicts the gradients of the atom coordinates.

An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch

  • A model for autoregressive molecule generation. This model places successive atoms in a relative frame described by distance, torsion, and angle, providing E(3) invariance to the probability density.

Differentiable Scaffolding Tree for Molecule Optimization

  • A differentiable scaffolding tree that converts discrete molecules into continuous representations, allowing for gradient-based molecule optimization.

Data-Efficient Graph Grammar Learning for Molecular Generation

  • A method for deriving molecular generation rules (a grammar) from very small datasets. This enables training of generative models on very small molecular generation datasets.

Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design

  • An algorithm to build molecule synthesis pathways using synthetic trees. Also, a genetic algorithm for synthesizable molecule generation based on synthetic tree updates.

Learning to Extend Molecular Scaffolds with Structural Motifs

  • A GNN for scaffold-based molecular generation.

Equivariant Transformers for Neural Network based Molecular Potentials

  • An equivariant transformer that predicts molecular potentials. Includes an extensive analysis of what is learned by the attention mechanism.

Pre-training Molecular Graph Representation with 3D Geometry

  • A self-supervised learning algorithm for learning molecule representations that incorporate both 2D graph and 3D geometric information.

Spherical Message Passing for 3D Molecular Graphs

  • A message passing GNN for molecules that incorporates 3D information in the form of distance, torsion, and angle, making the learned features E(3) invariant.

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

  • A model that is invariant to bond torsion angles but capable of learning chirality. This makes it possible to design models that distinguish between enantiomers for stereochemistry-sensitive prediction tasks.

Chemical-Reaction-Aware Molecule Representation Learning

  • Main insight: a desirable property of molecule embeddings is that, given a reaction, the sum of reactants is equal to the sum of the products. They present a minimax contrastive representation learning objective that encourages this property.

Simple GNN Regularisation for 3D Molecular Property Prediction and Beyond

  • A regularisation method that adds noise to the graph and a noise-correction term to the loss function, improving representations.

GeneDisco: A Benchmark for Experimental Design in Drug Discovery

  • A benchmark for active learning in drug discovery, with results given in the MLDD workshop.


Ancestral protein sequence reconstruction using a tree-structured Ornstein-Uhlenbeck variational autoencoder

  • VAE representation learning of biological sequences, incorporating multiple-sequence-alignment information. Validated on an ancestral sequence reconstruction task.

OntoProtein: Protein Pretraining With Gene Ontology Embedding

  • Contrastive representation learning algorithm for protein sequences, incorporating information from the Gene Ontology knowledge graph.

Geometric Transformers for Protein Interface Contact Prediction

  • A transformer that takes as input proteins represented as graphs and coordinates, uses a geometric transformer to predict C-alpha representations that are invariant to rotation and translation of the input, and passes these representations through a convolutional interaction module to predict residue-residue interface contacts.

Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design

  • A roto-translationally invariant, autoregressive graph generative model for proteins. Generates sequence residues sequentially while updating structure coordinates for the already-generated residues.

Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking

  • An equivariant graph neural network for keypoint prediction, which can be used for 3D protein-protein docking. The network predicts “keypoints” (interface points) for two proteins, finds the rotation/translation to align the keypoints, and applies that roto-translation to align one protein with the other.

Assorted Biology/Chemistry

Granger causal inference on DAGs identifies genomic loci regulating transcription

  • Extends Granger causal inference, a technique for causal modeling, from sequences to directed acyclic graphs (DAGs). Does this by using a message-passing graph neural network to predict Granger causality. Applies this to infer gene loci that mediate regulation of specific genes.

Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations

  • A parallel training method for graph neural networks, enabling large-scale training that models higher-order atomic interactions. Sets new state-of-the-art for several metrics on the Open Catalyst 20 dataset.

Solving Inverse Problems in Medical Imaging with Score-Based Generative Models

  • A new conditional sampling approach for solving inverse problems with generative models – in this case, constructing an image X consistent with certain measurements Y.

Crystal Diffusion Variational Autoencoder for Periodic Material Generation

  • Roto-translationally invariant graph variational autoencoder which incorporates inductive biases for material stability and periodicity.

Maximum n-times Coverage for Vaccine Design

  • Introduces the “maximum n-times coverage” problem, which requires selection of k overlays to maximize the summed coverage of weighted elements. Demonstrates how this is a helpful framing of a vaccine design problem.

Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

  • A spatial/graph policy network for reinforcement learning-based molecular optimization.

MoReL: Multi-omics Relational Learning

  • A deep Bayesian generative model to infer a graph structure that captures molecular interactions across different modalities. Uses a Gromov-Wasserstein optimal transport regularization in the latent space to align latent variables of heterogeneous data.

A Biologically Interpretable Graph Convolutional Network to Link Genetic Risk Pathways and Imaging Phenotypes of Disease

  • A genetics graph convolutional network paired with an imaging network, linking imaging phenotypes of disease with biological pathways.

Following are other papers that are not directly focused on chemical/biological applications, but which deal with related topics. Papers can be found by name in the ICLR conference proceedings on OpenReview:

Point Clouds

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework

Deep Point Cloud Reconstruction

TPU-GAN: Learning temporal coherence from dynamic point cloud sequences


Scattering Networks on the Sphere for Scalable and Rotationally Equivariant Spherical CNNs

Top-N: Equivariant Set and Graph Generation without Exchangeability

Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks

Equivariant Subgraph Aggregation Networks

Geometric and Physical Quantities improve E(3) Equivariant Message Passing

Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?

Group equivariant neural posterior estimation

Properties from mechanisms: an equivariance perspective on identifiable representation learning

Equivariant Graph Mechanics Networks with Constraints

Frame Averaging for Invariant and Equivariant Network Design

A Program to Build E(N)-Equivariant Steerable CNNs

Graph NNs

DEGREE: Decomposition Based Explanation for Graph Neural Networks

Graph Condensation for Graph Neural Networks

Automated Self-Supervised Learning for Graphs

On Evaluation Metrics for Graph Generative Models

A New Perspective on “How Graph Neural Networks Go Beyond Weisfeiler-Lehman?”

Do We Need Anisotropic Graph Neural Networks?

Large-Scale Representation Learning on Graphs via Bootstrapping

GRAND++: Graph Neural Diffusion with A Source Term

Graph Neural Networks with Learnable Structural and Positional Representations

Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction

How Attentive are Graph Attention Networks?

Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels

Expressiveness and Approximation Properties of Graph Neural Networks

Graph-Guided Network for Irregularly Sampled Multivariate Time Series

Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions.