Disconnection-Aware Retrosynthesis

In a new paper, researchers at IBM Research recently presented a novel approach to retrosynthesis. In chemical synthesis, the retrosynthesis problem involves determining the optimal sequence of steps to synthesize a given molecule starting from readily available building blocks, known as precursors. In retrosynthesis, a chemist or computational model must first identify a suitable disconnection […]

Team ZONTAL February 2, 2023 No Comments

DiffDock – A Diffusion Model for Molecular Docking

Molecular docking is a critical task in drug design, as it involves predicting the binding structure of a small molecule ligand to a protein. Traditional methods for molecular docking rely on search-based algorithms and scoring functions to estimate the correctness of a proposed structure. However, these methods can be slow and inaccurate, especially for high-throughput […]

Team ZONTAL January 19, 2023 No Comments

RFDiffusion – Leveraging the Power of DDPMs to Generate Protein Sequences and Structures

RFDiffusion is a new method for protein design that leverages the power of denoising diffusion probabilistic models (DDPMs) to generate protein sequences and protein structures. This approach represents a significant advance in the field of protein design, as it allows for the design of complex protein architectures and functions from simple molecular specifications. Figure 1: RFDiffusion […]

Team ZONTAL January 10, 2023 No Comments

MILCDock – Machine Learning Consensus Docking

Molecular docking tools are commonly used in drug discovery to computationally identify new molecules through virtual screening. However, these tools often suffer from inaccurate scoring functions that can vary in performance across different proteins. To address this issue, researchers at Brigham Young University have developed MILCDock, a machine learning consensus docking tool that uses predictions from […]

Team ZONTAL December 20, 2022 No Comments

Reality or Illusion – What can AI do for Drug Discovery?

I thoroughly enjoyed meeting Andreas Bender at the recent BioTechX conference in Basel. He gave a very honest and thought-provoking presentation on a series of papers released in Drug Discovery Today, titled: Artificial intelligence in drug discovery: what is realistic, what are illusions?  Let’s recap his main findings: Artificial intelligence (AI) has had a profound impact on many areas […]

Team ZONTAL September 1, 2022 No Comments

DALL·E 2, Imagen, and Applications to Chemistry

In the past two months, DALL·E 2 has taken over the internet. From Bart Simpson edited into Egyptian art to Donald Trump as the Lorax, text-to-image AI produces amazing results. Caption: “Panda weaving a basket made of cyclohexane”, DALL·E 2 Are these an impressive-but-gimmicky party trick? Or can these innovations be harnessed for applications in scientific domains? Many […]

Team ZONTAL August 23, 2022 No Comments

Making Chemistry Knowledge Machine-Actionable

The history of chemistry has been epitomized by individual chemists coming up with hypotheses, running experiments at lab-scale, and producing discoveries. But in 2022, chemistry data is generated at a scale previously unseen, computers can rapidly process that data, and the data can be widely distributed at relatively minimal cost. This new frontier of global-scale […]

Team ZONTAL August 12, 2022 No Comments

Transformer Retrosynthesis

In drug discovery, there are two main approaches to hit finding: 1) virtual screening of existing small molecule libraries and 2) generative design of new molecules. Generative molecule design can result in better binders, but it may be unknown how to synthesize them. The task of retrosynthesis – designing a synthesis pathway for a molecule […]

Many of our recent blog posts have dealt with physical modeling of small molecules and proteins. This is due to a recent flurry of groundbreaking research in equivariant neural networks that have improved structural modeling of chemicals. But today we step back to a broader-scope biochemistry problem. We are reviewing a paper from a few years ago, but one that I believe is worth returning to in the context of recent advances in structural biochemistry.

In 2018, Zitnix et. al. published “Modeling polypharmacy side effects with graph convolutional networks”. Drugs are often used in tandem in therapies for complex disease or for patients with co-existing conditions. These drug combinations, called “polypharmacy”, can be complementary or cause adverse side effects. But the span of all drug combinations is too large to test experimentally. Thus, the authors designed a neural network to predict polypharmacy side effects.

The Data

The training data is a relational graph, with nodes representing drugs or proteins and edges representing different types of relationships. Edges have different types, modeling protein-protein interactions, drug-protein interactions, and drug-drug side effects.

Captura de pantalla 2022-08-05 081423


The Task

The model is trained with a cross entropy classification objective to predict whether different drug-drug edge types (representing side effects) are present. Because there are only positive examples of edges, the authors use negative sampling for negative examples. This involves random selection of other drug-drug edges which are assumed to be absent – a fair assumption due to the sparsity of the interaction network.

The Model

The authors use a graph convolutional network where node representations are updated based on their surrounding nodes. In this update, the surrounding nodes are transformed depending on their edge type, with a different learned projection matrix modeling each edge type. The node update is given in equation 1:

Captura de pantalla 2022-08-05 081710

The result is updated representations for each node, which depend on interactions with nearby nodes in the graph.

Finally, the node representations are combined to predict edges. This is done via “Tensor factorization” – essentially, assuming that relationships can be modeled by a tensor which relates the two node embeddings. This is important, because it forces the prediction to depend on the combination of the two node embeddings and not on a single node.

The equations for drug-drug edge predictions, drug-protein edge predictions, and protein-protein edge predictions are given below, where is the predicted score.

Captura de pantalla 2022-08-05 081907


There are more takeaways from this paper than can be discussed here, but here are a few of mine:

  1. The use of tensor factorization to model multi-factor phenomena. If you want to model phenomena that depend on the interaction of two variables, tensor factorization is a good way to make sure that your model doesn’t ignore one variable.
  2. The consideration of broader biochemical effects in the drug discovery pipeline. This paper hints at a relatively unexplored side of AI for structure-based drug discovery – biochemical interaction networks. As structural modeling of small and large molecules improves, we can begin to fill in these sparse interaction networks. Experimental evidence will be replaced by relations discovered in-silico via structural drug-protein and protein-protein docking studies.


Using structural approaches to construct interaction networks

The approach in this paper is creative but limited by the sparsity of the data and the lack of mechanistic understanding of disease. All the predicted edges are based on implicit understanding of drug-drug, drug-protein, and protein-protein interactions. Structural chemistry data gives mechanistic explanation of disease, and could offer generalization beyond existing drugs, since structure-based models explicitly capture the theorized physical interactions.

Structure-based interaction networks could lead to better structural understanding of disease symptoms, helping to identify new targets. They could shed light on mechanisms for drugs which are effective, but which have unknown mechanisms of action. They could be used to better predict ADMET properties, along with providing structural explanations.

Reducing the cost of drug discovery requires more than hit discovery and lead optimization, which have been the focus of AI for structure-based drug discovery. Using the recently developed structure-based tools to inform human-scale interaction networks is a difficult challenge but could unlock solutions to numerous other drug discovery problems.