Topology-Driven Negative Sampling Enhances Generalizability in Protein-Protein Interaction Prediction

Published in Journal / Oxford (Accepted), 2024

Chatterjee, Ayan, et al. "Topology-Driven Negative Sampling Enhances Generalizability in Protein-Protein Interaction Prediction."

Prediction of protein-protein interactions (PPI) plays a central role in therapeutic drug development. Graph neural networks are increasingly popular in this context. In our work, we investigate negative sampling strategies. By systematically decoupling node attributes from the graph structure, we highlight that models can leverage topology to identify shortcuts, often sacrificing generalization. We propose a topology-driven sampling technique to force the model into learning biologically meaningful patterns instead of structural fallacies, achieving improved generalizability in hold-out interaction data.

Dr. Ayan Chatterjee