May the Force be with You: Unified Force-Centric Pre-Training for 3D Molecular Conformations. (arXiv:2308.14759v1 [physics.chem-ph])

Recent works have shown the promise of learning pre-trained models for 3D
molecular representation. However, existing pre-training models focus
predominantly on equilibrium data and largely overlook off-equilibrium
conformations. It is challenging to extend these methods to off-equilibrium
data because their training objective relies on assumptions of conformations
being the local energy minima. We address this gap by proposing a force-centric
pretraining model for 3D molecular conformations covering both equilibrium and
off-equilibrium data. For off-equilibrium data, our model learns directly from
their atomic forces. For equilibrium data, we introduce zero-force
regularization and forced-based denoising techniques to approximate
near-equilibrium forces. We obtain a unified pre-trained model for 3D molecular
representation with over 15 million diverse conformations. Experiments show
that, with our pre-training objective, we increase forces accuracy by around 3
times compared to the un-pre-trained Equivariant Transformer model. By
incorporating regularizations on equilibrium data, we solved the problem of
unstable MD simulations in vanilla Equivariant Transformers, achieving
state-of-the-art simulation performance with 2.45 times faster inference time
than NequIP. As a powerful molecular encoder, our pre-trained model achieves
on-par performance with state-of-the-art property prediction tasks.



