Attention-aware contrastive learning for predicting T cell receptor-antigen binding specificity. (arXiv:2206.11255v1 [q-bio.QM])

It has been verified that only a small fraction of the neoantigens presented
by MHC class I molecules on the cell surface can elicit T cells. The limitation
can be attributed to the binding specificity of T cell receptor (TCR) to
peptide-MHC complex (pMHC). Computational prediction of T cell binding to
neoantigens is an challenging and unresolved task. In this paper, we propose an
attentive-mask contrastive learning model, ATMTCR, for inferring TCR-antigen
binding specificity. For each input TCR sequence, we used Transformer encoder
to transform it to latent representation, and then masked a proportion of
residues guided by attention weights to generate its contrastive view.
Pretraining on large-scale TCR CDR3 sequences, we verified that contrastive
learning significantly improved the prediction performance of TCR binding to
peptide-MHC complex (pMHC). Beyond the detection of important amino acids and
their locations in the TCR sequence, our model can also extracted high-order
semantic information underlying the TCR-antigen binding specificity. Comparison
experiments were conducted on two independent datasets, our method achieved
better performance than other existing algorithms. Moreover, we effectively
identified important amino acids and their positional preferences through
attention weights, which indicated the interpretability of our proposed model.



Related post