MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction. (arXiv:2108.04820v1 [q-bio.QM])

Growing evidence from recent studies implies that microRNA or miRNA could
serve as biomarkers in various complex human diseases. Since wet-lab
experiments are expensive and time-consuming, computational techniques for
miRNA-disease association prediction have attracted a lot of attention in
recent years. Data scarcity is one of the major challenges in building reliable
machine learning models. Data scarcity combined with the use of pre-calculated
hand-crafted input features has led to problems of overfitting and data
leakage.

We overcome the limitations of existing works by proposing a novel
multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD
allows automatic feature extraction while incorporating knowledge from 4
heterogeneous biological information sources (interactions between
miRNA/diseases and protein-coding genes (PCG), miRNA family information, and
disease ontology) in a multi-task setting which is a novel perspective and has
not been studied before. The use of multi-channel convolutions allows us to
extract expressive representations while keeping the model linear and,
therefore, simple. To effectively test the generalization capability of our
model, we construct large-scale experiments on standard benchmark datasets as
well as our proposed larger independent test sets and case studies. MuCoMiD
shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and
HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen
miRNA and diseases over state-of-the-art approaches. We share our code for
reproducibility and future research at
https://git.l3s.uni-hannover.de/dong/cmtt.

Source: https://arxiv.org/abs/2108.04820

webmaster

Related post