Anticancer Peptides Classification using Kernel Sparse Representation Classifier. (arXiv:2212.10567v1 [q-bio.QM])
Cancer is one of the most challenging diseases because of its complexity,
variability, and diversity of causes. It has been one of the major research
topics over the past decades, yet it is still poorly understood. To this end,
multifaceted therapeutic frameworks are indispensable. emph{Anticancer
peptides} (ACPs) are the most promising treatment option, but their large-scale
identification and synthesis require reliable prediction methods, which is
still a problem. In this paper, we present an intuitive classification strategy
that differs from the traditional emph{black box} method and is based on the
well-known statistical theory of emph{sparse-representation classification}
(SRC). Specifically, we create over-complete dictionary matrices by embedding
the emph{composition of the K-spaced amino acid pairs} (CKSAAP). Unlike the
traditional SRC frameworks, we use an efficient emph{matching pursuit} solver
instead of the computationally expensive emph{basis pursuit} solver in this
strategy. Furthermore, the emph{kernel principal component analysis} (KPCA) is
employed to cope with non-linearity and dimension reduction of the feature
space whereas the emph{synthetic minority oversampling technique} (SMOTE) is
used to balance the dictionary. The proposed method is evaluated on two
benchmark datasets for well-known statistical parameters and is found to
outperform the existing methods. The results show the highest sensitivity with
the most balanced accuracy, which might be beneficial in understanding
structural and chemical aspects and developing new ACPs. The Google-Colab
implementation of the proposed method is available at the author’s GitHub page
(href{https://github.com/ehtisham-Fazal/ACP-Kernel-SRC}{https://github.com/ehtisham-fazal/ACP-Kernel-SRC}).
Source: https://arxiv.org/abs/2212.10567