Treffer: One-class SVM and supervised machine learning models for uncovering associations of non-coding RNA with diseases

Title:
One-class SVM and supervised machine learning models for uncovering associations of non-coding RNA with diseases
Contributors:
Wang, Zenghui
Publication Year:
2022
Collection:
University of South Africa: UNISA Institutional Repository
Document Type:
Dissertation thesis
File Description:
1 online resource (xiii, 104 leaves) : illustrations (chiefly color), color graphs; application/pdf
Language:
English
Accession Number:
edsbas.3E34C3F5
Database:
BASE

Weitere Informationen

The study of MicroRNAs (miRNAs), long non-coding RNAs (lncRNAs) and gene interactions may be expected to provide new technologies to serve as valuable biomarkers for personalized treatments of diseases and to aid in the prognosis of certain conditions. These molecules act at the genome level by regulating or suppressing their protein expression functions. The primary challenge in the study of these non-coding molecules involves the necessity of finding labeled data indicating positive and negative interactions when predicting interactions using machine-learning or deep-learning techniques. However, usually we end up with a scenario of unbalanced data or unstable scenarios for using these models. An additional problem involves the extraction of features derived from the binding of these non-coding RNAs and genes. This binding process usually occurs fully or partially in animal genetics, which leads to considerable complexity in studying the process. Therefore, the main objective of the present work is to demonstrate that it is possible to use features extracted for miRNAs sequences in the development of diseases such as breast cancer, breast neoplasms, or if there is any influence with immune genes related to the SARS-COV-2. We performed experiments focusing on the erb-b2 receptor tyrosine kinase 2 (ERBB2) gene involved in breast cancer. For this purpose, we gathered miRNA-mRNA information from the binding between these two genetic molecules. In this part of our research, we applied a One-Class SVM and an Isolation Forest to discriminate between weak interactions, outliers given by the one-class model, and strong interactions that could occur between miRNA and mRNA (messenger RNA). Additionally, this study aimed to differentiate between breast cancer cases and breast neoplasm conditions. In this section we used the information encoded in lncRNAs. The additional feature used in this part was the frequency of k-mers, i.e., small portions of nucleotides, along with the data from the energy released in miRNA folding. ...