Accéder au contenu
Merck

Weighted matrix factorization on multi-relational data for LncRNA-disease association prediction.

Methods (San Diego, Calif.) (2019-06-22)
Yuehui Wang, Guoxian Yu, Jun Wang, Guangyuan Fu, Maozu Guo, Carlotta Domeniconi
RÉSUMÉ

Influx evidences show that red long non-coding RNAs (lncRNAs) play important roles in various critical biological processes, and they afffect the development and progression of various human diseases. Therefore, it is necessary to precisely identify the lncRNA-disease associations. The identification precision can be improved by developing data integrative models. However, current models mainly need to project heterogeneous data onto the homologous networks, and then merge these networks into a composite one for integrative prediction. We recognize that this projection overrides the individual structure of the heterogeneous data, and the combination is impacted by noisy networks. As a result, the performance is compromised. Given that, we introduce a weighted matrix factorization model on multi-relational data to predict LncRNA-disease associations (WMFLDA). WMFLDA firstly uses a heterogeneous network to capture the inter(intra)-associations between different types of nodes (including genes, lncRNAs, and Disease Ontology terms). Then, it presets weights to these inter-association and intra-association matrices of the network, and cooperatively decomposes these matrices into low-rank ones to explore the underlying relationships between nodes. Next, it jointly optimizes the low-rank matrices and the weights. After that, WMFLDA approximates the lncRNA-disease association matrix using the optimized matrices and weights, and thus to achieve the prediction. WMFLDA obtains a much better performance than related data integrative solutions across different experiment settings and evaluation metrics. It can not only respect the intrinsic structures of individual data sources, but can also fuse them with selection.