A hybrid model for data visualization using linear algebra methods and machine learning algorithm

Mohsin Ali, jitendra Choudhary, Tanmay Kasbe

Abstract


The t-distributed stochastic neighbor embedding (t-SNE) is a powerful technique for visualizing high-dimensional datasets. By reducing the dimensionality of the data, t-SNE transforms it into a format that can be more easily understood and analyzed. The existing approach is to visualize high-dimensional data but not deeply visualize. This paper proposes a model that enhances visualization and improves the accuracy. The proposed model combines the non-linear embedding technique t-SNE, the linear dimensionality reduction method principal component analysis (PCA), and the QR decomposition algorithm for discovering eigenvalues and eigenvectors. In Addition, we quantitatively compare the proposed model QRPCA-t-SNE with PCA-t-SNE using the following criteria: data visualization with different perplexity and different principal components, confusion matrix, model score, mean square error (MSE), training, testing accuracy, receiver operating characteristic curve (ROC) score, and AUC score.

Keywords


Data visualization; Drosophila melanogaster; Principal component analysis; QR decomposition; t-SNE

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v33.i1.pp463-475

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics