Dimensionality reduction is critical for analyzing and interpreting high-dimensional data across domains like genomics, imaging, and finance. This paper presents a comparative analysis of dimensionality reduction techniques, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Recursive Feature Elimination (RFE), and Lasso regression. These methods are applied to datasets from genomics, medical imaging, and finance to evaluate their ability to reduce dimensions while preserving relevant information. The results demonstrate that PCA and LDA are highly effective for genomics data, reducing gene expression profiles from over 60,000 dimensions to 10-50 components while maintaining precision of over 80%. For medical images, PCA and LDA reduce pixel dimensions by over 90% without compromising precision. However, no single technique optimizes dimensionality reduction and precision for complex finance data. Overall, the analysis provides domain-specific insights, highlighting PCA and LDA as leading techniques for genomics and imaging. The choice of method should be guided by data characteristics. Testing on more diverse, real-world datasets is needed to establish validity further. This research aims to inform the selection of appropriate data reduction techniques across critical applications involving high-dimensional data.
Published in | American Journal of Electrical and Computer Engineering (Volume 7, Issue 2) |
DOI | 10.11648/j.ajece.20230702.12 |
Page(s) | 27-39 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2023. Published by Science Publishing Group |
Machine Learning, Principal Component Analysis, Linear Discriminant Analysis, Recursive Feature Elimination, Lasso Regression, Genomics, Medical Imaging
[1] | S. Vijayarani, S. Sharmila and G. Srivastava, "Comparative analysis of dimensionality reduction techniques for heart disease prediction," in Computational Intelligence and Data Analytics: Proceedings of ICIDA 2019, Cham, 2019. |
[2] | K. Yildiz, A. Çamurcu and B. Doğan, "Comparison of dimension reduction techniques on high dimensional datasets.," Int. Arab J. Inf. Technol., vol. 15, pp. 256-262, 2018. |
[3] | G. T. Reddy, M. P. K. Reddy, K. Lakshmanna, R. Kaluri, D. S. Rajput, G. Srivastava and T. Baker, "Analysis of Dimensionality Reduction Techniques on Big Data," IEEE Access, vol. 8, pp. 54776-54788, 2020. |
[4] | H. Yang, " A comparative study of dimensionality reduction techniques to enhance trace clustering performances," 2012. |
[5] | T. Gadekallu, P. Reddy, K. Lakshman, R. Kaluri, D. Rajput, G. Srivastava and T. Baker, "Analysis of Dimensionality Reduction Techniques on Big Data," IEEE Access, pp. 1-10, 2020. |
[6] | L. Zhang, Z. Wang and Z. Liu, "A comparative study of dimensionality reduction techniques for cancer diagnosis," Journal of Biomedical Informatics, vol. 92, pp. 103-111, 2018. |
[7] | S. Bharti, S. Kumar and A. Kumar, "Comparative study of dimensionality reduction techniques for intrusion detection systems," in 2nd International Conference on Computing, Communication, and Smart Technologies (ICCST), 2020. |
[8] | S. Ayesha, M. Kashif and R. Talib, "Overview and Comparative Study of Dimensionality Reduction Techniques for High Dimensional Data," Information Fusion, 2020. |
[9] | V. Santhosh, " Comparative Analysis of Dimensionality Reduction Techniques for Machine Learning," International Journal of Scientific Research in Science and, vol. 4, no. 8, pp. 364-369, 2018. |
[10] | R. A. Fisher, "The use of multiple measurements in taxonomic problems," Annals of Eugenics, vol. 7, no. 2, pp. 179-188, 1936. |
[11] | M. Vikram, R. Pavan, N. D. Dineshbhai and B. Mohan, "Performance evaluation of dimensionality reduction techniques on high dimensional data," in 3rd International Conference on Trends in Electronics and Informatics (ICOEI), 2019. |
[12] | M. A. Belarbi, S. Mahmoudi, G. Belalem, S. A. Mahmoudi and A. Cools, "A New Comparative Study of Dimensionality Reduction Methods in Large-Scale Image," Big Data and Cognitive Computing, vol. 6, no. 2, 2022. |
[13] | D. Mishra and S. Sharma, "Performance Analysis of Dimensionality Reduction Techniques: A Comprehensive Review," Advances in Mechanical Engineering. Lecture Notes in Mechanical Engineering, 2021. |
[14] | S. Gyamerah and D. R. Korda, "Prediction of Stock Market Returns using LSTM Model and Traditional Statistical Model," International Journal of Computer Applications, vol. 183, no. 37, pp. 57-61, 2021. |
[15] | B. Ghojogh, M. N. Samad, S. A. Mashhadi, T. Kapoor, W. Ali, F. Karray and M. Crowley, "Feature selection and feature extraction in pattern analysis: A literature review," arXiv preprint, 2019. |
[16] | Wikipedia, "Principal component analysis," [Online]. Available: https://en.wikipedia.org/w/index.php?title=Principal_component_analysis&oldid=1168271511. [Accessed 3 August 2023]. |
APA Style
Gyamerah, S., Tour Soori, G., Redeemer Korda, D., Kwame Tawiah, J., Ayintareba Akolgo, E., et al. (2023). Comparative Analysis of Feature Extraction of High Dimensional Data Reduction Using Machine Learning Techniques. American Journal of Electrical and Computer Engineering, 7(2), 27-39. https://doi.org/10.11648/j.ajece.20230702.12
ACS Style
Gyamerah, S.; Tour Soori, G.; Redeemer Korda, D.; Kwame Tawiah, J.; Ayintareba Akolgo, E., et al. Comparative Analysis of Feature Extraction of High Dimensional Data Reduction Using Machine Learning Techniques. Am. J. Electr. Comput. Eng. 2023, 7(2), 27-39. doi: 10.11648/j.ajece.20230702.12
AMA Style
Gyamerah S, Tour Soori G, Redeemer Korda D, Kwame Tawiah J, Ayintareba Akolgo E, et al. Comparative Analysis of Feature Extraction of High Dimensional Data Reduction Using Machine Learning Techniques. Am J Electr Comput Eng. 2023;7(2):27-39. doi: 10.11648/j.ajece.20230702.12
@article{10.11648/j.ajece.20230702.12, author = {Seth Gyamerah and Godfred Tour Soori and Dennis Redeemer Korda and John Kwame Tawiah and Eric Ayintareba Akolgo and Emmanuel Oteng Dapaah}, title = {Comparative Analysis of Feature Extraction of High Dimensional Data Reduction Using Machine Learning Techniques}, journal = {American Journal of Electrical and Computer Engineering}, volume = {7}, number = {2}, pages = {27-39}, doi = {10.11648/j.ajece.20230702.12}, url = {https://doi.org/10.11648/j.ajece.20230702.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajece.20230702.12}, abstract = {Dimensionality reduction is critical for analyzing and interpreting high-dimensional data across domains like genomics, imaging, and finance. This paper presents a comparative analysis of dimensionality reduction techniques, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Recursive Feature Elimination (RFE), and Lasso regression. These methods are applied to datasets from genomics, medical imaging, and finance to evaluate their ability to reduce dimensions while preserving relevant information. The results demonstrate that PCA and LDA are highly effective for genomics data, reducing gene expression profiles from over 60,000 dimensions to 10-50 components while maintaining precision of over 80%. For medical images, PCA and LDA reduce pixel dimensions by over 90% without compromising precision. However, no single technique optimizes dimensionality reduction and precision for complex finance data. Overall, the analysis provides domain-specific insights, highlighting PCA and LDA as leading techniques for genomics and imaging. The choice of method should be guided by data characteristics. Testing on more diverse, real-world datasets is needed to establish validity further. This research aims to inform the selection of appropriate data reduction techniques across critical applications involving high-dimensional data. }, year = {2023} }
TY - JOUR T1 - Comparative Analysis of Feature Extraction of High Dimensional Data Reduction Using Machine Learning Techniques AU - Seth Gyamerah AU - Godfred Tour Soori AU - Dennis Redeemer Korda AU - John Kwame Tawiah AU - Eric Ayintareba Akolgo AU - Emmanuel Oteng Dapaah Y1 - 2023/12/11 PY - 2023 N1 - https://doi.org/10.11648/j.ajece.20230702.12 DO - 10.11648/j.ajece.20230702.12 T2 - American Journal of Electrical and Computer Engineering JF - American Journal of Electrical and Computer Engineering JO - American Journal of Electrical and Computer Engineering SP - 27 EP - 39 PB - Science Publishing Group SN - 2640-0502 UR - https://doi.org/10.11648/j.ajece.20230702.12 AB - Dimensionality reduction is critical for analyzing and interpreting high-dimensional data across domains like genomics, imaging, and finance. This paper presents a comparative analysis of dimensionality reduction techniques, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Recursive Feature Elimination (RFE), and Lasso regression. These methods are applied to datasets from genomics, medical imaging, and finance to evaluate their ability to reduce dimensions while preserving relevant information. The results demonstrate that PCA and LDA are highly effective for genomics data, reducing gene expression profiles from over 60,000 dimensions to 10-50 components while maintaining precision of over 80%. For medical images, PCA and LDA reduce pixel dimensions by over 90% without compromising precision. However, no single technique optimizes dimensionality reduction and precision for complex finance data. Overall, the analysis provides domain-specific insights, highlighting PCA and LDA as leading techniques for genomics and imaging. The choice of method should be guided by data characteristics. Testing on more diverse, real-world datasets is needed to establish validity further. This research aims to inform the selection of appropriate data reduction techniques across critical applications involving high-dimensional data. VL - 7 IS - 2 ER -