| Peer-Reviewed

Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree

Received: 5 August 2019     Published: 27 September 2019
Views:       Downloads:
Abstract

Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents.

Published in Applied and Computational Mathematics (Volume 8, Issue 4)
DOI 10.11648/j.acm.20190804.12
Page(s) 75-81
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2019. Published by Science Publishing Group

Keywords

Data Mining, Classification Models, Prediction, Talent Introduction, Academic Talent Capacity

References
[1] Hanif, M. I. & Yunfei, S. (2013), The role of talent management and HR generic strategies for talent retention, African Journal of Business Management, 7, 2827-2835.
[2] Kellogg, R. P. (2012), China’s brain gain: Attitudes and future plans of overseas Chinese students in the US, Journal of Chinese Overseas, 8, 83-104.
[3] Tharenou, P. & Seet, P. S. (2014), China's reverse brain drain: regaining and retaining talent, International Studies of Management and Organization, 44, 55-74.
[4] Ma, Y. P. & Pan, S. Y. (2015), Chinese returnees from overseas study: An understanding of brain gain and brain circulation in the age of globalization, Frontiers of Education in China, 10, 306-329.
[5] Lievens, K. van Dam, & Anderson, N. (2002), Recent trends and challenges in personnel selection, Personnel Review, 31, 580-601.
[6] Friedman, J. H. (2001), Greedy function approximation: A gradient boosting machine, Annals of Statistics, 29, 1189-1232.
[7] Quinlan, J. R. (1987), Simplifying decision trees, International Journal of Man-machine Studies, 27, 221-234.
[8] Breiman, L. (2001), Random forests, Machine learning, 45, 5-32.
[9] Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996), Artificial neural networks: A tutorial, Computer, 29, 31-44.
[10] Chen, J., Huang, H., Tian, S., & Qu, Y. (2009), Feature selection for text classification with Naive Bayes, Expert Systems with Applications, 36, 5432-5435.
[11] Suykens J. A. & Vandewalle, J. (1999), Least squares support vector machine classifiers, Neural Processing Letters, 9, 293-300.
[12] Shaw, M. J., Subramaniam, C., Tan, G. W., & Welge, M. E. (2001), Knowledge management and data mining for marketing, Decision Support Systems, 31, 127-137.
[13] Hormozi, A. M. & Giles, S. (2004), Data mining: A competitive weapon for banking and retail industries, Information Systems Management, 21, 62-71.
[14] Koh, H. C. & Tan, G. (2011), Data mining applications in healthcare, Journal of Healthcare Information Management, 19, 65-72.
[15] Romero, C. & Ventura, S. (2013), Data mining in education, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3, 12-27.
[16] Chien, C. F. & Chen, L. F. (2008), Data mining to improve personnel selection and enhance human capital: A case study in high-technology industry, Expert Systems with Applications, 34, 280-290.
[17] Ranjan, J., Goyal, D. P. & Ahson, S. I. (2008), Data mining techniques for better decisions in human resource management systems, International Journal of Business Information Systems, 3, 464-481.
[18] Gupta, S., Mokashi, U. M., & Suma, V. (2017). Entropy-based discretisation for performance prediction of employee: strategy for improving software quality, International Journal of Productivity and Quality Management, 21, 411-428.
[19] Huang, M. J., Tsou, Y. L. & Lee, S. C. (2006), Integrating fuzzy data mining and fuzzy artificial neural networks for discovering implicit knowledge, Knowledge-Based Systems, 19, 396-403.
[20] Han, Y. (2016). Improved BIRCH Clustering Algorithm and Human Resource Management Efficiency: An Organizational Learning Perspective. International Journal of Security and Its Applications, 10 (8), 385-394.
[21] Fadhil, R., Djatna, T., & Maarif, M. S. (2017). Analysis and Design of a Human Resources Performance Measurement System for the Nutmeg Oil Agro-industry in Aceh. Journal of Regional and City Planning, 28 (2), 99-110.
[22] Chien, C. F. & Chen, L. F. (2007), Using rough set theory to recruit and retain high-potential talents for semiconductor manufacturing, IEEE Transactions on Semiconductor Manufacturing, 20, 528-541.
[23] Saron, M. & Othman, Z. A. (2012), Academic talent model based on human resource data mart, International Journal of Research in Computer Science, 2, 29-35.
Cite This Article
  • APA Style

    Shunshun Shi, Mingzhou Chen, Rui Feng, Hua Zhang, Shuai Zhang. (2019). Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Applied and Computational Mathematics, 8(4), 75-81. https://doi.org/10.11648/j.acm.20190804.12

    Copy | Download

    ACS Style

    Shunshun Shi; Mingzhou Chen; Rui Feng; Hua Zhang; Shuai Zhang. Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Appl. Comput. Math. 2019, 8(4), 75-81. doi: 10.11648/j.acm.20190804.12

    Copy | Download

    AMA Style

    Shunshun Shi, Mingzhou Chen, Rui Feng, Hua Zhang, Shuai Zhang. Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Appl Comput Math. 2019;8(4):75-81. doi: 10.11648/j.acm.20190804.12

    Copy | Download

  • @article{10.11648/j.acm.20190804.12,
      author = {Shunshun Shi and Mingzhou Chen and Rui Feng and Hua Zhang and Shuai Zhang},
      title = {Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree},
      journal = {Applied and Computational Mathematics},
      volume = {8},
      number = {4},
      pages = {75-81},
      doi = {10.11648/j.acm.20190804.12},
      url = {https://doi.org/10.11648/j.acm.20190804.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acm.20190804.12},
      abstract = {Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents.},
     year = {2019}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree
    AU  - Shunshun Shi
    AU  - Mingzhou Chen
    AU  - Rui Feng
    AU  - Hua Zhang
    AU  - Shuai Zhang
    Y1  - 2019/09/27
    PY  - 2019
    N1  - https://doi.org/10.11648/j.acm.20190804.12
    DO  - 10.11648/j.acm.20190804.12
    T2  - Applied and Computational Mathematics
    JF  - Applied and Computational Mathematics
    JO  - Applied and Computational Mathematics
    SP  - 75
    EP  - 81
    PB  - Science Publishing Group
    SN  - 2328-5613
    UR  - https://doi.org/10.11648/j.acm.20190804.12
    AB  - Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents.
    VL  - 8
    IS  - 4
    ER  - 

    Copy | Download

Author Information
  • School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

  • School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

  • School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

  • School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

  • School of Information, Zhejiang University of Finance and Economics, Hangzhou, China

  • Sections