Currently, about half of the transactions in the US stock market are based on high-frequency algorithmic trading, making it difficult for the investors with the long-term investment horizon, such as pension funds, to obtain stable returns. The development of a market forecast model that could achieve stable returns over the long term is an important issue in supporting not only pensions but also the central bank policy makers or new private businesses. To obtain stable investment performance by a forecast model over the long-term, it is necessary to remove noise from sample data in advance and extract a universal pattern. However, it is difficult to preliminarily distinguish between noise and true patterns and remove noise in advance. In this study, the sample space was divided into 8 sub-spaces using a Two Stage Optimization decision tree, and the versatility of each sub-space was evaluated by a pattern recognition model. Then, the sub-space with a low versatility was defined as the space with relatively large noise, and a forecast model was created by excluding the sub-spaces with large noise. It was found that the forecast model constructed in this way could obtain the prediction accuracy higher than that of the conventional method. Also, when the prediction accuracy of the model was evaluated by the walk-forward method using financial time-series data, investment performance that stably exceeded the return of benchmark assets was obtained over the past 15 years.
Published in | International Journal on Data Science and Technology (Volume 8, Issue 4) |
DOI | 10.11648/j.ijdst.20220804.13 |
Page(s) | 72-86 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2022. Published by Science Publishing Group |
Evolutionary Computing, Noisy Function Optimization, Financial Market, Forecasting, Regression
[1] | Bruno, M. H., Sobreiro, V. A., & Kimura, H. (2019). Literature review: Machine learning techniques applied to financial market prediction. Expert Systems with Applications, 124, 226-251. https://doi.org/10.1016/j.eswa.2019.01.012 |
[2] | Obthong, M., Tantisantiwong, N., Jeamwatthanachai, W., & Wills, G. (2020). A survey on machine learning for stock price prediction: algorithms and techniques. 2nd International Conference on Finance, Economics, Management and IT Business, 63-71. https://doi.org/10.5220/0009340700630071 |
[3] | Hu, Z., Zhao, Y., & Khushi, M. (2021). A survey of forex and stock price prediction using deep learning. Applied System Innovation, 4 (9). https://doi.org/10.3390/asi4010009 |
[4] | Kumar, S., Gupta, R., Kumar, P., & Aggarwal, N. (2021). A Survey on Artificial Neural Network based Stock Price Prediction Using Various Methods. 5th International Conference on Intelligent Computing and Control Systems. https://doi.org/10.1109/ICICCS51141.2021.9432329 |
[5] | Maqsood, H., Mehmood, I., Maqsood, M., Yasir, M., Afzal, S., Aadil, F., Selim, M. M., & Muhammad, K. (2020). A local and global event sentiment based efficient stock exchange forecasting using deep learning. International Journal of Information Management, 50, 432-451. https://doi.org/10.1016/j.ijinfomgt.2019.07.011 |
[6] | Jin, Z., Yang, Y., & Liu, Y. (2020). Stock closing price prediction based on sentiment analysis and LSTM. Neural Computing and Applications, 32 (13), 9713–9729. https://doi.org/10.1007/s00521-019-04504-2 |
[7] | Shen, J., & Shafiq, M. O. (2020). Short-term stock market price trend prediction using a comprehensive deep learning system. Journal of Big Data, 7 (66). https://doi.org/10.1186/s40537-020-00333-6 |
[8] | Rivera-Lopeza, R., Canul-Reich, J., Mezura-Montes, E., & Cruz-Chávezd, M. A. (2021). Induction of decision trees as classification models through metaheuristics. Swarm and Evolutionary Computation, https://doi.org/10.1016/j.swevo.2021.101006 |
[9] | Tanigawa, T., & Zhao, Q. (2000). A study on efficient generation of decision trees using genetic programming, Genetic and Evolutionary Computation Conference 2000, 1047–1052. https://dl.acm.org/doi/pdf/10.5555/2933718.2933915 |
[10] | Falcón-Cardona, J. G., & Coello, C. A. C. (2020). Indicator-based multi-objective evolutionary algorithms: A comprehensive survey. ACM Computing Surveys, 53 (2), 1-35. https://dl.acm.org/doi/pdf/10.1145/3376916 |
[11] | Chung, H., & Shin, K. (2020). Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction. Neural Computing and Applications, 32 (12), 7897–7914. https://doi.org/10.1007/s00521-019-04236-3 |
[12] | Kamalov, F. (2020). Forecasting significant stock price changes using neural networks. Neural Computing and Applications, 32 (23), 17655–17667. https://doi.org/10.1007/s00521-020-04942-3 |
[13] | Rezaei, H., Faaljou, H., & Mansourfar, G. (2021). Stock price prediction using deep learning and frequency decomposition. Expert Systems with Applications, 169, 114332. https://doi.org/10.1016/j.eswa.2020.114332 |
[14] | Yu, P., & Yan, X. (2020). Stock price prediction based on deep neural networks. Neural Computing and Applications, 32 (6), 1609–1628. https://doi.org/10.1007/s00521-019-04212-x |
[15] | Harikrishnan, H., & Urolagin, S. (2020). Prediction of Stock Market Prices of Using Recurrent Neural Network—Long Short-Term Memory. https://link.springer.com/chapter/10.1007/978-981-15-5243-4_33 |
[16] | Lu, W., Li, J., Wang, J. & Qin, L. (2021). A CNN-BiLSTM-AM method for stock price prediction. Neural Computing and Applications, 33 (10), 4741–4753. https://doi.org/10.1007/s00521-020-05532-z |
[17] | Ta, V., Liu, C., & Tadesse, D. A. (2020). Portfolio optimization-based stock prediction using long-short term memory network in quantitative trading. Applied Science, 10 (2), 437–456. https://doi.org/10.3390/app10020437 |
[18] | Lee, S. W., & Kim, H. Y. (2020). Stock market forecasting with super-high dimensional time-series data using ConvLSTM, trend sampling, and specialized data augmentation. Expert Systems with Applications, 161, 113704. https://doi.org/10.1016/j.eswa.2020.113704 |
[19] | Urolagin, S., Sharma, N., & Datta, T. K. (2021). A combined architecture of multivariate LSTM with Mahalanobis and Z-Score transformations for oil price forecasting. Energy, 231, 120963. https://doi.org/10.1016/j.energy.2021.120963 |
[20] | Livieris, I., Pintelas, E., & Pintelas, A. (2020). A CNN–LSTM model for gold price time-series forecasting. Neural Computing and Applications, 32 (23), 17351–17360. https://doi.org/10.1007/s00521-020-04867-x |
[21] | Basak, S., Kar, S., Saha, S., Khaidem, L., & Dey, S. (2018). Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance, 47, 552–567. https://doi.org/10.1016/j.najef.2018.06.013 |
[22] | Xiao, C., Xia, W., & Jiang, J. (2020). Stock price forecast based on combined model of ARI-MA-LS-SVM. Neural Computing and Applications, 32 (10), 5379–5388. https://doi.org/10.1007/s00521-019-04698-5 |
[23] | Yang, R., Yu, L., Zhao, Y., Yu, H., Xu, G., Wu, Y., Liu, Z. (2020). Big data analytics for financial Market volatility forecast based on support vector machine, International Journal of Information Management, 50, 452-462. https://doi.org/10.1016/j.ijinfomgt.2019.05.027 |
[24] | Pandey, M. K., Mittal, M., & Subbiah, K. (2021). Optimal balancing & efficient feature ranking approach to minimize credit risk. International Journal of Information Management Data Insights, 1 (2), 100037. https://doi.org/10.1016/j.jjimei.2021.100037 |
[25] | Ma, X., Yang, R., Zou, D., & Liu, R. (2020). Measuring extreme risk of sustainable financial system using GJR-GARCH model trading data-based, International Journal of Information Management, 50, 526-537. https://doi.org/10.1016/j.ijinfomgt.2018.12.013 |
[26] | Urolagin, S. (2017). Text Mining of Tweet for Sentiment Classification and Association with Stock Prices, International Conference on Computer and Applications. https://doi.org/10.1109/COMAPP.2017.8079788 |
[27] | Seong, N., & Nam, K. (2021). Predicting stock movements based on financial news with segmentation. Expert Systems with Applications, 164, 113988. https://doi.org/10.1016/j.eswa.2020.113988 |
[28] | Xu, Q., Wang, L., Jiang, C., & Liu, Y. (2020) A novel (U) MIDAS-SVR model with multi-source market sentiment for forecasting stock returns. Neural Computing and Applications, 32 (10), 5875–5888. https://doi.org/10.1007/s00521-019-04063-6 |
[29] | Tandon, C., Revankar, S., & Parihar, S. S. (2021). How can we predict the impact of the social media messages on the value of cryptocurrency? Insights from big data analytics. International Journal of Information Management Data Insights, 1 (2), 100035. https://doi.org/10.1016/j.jjimei.2021.100035 |
[30] | Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed). Springer Series in Statistics. |
[31] | Seni, G., & Elder, J. (2010). Ensemble methods in data mining: Improving accuracy through combining predictions. Morgan & Claypool Publishers. |
[32] | Freitas, A. A. (2002). Data mining and knowledge discovery with evolutionary algorithms. New York, NJ: Springer-Verlag New York, Inc. |
[33] | Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21 (3), 660–674. https://doi.org/10.1109/21.97458 |
[34] | Murthy, S. K. (1998). Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2 (4), 345–389. https://doi.org/10.1023/A:1009744630224 |
[35] | Freitas, A. A. (2004). A critical review of multi-objective optimization in data mining: A position paper. ACM SIGKDD Explorations Newsletter, 6 (2), 77–86. https://doi.org/10.1145/1046456.1046467 |
[36] | Rokach, L., & Maimon, O. (2005). Top-down induction of decision trees classifiers—A survey. IEEE Transactions on Systems, Man and Cybernetics, Part C, 35 (4), 476–487. https://doi.org/10.1109/TSMCC.2004.843247 |
[37] | Espejo, P. G., Ventura, S., & Herrera, F. (2010). A survey on the application of genetic programming to classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 40 (2), 121–144. https://doi.org/10.1109/TSMCC.2009.2033566 |
[38] | Barros, R. C., Basgalupp, M. P., de Carvalho, A. C. P. L. F., & Freitas, A. A. (2012). A survey of evolutionary algorithms for decision tree induction. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 42 (3), 291–312. https://doi.org/10.1109/TSMCC.2011.2157494 |
[39] | Shirasaka, M., Zhao, Q., Hammami, O., Kuroda, K., & Saito, K. (1998). Automatic design of binary decision trees based on genetic programming, Second Asia-Pacific Conference on Simulated Evolution and Learning. |
[40] | Zhao, Q., & Shirasaka, M. (1999). A study on evolutionary design of binary decision trees, IEEE Congress on Evolutionary Computation (pp. 1988–1993). |
[41] | Aitkenhead, M. J. (2008). A co-evolving decision tree classification method. Expert Systems with Applications, 34 (1), 18–25. https://doi.org/10.1016/j.eswa.2006.08.008 |
[42] | Papagelis, A., & Kalles, D. (2001). Breeding decision trees using evolutionary techniques, Eighteenth International Conference on Machine Learning (pp. 393–400). Morgan Kaufmann Publishers, Inc. |
[43] | Fu, Z., Golden, B. L., Lele, S., Raghavan, S., & Wasil, E. A. (2003). A genetic algorithm-based approach for building accurate decision trees. INFORMS Journal on Computing, 15 (1), 3–22. https://doi.org/10.1287/ijoc.15.1.3.15152 |
[44] | Burgess, C. J., & Lefley, M. (2001). Can genetic programming improve software effort estimation? a comparative evaluation. Information and Software Technology, 43 (14), 863–873. https://doi.org/10.1016/S0950-5849(01)00192-6 |
[45] | DeLisle, R. K., & Dixon, S. L. (2004). Induction of decision trees via evolutionary programming. Journal of Chemical Information and Computer Sciences, 44 (3), 862–870. https://doi.org/10.1021/ci034188s |
[46] | Zhao, H. (2007). A multi-objective genetic programming approach to developing pareto optimal decision trees. Decision Support Systems, 43 (3), 809–826. https://doi.org/10.1016/j.dss.2006.12.011 |
[47] | To, C., & Pham, T. (2009). Analysis of cardiac imaging data using decision tree based parallel genetic programming, 6th International Symposium on Image and Signal Processing and Analysis (pp. 317–320). |
[48] | Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104 (1), 148–175. https://doi.org/10.1109/JPROC.2015.2494218 |
[49] | Adams, R. P., & Stegle, O. (2008). Gaussian process product models for nonparametric nonstationarity, International Conference on Machine Learning (pp. 1–8). |
[50] | Larraga, R. E., Lozano, J. A., & Pena, J. M. (1999). A review of cooperation between evolutionary computation and probabilistic graphical models, Second Symposium on Artificial Intelligence CIMAF 99 (pp. 314–324). |
[51] | Davis, L. (1990). The handbook of genetic algorithms. New York: Van Nostrand Reinhold. |
[52] | Beyer, H. G., & Schwefel, H. P. (2002). Evolution strategies: A comprehensive introduction. Natural Computing, 1 (1), 3–52. https://doi.org/10.1023/A:1015059928466 |
[53] | Storn, R., & Price, K. (1997). Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11 (4), 341–359 (1197). https://doi.org/10.1023/A:1008202821328 |
[54] | Kennedy, J., & Eberhart, R. C. Particle swarm optimization. (2005). IEEE international joint conference on neural networks, pp. 1942–1948 (1995). |
[55] | Sato, H., Ono, I., & Kobayashi, S. (1997). A new generation alternation model of genetic algorithms and its assessment. Journal of Japanese Society for Artificial Intelligence, 12 (5), 734–744. |
[56] | Ono, I., Kobayashi, S., & Yoshida, K. (2000). Optimal lens design by real-coded genetic algorithms using UNDX. Computer Methods in Applied Mechanics and Engineering, 186 (2–4), 483–497. https://doi.org/10.1016/S0045-7825(99)00398-9 |
[57] | Kobayashi, S. (2009). The frontiers of real-coded genetic algorithms. Journal of Japanese Society for Artificial Intelligence, 24 (1), 128–143. |
[58] | Chakraborty, A., & Kar, A. K. (2017). Swarm intelligence: A review of algorithms. Nature-Inspired Computing and Optimization, 475-494. https://doi.org/10.1007/978-3-319-50920-4_19 |
[59] | Slowik, A., & Kwasnicka, H. (2020). Evolutionary algorithms and their applications to engineering problems. Neural Computing and Applications, 32 (16), 12363–12379. https://doi.org/10.1007/s00521-020-04832-8 |
[60] | Jain, R., Batra, J., Kar, A. K., Agrawal, H., & Tikkiwal, V. A. (2021). A hybrid bio-inspired computing approach for buzz detection in social media. Evolutionary Intelligence. https://doi.org/10.1007/s12065-020-00512-7 |
[61] | Anupam, S., & Kar, A. K. (2021). Phishing website detection using support vector machines and nature-inspired optimization algorithms. Telecommunication Systems, 76 (1), 17-32. https://doi.org/10.1007/s11235-020-00739-w |
[62] | Batra, J., Jain, R., Tikkiwal, V. A., & Chakraborty, A. (2021). A comprehensive study of spam detection in e-mails using bio-inspired optimization techniques. International Journal of Information Management Data Insights, 1 (1), 100006. https://doi.org/10.1016/j.jjimei.2020.100006 |
[63] | Kar, A. K., & Aswani, R. (2021). How to differentiate propagators of information and misinformation–Insights from social media analytics based on bio-inspired computing. Journal of Information and Optimization Sciences, 42 (6), 1307-1335. https://doi.org/10.1080/02522667.2021.1880147 |
[64] | Hansen, N., & Ostermeier, A. (2001). Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9 (2), 159–195. https://doi.org/10.1162/106365601750190398 |
[65] | Fukushima, N., Nagata, Y., Kobayashi, S., & Ono, I. (2011). Proposal of distance-weighted exponential natural evolution strategies, IEEE congress on evolutionary computing (pp. 164–170). |
[66] | Masutomi, K., Nagata, Y., & Ono, I. (2015). A novel evolution strategy for noisy function optimization, Transaction of the Japanese Society for Evolutionary Computation, 6 (1), 1–12. https://doi.org/10.11394/tjpnsec.6.1 |
[67] | Hansen, N., Niederberger, A. S. P., Guzzella, L., & Koumoutsakos, P. (2009). A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Transactions on Evolutionary Computation, 13 (1), 180–197. https://doi.org/10.1109/TEVC.2008.924423 |
[68] | Richter, S. N., Schoen, M. G., & Tauritz, D. R. (2019). Evolving mean-update selection methods for CMA-ES, Proceedings of the 2019 Genetic and Evolutionary Computation Conference (pp. 1513–1517). |
[69] | Senoguchi, J. (2021). Forecast of complex financial big data using model tree optimized by bilevel evolution strategy, Journal of Big Data, 8 (116), 1-13. |
[70] | Senoguchi, J (2021). Extraction of Space Containing Noise and Forecast of Complex Data by Multiway Tree Bi-Level GA, The Japanese Society for Artificial Intelligence, 36 (5), 1-12. |
[71] | Zhang, Y., Ma, Q., Sakamoto, M., & Furutani, H. (2011). Experimental analysis of the first appearing time of optimum solution in genetic algorithm, Journal of Information Processing, 4 (1), 82–88. |
APA Style
Junsuke Senoguchi. (2022). Forecasting of Global Stock Market by Two Stage Optimization Model. International Journal on Data Science and Technology, 8(4), 72-86. https://doi.org/10.11648/j.ijdst.20220804.13
ACS Style
Junsuke Senoguchi. Forecasting of Global Stock Market by Two Stage Optimization Model. Int. J. Data Sci. Technol. 2022, 8(4), 72-86. doi: 10.11648/j.ijdst.20220804.13
@article{10.11648/j.ijdst.20220804.13, author = {Junsuke Senoguchi}, title = {Forecasting of Global Stock Market by Two Stage Optimization Model}, journal = {International Journal on Data Science and Technology}, volume = {8}, number = {4}, pages = {72-86}, doi = {10.11648/j.ijdst.20220804.13}, url = {https://doi.org/10.11648/j.ijdst.20220804.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdst.20220804.13}, abstract = {Currently, about half of the transactions in the US stock market are based on high-frequency algorithmic trading, making it difficult for the investors with the long-term investment horizon, such as pension funds, to obtain stable returns. The development of a market forecast model that could achieve stable returns over the long term is an important issue in supporting not only pensions but also the central bank policy makers or new private businesses. To obtain stable investment performance by a forecast model over the long-term, it is necessary to remove noise from sample data in advance and extract a universal pattern. However, it is difficult to preliminarily distinguish between noise and true patterns and remove noise in advance. In this study, the sample space was divided into 8 sub-spaces using a Two Stage Optimization decision tree, and the versatility of each sub-space was evaluated by a pattern recognition model. Then, the sub-space with a low versatility was defined as the space with relatively large noise, and a forecast model was created by excluding the sub-spaces with large noise. It was found that the forecast model constructed in this way could obtain the prediction accuracy higher than that of the conventional method. Also, when the prediction accuracy of the model was evaluated by the walk-forward method using financial time-series data, investment performance that stably exceeded the return of benchmark assets was obtained over the past 15 years.}, year = {2022} }
TY - JOUR T1 - Forecasting of Global Stock Market by Two Stage Optimization Model AU - Junsuke Senoguchi Y1 - 2022/12/15 PY - 2022 N1 - https://doi.org/10.11648/j.ijdst.20220804.13 DO - 10.11648/j.ijdst.20220804.13 T2 - International Journal on Data Science and Technology JF - International Journal on Data Science and Technology JO - International Journal on Data Science and Technology SP - 72 EP - 86 PB - Science Publishing Group SN - 2472-2235 UR - https://doi.org/10.11648/j.ijdst.20220804.13 AB - Currently, about half of the transactions in the US stock market are based on high-frequency algorithmic trading, making it difficult for the investors with the long-term investment horizon, such as pension funds, to obtain stable returns. The development of a market forecast model that could achieve stable returns over the long term is an important issue in supporting not only pensions but also the central bank policy makers or new private businesses. To obtain stable investment performance by a forecast model over the long-term, it is necessary to remove noise from sample data in advance and extract a universal pattern. However, it is difficult to preliminarily distinguish between noise and true patterns and remove noise in advance. In this study, the sample space was divided into 8 sub-spaces using a Two Stage Optimization decision tree, and the versatility of each sub-space was evaluated by a pattern recognition model. Then, the sub-space with a low versatility was defined as the space with relatively large noise, and a forecast model was created by excluding the sub-spaces with large noise. It was found that the forecast model constructed in this way could obtain the prediction accuracy higher than that of the conventional method. Also, when the prediction accuracy of the model was evaluated by the walk-forward method using financial time-series data, investment performance that stably exceeded the return of benchmark assets was obtained over the past 15 years. VL - 8 IS - 4 ER -