A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.
Published in | American Journal of Information Science and Technology (Volume 8, Issue 3) |
DOI | 10.11648/j.ajist.20240803.14 |
Page(s) | 84-97 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2024. Published by Science Publishing Group |
Multi-tiered Persistent Storage, Scheduling, Parallel Data Processing, Performance Tuning, Database Management Systems
[1] | W. Reisig, Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies, Springer Publishing Company, Incorporated, 2013. |
[2] | N. N. Noon, J. R. Getta and T. Xia, Optimization Query Processing for Multi-tiered Persistent Storage, 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET), 2021, pp. 131–135. |
[3] | N. N. Noon, and J. R. Getta, Optimisation of query processing with multilevel storage, In: Lecture Notes in Computer Science, 691–700. Da Nang, Vietnam Proceedings of the 8th Asian Conference, ACIIDS (2016). |
[4] | N. N. Noon, and J. R. Getta, Automated Performance Tuning of Data Management Systems with Materializations and Indices, In: Journal of Computer and Communications, 4, pp. 46–52 (2016). |
[5] | T. Sthr, H. Mrtens and E. Rahm, Multi-dimensional database allocation for parallel data warehouses, In: Proceedings of the 26th International Conference on Very Large Databases, pp. 273–284, (2000). |
[6] | J. Li, J. F. Naughton and R.V. Nehme, Resource bricolage and resource selection for parallel database systems, The VLDB Journal, vol. 26, no. 1, pp. 31–54 (2017). |
[7] | R. Nehme and N. Bruno, N, Automated partitioning design in parallel database systems, in: SIGMOD, Association for Computing Machinery, New York, NY, USA, 1137–1148 (2011). |
[8] | K. Wang, S. H. Choi and H. Qin, A cluster-based scheduling model using SPT and SA for dynamic hybrid flow shop problems, International journal of advanced manufacturing technology, 67, 2243-2258 (2013). |
[9] | J. Blazewicz, Klaus H. Ecker, E. Pesh, G. Schmidt, M. Sterna and J. Weglarz, Handbook on Scheduling From Theory to Practice, 2nd edn. Springer, Cham, (2019). |
[10] | Y.Zhang, H.Franke, J.MoreiraandA.Sivasubramaniam, An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration, in IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 3, pp. 236-247, March 2003, |
[11] | E. Frachtenberg, G. Feitelson, F. Petrini, and J. Fernandez, Adaptive parallel job scheduling with flexible coscheduling, in IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 11, pp. 1066-1077, Nov. 2005, |
[12] | R. Sheldon, G. Kranz and D. Raffo, Evaluator Group, TieredStorage, (2021), https://searchstorage.techtarget.co m/definition/tiered-storage, last accessed on 09 June 2023. |
[13] | P. Tsai, Spiceworks, Spiceworks Research Examines Storage Trends in 2020 and Beyond, (2020), https://community.spiceworks.com/blog/3240- spiceworks-research-examines-storage-trends-in-2020- and-beyond, last accessed 09 June 2023. |
[14] | N. N. Noon, J. R. Getta and T. Xia, Scheduling Parallel Data Transfers in Multi-tiered Persistent Storage, inIntelligentInformationandDatabaseSystems, ACIIDS 2022, Communications in Computer and Information Science, vol 1716, Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7 34 |
APA Style
Noon, N. N., Getta, J. R., Xia, T. (2024). Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. American Journal of Information Science and Technology, 8(3), 84-97. https://doi.org/10.11648/j.ajist.20240803.14
ACS Style
Noon, N. N.; Getta, J. R.; Xia, T. Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage. Am. J. Inf. Sci. Technol. 2024, 8(3), 84-97. doi: 10.11648/j.ajist.20240803.14
@article{10.11648/j.ajist.20240803.14, author = {Nan Noon Noon and Janusz Roman Getta and Tianbing Xia}, title = {Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage}, journal = {American Journal of Information Science and Technology}, volume = {8}, number = {3}, pages = {84-97}, doi = {10.11648/j.ajist.20240803.14}, url = {https://doi.org/10.11648/j.ajist.20240803.14}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajist.20240803.14}, abstract = {A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans.}, year = {2024} }
TY - JOUR T1 - Optimization of Parallel Data Transfers in Multi-Tiered Persistent Storage AU - Nan Noon Noon AU - Janusz Roman Getta AU - Tianbing Xia Y1 - 2024/09/29 PY - 2024 N1 - https://doi.org/10.11648/j.ajist.20240803.14 DO - 10.11648/j.ajist.20240803.14 T2 - American Journal of Information Science and Technology JF - American Journal of Information Science and Technology JO - American Journal of Information Science and Technology SP - 84 EP - 97 PB - Science Publishing Group SN - 2640-0588 UR - https://doi.org/10.11648/j.ajist.20240803.14 AB - A logical model of multi-tiered persistent storage provides a view of data where all available storage resources are distributed over a number of levels depending on the data transfer parameters and capacities. The efficient parallelization of data transfers in multi-tiered persistent storage is a significant challenge for a pipelined data processing model. This work examines a category of database applications implemented as sequences of operations that transfer data between the levels of multi-tiered persistent storage. The concept of EPN: Extended Petri Nets represents how database applications can be processed in parallel. A proposed transformation involves converting EPN into sequences of parallel data transfers. Additionally, a method is demonstrated for partitioning these sequences of data transfers, with the goal of reducing the total number of conflicts when data transfers occur between the levels of multi-tiered persistent storage. The paper proposes new rule-based algorithms for scheduling parallel data transfers that minimize total data transfer time. The objectives of the new algorithms are to evenly distribute the workload among the data transfer processes and reduce their idle time. Several experiments have confirmed the effectiveness of the new algorithms in generating parallel data transfer plans. VL - 8 IS - 3 ER -