Test Case Prioritization Through Clustering: A Data-Driven Approach

Sheetal Sharma; Swati V. Chande; Dr N.K.  Joshi

Test Case Prioritization Through Clustering: A Data-Driven Approach

Authors

Sheetal Sharma Rajasthan Technical University, Kota 324010, India
Swati V. Chande International School of Informatics & Management, Jaipur 302020, India
Dr N.K. Joshi Modi Institute of Management & Technology, KOTA-324009 Rajasthan India

Abstract

In software development, maintaining software quality and reliability through thorough testing is vital. However, as software systems become increasingly complex, managing large volumes of test cases presents significant challenges [1]. To tackle this problem, effective test case prioritization is necessary to ensure that the most critical tests are executed first [2]. This paper introduces an innovative approach to test case prioritization by combining clustering techniques, specifically K-means, with machine learning algorithms. We explore how K-means clustering can group similar test cases to improve prioritization efficiency [3][4]. Furthermore, we examine the performance of several machine learning models, including Decision Trees (DT), Random Forests (RF), and Neural Networks (NN), comparing their results against traditional methods. The study evaluates these approaches using diverse datasets and metrics such as the number of executed test cases, fault detection rate, and execution time [5]. Experimental findings demonstrate that integrating K-means clustering with machine learning techniques can enhance prioritization by reducing test execution efforts while preserving or even boosting fault detection capabilities. We also highlight the limitations of the proposed method and suggest future research opportunities aimed at further optimizing test case prioritization through advanced machine learning strategies. Overall, this framework offers important contributions toward developing more effective and reliable software testing processes.

References

H. K. Leung and L. White, "Insights into regression testing (software testing)," in Proc. Conf. Softw. Maintenance, 1989, pp. 60–69.

S. Kumar and S. Singh, "Test case prioritization: Various techniques–A review," Int. J. Sci. Eng. Res., vol. 4, no. 4, pp. 1106–1109, 2013.

R. Lachmann et al., "System-level test case prioritization using machine learning," in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 201-207, 2016.

L. Davis and P. Lee, "Deep learning for automated test case prioritization," in Proc. Int. Conf. on Software Engineering (ICSE), pp. 150–158, 2019.

R. Pan et al., "Test case selection and prioritization using machine learning: A systematic literature review," Empirical Software Engineering, vol. 27, no. 2, 2022.

A. Verma and R. Kumar, "Test case prioritization using a fuzzy logic approach based on multiple factors," Journal of Software: Evolution and Process, vol. 30, no. 2, e1908, 2018.

V. Gupta, P. C. Jha, and K. K. Biswas, "Test case prioritization using fault severity," in International Journal of Computer Science and Technology (IJCST), vol. 1, no. 1, pp. 12–18, Mar. 2010. [Online].

A. Verma, R. Bajaj, and I. K. Luthra, “A Novel Density-Based K-Means Clustering for Test Case Prioritization in Regression Testing,” International Journal of Computer Science and Technology, vol. 7, no. 1, pp. 114–116, Mar. 2016.

G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, "Test case prioritization: An empirical study," in Proc. IEEE Int. Conf. Softw., Aug. 1999.

D. Leon and A. Podgurski, "A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases," in Proc. 14th Int. Symp. Softw. Reliab. Eng. (ISSRE), Nov. 2003, pp. 442–453.

S. Yoo and M. Harman, "Regression testing minimization, selection and prioritization: A survey," Softw. Test. Verif. Reliab., vol. 22, no. 2, pp. 67–120, 2012.

M. Ahmed and M. Akbar, "A fuzzy logic-based approach for test case prioritization using fault severity and execution time," Int. J. Advanced Computer Science and Applications, vol. 9, no. 4, pp. 112–120, 2018.

N. Chauhan and N. Bhatnagar, "Test case prioritization using a hybrid approach combining fuzzy logic and genetic algorithms," Journal of Systems and Software, vol. 158, p. 110425, 2019.

R. Kaur and A. Singhal, "Test case prioritization using fuzzy logic-based criteria weighting," Int. J. Software Engineering and Computer Systems, vol. 6, no. 1, pp. 45–56, 2020.

S. Sharma and R. Saini, "Enhancing fault detection efficiency using fuzzy logic in test case prioritization," Journal of Computer Science and Technology, vol. 33, no. 4, pp. 692–708, 2017.

S. Yadav and A. Raghuwanshi, "Application of fuzzy logic for efficient test case prioritization," Int. J. Recent Technology and Engineering, vol. 8, no. 4, pp. 186–191, 2019.

R. Lachmann et al., "System-level test case prioritization using machine learning," in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 201-207, 2016.

L. Davis and P. Lee, "Deep learning for automated test case prioritization," in Proc. Int. Conf. on Software Engineering (ICSE), pp. 150–158, 2019.

R. Pan et al., "Test case selection and prioritization using machine learning: A systematic literature review," Empirical Software Engineering, vol. 27, no. 2, 2022.

A. Swain et al., "Automated test case prioritization using machine learning," in International Conference on Metaheuristics in Software Engineering and its Application, Cham: Springer International Publishing, 2022.

S. Sharma and J. Choudhary, “A K-Means Clustering Approach to Test Case Minimization,” Journal of Information Systems Engineering & Management, vol. 10, no. 4, pp. 370–380, Dec. 2024.

A. Verma and R. Kumar, "Test case prioritization using a fuzzy logic approach based on multiple factors," Journal of Software: Evolution and Process, vol. 30, no. 2, e1908, 2018.

A. C. Bezerra, T. C. de Souza, and M. A. F. Silva, "Test case prioritization using machine learning techniques," in Proceedings of the 2019 IEEE International Conference on Information Communication and Computing Systems (ICICCS), pp. 120–125, 2019.

C. Hettiarachchi, H. Do, and B. Choi, "Effective regression testing using requirements and risks," in 2014 Eighth Int. Conf. on Software Security and Reliability (SERE), San Francisco, CA, USA, pp. 157–166, June 2014, doi: 10.1109/SERE.2014.33.

V. Gupta, P. C. Jha, and K. K. Biswas, "Test case prioritization using fault severity," in International Journal of Computer Science and Technology (IJCST), vol. 1, no. 1, pp. 12–18, Mar. 2010. [Online].

D. Di Nardo, N. Alshahwan, L. Briand, and Y. Labiche, "Coverage-based test case prioritisation: An industrial case study," in Proc. IEEE 6th Int. Conf. Softw. Test. Verif. Valid. (ICST), Mar. 2013, pp. 302–311.

R. Arumugam and N. Kumaravel, "Fuzzy logic approach for test case prioritization using multiple factors," Journal of Software Engineering and Applications, vol. 9, no. 9, pp. 435–445, 2016.

Sharif, A., Marijan, D., & Liaaen, M. (2023). "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing." arXiv preprint arXiv:2301.07443.

Patel, R., & Zhao, L. (2022). "Optimizing APFD for Enhanced Test Case Prioritization in Large-Scale Systems." IEEE Access, vol. 10, pp. 24750-24765. doi: 10.1109/ACCESS.2022.3145678.

Li, P., Chen, X., & Wang, Y. (2019). "Coverage-Based Test Case Prioritization for Regression Testing." IEEE Transactions on Software Engineering, vol. 45, no. 2, pp. 150-165. doi: 10.1109/TSE.2019.2894517.

Year	Rate
2024	12.6%
2023	18.3%

Test Case Prioritization Through Clustering: A Data-Driven Approach