Treffer: Secure multi-party test case data generation through generative adversarial networks.

Title:

Secure multi-party test case data generation through generative adversarial networks.

Authors:

Wang Z; Institute of Digital Economy, Beijing Academy of Science and Technology, Beijing, 100032, China., Zhao L; Beijing Computing Center Co., Ltd., Beijing Academy of Science and Technology, Beijing, 100032, China., Meng F; Beijing Beike Rongzhi Cloud Computing Technology Co., Ltd., Beijing Academy of Science and Technology, Beijing, 100032, China., Zhu Z; Beijing Computing Center Co., Ltd., Beijing Academy of Science and Technology, Beijing, 100032, China., Lu Y; Foreign Environment Cooperation Center, Ministry of Ecology and Environment, Beijing, China. lu.yiqing@fecomee.org.cn.

Source:

Scientific reports [Sci Rep] 2026 Jan 13. Date of Electronic Publication: 2026 Jan 13.

Publication Model:

Ahead of Print

Publication Type:

Journal Article

Language:

English

Journal Info:

Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE

Imprint Name(s):

References:

Liang, W. & Ji, N. Privacy challenges of iot-based blockchain: a systematic review. Clust. Comput. 25, 2203–2221 (2022).
Wang, X., Sun, Y. & Ding, D. Adaptive dynamic programming for networked control systems under communication constraints: A survey of trends and techniques. Int. J. Netw. Dyn. Intell. 85–98 (2022).
Casti, J. L. On system complexity: Identification, measurement, and management. In Complexity, language, and life: Mathematical approaches, 146–173 (Springer, 1986).
Kumar, S., Aggarwal, A. G. & Gupta, R. Modeling the role of testing coverage in the software reliability assessment. Int. J.Math. Eng. Manag. Sci. 8 (2023).
Aghababaeyan, Z. et al. Black-box testing of deep neural networks through test case diversity. IEEE Trans. Softw. Eng. 49, 3182–3204 (2023).
Rampérez, V., Soriano, J., Lizcano, D. & Lara, J. A. Flas: A combination of proactive and reactive auto-scaling architecture for distributed services. Future Gener. Comput. Syst. 118, 56–72 (2021).
Xie, H. et al. A verifiable federated learning algorithm supporting distributed pseudonym tracking. In International Conference on Database Systems for Advanced Applications, 173–189 (Springer, 2024).
Sai, S., Hassija, V., Chamola, V. & Guizani, M. Federated learning and nft-based privacy-preserving medical-data-sharing scheme for intelligent diagnosis in smart healthcare. IEEE Internet Things J. 11, 5568–5577 (2023).
Chen, Z. & Jiang, L. Promise and peril of collaborative code generation models: Balancing effectiveness and memorization. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 493–505 (2024).
Tram, H. T. N. Automated vulnerability scanning tools for securing cloud-based e-commerce supply chains. J. Appl. Cybersecurity Anal. Intell. Decis. Syst. 12, 11–21 (2022).
Xie, H. et al. Verifiable federated learning with privacy-preserving data aggregation for consumer electronics. IEEE Trans. Consum. Electron. 70, 2696–2707 (2024).
Yang, Y. et al. Federated learning for software engineering: a case study of code clone detection and defect prediction. IEEE Trans. Softw. Eng. 50, 296–321 (2024).
Preuveneers, D. et al. Chained anomaly detection models for federated learning: An intrusion detection case study. Appl. Sci. 8, 2663 (2018).
Singh, G., Sood, K., Rajalakshmi, P., Nguyen, D. D. N. & Xiang, Y. Evaluating federated learning-based intrusion detection scheme for next generation networks. IEEE Trans. Netw. Serv. Manag. 21, 4816–4829 (2024).
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Ghoshal, A., Kumar, S. & Mookerjee, V. Dilemma of data sharing alliance: When do competing personalizing and non-personalizing firms share data. Prod. Oper. Manag. 29, 1918–1936 (2020).
Li, Z. et al. Data heterogeneity-robust federated learning via group client selection in industrial iot. IEEE Internet Things J. 9, 17844–17857 (2022).
Ji, X., Tian, J., Zhang, H., Wu, D. & Li, T. Joint device selection and bandwidth allocation for cost-efficient federated learning in industrial internet of things. IEEE Internet Things J. 10, 9148–9160 (2023).
Xie, H. et al. Industrial wireless internet zero trust model: Zero trust meets dynamic federated learning with blockchain. IEEE Wirel. Commun. 31, 22–29 (2024).
Zhao, L., Xie, H., Zhong, L. & Wang, Y. Explainable federated learning scheme for secure healthcare data sharing. Health Inf. Sci. Syst. 12, 1–14 (2024).
Miller, B. P., Fredriksen, L. & So, B. An empirical study of the reliability of unix utilities. Commun. ACM 33, 32–44 (1990).
Ghosh, A., Shah, V. & Schmid, M. An approach for analyzing the robustness of windows nt software. In 21st National Information Systems Security Conference, Crystal City, VA, vol. 10 (Citeseer, 1998).
Ghosh, A. K., Schmid, M. & Shah, V. Testing the robustness of windows nt software. In Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No. 98TB100257), 231–235 (IEEE, 1998).
Eddington, M. Peach fuzzing platform. Peach Fuzzer 34, 32–43 (2011).
Biyani, A. et al. Extension of spike for encrypted protocol fuzzing. In 2011 Third International Conference on Multimedia Information Networking and Security, 343–347 (IEEE, 2011).
Cui, W., Kannan, J. & Wang, H. J. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 1–14 (Boston, MA, USA, 2007).
Comparetti, P. M., Wondracek, G., Kruegel, C. & Kirda, E. Prospex: Protocol specification extraction. In 2009 30th IEEE Symposium on Security and Privacy, 110–125 (IEEE, 2009).
Wondracek, G., Comparetti, P. M., Kruegel, C., Kirda, E. & Anna, S. S. S. Automatic network protocol analysis. In NDSS, vol. 8, 1–14 (Citeseer, 2008).
Whalen, S., Bishop, M. & Crutchfield, J. P. Hidden markov models for automated protocol learning. In Security and Privacy in Communication Networks: 6th Iternational ICST Conference, SecureComm 2010, Singapore, September 7-9, 2010. Proceedings 6, 415–428 (Springer, 2010).
Godefroid, P., Peleg, H. & Singh, R. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 50–59 (IEEE, 2017).
Sweeney, L. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems 10, 557–570 (2002).
Machanavajjhala, A., Kifer, D., Gehrke, J. & Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (tkdd) 1, 3–es (2007).
Riyana, S., Sasujit, K. & Homdoung, N. Achieving privacy preservation constraints based on k-anonymity in conjunction with adjacency matrix and weighted graphs. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 18, 34–50 (2024).
Riyana, S., Nanthachumphu, S. & Riyana, N. Achieving privacy preservation constraints in missing-value datasets. SN Comput. Sci. 1, 227 (2020).
Riyana, S. Achieving anatomization constraints in dynamic datasets. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 27–45 (2023).
Rasouli, M., Sun, T. & Rajagopal, R. Fedgan: Federated generative adversarial networks for distributed data. arXiv preprint arXiv:2006.07228 (2020).
Hardy, C., Le Merrer, E. & Sericola, B. Md-gan: Multi-discriminator generative adversarial networks for distributed datasets. In 2019 IEEE international parallel and distributed processing symposium (IPDPS), 866–877 (IEEE, 2019).
Dong, Y., Liu, Y., Zhang, H., Chen, S. & Qiao, Y. Fd-gan: Generative adversarial networks with fusion-discriminator for single image dehazing. Proc. AAAI Conf. Artif. Intell. 34, 10729–10736 (2020).
Bhardwaj, T. & Sumangali, K. A federated incremental blockchain framework with privacy preserving xai optimization for securing healthcare data. Sci. Rep. 15, 38001 (2025).
Dwork, C. Differential privacy. In International colloquium on automata, languages, and programming, 1–12 (Springer, 2006).
Riyana, S., Sasujit, K. & Homdoung, N. Privacy-enhancing data aggregation for big data analytics. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 440–456 (2023).
Riyana, S. ([Formula: see text],..., [Formula: see text])-privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes. J. Ambient Intell. Humaniz. Comput. 12, 9713–9729 (2021).
Shamsinezhad, E., Banirostam, H., BaniRostam, T., Pedram, M. M. & Rahmani, A. M. Providing and evaluating a model for big data anonymization streams by using in-memory processing. Knowl. Inf. Syst. 1–34 (2025).
Shamsinezhad, E., Banirostam, T., Pedram, M. M. & Rahmani, A. M. Anonymizing big data streams using in-memory processing: A novel model based on one-time clustering. J. Signal Process. Syst. 96, 333–356 (2024).

Grant Information:

2024-dchrcpyzz-9 Excellent Talent Training Funding Project in Dongcheng District; 9874 GEF project:Strengthening coordinated approaches to reduce invasive alien species (IAS) threats to globally significant agrobiodiversity and agroecosystems in China

Contributed Indexing:

Keywords: Autoencoders; Federated Learning; Generative Adversarial Networks; Test Case Generation

Entry Date(s):

Date Created: 20260112 Latest Revision: 20260112

Update Code:

20260113

DOI:

10.1038/s41598-026-35773-2

PMID:

41526446

Database:

MEDLINE

Weitere Informationen

In the current landscape of software testing, challenges persist in test case data generation, including variability in data quality and the inherent difficulty of data synthesis. These challenges are further exacerbated in scenarios where data are widely distributed across heterogeneous organizational environments. Privacy regulations and security concerns impose strict constraints on data sharing, preventing centralized data aggregation and highlighting the necessity of a federated environment as a more practical solution. To address the privacy protection and data sharing challenges in federated test case data generation, we propose a Generative Adversarial Network (GAN)-based method specifically designed for federated settings. By leveraging the strong data generation capabilities of GANs, the proposed approach is able to generate high-quality and diverse test case data while preserving data privacy. Specifically, through a protocol grammar-based deep learning framework combined with test case encoder-decoder encoding mechanisms and a GAN-driven sample character generator, the proposed method can predict and generate variant test case samples. In the federated environment, each participant trains the generator and discriminator locally, while model parameters are securely aggregated to achieve global model optimization. Experimental results demonstrate that the generated test case data outperforms traditional methods in terms of coverage and effectiveness, significantly enhancing the efficiency and quality of software testing. Ultimately, the proposed framework provides a scalable solution for identifying latent vulnerabilities in critical infrastructure while strictly adhering to data sovereignty requirements in cross-organizational environments.
(© 2026. The Author(s).)

Declarations. Competing interests: The authors declare no competing interests.

Treffer: Secure multi-party test case data generation through generative adversarial networks.

Weitere Informationen

Links

Zusatz-Funktionen