LEVERAGING ISTIO FOR ADVANCED TRAFFIC MANAGEMENT AND SECURITY IN GENERATIVE AI APPLICATIONS ON KUBERNETES CLUSTER

Novman Mohammed; Rajesh Kumar Malviya

Authors

Novman Mohammed Sr. Devops Engineer, Hyderabad, India. Author
Rajesh Kumar Malviya Enterprise Architect, Bengaluru, Karnataka, India. Author

Keywords:

Artificial Intelligence (AI), Failure Analysis (FA), Fault Injection (FI), Traffic Management, Leveraging Istio

Abstract

The number of domains have begun to include AI as a result of its fast development; one such domain is AI Generated Content (AIGC), where Large Language Models (LLMs) have greatly improved capabilities. On the other hand, AI systems' vulnerabilities have been brought to light by their complexity, therefore reliable and resilient systems require strong methods for failure analysis (FA) and fault injection (FI). There hasn't been a thorough evaluation of FA and FI procedures in AI systems, even though these techniques are important. The use of containers has the potential to standardize and fine-tune resource management as infrastructures move from monolithic to microservices. Thanks to microservices, we can now use a single machine as if it were numerous machines. By letting programs adjust the number of computers according to demand, this reduced resource waste. The best way to route traffic to services that are held in a hybrid cloud, on-premises, or on several cloud deployments is by employing a service mesh, particularly in microservices environments. To deal with these kinds of situations, the idea is to use a service mesh. Improving Kubernetes' performance and safety is the main goal of service mesh. Istio is one application that runs on this service mesh idea. Even though Kubernetes offers Ingress, we still require Istio, and we explain in present Research paper.

References

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI 2016. USENIX Association, 265–283. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi

Sahar Abdelnabi, Kai Greshake, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. In AISec 2023. ACM, 79–90. https://doi.org/10.1145/3605764.3623985

Udit Kumar Agarwal, Abraham Chan, Ali Asgari, and Karthik Pattabiraman. 2023. Towards Reliability Assessment of Systolic Arrays against Stuck-at Faults. In DSN 2023). 230–236. https://doi.org/10.1109/DSN-S58398.2023.00063

Naveed Akhtar, Ajmal Mian, Navid Kardan, and Mubarak Shah. 2021. Advances in Adversarial Attacks and Defenses in Computer Vision: A Survey. IEEE Access 9 (2021), 155161–155196. https://doi.org/10.1109/ACCESS.2021.3127960

Alexei95. 2024. Enpheeph Github. https://github.com/Alexei95/enpheeph. Accessed 2024.

Lörinc Antoni, Régis Leveugle, and Béla Fehér. 2002. Using Run-Time Reconfiguration for Fault Injection in Hardware Prototypes. In DFT 2002. IEEE, 245–253. https://doi.org/10.1109/DFTVS.2002.1173521

A. Avizienis, J.C. Laprie, B. Randell, and C. Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing 1, 1 (2004), 11–33. https://doi.org/10.1109/TDSC. 2004.2

Fu Bang. 2023. GPTCache: An Open-Source Semantic Cache for LLM Applications Enabling Faster Answers and Cost Savings. In NLP-OSS 2023. Empirical Methods in Natural Language Processing, 212–218. https://doi.org/10.18653/v1/ 2023.nlposs-1.24

S. H. Shabbeer Basha, Shiv Ram Dubey, Viswanath Pulabaigari, and Snehasis Mukherjee. 2020. Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 378 (2020), 112–119. https://doi.org/10.1016/J.NEUCOM.2019.10.008

Michael Beyer, Andrey Morozov, Emil Valiev, Christoph Schorn, Lydia Gauerhof, Kai Ding, and Klaus Janschek. 2020. Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs. CoRR abs/2012.07037 (2020). https://arxiv.org/abs/2012.07037

Bfgoldstein. 2024. Torchfi Github. https://github.com/bfgoldstein/torchfi. Accessed 2024.

Albert Bifet, Geoff Holmes, Richard Kirkby, and Bernhard Pfahringer. 2010. MOA: Massive Online Analysis. J. Mach. Learn. Res. 11 (2010), 1601–1604. https://api.semanticscholar.org/CorpusID:12397401

Pablo R. Bodmann and Paolo Rech. 2024. Tensor Processing Unit Reliability Dependence on Temperature and Radiation Source. IEEE Transactions on Nuclear Science (2024), 1–1. https://doi.org/10.1109/TNS.2024.3359524

Jakub Breier, Dirmanto Jap, Xiaolu Hou, Shivam Bhasin, and Yang Liu. 2022. SNIFF: Reverse Engineering of Neural Networks With Fault Attacks. IEEE Transactions on Reliability 71, 4 (2022), 1527–1539. https://doi.org/10.1109/TR. 2021.3105697

Junming Cao, Bihuan Chen, Chao Sun, Longjie Hu, Shuaihong Wu, and Xin Peng. 2022. Understanding performance problems in deep learning systems. In ESEC/FSE 2022. ACM, 357–369. https://doi.org/10.1145/3540250.3549123

Abraham Chan, Arpan Gujarati, Karthik Pattabiraman, and Sathish Gopalakrishnan. 2022. The Fault in Our Data Stars: Studying Mitigation Techniques against Faulty Training Data in Machine Learning Applications. In DSN 2022. IEEE/IFIP, 163–171. https://doi.org/10.1109/DSN53405.2022.00027

Abraham Chan, Niranjhana Narayanan, Arpan Gujarati, Karthik Pattabiraman, and Sathish Gopalakrishnan. 2021. Understanding the Resilience of Neural Network Ensembles against Faulty Training Data. In QRS 2021. IEEE, 1100–1111. https://doi.org/10.1109/QRS54544.2021.00118

Chaosblade. 2024. Chaosblade Github. https://github.com/chaosblade-io/chaosblade.

Hongyang Chen, Pengfei Chen, Guangba Yu, Xiaoyun Li, Zilong He, and Huxing Zhang. 2024. MicroFI: Non-Intrusive and Prioritized Request-Level Fault Injection for Microservice Applications. IEEE Transactions on Dependable and Secure Computing (2024), 1–18. https://doi.org/10.1109/TDSC.2024.3363902

Haicheng Chen, Wensheng Dou, Dong Wang, and Feng Qin. 2020. CoFI: Consistency-Guided Fault Injection for Cloud Systems. In ASE 2020. IEEE, 536–547. https://doi.org/10.1145/3324884.3416548

LEVERAGING ISTIO FOR ADVANCED TRAFFIC MANAGEMENT AND SECURITY IN GENERATIVE AI APPLICATIONS ON KUBERNETES CLUSTER

Authors

Keywords:

Abstract

References

Published

Issue

Section

License

How to Cite

cover