Self-Healing REST Services Using Artificial Intelligence in Multi-Cloud Environments

Authors

  • Ishu Anand Jaiswal Apple Inc., One Apple Park Way, Cupertino, CA 95014, USA Author

DOI:

https://doi.org/10.63345/sjaibt.v1.i3.201

Keywords:

Self-Healing Systems, RESTful Services, Artificial Intelligence, Multi-Cloud Computing, Cloud-Native Architecture, Fault Detection, Autonomous Systems, Service Reliability, Machine Learning Monitoring, Distributed Systems.

Abstract

The current day digital applications are based on the search of RESTful services and use distributed cloud infrastructures. The problem of service reliability becomes much more complicated as organizations are switching to multi-cloud architecture to enhance their scalability, reliability, and independence with a single vendor. Conventional methods of monitoring and incident-response usually respond very slowly to failures like API spikes in latencies, service failures, container crashes, and configuration errors. These limits cause downtime, reduced performance and operational costs.

Self-healing systems have become a good solution in order to overcome these challenges. A self-healing architecture allows software systems to automatically identify, diagnose, and recover failure automatically without human intervention. Together with Artificial Intelligence (AI), self-healing can forecast possible failures and optimise system behaviour, as well as automatically implement corrective measures. Monitoring systems that are powered by AI can process large amounts of system telemetry, logs and performance metrics to find anomalies and initiate automatic remediation.

This paper suggests a self-healing model of AI to be used by REST services in the context of multi-cloud environments. The framework will combine machine learning-based outlier detection, automated fault diagnosis, predictive analytics and smart orchestration systems. Through cloud monitoring data, distributed tracing, and decision engines based on reinforcing learning, the proposed system is able to monitor the performance of REST services and take automatic recovery measures like restarting a container, rerouting of a service, auto-scaling, and reconfiguring an API gateway.

 

Downloads

Download data is not yet available.

References

• Kephart, J. O., & Chess, D. M. (2003). The vision of autonomic computing. IEEE Computer, 36(1), 41–50.

• Kratzke, N., & Quint, P. C. (2017). Understanding cloud-native applications after 10 years of cloud computing. Journal of Systems and Software, 126, 1–16.

• Basiri, A., et al. (2016). Chaos engineering. IEEE Software, 33(3), 35–41.

• Xu, W., Huang, L., Fox, A., Patterson, D., & Jordan, M. (2009). Detecting large-scale system problems by mining console logs. Proceedings of SOSP.

• Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile Networks and Applications, 19(2), 171–209.

• Lorido-Botran, T., Miguel-Alonso, J., & Lozano, J. A. (2014). Auto-scaling techniques for elastic applications in cloud environments. Journal of Grid Computing, 12(4), 559–592.

• Zhang, Q., Chen, M., Li, L., & Tang, Z. (2018). AI-based anomaly detection for cloud systems. Future Generation Computer Systems, 87, 898–910.

• Mao, M., & Humphrey, M. (2012). A performance study on the VM startup time in the cloud. IEEE Cloud Computing.

• Villamizar, M., et al. (2015). Infrastructure cost comparison of running web applications in the cloud using AWS Lambda and monolithic architectures. Proceedings of the IEEE Cloud.

• Dragoni, N., et al. (2017). Microservices: Yesterday, today, and tomorrow. Present and Ulterior Software Engineering.

• Newman, S. (2015). Building Microservices. O’Reilly Media.

• Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. Communications of the ACM.

• Hwang, K., Dongarra, J., & Fox, G. (2013). Distributed and Cloud Computing. Morgan Kaufmann.

• Li, Y., et al. (2019). Intelligent fault diagnosis in cloud computing. IEEE Access, 7, 109254–109267.

• Garlan, D., Cheng, S., Huang, A., Schmerl, B., & Steenkiste, P. (2004). Rainbow: Architecture-based self-adaptation with reusable infrastructure. IEEE Computer.

• Chen, T., Fox, E., & Chen, Z. (2018). Reinforcement learning for autonomous cloud resource management. Future Generation Computer Systems.

• Erl, T., Puttini, R., & Mahmood, Z. (2013). Cloud Computing: Concepts, Technology and Architecture. Pearson.

• Balalaie, A., Heydarnoori, A., & Jamshidi, P. (2016). Microservices architecture enables DevOps. IEEE Software.

• Pahl, C. (2015). Containerization and the PaaS cloud. IEEE Cloud Computing.

• Turnbull, J. (2014). The Docker Book. James Turnbull.

Additional Files

Published

04-08-2024

Issue

Section

Original Research Articles

How to Cite

Self-Healing REST Services Using Artificial Intelligence in Multi-Cloud Environments. (2024). Scientific Journal of Artificial Intelligence and Blockchain Technologies, 1(3), Aug (1-7). https://doi.org/10.63345/sjaibt.v1.i3.201

Similar Articles

11-20 of 115

You may also start an advanced similarity search for this article.