Baseline Agentic AI Systems Architecture by info.odysseyx@gmail.com August 20, 2024 written by info.odysseyx@gmail.com August 20, 2024 0 comment 16 views 16 Agent AI System It is designed to solve complex problems with limited direct human supervision. [1]. These systems consist of multiple conversational agents that converse with each other and can be centrally coordinated or self-organizing in a distributed manner. [1, 2]As enterprises increasingly use multi-agent systems to automate complex processes or solve complex tasks, we want to take a closer look at how these systems might be architected. These agents have the following features: planIt allows you to predict future states and select optimal actions to achieve specific goals. It also integrates. MemoryIt allows the agent to recall past interactions, experiences, and knowledge. This is important for maintaining continuity of work and improving strategies. Agents can also equipmentRun code, query databases, and interact with other systems, including APIs and external software. [1, 3]. This tool expands your capabilities and allows you to perform a wider range of tasks. Agents can perform tasks and write and execute code, so they have the potential to execute code that may be malicious or harmful to the host system or other users. [3]Therefore, understanding the architecture of these systems is critical to sandboxing code execution, limiting or denying access to production data and services, and mitigating failures, vulnerabilities, and abuse. This paper provides a basic architecture for building and deploying Agentic AI systems using frameworks such as AutoGen, LangChain, LlamaIndex, or Semantic Kernel. It is based on the Baseline OpenAI End-to-End Chat Reference Architecture. [4]. We provide Azure Container Apps or Azure Kubernetes Services as the main platform for deploying agents, orchestrators, APIs, and prompt flows. For certain models, we require machine learning workspaces, so we have also integrated them into the architecture. To use Azure Open AI efficiently, we suggest deploying the service behind Azure API Management, which provides dedicated policies for Azure OpenAI services. If you need a UI, we suggest using App Service. Outside the workload network, the Azure Container Apps environment is deployed with a serverless code interpreter session (preview). [3] Executes code generated by the agent. architecture component Many components of this proposed architecture are similar to the baseline OpenAI end-to-end chat reference architecture. [4] The main components of Agentic AI Systems are Azure OpenAI, Azure AI Search, and Azure AI Services. We will highlight the main components of this architecture. Azure AI Studio [5] A managed cloud service used to train, deploy, automate, and manage machine learning models, including large-scale language models (LLMs), small-scale language models (SLMs), and multimodal models used by agents. The platform provides a comprehensive set of tools and services that facilitate the end-to-end machine learning lifecycle. Key features of Azure AI Studio include: Prompt Flow [6] A development tool designed to simplify the entire development lifecycle of generative AI applications. You can create, test, and deploy prompt flows, which can be used to generate responses or actions based on given prompts. These prompt flows can be deployed to a Machine Learning Workspace or containerized and deployed to Azure Container Apps or Azure Kubernetes Services. [7]AI Studio also lets you develop and deploy these instant flows. Managed Online Endpoints Used to invoke prompt flows for real-time inference from agents and backend services. Provides a scalable, reliable, and secure endpoint to deploy machine learning models and enable real-time decision making and interactions. [7]. Azure AI Dependencies Contains essential Azure services and resources that support the functionality of AI Studio and related projects. [8]: Azure Storage Account Stores artifacts of the project, such as prompt flows and assessment data. Primarily used to manage data and model assets in AI Studio. Azure AI SearchA comprehensive cloud search service that supports full-text search, semantic search, vector search, and hybrid search. It provides search capabilities for AI projects and agents and is essential for implementing the Augmented Search Generation (RAG) pattern. This pattern involves extracting relevant queries from a prompt, querying an AI search service, and using the results to generate a response using an LLM or SLM model. Azure Key Vault Used to securely store and manage secrets, keys, and certificates required by agents, AI projects, and backend services. Azure Container Registry Stores and manages container images for agents, backend APIs, orchestrators, and other components. It also stores images generated when using custom runtimes for prompt flows. Azure OpenAI Service Supports natural language processing tasks such as text generation, summarization, and conversation. Azure AI Services Provides APIs for vision, speech, language, and decision making, including custom models. Document Intelligence Extract data from documents and process it intelligently. Azure Voice It converts speech to text, text to speech, and also has translation capabilities. Content AI Safety Ensure AI-generated content is ethical and safe, preventing the creation or spread of harmful or biased material. These components and services provided by Azure AI Studio facilitate the development and operation of Agentic AI Systems by enabling seamless integration, deployment, and management of sophisticated AI solutions. Azure Cosmos DB Suitable for Agentic AI Systems and AI agents. [9]. It can provide “session” Memory containing the message history of a conversational agent (e.g. Conversational agent. Chat messages. In Autogen [9, 10]) can also be used for LLM caching. [9, 11]. Finally, it can be used as a vector database. [9, 12]. Azure Cache for Redis An in-memory store that can be used to store short-term memory for agents such as autogen and LLM caching. [11, 13]It can be used to improve performance in backend services and as a session store. [13]. no wayManage the given API For us, it is a key architectural component for managing access to Azure OpenAI services, especially when used across multiple agents. First, you can import the OpenAI API directly from API Management or using the OpenAPI specification. [14]. There are several ways to authenticate and authorize access to the Azure OpenAI API using API Management policies. [15]. You can also use API Management to monitor and analyze Azure OpenAI service usage. [16]Setting token restriction policy [17]Enable semantic caching of responses to Azure OpenAI requests to reduce bandwidth, processing requirements, and latency. [18]For semantic caching, you can use distributed Azure Cache for Redis. [18]Finally, API management policies allow you to implement smart load balancing for your Azure OpenAI services. [19-21]. For all these reasons, in this architecture, the agent does not call the Azure OpenAI service directly, but rather through Azure API Management. API Management is also used to expose the API of the backend service to the agent and the outside world. Azure Container App A serverless platform designed to focus on containerized applications and less on infrastructure. [22]. Suitable for Agentic AI systems. Agents, orchestrators, prompt flows, and backend APIs can all be deployed as container apps. Provides a robust solution for agents, orchestrators, prompt flows, and backend APIs. Can automatically scale with load. Container apps also provide Dapr integration, which helps you implement simple, portable, resilient, and secure microservices and agents. [23]. For asynchrony between agents and between agents and orchestrators, we recommend using: Azure Service Bus. A fully managed enterprise message broker with message queues and publish-subscribe topics. [24]. Provides decoupled communication between agents and between agents and orchestrators. Dapr can be used to communicate with Azure Service Bus. [24]. Dapr provides resiliency policies for communication with Azure Service Bus (preview) [25]. For synchronous communication between agents and between agents and orchestrators, you can: Daphne Service-to-service calls. A simple way to call other services (agents or orchestrators) directly, using automatic mTLS authentication and encryption and service discovery. [24]. Dapr also provides resiliency for service calls, but this does not apply to requests made using the Dapr service call API. [26]. not Azure Kubernetes Service (AKS) The architecture is as follows. You can deploy Dapr on Azure Kubernetes Services or use a service mesh for direct communication between agents and between agents and orchestrators. Azure Kubernetes Services also provides a robust solution for agents, orchestrators, prompt flows, and backend APIs. Azure Container App Code Interpreter Session (Preview) is completely isolated and designed to run untrusted code. [3]. Powered by Azure Container Apps Dynamic Sessions (Preview), it provides fast access to secure sandbox environments with strong isolation capabilities. [27]. Code interpreter sessions are completely isolated from each other by the Hyper-V boundary, providing enterprise-grade security and isolation. [3, 27]. Outbound traffic may also be restricted. [3]. By default, the Python code interpreter session includes popular Python packages such as NumPy, Pandas, and Scikit-learn. [3]. You can also create custom Azure Container App custom container sessions to suit your needs. [28]. Azure Container Apps Code Interpreter Sessions and Custom Container Sessions are ideal for running agent-generated code in a secure, isolated environment. They are a critical component of any architecture that prevents malicious code execution and protects the host system and other users. conclusion Agentic AI systems represent a significant advance in the field of artificial intelligence, providing autonomous decision-making and problem-solving capabilities with minimal human intervention. By leveraging conversational agents with planning, memory, and tooling capabilities, these systems can solve complex enterprise challenges. The proposed architecture, which leverages a suite of services in Azure, including Azure OpenAI, AI Studio, Azure API Management, and Container Apps, provides a strong foundation for deploying these intelligent systems. Ensuring the safety, reliability, and ethical operation of these systems is critical, especially in managing code execution and data security. As the field evolves, it is essential to continually improve these architectures and practices to maximize the benefits and minimize the risks associated with Agentic AI. References [1] Shavit Y, Agarwal S, Brundage M, Adler S, O’Keefe C, Campbell R, Lee T, Mishkin P, Eloundou T, Hickey A, Slama K. The practice of governing agent AI systems. Research paper, OpenAI, December 2023. [2] Wu Q, Bansal G, Zhang J, Wu Y, Zhang S, Zhu E, Li B, Jiang L, Zhang X, Wang C. Autogen: Enabling next-generation LLM applications with a multi-agent conversational framework. arXiv preprint arXiv:2308.08155. 16 Aug 2023. appendix thank you Special thanks to my colleagues for their feedback on the architecture. Anurag Karuparti Freddy Ayala Hitashi Patel George Varghese Paulik Garraway Sam El-Anis Srikanth Baktan Zhuhai Ramram Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Head of Legal Assistance Unit Human Mobility Hub – Middle East Office (open to internal candidates only) next post PhD in Molecular Plant Biology, RPTU, Kaiserslautern, Germany You may also like 7 Disturbing Tech Trends of 2024 December 19, 2024 AI on phones fails to impress Apple, Samsung users: Survey December 18, 2024 Standout technology products of 2024 December 16, 2024 Is Intel Equivalent to Tech Industry 2024 NY Giant? December 12, 2024 Google’s Willow chip marks breakthrough in quantum computing December 11, 2024 Job seekers are targeted in mobile phishing campaigns December 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.