Manage your Generative AI APIs with Azure API Management and Azure Open AI (2024)

This is for you who have started with Generative AI APIs and you’re looking to take those APIs into production. At high-level, there are things to consider like load balancing error management and cost management. We’ll mention those in this article and guide you to an Azure Sample where you can get started deploying an enterprise-ready solution.

Scenario: you want to take your generative AI to production.

So, you’re starting to use Azure Open AI, you like what you see, and you see how you can add this AI functionality to several of the apps in your company.

However, you have a big operation with many customers and many different apps and enterprise grade requirements on security and you know you must tackle all that before you can adopt generative AI to your business.

Problems we must address

You write up a list of problems that you need to address to fully implement generative AI:

- Load balancing and circuit breaker: With many customers, it’s keythat you can distribute the load across multiple instances. Also, error management is super important, to ensure that if one instance fails, the others can take over. In the cloud a common approach to error management is Circuit breaker helps by stopping the requests to the failing instance and redirecting them to the healthy ones.

- Monitoring and metrics: You want to monitor the usage of the AI model, how many requests are coming in, how many are failing, and how many are successful. Also, how many tokens are being used and how many are left. What about caching the responses to reduce the load on the AI model and save costs and improve performance.

Resources

Here’s some great resource to get you started and also learn more about the features implemented in the Azure Sample.

- Azure sample - APIM + Generative AI

- Azure API Management - Overview and key concepts | Microsoft Learn

- Azure API Management policy reference - azure-openai-token-limit | Microsoft Learn

- Azure API Management policy reference - azure-openai-emit-token-metric | Microsoft Learn

- Azure API Management backends | Microsoft Learn

- Use managed identities in Azure API Management | Microsoft Learn

Introducing: enterprise grade sample using APIM + Generative AI

In this sample, we get a chat app (frontend and backend) and a set of cloud resources that can be deployed to Azure using Azure Developer CLI, azd. Below is the user interface of the app included in the sample:

Architecture view of the sample

Ok, so first we get a chat window, great, that’s a good start, but let’s learn more about the architecture, how the sample is implemented:

The easiest way to describe how the architecture works is considering an incoming web request and what happens to it. In our case, we have a POST request with a prompt.

Request is hitting the API, and the API considers what to do with it:
Authentication, first it checks whether you’re allowed by checking the subscriberID you provided in your request
Routing. Next the API checks the policies to determines whether this request is within token limits (and the request is logged), thereafter it’s sent to the loadBalancer, where the load balancer determines which backend to send it to (each backend has 1:1 association with an Azure Open AI endpoint )
1. There's an alternate scenario here where if a backend responds with error within a certain time interval and a certain type of error the request is routed to a healthy resource
Creating a response, the assigned Azure Open AI endpoint responds, and the user sees the response rendered in the chat app.

Above is the happy path, if an endpoint throws errors for some reason with a certain frequency and/or error code the circuit breaker logic is triggered, and the request is routed to a healthy endpoint. Another reason for not getting a chat response back is if the token limits have been hit, i.e. rate limiting (you’ve for example made too many requests in a short time span).

Also note how a semantic cache could be made to respond instead if a response and prompt is similar to what's already in the cache.

How to get started

Ensure you have a terminal up and running and that you Azure Developer CLI, azd installed. Then run the following steps:

Clone the repo (or start in codespaces)

git clone https://github.com/Azure-Samples/genai-gateway-apim.git

azd auth login

Deploy the app

azd up

Run app, at this point, you have your cloud resources deployed. To test these out, run the app locally (you need to have Node.js installed), at repo directory, run the below commands in a terminal:

cd srcnpm installnpm start

This will start the app onhttp://localhost:3000and the API is available athttp:localhost:1337.

What next

Our suggestion is that you go and check out the Azure Sample - APIM + Gen AI Try deploying it and see how it works.

Let us know if you have any questions or feedback.

Manage your Generative AI APIs with Azure API Management and Azure Open AI (2024)

FAQs

How to get API key for Azure Open AI? ›

Obtain an API key from the Azure OpenAI resource. In the Azure portal, find a key on the Keys and Endpoint page of the Azure OpenAI resource. Go to your API Management instance, and select Named values in the left menu.

Explore More ›

How to add APIs to Azure API Management with Azure DevOps? ›

Create an API

Navigate to your API Management service in the Azure portal and select APIs from the menu. From the left menu, select + Add API. Select HTTP from the list. Enter the backend Web service URL (for example, https://httpbin.org ) and other settings for the API.

Which format should you use to send a request to a rest API endpoint for Azure Open AI? ›

You can use either API Keys or Microsoft Entra ID. API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.

Explore More ›

How to import Azure OpenAI? ›

Download the OpenAPI specification for the Azure OpenAI REST API, such as the 2024-02-01 GA version.

In a text editor, open the specification file that you downloaded.
Make a note of the value of the API version in the specification. You'll need it to test the API. Example: 2024-02-01 .

May 21, 2024

Discover More ›

Are OpenAI API keys free? ›

There is no “free account” for API. The use of the service costs money by the amount of data used. There is only possibility of a free trial credit, which expires three months after you first created your OpenAI account. Now you'll need to purchase a credit balance in order to make calls.

Get More Info ›

How do I get OpenAI API? ›

Hands-On: Getting Started with OpenAI API

Step 1: Create an OpenAI platform account. Before anything else, you'll need an account in the OpenAI platform. ...
Step 2: Get your API key. ...
Step 3: Install the OpenAI Python library. ...
Step 4: Making your first API call. ...
Step 5: Exploring further.

Explore More ›

How to deploy API in Azure API Management? ›

Remove an API Management service region

In the Azure portal, navigate to your API Management service and select Locations from the left menu. For the location you would like to remove, select the context menu using the ... button at the right end of the table. Select Delete.

View Details ›

Why use Azure API Management? ›

Azure API Management is a fully managed service that helps developers to securely expose their APIs to external and internal customers. It provides a set of tools and services for creating, publishing, and managing APIs, as well as for enforcing security, scaling, and monitoring API usage.

Learn More ›

How do I set up an API in Azure? ›

Steps to set up Azure API Management

Step 1: Sign in to the Azure Portal. ...
Step 2: Create an API Management Instance. ...
Step 3: Configure API Management Instance. ...
Step 4: Create or Import APIs. ...
Step 5: Configure API Policies. ...
Step 6: Configure API Products. ...
Step 7: Publish APIs. ...
Step 8: Monitor and Manage APIs.

May 17, 2024

Keep Reading ›

How are ChatGPT OpenAI and Azure OpenAI related? ›

In summary, ChatGPT is developed by OpenAI, and Azure OpenAI is the collaboration between OpenAI and Microsoft to make OpenAI's models accessible through the Azure platform.

How to call API from Azure? ›

To make a REST API call to Azure, you first need to obtain an access token. Include this access token in the headers of your Azure REST API calls using the "Authorization" header and setting the value to "Bearer {access-token}".

Know More ›

How does Azure OpenAI work? ›

Azure OpenAI is a suite of AI services that allows you to apply natural language algorithms on your data without any prior knowledge of math, data science, or machine learning. It can help you make your app more intelligent without writing actual code for natural languages.

Get More Info ›

How to migrate from OpenAI to Azure OpenAI? ›

For switching between OpenAI and Azure OpenAI Service endpoints, you need to make slight changes to your code. Update the authentication, model keyword argument, and other differences (Python examples below). Use environment variables for API keys and endpoints. For OpenAI, set openai.

Learn More ›

What language does OpenAI API support? ›

Programming languages

Language	Source code	Package
Go	Source code	Package (Go)
Java	Source code	Artifact (Maven)
JavaScript	Source code	Package (npm)
Python	Source code	Package (PyPi)

1 more row

Dec 18, 2023

What is the difference between OpenAI and Azure OpenAI service? ›

OpenAI uses the model keyword argument to specify what model to use. Azure OpenAI has the concept of unique model deployments.

How do I get Azure API key? ›

Sign in to the Azure portal and find your search service. Under Settings, select Keys to view API keys. Under Manage query keys, use the query key already generated for your service, or create new query keys. The default query key isn't named, but other generated query keys can be named for manageability.

Read On ›

How do I get my API key from the OpenAI dashboard? ›

Once you've created your OpenAI account or logged into an existing one, you'll see your name's initials and profile icon at the top-right corner of the OpenAI dashboard. To generate an OpenAI API key, tap on your name to view the dropdown menu. Click the 'View API keys' option.

Tell Me More ›

How do I get my API API key? ›

Setting up API keys

Go to the API Console.
From the projects list, select a project or create a new one.
If the APIs & services page isn't already open, open the left side menu and select APIs & services.
On the left, choose Credentials.
Click Create credentials and then select API key.

Know More ›

How to get access to Azure OpenAI service? ›

How to Get Azure OpenAI Service Access

If you see a red box containing an error message, you don't have access to Azure OpenAI Service yet.
Fill in all the required fields and click Submit.
Wait for a confirmation email from Microsoft Azure saying you have been approved for Azure OpenAI Service (7-10 days usually)

Oct 23, 2023