Skip to content

Port to gpt-4o-mini and GlobalStandard #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ so that you can use the OpenAI API SDKs with keyless (Entra) authentication.

* Provisions an Azure OpenAI account with keyless authentication enabled
* Grants the "Cognitive Services OpenAI User" RBAC role to your user account
* Deploys a gpt-3.5 model by default, but you can modify the [Bicep template](infra/main.bicep) to deploy other models
* Deploys a gpt-4o-mini model by default, but you can modify the [Bicep template](infra/main.bicep) to deploy other models
* Example script uses the [openai](https://pypi.org/project/openai/) Python package to make a request to the Azure OpenAI API

### Architecture diagram
Expand Down
17 changes: 11 additions & 6 deletions infra/main.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,30 @@ param environmentName string

@minLength(1)
@description('Location for the OpenAI resource')
// https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#standard-deployment-model-availability
// https://learn.microsoft.com/azure/ai-services/openai/concepts/models?tabs=python-secure%2Cglobal-standard%2Cstandard-chat-completions#models-by-deployment-type
@allowed([
'australiaeast'
'brazilsouth'
'canadaeast'
'eastus'
'eastus2'
'francecentral'
'germanywestcentral'
'japaneast'
'koreacentral'
'northcentralus'
'norwayeast'
'polandcentral'
'southafricanorth'
'southcentralus'
'southindia'
'spaincentral'
'swedencentral'
'switzerlandnorth'
'uksouth'
'westeurope'
'westus'
'westus3'
])
@metadata({
azd: {
Expand All @@ -35,15 +40,15 @@ param environmentName string
param location string

@description('Name of the GPT model to deploy')
param gptModelName string = 'gpt-35-turbo'
param gptModelName string = 'gpt-4o-mini'

@description('Version of the GPT model to deploy')
// See version availability in this table:
// https://learn.microsoft.com/azure/ai-services/openai/concepts/models#gpt-4-and-gpt-4-turbo-preview-models
param gptModelVersion string = '0125'
// https://learn.microsoft.com/azure/ai-services/openai/concepts/models?tabs=python-secure%2Cglobal-standard%2Cstandard-chat-completions#models-by-deployment-type
param gptModelVersion string = '2024-07-18'

@description('Name of the model deployment (can be different from the model name)')
param gptDeploymentName string = 'gpt-35-turbo'
param gptDeploymentName string = 'gpt-4o-mini'

@description('Capacity of the GPT deployment')
// You can increase this, but capacity is limited per model/region, so you will get errors if you go over
Expand Down Expand Up @@ -93,7 +98,7 @@ module openAi 'br/public:avm/res/cognitive-services/account:0.7.1' = {
version: gptModelVersion
}
sku: {
name: 'Standard'
name: 'GlobalStandard'
capacity: gptDeploymentCapacity
}
}
Expand Down