Updated instructions and code for DALL-E 3 (MicrosoftLearning#55)

GraemeMalcolm · web-flow · commit bd5d1d4da5a9 · 2024-03-24T18:08:39.000-04:00
diff --git a/Instructions/Exercises/05-generate-images.md b/Instructions/Exercises/05-generate-images.md
@@ -7,30 +7,32 @@ lab:
 
 The Azure OpenAI Service includes an image-generation model named DALL-E. You can use this model to submit natural language prompts that describe a desired image, and the model will generate an original image based on the description you provide.
 
+In this exercise, you'll use a DALL-E version 3 model to generate images based on natural language prompts.
+
 This exercise will take approximately **25** minutes.
 
 ## Provision an Azure OpenAI resource
 
-Before you can use Azure OpenAI models, you must provision an Azure OpenAI resource in your Azure subscription.
+Before you can use Azure OpenAI to generate images, you must provision an Azure OpenAI resource in your Azure subscription. The resource must be in a region where DALL-E models are supported.
 
 1. Sign into the **Azure portal** at `https://portal.azure.com`.
 2. Create an **Azure OpenAI** resource with the following settings:
     - **Subscription**: *Select an Azure subscription that has been approved for access to the Azure OpenAI service, including DALL-E*
     - **Resource group**: *Choose or create a resource group*
-    - **Region**: **East US**\*
+    - **Region**: *Choose either **East US** or **Sweden Central***\*
     - **Name**: *A unique name of your choice*
     - **Pricing tier**: Standard S0
 
-    > \* DALL-E models are only available in Azure OpenAI service resources in the **East US** region.
+    > \* DALL-E 3 models are only available in Azure OpenAI service resources in the **East US** and **Sweden Central** regions.
 
 3. Wait for deployment to complete. Then go to the deployed Azure OpenAI resource in the Azure portal.
 
 ## Explore image-generation in the DALL-E playground
 
 You can use the DALL-E playground in **Azure OpenAI Studio** to experiment with image-generation.
 
-1. In the Azure portal, on the **Overview** page for your Azure OpenAI resource, use the **Explore** button to open Azure OpenAI Studio in a new browser tab. Alternatively, navigate to [Azure OpenAI Studio](https://oai.azure.com) directly.
-2. In the **Playground** section, select the **DALL-E** playground.
+1. In the Azure portal, on the **Overview** page for your Azure OpenAI resource, use the **Explore** button to open Azure OpenAI Studio in a new browser tab. Alternatively, navigate to [Azure OpenAI Studio](https://oai.azure.com) directly at `https://oai.azure.com`.
+2. In the **Playground** section, select the **DALL-E** playground. A deployment of the DALL-E model named *Dalle3* will be created automatically.
 3. In the **Prompt** box, enter a description of an image you'd like to generate. For example, `An elephant on a skateboard` Then select **Generate** and view the image that is generated.
 
     ![The DALL-E Playground in Azure OpenAI Studio with a generated image.](../media/dall-e-playground.png)
@@ -61,7 +63,7 @@ Now let's explore how you could build a custom app that uses Azure OpenAI servic
 
 ### Configure your application
 
-Applications for both C# and Python have been provided. Both apps feature the same functionality. First, you'll complete some key parts of the application to enable using your Azure OpenAI resource.
+Applications for both C# and Python have been provided. Both apps feature the same functionality. First, you'll add the endpoint and key for your Azure OpenAI resource to the app's configuration file.
 
 1. In Visual Studio Code, in the **Explorer** pane, browse to the **Labfiles/05-image-generation** folder and expand the **CSharp** or **Python** folder depending on your language preference. Each folder contains the language-specific files for an app into which you're you're going to integrate Azure OpenAI functionality.
 2. In the **Explorer** pane, in the **CSharp** or **Python** folder, open the configuration file for your preferred language
@@ -82,14 +84,9 @@ Now you're ready to explore the code used to call the REST API and generate an i
     - Python: `generate-image.py`
 
 2. Review the code that the file contains, noting the following key features:
-    - The code makes https requests to the endpoint for your service, including the key for your service in the header. Both of these values are obtained from the configuration file.
-    - The process consists of <u>two</u> REST requests: One to initiate the image-generation request, and another to retrieve the results.
-    The initial request includes the following data:
-        - The user-provided prompt that describes the image to be generated
-        - The number of images to be generated (in this case, 1)
-        - The resolution (size) of the image to be generated.
-    - The response header from the initial request includes an **operation-location** value that is used for the subsequent callback to get the results.
-    - The code polls the callback URL until the status of the image-generation task is *succeeded*, and then extracts and displays a URL for the generated image.
+    - The code makes an https request to the endpoint for your service, including the key for your service in the header. Both of these values are obtained from the configuration file.
+    - The request includes some parameters, including the prompt from on the image should be based, the number of images to generate, and the size of the generated image(s).
+    - The response includes a revised prompt that the DALL-E model extrapolated from the user-provided prompt to make it more descriptive, and the URL for the generated image.
 
 ### Run the app
 
@@ -112,9 +109,9 @@ Now that you've reviewed the code, it's time to run it and generate some images.
 
 4. Wait for the image to be generated - a hyperlink will be displayed in the terminal pane. Then select the hyperlink to open a new browser tab and review the image that was generated.
 
-   > **TIP**: If the app can't find the header, wait a minute and try again. Newly deployed resources can take up to 5 minutes to become available.
+   > **TIP**: If the app doesn't return a response, wait a minute and try again. Newly deployed resources can take up to 5 minutes to become available.
 
-6. Close the browser tab containing the generated image and re-run the app to generate a new image with a different prompt.
+5. Close the browser tab containing the generated image and re-run the app to generate a new image with a different prompt.
 
 ## Clean up
 
diff --git a/Labfiles/05-image-generation/CSharp/Program.cs b/Labfiles/05-image-generation/CSharp/Program.cs
@@ -29,52 +29,33 @@ static async Task Main(string[] args)
                 Console.WriteLine("Enter a prompt to request an image:");
                 string prompt = Console.ReadLine() ?? "";
 
-                // Make the initial call to start the job
+                // Call the DALL-E model
                 using (var client = new HttpClient())
                 {
                     var contentType = new MediaTypeWithQualityHeaderValue("application/json");
-                    var api = "openai/images/generations:submit?api-version=2023-06-01-preview";
+                    var api = "openai/deployments/dalle3/images/generations?api-version=2024-02-15-preview";
                     client.BaseAddress = new Uri(aoaiEndpoint);
                     client.DefaultRequestHeaders.Accept.Add(contentType);
                     client.DefaultRequestHeaders.Add("api-key", aoaiKey);
                     var data = new
                     {
                         prompt=prompt,
                         n=1,
-                        size="512x512"
+                        size="1024x1024"
                     };
 
                     var jsonData = JsonSerializer.Serialize(data);
                     var contentData = new StringContent(jsonData, Encoding.UTF8, "application/json");
-                    var init_response = await client.PostAsync(api, contentData); 
+                    var response = await client.PostAsync(api, contentData); 
 
-                    // Get the operation-location URL for the callback
-                    var callback_url = init_response.Headers.GetValues("operation-location").FirstOrDefault();
-
-                    // Poll the callback URL until the job has succeeeded (or 100 attempts)
-                    var response = await client.GetAsync(callback_url); 
+                    // Get the revised prompt and image URL from the response
                     var stringResponse = await response.Content.ReadAsStringAsync();
-                    var status = JsonSerializer.Deserialize<Dictionary<string,object>>(stringResponse)["status"];
-                    var tries = 1;
-                    while (status.ToString() != "succeeded" && tries < 101)
-                    {
-                        Thread.Sleep (3000); // wait 3 seconds to avoid rate limit
-                        tries ++;
-                        response = await client.GetAsync(callback_url);
-                        stringResponse = await response.Content.ReadAsStringAsync();
-                        status = JsonSerializer.Deserialize<Dictionary<string,object>>(stringResponse)["status"];
-                        Console.WriteLine(tries.ToString() + ": " + status);
-                    }
-
-                    // Get the results
-                    stringResponse = await response.Content.ReadAsStringAsync();
                     JsonNode contentNode = JsonNode.Parse(stringResponse)!;
-                    JsonNode resultNode = contentNode!["result"];
-                    JsonNode dataNode = resultNode!["data"];
-                    JsonNode urlNode = dataNode[0]!;
-                    JsonNode url = urlNode!["url"];
-
-                    // Display the URL for the generated image
+                    JsonNode dataCollectionNode = contentNode!["data"];
+                    JsonNode dataNode = dataCollectionNode[0]!;
+                    JsonNode revisedPrompt = dataNode!["revised_prompt"];
+                    JsonNode url = dataNode!["url"];
+                    Console.WriteLine(revisedPrompt.ToJsonString());
                     Console.WriteLine(url.ToJsonString().Replace(@"\u0026", "&"));
 
                 }
diff --git a/Labfiles/05-image-generation/Python/generate-image.py b/Labfiles/05-image-generation/Python/generate-image.py
@@ -10,35 +10,27 @@ def main():
         load_dotenv()
         api_base = os.getenv("AZURE_OAI_ENDPOINT")
         api_key = os.getenv("AZURE_OAI_KEY")
-        api_version = '2023-06-01-preview'
+        api_version = '2024-02-15-preview'
         
         # Get prompt for image to be generated
         prompt = input("\nEnter a prompt to request an image: ")
 
-        # Make the initial call to start the job
-        url = "{}openai/images/generations:submit?api-version={}".format(api_base, api_version)
+        # Call the DALL-E model
+        url = "{}openai/deployments/dalle3/images/generations?api-version={}".format(api_base, api_version)
         headers= { "api-key": api_key, "Content-Type": "application/json" }
         body = {
             "prompt": prompt,
             "n": 1,
-            "size": "512x512"
+            "size": "1024x1024"
         }
-        submission = requests.post(url, headers=headers, json=body)
+        response = requests.post(url, headers=headers, json=body)
 
-        # Get the operation-location URL for the callback
-        operation_location = submission.headers['Operation-Location']
-
-        # Poll the callback URL until the job has succeeeded
-        status = ""
-        while (status != "succeeded"):
-            time.sleep(3) # wait 3 seconds to avoid rate limit
-            response = requests.get(operation_location, headers=headers)
-            status = response.json()['status']
-
-        # Get the results
-        image_url = response.json()['result']['data'][0]['url']
+        # Get the revised prompt and image URL from the response
+        revised_prompt = response.json()['data'][0]['revised_prompt']
+        image_url = response.json()['data'][0]['url']
 
         # Display the URL for the generated image
+        print(revised_prompt)
         print(image_url)