Stability AI announced an early preview of Stable Diffusion 3 in February 2024. The AI model is still in preview, but in April 2024, the team announced they would make Stable Diffusion 3 and Stable Diffusion 3 Turbo available on the Stability AI Developer Platform API after partnering with Fireworks AI, the fastest and most reliable API platform in the market.
Note that Stable Diffusion 3 is simply a series of text-to-image generative AI models. According to the team at Stability AI, the model is “equal to or outperforms” other text-to-image generators, such as OpenAI’s DALL-E 3 and Midjourney v6, in “typography and prompt adherence.”
In this tutorial, you will learn practical steps to get started with the API so you can start generating your own images.
Stable Diffusion 3 introduces several advancements and features that set it apart from its predecessors and make it highly competitive in the text-to-image generation space – particularly in terms of improved text generation and prompt-following capabilities.
Let's explore these advancements:
This section will go through the steps to start with the Stability API.
Step 1: Create your account. You'll need to create an account before you can use Stability AI’s API. You can sign up using a username and password, but new users get 25 free credits for signing up using their Google account.
Step 2: Claim your API key. Once you’ve created your account, you’ll need an API get. This can be found on the API Keys page. In the documentation, Stability AI states that “All APIs documented on this site use the same authentication mechanism: passing the API key in via the Authorization header.”
Step 3: Topping up credits. You must have credits to request the API. Credits are the unit of currency consumed when calling the API – the amount consumed varies across models and modalities. After using up all your credits, you can purchase more through your Billing dashboard at $1 USD per 100 credits.
In this tutorial, we will use Google Colab and ComfyUI to demonstrate how to generate images using the Stable Diffusion 3 API. In the next section, we will cover the steps to get started using each tool.
To get started with Google Colab, you must create a Google account – click the link and follow the instructions.
If you already have a Google account, open a new notebook and follow the steps below.
Note: The code used in this example is taken from the SD3_API tutorial by Stability AI.
Step 1: Install the requirements.
from io import BytesIO import IPython import json import os from PIL import Image import requests import time from google.colab import output
Step 2: Connect to the Stability API.
import getpass # To get your API key, visit https://platform.stability.ai/account/keys STABILITY_KEY = getpass.getpass('Enter your API Key')
Step 3. Define functions
def send_generation_request( host, params, ): headers = { "Accept": "image/*", "Authorization": f"Bearer {STABILITY_KEY}" } # Encode parameters files = {} image = params.pop("image", None) mask = params.pop("mask", None) if image is not None and image != '': files["image"] = open(image, 'rb') if mask is not None and mask != '': files["mask"] = open(mask, 'rb') if len(files)==0: files["none"] = '' # Send request print(f"Sending REST request to {host}...") response = requests.post( host, headers=headers, files=files, data=params ) if not response.ok: raise Exception(f"HTTP {response.status_code}: {response.text}") return response
Step 4. Generate images.
According to the documentation, the Stable Image services include only one offering that’s currently in production:
Let’s test them out.
In this example, we will create an image of a Toucan bird in a lowland tropic area.
# SD3 prompt = "This dreamlike digital art captures a vibrant, Toucan bird in a lowland tropic area" #@param {type:"string"} negative_prompt = "" #@param {type:"string"} aspect_ratio = "1:1" #@param ["21:9", "16:9", "3:2", "5:4", "1:1", "4:5", "2:3", "9:16", "9:21"] seed = 0 #@param {type:"integer"} output_format = "jpeg" #@param ["jpeg", "png"] host = f"https://api.stability.ai/v2beta/stable-image/generate/sd3" params = { "prompt" : prompt, "negative_prompt" : negative_prompt, "aspect_ratio" : aspect_ratio, "seed" : seed, "output_format" : output_format, "model" : "sd3", "mode" : "text-to-image" } response = send_generation_request( host, params ) # Decode response output_image = response.content finish_reason = response.headers.get("finish-reason") seed = response.headers.get("seed") # Check for NSFW classification if finish_reason == 'CONTENT_FILTERED': raise Warning("Generation failed NSFW classifier") # Save and display result generated = f"generated_{seed}.{output_format}" with open(generated, "wb") as f: f.write(output_image) print(f"Saved image {generated}") output.no_vertical_scroll() print("Result image:") IPython.display.display(Image.open(generated))
Here’s what it created:
Image created by author using Stable Diffusion 3
Now, let’s create an image of a car made out of fruits using SD3 Turbo:
#SD3 Turbo prompt = "A car made out of fruits." #@param {type:"string"} aspect_ratio = "1:1" #@param ["21:9", "16:9", "3:2", "5:4", "1:1", "4:5", "2:3", "9:16", "9:21"] seed = 0 #@param {type:"integer"} output_format = "jpeg" #@param ["jpeg", "png"] host = f"https://api.stability.ai/v2beta/stable-image/generate/sd3" params = { "prompt" : prompt, "aspect_ratio" : aspect_ratio, "seed" : seed, "output_format" : output_format, "model" : "sd3-turbo" } response = send_generation_request( host, params ) # Decode response output_image = response.content finish_reason = response.headers.get("finish-reason") seed = response.headers.get("seed") # Check for NSFW classification if finish_reason == 'CONTENT_FILTERED': raise Warning("Generation failed NSFW classifier") # Save and display result generated = f"generated_{seed}.{output_format}" with open(generated, "wb") as f: f.write(output_image) print(f"Saved image {generated}") output.no_vertical_scroll() print("Result image:") IPython.display.display(Image.open(generated))
Running this code produced the following image:
Image created by author using Stable Diffusion 3 Turbo
ComfyUI is a robust and flexible graphical user interface (GUI) for stable diffusion. It features a graph-based interface and uses a flowchart-style design to enable users to create and run sophisticated, stable diffusion workflows.
The simplest method for installing ComfyUI on Windows involves utilizing the standalone installer found on the releases page. This installer includes essential dependencies such as PyTorch and Hugging Face Transformers, eliminating the need for separate installations.
It provides a comprehensive package, enabling a swift setup of ComfyUI on Windows without requiring intricate configurations.
Simply download, extract, add models, and launch!
Step 1.1: Download the standalone version of ComfyUI from this GitHub repository – clicking the link will initiate the download.
Step 1.2: Once you've downloaded the most recent comfyui-windows.zip file, extract it using a utility such as 7-Zip or WinRAR.
Step 1.3: A checkpoint model is required to start using ComfyUI. You can download a checkpoint model from Stable Diffusion or Hugging Face . Put the model in the folder:
from io import BytesIO import IPython import json import os from PIL import Image import requests import time from google.colab import output
Step 1.4: Now, simply run the run_nvidia_gpu.bat (recommended) or run_cpu.bat. This should automatically start ComfyUI on your browser.
The command line will execute and generate a URL http://127.0.0.1:8188/ that you can now open in your browser.
Within the File Explorer application, locate the directory you just installed. Given you’re using Windows, it should be named “ComfyUI_windows_portable.” From here, navigate to ComfyUI, and then custom_nodes. From this location, type cmd in the address bar and press Enter.
This should open up a command prompt terminal, where you must insert the following command:
import getpass # To get your API key, visit https://platform.stability.ai/account/keys STABILITY_KEY = getpass.getpass('Enter your API Key')
Once it’s complete, restart ComfyUI. The new “Manager” button should appear on the floating panel.
Select the Manage button and navigate to “Install Custom Nodes.” From here, search “stability API.”
Locate the "Stability API nodes for ComfyUI" node, then click the Install button situated on the right side to initiate the installation process. Following this, a “Restart” button will become visible. Click on “Restart” to reboot ComfyUI.
This step is optional, but it’s recommended. Namely, You can set a Stability AI API key for each node within the Stability AI custom node. This prevents the need to input the API key repeatedly in every workflow and reduces the risk of inadvertently sharing your API key when sharing your workflow JSON file.
To do so, navigate to the custom node directory:
from io import BytesIO import IPython import json import os from PIL import Image import requests import time from google.colab import output
Create a new file named sai_platform_key.txt. Paste your API Key into the file, save the document, and then restart ComfyUI.
Install the Stable Diffusion 3 text-to-image workflow and drop it into ComfyUI.
You’re now good to go!
As with any tool, there’s always a chance you’ll encounter a few issues along the way. Here are the most common challenges and troubleshooting steps for users facing issues with the API or the setup process.
Challenge: Users may face authentication errors when accessing the API due to an incorrect API key or wrong authentication credentials.
Troubleshooting: Double-check the API key and ensure it is copied and pasted correctly. Verify that there are no extra spaces or characters in the key. Ensure that the API key is properly authenticated by the Stable Diffusion 3 server.
Challenge: Users may encounter issues related to credit management, such as insufficient credits or billing errors.
Troubleshooting: Check your credit balance in the Stable Diffusion 3 dashboard to ensure that you have sufficient credits. Verify your billing information and address any billing errors or discrepancies with the support team.
Challenge: Users may experience connectivity issues or network interruptions that prevent them from accessing the API.
Troubleshooting: Ensure that you have a stable internet connection and that there are no network disruptions. To isolate the issue, try accessing the API from a different network or device. Contact your internet service provider if you continue to experience connectivity problems.
Challenge: Users may encounter compatibility issues or dependency errors when installing or using the required tools and libraries.
Troubleshooting: Check the compatibility requirements of the Stable Diffusion 3 API and ensure that you are using compatible versions of tools and libraries. Update or reinstall any dependencies that are causing errors. Refer to the documentation and community forums for troubleshooting guidance.
Challenge: Users may experience slow response times or performance issues when interacting with the API, particularly during peak usage times.
Troubleshooting: Monitor the API's performance and track response times to identify patterns or trends. Consider upgrading to a higher-tier subscription plan for better performance and priority access. Contact the support team if you consistently experience slow response times.
Challenge: Users may encounter difficulties understanding the API documentation or require assistance troubleshooting specific issues.
Troubleshooting: For guidance on API usage, troubleshooting, and best practices, refer to the Stable Diffusion 3 documentation. If you have any unresolved issues or questions, contact the support team or community forums.
Stable Diffusion 3 is a series of text-to-image generative AI models. This article covered practical steps to start using the API with Google Colab and ComfyUI. Now, you have the skills to create your own images; be sure to apply what you learned as soon as possible so you do not forget.
Thanks for reading!
Best practices for using the Stable Diffusion 3 API include providing clear and specific prompts, experimenting with different parameters to achieve desired results, monitoring credit usage to avoid depletion, and staying updated with the latest documentation and features.
Stable Diffusion comprises a collection of AI models focused on generating images from textual prompts. Users provide descriptions of desired images, and the model generates corresponding visual representations based on these prompts.
Stable Diffusion 3 employs a diffusion transformer architecture akin to Sora, diverging from prior versions that utilized a diffusion model akin to most existing image generation AIs. This innovation merges the transformer architecture commonly used in large language models such as GPT with diffusion models, offering the potential to leverage the strengths of both architectures.
The above is the detailed content of How to Use the Stable Diffusion 3 API. For more information, please follow other related articles on the PHP Chinese website!