🖌️ Generate Images with Stable Diffusion XL on Mac M1/M2 (January 2026)
Introduction
In this comprehensive tutorial, we’ll guide you through setting up and using Stable Diffusion XL, a powerful text-to-image model from CompVis, on your Mac M1 or M2 device. By the end of this guide, you’ll have a local setup that can generate stunning images based on textual descriptions. This is especially useful for creative professionals, researchers, or anyone interested in exploring the capabilities of large language models in generating visual content.
Prerequisites
Before we begin, ensure you have the following prerequisites installed:
Python 3.10+: If you haven’t already, install Python using Homebrew:
brew install python@3.10PyTorch 1.12 with MPS backend (for Mac M1/M2): To leverage the power of your Mac’s GPU.
pip install torch==1.12 --extra-index-url https://download.pytorch.org/whl/cu102Transformers 4.18: The library that provides Stable Diffusion XL model.
pip install transformers==4.18Gradio 3.7: For creating a simple UI to interact with our model.
pip install gradio==3.7
Step 1: Project Setup
First, let’s set up a new directory for our project and navigate into it:
mkdir stable_diffusion_xl
cd stable_diffusion_xl
Next, create a requirements.txt file to keep track of the packages we’ll use:
python=3.10
torch==1.12 --extra-index-url https://download.pytorch.org/whl/cu102
transformers==4.18
gradio==3.7
Now, install the required packages using pip:
pip install -r requirements.txt
Step 2: Core Implementation
Create a new Python file named diffusion.py and add the following code:
import torch
from transformers import StableDiffusionXLProcessor, StableDiffusionXLPipeline
class ImageGenerator:
def __init__(self):
self.model_id = "CompVis/stable-diffusion-xl-base-1.0"
self.processor = StableDiffusionXLProcessor.from_pretrained(self.model_id)
self.pipe = StableDiffusionXLPipeline.from_pretrained(self.model_id, torch_dtype=torch.float16).to("mps")
def generate_image(self, prompt):
inputs = self.processor(text=prompt, num_inference_steps=50, guidance_scale=7.5, width=640, height=640)
image = self.pipe(**inputs).images[0]
return image
def main():
generator = ImageGenerator()
prompt = "Astronaut riding a horse on Mars"
generated_image = generator.generate_image(prompt)
generated_image.save("generated_image.png")
if __name__ == "__main__":
main()
This script initializes the Stable Diffusion XL model with a specified ID, processes textual inputs, generates images based on the prompt using the generate_image method, and saves the result as “generated_image.png”.
Step 3: Configuration
No additional configuration is required for this project. The default parameters provided in diffusion.py should work well for most use cases.
Step 4: Running the Code
Run the script with:
python diffusion.py
After running, you’ll find a generated image named “generated_image.png” in your project directory. You can verify its correctness by opening it using any image viewer.
If you encounter any errors related to GPU memory, try reducing the width and height parameters during input processing:
inputs = self.processor(text=prompt, num_inference_steps=50, guidance_scale=7.5, width=320, height=320)
Step 5: Advanced Tips
Fine-tuning the model: You can fine-tune Stable Diffusion XL on your custom dataset using
transformers.Trainerto adapt it to specific artistic styles or domains.Using different models: Explore other text-to-image models like DALL-E 3, Imagen, or Midjourney by replacing the model ID in
ImageGenerator’s constructor.
Results
Upon completion of this tutorial, you’ll have successfully generated an image using Stable Diffusion XL on your Mac M1/M2 device. The generated image will be saved as “generated_image.png” and should depict the scene described in the prompt (“Astronaut riding a horse on Mars”).
Going Further
Here are some next steps to build upon this project:
Create a Gradio UI: Use Gradio to create an easy-to-use interface for your image generation model: Gradio Quick Start.
Experiment with different prompts and parameters: Try varying the prompt, number of inference steps, guidance scale, and image dimensions to see how they affect the generated images.
Fine-tune the model on custom data: Follow the official guide on fine-tuning Stable Diffusion for more advanced use cases.
Conclusion
In this tutorial, we successfully set up and used Stable Diffusion XL to generate images locally on a Mac M1/M2 device. We covered installation, setup, implementation, configuration, running the code, and advanced tips to enhance your experience with text-to-image models. Happy image generating! 📸🚀
💬 Comments
Comments are coming soon! We're setting up our discussion system.
In the meantime, feel free to contact us with your feedback.