Work In progress

Image Generation Example

This example shows how to generate images using the Stable diffusion template.

Substitute the following values with your own:

model_id: The model ID of the model you want to use. Example: black-forest-labs/FLUX.2-klein-4B
API_KEY: Your API key (if authentication is required)
API_URL: The URL of the API. Example: http://kalavai-api.public.kalavai.net:31311/v1/images/generations

import time
import requests
import base64

t = time.time()
MODEL_ID = "<your-model-id>"
API_URL = "<your-api-url>"
API_KEY = "<your-api-key>"

prompt = "a realistic picture of a sunset."
resp = requests.post(
    API_URL,
    headers={
        "Authorization": f"Bearer {API_KEY}"
    },
    json={
        "prompt": prompt,
        "model": MODEL_ID,
        "n": 1,
        "size": "256x256", # only 256x256, 512x512, 1024x1024 are supported
        "response_format": "b64_json",
        "extra": {"num_inference_steps": 5} # supported parameters https://huggingface.co/docs/diffusers/api/pipelines/flux#diffusers.FluxPipeline
    },
)
print(f"Inference time: {time.time()-t:.2f}s")

# Debug: Print the response structure
print("Response status:", resp.status_code)

for i in range(len(resp.json()["data"])):
    image_data = resp.json()["data"][i].get("b64_json")
    if image_data is None:
        print(f"Warning: No b64_json data found for image {i}")
        print(f"Available keys: {list(resp.json()['data'][i].keys())}")
        continue

    with open(f"output_{i}.png", 'wb') as f:
        f.write(base64.b64decode(image_data))

Batched Inference Example

This example demonstrates batched inference with multiple images and batch size optimization. This is a more efficient way to generate multiple images at once, rather than sending individual requests, having them processed at once.

import time
import requests
import base64

t = time.time()
MODEL_ID = "<your-model-id>"
API_URL = "<your-api-url>"
API_KEY = "<your-api-key>"

resp = requests.post(
    API_URL,
    headers={
        "Authorization": f"Bearer {API_KEY}"
    },
    json={
        "prompt": "a majestic mountain landscape with snow peaks.", 
        "model": MODEL_ID,
        "n": 2,  # Generate 2 images per prompt
        "batch_size": 2,  # Process in batches of 2 for efficiency
        "size": "256x256",
        "response_format": "b64_json",
        "extra": {"num_inference_steps": 5}
)
print(f"Batched inference time: {time.time()-t:.2f}s")

# Debug: Print the response structure
print("Response status:", resp.status_code)
print(f"Total images generated: {len(resp.json()['data'])}")

for i in range(len(resp.json()["data"])):
    image_data = resp.json()["data"][i].get("b64_json")
    if image_data is None:
        print(f"Warning: No b64_json data found for image {i}")
        print(f"Available keys: {list(resp.json()['data'][i].keys())}")
        continue

    with open(f"batched_output_{i}.png", 'wb') as f:
        f.write(base64.b64decode(image_data))

Performance tips

Increase the batch_size and n parameters to process multiple images in a single request

Use batched inference when generating multiple images to improve efficiency

Set the size parameter to "256x256" for faster generation

Lower resolution images (256x256) are faster to generate than higher resolution images (512x512 or 1024x1024)

Set the num_inference_steps parameter to a lower value for faster generation

Reduce the number of inference steps for faster generation. Quality generally platoes after 4 r 5 iterations, but one might find good results with fewer steps.