Work In progress
Image Generation Example
This example shows how to generate images using the Stable diffusion template.
Substitute the following values with your own:
model_id: The model ID of the model you want to use. Example:black-forest-labs/FLUX.2-klein-4BAPI_KEY: Your API key (if authentication is required)API_URL: The URL of the API. Example:http://kalavai-api.public.kalavai.net:31311/v1/images/generations
import time
import requests
import base64
t = time.time()
MODEL_ID = "<your-model-id>"
API_URL = "<your-api-url>"
API_KEY = "<your-api-key>"
prompt = "a realistic picture of a sunset."
resp = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {API_KEY}"
},
json={
"prompt": prompt,
"model": MODEL_ID,
"n": 1,
"size": "256x256", # only 256x256, 512x512, 1024x1024 are supported
"response_format": "b64_json",
"extra": {"num_inference_steps": 5} # supported parameters https://huggingface.co/docs/diffusers/api/pipelines/flux#diffusers.FluxPipeline
},
)
print(f"Inference time: {time.time()-t:.2f}s")
# Debug: Print the response structure
print("Response status:", resp.status_code)
for i in range(len(resp.json()["data"])):
image_data = resp.json()["data"][i].get("b64_json")
if image_data is None:
print(f"Warning: No b64_json data found for image {i}")
print(f"Available keys: {list(resp.json()['data'][i].keys())}")
continue
with open(f"output_{i}.png", 'wb') as f:
f.write(base64.b64decode(image_data))
Batched Inference Example
This example demonstrates batched inference with multiple images and batch size optimization. This is a more efficient way to generate multiple images at once, rather than sending individual requests, having them processed at once.
import time
import requests
import base64
t = time.time()
MODEL_ID = "<your-model-id>"
API_URL = "<your-api-url>"
API_KEY = "<your-api-key>"
resp = requests.post(
API_URL,
headers={
"Authorization": f"Bearer {API_KEY}"
},
json={
"prompt": "a majestic mountain landscape with snow peaks.",
"model": MODEL_ID,
"n": 2, # Generate 2 images per prompt
"batch_size": 2, # Process in batches of 2 for efficiency
"size": "256x256",
"response_format": "b64_json",
"extra": {"num_inference_steps": 5}
)
print(f"Batched inference time: {time.time()-t:.2f}s")
# Debug: Print the response structure
print("Response status:", resp.status_code)
print(f"Total images generated: {len(resp.json()['data'])}")
for i in range(len(resp.json()["data"])):
image_data = resp.json()["data"][i].get("b64_json")
if image_data is None:
print(f"Warning: No b64_json data found for image {i}")
print(f"Available keys: {list(resp.json()['data'][i].keys())}")
continue
with open(f"batched_output_{i}.png", 'wb') as f:
f.write(base64.b64decode(image_data))
Performance tips
Increase the batch_size and n parameters to process multiple images in a single request
- Use batched inference when generating multiple images to improve efficiency
Set the size parameter to "256x256" for faster generation
- Lower resolution images (256x256) are faster to generate than higher resolution images (512x512 or 1024x1024)
Set the num_inference_steps parameter to a lower value for faster generation
- Reduce the number of inference steps for faster generation. Quality generally platoes after 4 r 5 iterations, but one might find good results with fewer steps.