AI Integration Nodes

Generate captions, titles, and descriptions for your artwork using your preferred AI model. Isekai integrates with 4 LLM providers: Claude (Anthropic), OpenAI GPT, Google Gemini, and local Ollama.

Quick Comparison

Provider	Best For	Requires	Cost
Claude	Long-form descriptions, nuanced captions	API key	Pay-per-use
OpenAI	General purpose, widely available	API key	Pay-per-use
Gemini	Google ecosystem, multimodal	API key	Free tier + paid
Ollama	Privacy, offline use, no API costs	Local install	Free

Claude

Generate high-quality captions using Anthropic’s Claude models.

Location: Isekai/LLMs

Inputs:

text_input (STRING): Prompt or context to generate from
api_key (STRING, optional): Claude API key (uses ANTHROPIC_API_KEY env var if empty)
model (COMBO): claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-opus-20240229
max_tokens (INT, 1-4096): Maximum response length (default: 100)
system_prompt (STRING, optional): Custom instructions for Claude

Outputs:

generated_text (STRING): Claude’s response

Recommended Models:

claude-3-5-sonnet-20241022: Best quality, most capable
claude-3-5-haiku-20241022: Fast and cost-effective
claude-3-opus-20240229: Highest intelligence (expensive)

Environment Variable:

export ANTHROPIC_API_KEY="sk-ant-api03-..."

Example:

text_input: "A warrior woman in golden armor standing on a mountain peak"
system_prompt: "Generate a short, catchy title (5-10 words max)"
Output: "Golden Warrior Atop Mountain Peak"

OpenAI

Generate captions using GPT-3.5 or GPT-4 models.

Location: Isekai/LLMs

Inputs:

text_input (STRING): Prompt or context to generate from
api_key (STRING, optional): OpenAI API key (uses OPENAI_API_KEY env var if empty)
model (COMBO): gpt-4-turbo, gpt-4, gpt-3.5-turbo
max_tokens (INT, 1-4096): Maximum response length (default: 100)
system_prompt (STRING, optional): Custom instructions for GPT

Outputs:

generated_text (STRING): GPT’s response

Recommended Models:

gpt-4-turbo: Best balance of quality and speed
gpt-3.5-turbo: Fast and cost-effective
gpt-4: Highest quality (slower, expensive)

Environment Variable:

export OPENAI_API_KEY="sk-..."

Example:

text_input: "portrait of a cyberpunk hacker in neon-lit alley"
system_prompt: "Write a dramatic one-sentence description"
Output: "A lone hacker emerges from shadows, neon reflections dancing across chrome implants."

Gemini

Generate captions using Google’s Gemini models.

Location: Isekai/LLMs

Inputs:

text_input (STRING): Prompt or context to generate from
api_key (STRING, optional): Gemini API key (uses GEMINI_API_KEY env var if empty)
model (COMBO): gemini-1.5-pro, gemini-1.5-flash, gemini-1.0-pro
max_tokens (INT, 1-8192): Maximum response length (default: 100)
system_prompt (STRING, optional): Custom instructions for Gemini

Outputs:

generated_text (STRING): Gemini’s response

Recommended Models:

gemini-1.5-pro: Most capable, multimodal
gemini-1.5-flash: Fast and efficient
gemini-1.0-pro: Stable, proven model

Environment Variable:

export GEMINI_API_KEY="AIza..."

Example:

text_input: "fantasy dragon breathing fire over medieval castle"
system_prompt: "Create a short title suitable for an art gallery"
Output: "Dragon's Fury: Castle Siege"

Ollama

Generate captions using local open-source models (fully offline).

Location: Isekai/LLMs

Requirements: Ollama running locally (ollama.com)

Inputs:

text_input (STRING): Prompt or context to generate from
ollama_url (STRING): Ollama server URL (default: “http://localhost:11434”)
model (COMBO): Dynamically populated from your Ollama installation

Outputs:

generated_text (STRING): Model’s response

Special Outputs:

"Untitled": Empty input
"Connection Failed": Cannot reach Ollama
"Error: 404": Model not found

Popular Models:

llama3: Meta’s flagship model (best quality)
mistral: Fast and capable
gemma: Google’s open model
phi: Microsoft’s efficient model

Example:

text_input: "A highly detailed digital painting of a fierce warrior"
model: llama3
Output: "Fierce Warrior Portrait"

Setup Ollama

Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Pull a model
Terminal window
```
ollama pull llama3
```
Verify it’s running
Terminal window
```
curl http://localhost:11434/api/tags
```

System Prompts Guide

System prompts control how the AI generates text. Here are templates for common use cases:

Short Titles (5-10 words)

Generate a short, catchy title (5-10 words max) for this artwork. Be creative and evocative.

Long Descriptions (1-2 sentences)

Write a vivid, detailed description of this artwork in 1-2 sentences. Focus on mood, composition, and key visual elements.

SEO-Friendly Titles

Create an SEO-friendly title that describes the artwork clearly while being engaging. Include key visual elements.

Poetic Captions

Write a poetic, atmospheric caption that captures the essence and mood of this artwork. Be artistic and evocative.

Write an engaging social media caption for this artwork. Be concise, use emojis if appropriate, and make it shareable.

Environment Variables Setup

Set API keys via environment variables for security (recommended over hardcoding in workflows).

Add to ~/.bashrc or ~/.zshrc:

export ANTHROPIC_API_KEY="sk-ant-api03-..."
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="AIza..."

Then restart ComfyUI.

Set via System Environment Variables:

Open System Properties > Environment Variables
Add New User Variable:
- ANTHROPIC_API_KEY = sk-ant-api03-...
- OPENAI_API_KEY = sk-...
- GEMINI_API_KEY = AIza...
Restart ComfyUI

Add to docker-compose.yml:

services:
  comfyui:
    environment:
      - ANTHROPIC_API_KEY=sk-ant-api03-...
      - OPENAI_API_KEY=sk-...
      - GEMINI_API_KEY=AIza...

Example Workflows

Auto-Title Generation

Dynamic String → CLIP Text Encode → Sampler → VAE Decode
(random prompts)                                    ↓
                                            Claude/OpenAI/Gemini
                                            (generate title)
                                                    ↓
                                              Isekai Upload
                                              (auto-populated title)

Batch Character Renders with AI Titles

Round Robin ──→ Tag Selector ──→ Concatenate ──→ CLIP Text Encode ──→ Sampler
(characters)    (char tags)      (full prompt)                           ↓
                                                                     VAE Decode
                                                                          ↓
                                  ┌─────────────────── (pass prompt)─────┘
                                  ↓
                           Ollama (generate title)
                                  ↓
                           Isekai Upload (with AI title)

Multi-Model Caption Comparison

Image → Color Adjust → Vignette → [Split to 3 paths]
                                        ↓
                            ┌───────────┼───────────┐
                            ↓           ↓           ↓
                          Claude     OpenAI      Gemini
                         (title)    (title)     (title)
                            ↓           ↓           ↓
                       [Compare outputs and choose best]

Cost Comparison (Approximate)

Provider	Model	Cost per 1M tokens (input)	Cost per 1M tokens (output)
Claude	Sonnet 3.5	$3.00	$15.00
Claude	Haiku 3.5	$1.00	$5.00
OpenAI	GPT-4 Turbo	$10.00	$30.00
OpenAI	GPT-3.5 Turbo	$0.50	$1.50
Gemini	Pro 1.5	$1.25	$5.00
Gemini	Flash 1.5	$0.075	$0.30
Ollama	All models	Free	Free

Tips for Better Results

Be specific in prompts: Include key visual elements from your artwork
Use system prompts: Control output length and style
Test different models: Each has different strengths
Keep max_tokens low: For titles, 50-100 tokens is plenty
Use Ollama for experimentation: Free and unlimited testing

Troubleshooting

”Connection Failed”

Ollama only: Ensure Ollama is running

ollama serve

”Invalid API Key”

Cloud models: Verify your API key is correct and has credits

# Test Claude
curl https://api.anthropic.com/v1/messages -H "x-api-key: $ANTHROPIC_API_KEY"

# Test OpenAI
curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"

# Test Gemini
curl "https://generativelanguage.googleapis.com/v1/models?key=$GEMINI_API_KEY"

”Model not found”

Ollama: Pull the model first

ollama pull llama3

Cloud models: Check model name spelling (case-sensitive)

Empty or Truncated Responses

Increase max_tokens parameter
Simplify your prompt
Try a different model

AI Integration Nodes

Quick Comparison

Claude

OpenAI

Gemini

Ollama

Setup Ollama

System Prompts Guide

Short Titles (5-10 words)

Long Descriptions (1-2 sentences)

SEO-Friendly Titles

Poetic Captions

Social Media Captions

Environment Variables Setup

Example Workflows

Auto-Title Generation

Batch Character Renders with AI Titles

Multi-Model Caption Comparison

Cost Comparison (Approximate)

Tips for Better Results

Troubleshooting

”Connection Failed”

”Invalid API Key”

”Model not found”

Empty or Truncated Responses