Issue 01 / Field notes for practical AI
AIAI Tutorials Hub
image

Train a Flux LoRA on your own face in 30 minutes (Replicate + ComfyUI)

Step-by-step Flux LoRA training on Replicate, then running the LoRA inside ComfyUI — with gotchas on overfitting, photo count, and running locally vs cloud.

Updated
Read time
7 min read
Difficulty
Advanced
Author
By the AI Tutorials Hub editors

Train a Flux LoRA on your own face in 30 minutes (Replicate + ComfyUI)

Training a custom image model used to mean renting a GPU, installing PyTorch, debugging CUDA, and losing a weekend. With Replicate's one-click trainer and ComfyUI's node-based inference, you can go from "10 photos of yourself" to "a Flux LoRA that draws you in any scene" in about 30 minutes. This is the workflow I actually use.

What you'll learn

  • What a LoRA actually is (in 90 seconds, no math)
  • How to gather the right training photos
  • The Replicate training step-by-step
  • Running the LoRA inside ComfyUI
  • The two overfitting gotchas that wreck most first attempts

What a LoRA actually is

A LoRA (Low-Rank Adaptation) is a small add-on to a base image model. Think of it as a "style pack" or "subject pack" that you can plug in at inference time.

The base model stays frozen. The LoRA is a few hundred megabytes of learned weights that steer the base model toward a specific subject (your face), style (watercolor), or concept (a particular product).

For Flux (the open-weights image model from Black Forest Labs), a LoRA is typically 100-400MB and trains in 20-40 minutes on a single A100/H100 GPU. The training cost on Replicate is roughly $1-3 per run as of 2026.

Tip
You do not need to understand the math. You do need to understand that "more training" is not better — overfitting is the main failure mode, and the rest of this guide is mostly about avoiding it.

Step 1 — Gather training photos (15 minutes)

The single biggest determinant of LoRA quality is the training set. Bad photos in, bad LoRA out.

How many photos

10-30 is the sweet spot. Fewer than 10 and the model overfits to a single expression. More than 30 and the model underfits and produces a generic face.

What photos work

A good set has:

  • Varied angles — front, three-quarter, profile, slightly above, slightly below. Not all front-facing.
  • Varied expressions — neutral, smiling, serious, talking. Not all smiling.
  • Varied lighting — indoor warm, outdoor cool, low light, bright sun. Not all studio-lit.
  • Varied clothing — different shirts, jackets, hats. Helps the model learn "face" and not "this exact outfit."
  • Sharp focus — no motion blur, no out-of-focus shots. The model cannot learn from a blurry face.

What does not work

  • All the same angle (e.g., 20 selfies from the same arm's length).
  • Heavy makeup or no makeup across the set (the model will learn the makeup, not the face).
  • Group photos (you have to crop tightly to one face).
  • Heavy filters (the model will learn the filter, not the face).
  • Low resolution (below 512×512). The model will produce low-res outputs.

A pre-flight checklist

Before you upload, ask: "If a stranger saw these 20 photos, could they pick me out of a crowd of 1000 people?" If yes, your set is good. If the photos all look the same in vibe and angle, gather more variety.

Tip
Take 30-40 photos in one session with a self-timer or have a friend help. Spend 2 minutes per photo on framing and lighting. The 15 minutes of photo-gathering is the highest-ROI part of the whole process.

Step 2 — Train on Replicate (15-25 minutes)

Go to replicate.com and search for "Flux LoRA training." As of 2026, the canonical trainer is ostris/flux-dev-lora-trainer or replicate/flux-lora-trainer. The UI is similar.

Inputs

  • Input images: zip your 20-30 photos. The trainer auto-resizes and auto-captions them.
  • Trigger word: a single word that does not exist in the base model's vocabulary. I use ohwx (random letters). At inference time, you include this word in the prompt to activate the LoRA.
  • Training steps: 1000-2000 for 20 images. More steps = more overfitting. Start with 1500.
  • Learning rate: 4e-4 is the safe default.
  • Lora rank: 16 or 32. Higher rank = more capacity, more overfitting risk. 16 is fine for faces.

Click Run

Cost is roughly $1-3. The run takes 15-25 minutes. Replicate emails you when it is done. The output is a .safetensors file (your LoRA weights) and a tar archive of test images the trainer generated during training.

Evaluate the test images

The trainer generates 4-6 sample images at the end of training, using the trigger word. Look at them:

  • If the face is recognizably you in 4-6 of them → the LoRA is good. Save it.
  • If the face is blurry or "AI-generic" → you underfit. Increase steps to 2000, retrain.
  • If the face is pixel-perfect in training samples but breaks at inference → you overfit. Decrease steps to 1000, retrain with a more varied dataset.
Tip
Do not trust the training samples alone. Always test the LoRA in ComfyUI (next step) with 3-4 different prompts. Training samples are cherry-picked by the trainer.

Step 3 — Run the LoRA in ComfyUI (5 minutes)

ComfyUI is a node-based Stable Diffusion / Flux UI. Install it from github.com/comfyanonymous/ComfyUI.

Load a Flux base model checkpoint

Download a Flux.1 dev checkpoint (or use the FP8 quantized version for lower VRAM). Place it in ComfyUI/models/diffusion_models/.

Add a LoRA loader node

In your workflow, between the checkpoint loader and the sampler, add a Load LoRA node. Point it to your downloaded .safetensors file. Set the strength to 0.8-1.0.

Prompt with the trigger word

ohwx, portrait of a person in a sunlit Tokyo alley, soft afternoon light,
35mm photo, f/2.8 --ar 3:4

The ohwx trigger word activates the LoRA. Without it, the model falls back to the base model's defaults.

Generate

Queue the prompt. You should see the LoRA face in the new scene. If the face is too strong, lower the LoRA strength to 0.6. If it is too weak, raise to 1.0.

Gotchas

1. Overfitting

The most common failure. Symptoms: training samples look perfect, but any new prompt produces a face that looks like a distorted version of you — eyes too big, skin too smooth, or features subtly wrong.

Fix: reduce training steps (from 1500 to 1000), gather a more varied dataset, or lower LoRA strength at inference (0.6-0.7 instead of 1.0).

2. Underfitting

The opposite. The face in your outputs is recognizably Flux's "default person," not you.

Fix: increase steps to 2000-2500, add more training photos (5-10 more), or raise LoRA strength to 1.0-1.2.

3. Trigger word collision

If your trigger word happens to be a real English word (or close to one), the base model's prior on that word will fight your LoRA. Use a nonsense word (ohwx, xyzq) to avoid this.

4. Resolution mismatch

Flux was trained at 1024×1024. If you train at higher resolution or use very wide aspect ratios at inference, you may see duplication or distortion. Stay close to 1024×1024 for training, 1:1 / 3:4 / 4:3 for inference.

5. Cost surprise on long runs

Replicate charges by the second. A 30-minute run is fine; a 4-hour run (from leaving default settings and 5000 steps) is a $40 surprise. Always set steps explicitly.

FAQ

How many training photos do I need?

10-30. Fewer than 10 overfits; more than 30 underfits.

Can I train a LoRA on a paid model (like Flux Pro)?

Not directly. Flux Pro is API-only and does not expose weights. You can train on Flux Dev (open weights) and the LoRA will mostly transfer, but quality will be slightly lower than if you trained on Pro.

Why does my LoRA look like me at age 5?

Usually a dataset problem. If your training photos are all from one age or one hairstyle, the model has no other context. Add variety.

Can I train multiple LoRAs and stack them?

Yes. In ComfyUI, chain multiple LoRA loader nodes. Common stacks: face LoRA + style LoRA + concept LoRA. Strength per LoRA usually needs to be reduced (0.5-0.7 each) to avoid over-coercion.

How long should the trigger word be?

One word, ideally 4-6 random characters. Two words works but is more typing per prompt.

Can I sell the outputs commercially?

For LoRAs trained on your own face or on public-domain imagery, yes. For LoRAs trained on other people, IP, or trademarked characters, no — same rules as any generative AI use.

What if I don't have a powerful local GPU?

Skip ComfyUI, use Replicate's hosted inference instead. The hosted Flux + LoRA workflow is roughly $0.01-0.05 per image.

How do I share the LoRA with a friend?

Send the .safetensors file (typically 100-400MB). They drop it into their ComfyUI models/loras/ folder, add a LoRA loader node, and use the same trigger word. Replicate also has a "push to Replicate" feature that creates a hosted version of your LoRA.

Frequently asked questions

How many training photos do I need?

10-30. Fewer than 10 overfits; more than 30 underfits.

Can I train a LoRA on a paid model (like Flux Pro)?

Not directly. Flux Pro is API-only and does not expose weights. Train on Flux Dev (open weights) and the LoRA will mostly transfer, with slightly lower quality.

Why does my LoRA look like me at age 5?

Usually a dataset problem. If your training photos are all from one age or hairstyle, the model has no other context. Add variety.

Can I train multiple LoRAs and stack them?

Yes. In ComfyUI, chain multiple LoRA loader nodes. Strength per LoRA usually needs to be reduced (0.5-0.7 each) to avoid over-coercion.

Can I sell the outputs commercially?

For LoRAs trained on your own face or on public-domain imagery, yes. For LoRAs trained on other people, IP, or trademarked characters, no.

Related tutorials