Authors: You Joo Lee, Jueun Jung, Keunil Lee, Heejung Uh
Affiliation: Yonsei University Data Science Lab (DSL)
This project explores that question through a Generative AI-based style transfer pipeline that transforms a given face into an idolized version while preserving its original identity.
By combining LoRA fine-tuning, IP-Adapter, ReActor FaceSwap, and RealESRGAN + CodeFormer, we created a pipeline capable of generating SM, YG, and JYP-style portraits from real facial inputs — keeping the person’s identity intact while adding the aesthetic tone of each entertainment agency.
| Item | Description |
|---|---|
| Title | Idolization — Face-to-Idol Generative Model |
| Team | DSL Modeling: Generative Model Team (Yonsei University) |
| Objective | Generate idol-style portraits from input faces while preserving identity fidelity |
| Core Components | LoRA Fine-Tuning · IP-Adapter FaceID Plus v2 · ReActor FaceSwap · RealESRGAN + CodeFormer |
- Curated 20–25 high-quality frontal images for each entertainment agency (SM, YG, JYP).
- Fine-tuned Stable Diffusion v1.5 to capture each agency’s distinct tone, lighting, and visual aesthetic.
- Produced
.safetensorsweights that encode stylistic priors for each label.
📁 Included Files
Dataset_Maker.ipynbtrained_lora_weights/sm_woman.safetensors,yg_woman.safetensors,jyp_woman.safetensorssm_man.safetensors,yg_man.safetensors,jyp_man.safetensors
LoRA served as the “agency training” phase — learning each label’s signature visual identity (e.g., SM’s ethereal tone, YG’s edgy contrast, JYP’s colorful brightness).
- Injected facial embeddings into the diffusion process for identity preservation.
- Ensured that individual facial features remain consistent after style transfer.
- The FaceID Plus v2 version of IP-Adapter is optimized for identity retention, working robustly alongside LoRA and text-based prompts.
- Implemented using ComfyUI, with separate workflows for each agency and gender-specific prompt configuration.
📁 Included File
workflowssm_woman.json,yg_woman.json,jyp_woman.jsonsm_man.json,yg_man.json,jyp_man.json
Conceptually, IP-Adapter acts as a facial anchor, preserving input identity while LoRA applies stylistic conditioning.
- Utilized ArcFace-based embeddings (via InsightFace) to seamlessly swap generated faces into real idol photographs.
- ReActor performs fine-grained facial alignment, tone correction, and boundary blending,
producing natural and photorealistic composites. - RealESRGAN enhances texture fidelity and upscales image resolution.
- CodeFormer restores facial details and removes noise artifacts.
- The combined workflow yields high-resolution, cohesive, and realistic results.
📁 Included File
workflows/reactor_faceswap.json
This stage represents the “debut” — merging the stylized AI-generated identity into real idol imagery.
- Base Model: RealisticVision v5.1 (Stable Diffusion 1.5 derivative)
- Frameworks: PyTorch · Diffusers · ComfyUI
- Training Platforms: Google Colab · Vessl Workspace
- Additional Tools: InsightFace (ArcFace), RealESRGAN, CodeFormer
| SM Woman Style | YG Woman Style | JYP Woman Style |
|---|---|---|
![]() |
![]() |
![]() |
| YG Man Style | Team Photo |
|---|---|
![]() |
![]() |
Each image was generated through the final workflow: LoRA (style) + IP-Adapter (identity) + ReActor (face swap) + RealESRGAN/CodeFormer (refinement).
Idolization_Project
├── Dataset_Maker.ipynb
├── trained_lora_weights/
│ ├── sm_woman.safetensors
│ ├── yg_woman.safetensors
│ ├── jyp_woman.safetensors
│ ├── sm_man.safetensors
│ ├── yg_man.safetensors
│ └── jyp_man.safetensors
├── workflows/
│ ├── sm_woman.json
│ ├── sm_man.json
│ ├── yg_woman.json
│ ├── yg_man.json
│ ├── jyp_woman.json
│ ├── jyp_man.json
│ └── reactor_faceswap.json
└── results/
├── sm_output.png
├── yg_woman_output.png
├── yg_man_output.png
├── jyp_output.png
└── team_output.pngThis project was conducted solely for academic and non-commercial purposes under the Yonsei University Data Science Lab (DSL). All celebrity photographs were used strictly as visual references. All copyrights and likeness rights remain with their respective owners. The generated outputs are for educational presentation use only and are not intended for commercial distribution.
- RealisticVision v5.1 — moiu2998/realisticVisionV60B1_v51VAE.safetensors
- Stable Diffusion VAE (Fine-tuned) — stabilityai/sd-vae-ft-mse-original
- IP-Adapter FaceID Plus v2 — h94/IP-Adapter
- CLIP Vision Model — Used internally for image embedding extraction and conditioning alignment.
- ReActor FaceSwap — Implementation reference from ComfyUI ReActor Tutorial (YouTube)
- RealESRGAN — xinntao/Real-ESRGAN
- CodeFormer — sczhou/CodeFormer




