"A user who cannot be honest with AI gives the system nothing to work with. Warm alignment is not a softer AI — it is a more accurate one."
I research how AI systems break down in real user interactions — and how prompt-level interventions can correct that behavior without model retraining.
My work sits at the intersection of user experience, behavioral alignment, and prompt architecture.
Warm AI Support Protocol Identified a failure pattern in which AI advisors respond to user mistakes with judgment rather than support — causing users to withhold information and undermining the system's ability to help. Designed a persona-level prompt intervention that produced reproducible behavioral correction across independent sessions, without retraining.
A single principle, embedded at the identity layer, changed whether a user felt safe enough to tell the truth.
Hexalemma Framework A diagnostic map of six structural failure modes in LLM safety systems — from over-censorship to deceptive alignment — and the case for dynamic, human-in-the-loop mitigation as the only viable path forward.
- Persona Extraction: Asking models to articulate their own behavioral structure, then embedding alignment principles at the identity layer rather than the rule layer
- Principle Injection: Encoding core values as first-person motivations, not system-level prohibitions
- Behavioral Observation: Tracking how prompt changes affect user honesty, engagement, and trust across sessions
AI Alignment Portfolio (EN) | AI 정렬 포트폴리오 (KO)
Based in Toronto, Canada | Open Work Permit Open to: Contractor / Consultant / Full-time roles in AI Alignment, Prompt Engineering, or LLM Evaluation
"Warm alignment is not a feature. It is the condition under which all other features become useful."
