Can AI Explain Your Moisturizer? Ours Can, Almost…

Apr 29

TL;DR:

We're prototyping a science-first skincare interpreter, powered by structured ingredient data and conversational AI. Before training our own model, we simulated behavior using ChatGPT to refine logic, tone, and fallback UX.

Outcome so far: Clearer tone, better guardrails for ambiguity, and early insight into user mental models

Read the previous post here.

This post builds on the original project overview. You can read the full context here. → Original case study

Step 1: Choosing 5 Products from 5 Countries

Before scaling to local models, we tested how GenAI could educate users with ingredient clarity, not fear-based claims.

We focused on three vectors:

Behavioral Prototyping: Used ChatGPT and structured ingredient data to simulate how the final chatbot might respond, tuning tone, logic, and fallbacks.
Cross-Product Testing: Selected 5 regionally diverse products (Korea, Japan, US, EU, Australia) to reflect global formulation philosophies and user trust triggers.
- 🇰🇷 Korea: No. 3 Skin Softening Serum
  → Ferments, humectants, and biotech actives
- 🇯🇵 Japan: Biore UV Aqua Rich Watery Essence
  → Lightweight SPF with skin-soothing agents
- 🇺🇸 United States: CeraVe Resurfacing Retinol Serum
  → Potent actives with marketing-led language
- 🇪🇺 Europe: Bioderma Sensibio H2O Micellar Water
  → EU-regulated, fragrance-free formulation for sensitive skin
- 🇦🇺 Australia: Blue Lizard Face Mineral Sunscreen SPF 30+
  → Reef-safe, mineral-based sunscreen with natural-leaning claims
UX Prompt Stress-Testing: Designed queries around actives, risk, and ingredient roles (e.g., “Is this safe for sensitive skin?”) to pressure-test nuance handling.

Step 2: Ingredient Data from INCIDecoder

We pulled profiles from NCIDecoder, valued for its accessible, science-aligned structure. We tagged ~80 ingredients with:

INCI name + common aliases
Functions (e.g., humectant, preservative)
Risk or controversy notes
Suitability tags (e.g., sensitive skin, acne-prone)

This dataset drives both prompt logic and UX design.

Step 3: Training ChatGPT as a Behavioral Prototype

AI Prompt Simulation Highlights:

“What’s the difference between Retinol and Niacinamide?”
“Is Ethylhexylglycerin okay for sensitive skin?”
“What’s the role of ferments in this formula?”

We tuned for responses that were:

Scientifically accurate but plainspoken
Transparent about uncertainty
Sensitive to user safety concerns

UX + Error Handling

We mapped chatbot responses for edge cases like:

Unsupported queries
Ingredient unknowns
Debated claims (e.g., parabens, silicones)

Fallback logic: Honest, helpful, and non-alarmist (e.g., “We don’t have data on that ingredient yet.”)

What’s Next:

We’re turning these findings into a locally hosted model with tighter guardrails and real-time UI feedback.

If you're building AI UX where trust matters more than novelty, we’d love to swap notes.

Phases:

Phase 1: Testing + Iteration (Current Phase)

Integrate with LLM or retrieval-augmented model
Conduct usability testing on language clarity and risk phrasing
Align outputs with accessibility and enterprise design systems

Phase 2: Testing + Iteration

Begin user testing with real ingredient lists
Test upload and paste workflows
Build a trust-feedback loop to guide future improvements

Future Plans

Layer in user profiles (e.g., acne-prone, sensitive skin, rosacea)
Enable “smart” questions (e.g., “Is this pregnancy safe?”)
Develop a Chrome extension for ingredient popovers while brows

Glow Getter