2025-12-10 –, Abigail Adams
People's visual and brand preferences encode a rich signal of identity that goes beyond clicks or text. In this talk, I present a pipeline for modeling a user’s “aesthetic identity” using Instagram likes, liked visuals, and followed brands. I show how to convert images and brand interactions into embedding spaces, condition a language model (via adapter / LoRA fine-tuning) to emulate that user’s responses, and evaluate the fidelity of that “digital twin.” You’ll leave with a reproducible architecture for persona modeling from multimodal data, along with insights into pitfalls of overfitting, privacy, and drift.
Motivation & Why It Matters
Recommendation systems, personalization, and user modeling often rely heavily on clicks or textual interactions—but much of what distinguishes individual taste is visual: the styles, brands, image compositions a user gravitates toward. If we could build a digital twin of a user that reacts to new visual or brand stimuli in the same way, the applications are powerful (personal assistants, targeted branding, persona-based agents, etc.). But this is nontrivial: you must unify image + brand + interaction signals, fine-tune models without overfitting, and evaluate behavior alignment.
This talk describes an end-to-end approach:
Data & embedding modeling – ingest liked visuals, brand follows, and brand–image relationships
Persona construction & prompt conditioning – encapsulate the user’s profile as structured embeddings + textual persona descriptors
Adapter / LoRA fine-tuning – fine-tune a base LLM (frozen backbone + low-rank adapters) to simulate user-like responses
Inference & memory / retrieval integration – how the twin processes new stimuli, uses context, and responds
Evaluation & limitations – metrics for alignment, holdout testing, drift, bias, and failure modes
Demo / case study & future extensions
AI and Machine Learning Engineering @ Vizit.