Complete transparency on how we analyzed 1,250 interviews from the Anthropic Interviewer dataset.
Anthropic/AnthropicInterviewer on HuggingFace (MIT License)
Each interview is a full conversation transcript between an AI interviewer and a human participant about their experiences with AI tools at work. Interviews vary in length from ~500 to ~3000 words.
All 1,250 interviews were analyzed using GPT-4o-mini with structured outputs, extracting consistent fields across 47 dimensions per interview.
How participants think about and describe AI
primary_metaphorrelationship_framingagency_attributionanthropomorphization_levelEmotional responses during the interview
primary_emotions[]emotional_intensityemotional_valenceemotional_triggers[]What tasks are delegated vs protected from AI
high_ai_tasks[]protected_tasks[]boundary_logicSocial and organizational dynamics
workplace_culturepeer_influencedisclosure_patternclient_considerationsTrust levels, what builds/destroys trust, verification behaviors
overall_trusttrust_drivers[]distrust_drivers[]verification_habittrust_trajectoryHow AI affects professional identity and sense of meaning
professional_identity_strengthidentity_threat_levelmeaning_sourceskill_anxietyexpertise_relationshipmeaning_disruptionThe arc of AI adoption over time
experience_leveladoption_trajectorypivotal_moments[]learning_approachfuture_intentionsEthical reasoning and moral responses to AI use
ethical_concerns_presentprimary_ethical_frameguilt_or_shamemoral_language[]Internal conflicts about AI (multiple per interview)
tension_namemanifestationresolution_statusVerbatim quotes with thematic tagging (3-5 per interview)
quotethemesignificanceOpen-ended theme discovery
theme_nameevidenceHigh-level characterization
one_sentence_summaryarchetyperesearch_valuestandout_insightThe "struggle score" is a composite index we created to quantify psychological difficulty with AI adoption. It is NOT a validated psychological measure—it's an exploratory metric derived from the structured analysis.
none=0, low=1, moderate=2, high=3
none=0, low=1, moderate=2, high=3
false=0, true=1
false=0, true=1
If any tension has resolution_status="unresolved", score=1
false=0, true=1
0 (no struggle) to 10 (maximum struggle). In practice, scores ranged from 0 to 8.
After structured extraction, we ran comprehensive EDA across 1,250 analysis results:
The structured analysis is performed by an LLM (GPT-4o-mini), which introduces potential interpretation biases. Different models or prompts might yield different results.
The scientists group (n=51) is significantly smaller than workforce (n=1,065) or creatives (n=134). Percentages for scientists should be interpreted with caution.
The "struggle score" is an exploratory composite index we created, not a validated psychological measure. It should be treated as directional, not definitive.
All data comes from self-reported interviews. Participants may not accurately represent their actual behaviors or feelings.
These interviews were conducted at a specific moment in AI development. Attitudes and behaviors may have changed since data collection.
We encourage independent verification of our findings.
The Anthropic Interviewer dataset is publicly available on HuggingFace under MIT license.
Our analysis scripts and Pydantic schema are available on request. Contact: research@playbookatlas.com
The complete JSON output of all 1,250 structured analyses is available on request for independent verification.