Data Analytics interview prep.
Senior data analytics coach for an AI/ML platform firm. A data analyst here lives at the intersection of SQL + experimentation + metric design + AI-product specifics. The bar is never 'can you write a query' but 'can you design a clean A/B with guardrails and power, diagnose a 5% drop in daily...
What interviewers look for
- Can the candidate write production-grade SQL fast - window functions, CTEs, funnel + cohort queries - without hand-waving syntax or muddling joins?
- Do they design A/B tests with rigor - hypothesis, MDE, power, guardrails, novelty + interference flags - not just 'we ran a 50/50 test'?
- Can they diagnose an ambiguous metric move (DAU down 5%, retention up 3%) by decomposing + segmenting + ruling out confounds in 15-20 minutes?
- Do they design AI-product metrics where standard funnel logic breaks - quality, eval, drift, token cost, abuse - not just generic engagement?
- Are they business-aware - tie analytics to product + revenue + cost decisions, frame the 'so what', communicate to non-technical exec?
- Do they show data discipline - question the data source, flag confounds, surface limitations, push back on bad questions?
Behavioural questions to expect
Walk me through your CV.
What it tests: Story coherence + genuine fit for an AI-product analytics seat. Teams want evidence of SQL depth, experimentation ownership, and curiosity about AI-product specifics - not pure BI dashboard work without rigor.
Tell me about the analysis or experiment you're proudest of.
What it tests: Depth + ownership + business impact. Tests whether the candidate frames question -> hypothesis -> method -> result -> decision, not 'I made a dashboard'.
Tell me about a weakness, a failure, or feedback you've received and worked on.
What it tests: Self-awareness + analytical honesty. Fake weaknesses downgrade immediately. Analytics mistakes (over-claimed a result, missed a confound, shipped a bad metric, wrong join logic) carry real decision cost.
Why data analytics at an AI firm vs data science or analytics engineering elsewhere?
What it tests: Authentic fit for the AI-product analytics seat: SQL + experimentation + metric design with the AI-specific layer (eval, quality, token cost) - vs pure DS modeling or pure pipeline engineering.
Which team or analytics area would you want to focus on - product, growth, ML, business analytics?
What it tests: Genuine fit + grasp of how analytics teams differ at an AI firm (product analytics drives feature decisions; growth drives acquisition + retention; ML analytics partners with model team; business analytics drives exec + finance).
Why this firm?
What it tests: Whether the candidate has done the homework. Bar: firm-specific evidence from the product, data stack, experimentation culture, and AI-product specifics - not generic 'cool AI firm'.
How would you describe this firm's data + analytics setup in your own words?
What it tests: Whether the candidate has internalized HOW the firm uses data - product, experimentation, eval - not just that it 'has data'. Tests whether they've read public material on the data stack + analytics practices.
How does analytics actually create value at an AI-product firm?
What it tests: Whether the candidate understands AI-product analytics economics: experimentation drives feature wins; metric design + monitoring catches quality + cost drifts that would otherwise erode retention; analytics partners with product + ML + GTM to drive decisions.
Technical concepts to master
SQL + warehouse fluency
- Window functions
- Functions computing over row group defined by PARTITION BY + ORDER BY; ROW_NUMBER, RANK, LAG, LEAD, SUM/AVG OVER, NTILE.
- CTEs + query structure
- WITH ... AS (...) clauses that break a query into readable named steps; chain into final SELECT.
- Joins + edge cases
- INNER / LEFT / FULL / CROSS joins; row explosion from many-to-many; NULL handling in equality; correlated subquery anti-patterns.
- Warehouse + dbt
- Modern stack: cloud warehouse + transformation layer (dbt) + BI tool; warehouse-specific SQL dialect differences in DATE_TRUNC, ARRAY / STRUCT handling, qualifiers.
Experimentation rigor
- Hypothesis + primary metric
- Pre-registered prediction with magnitude that would justify ship; one primary metric pre-declared; everything else is secondary or guardrail.
- Power + MDE + sample size
- Sample size determined by MDE (smallest effect worth detecting) at target power (80% standard) + alpha (5%); formula uses baseline + variance.
- Guardrails
- Secondary metrics that must NOT degrade; common guardrails: revenue, latency, retention, refusal rate, abuse signal.
- Validity threats - novelty, interference, contamination
- Novelty: lift fades 2-4 weeks. Interference (SUTVA violation): shared model / marketplace / social means treatment leaks into control. Contamination: eval set in training data.
Metric design + diagnosis
- North Star + input metrics
- One headline metric (usually capturing user value); decomposed into input metrics that teams can move (activation, engagement, retention).
- Leading vs lagging metrics
- Leading (activation, weekly active days) move first; lagging (retention, revenue) confirm later. Pair them.
- Metric decomposition
- DAU = new + retained + resurrected. Revenue = users x conversion x ACV. Decompose to find which sub-component moved.
- Confound elimination
- Before concluding a product change caused a metric move, rule out: instrumentation change, holiday / seasonality, competitor launch, infrastructure incident, pricing change.
AI-product analytics specifics
- Eval methodology
- Offline eval: curated task set with known answers + scoring rubric. Online eval: sampled production outputs scored by LLM-as-judge or human review.
- LLM-as-judge
- Use a strong model as automated grader against a rubric; scalable but biased + must be calibrated against human review.
- Quality drift + monitoring
- Quality changes over time from model updates, prompt changes, data shifts, retrieval index changes; monitor on stable eval set.
- Token economics + cost analytics
- Token cost per task / per active user; cost-quality tradeoff; gross margin per task.
Practical drills
- Walk me through the SQL to compute weekly retention cohorts for the last 12 weeks, segmented by signup channel, with a rolling 4-week active rate per cohort.
- Design the A/B test for this firm shipping a new model variant in the assistant. Walk me through hypothesis, metrics, power, validity threats, decision.
- this firm's daily active users dropped 5% week-over-week. Walk me through how you'd diagnose, in 15 minutes.
Smart-question anchors
- Data stack + tooling - warehouse, transformation, BI, experimentation platform
- Experimentation maturity - sample-size policy, guardrail framework, novelty + interference handling
- AI-product metrics + eval - quality measurement, LLM-as-judge, drift monitoring, abuse signals
- Cross-functional partnership - analyst + PM + ML + GTM working model
- Decision culture - data-driven vs intuition, how analytics influences strategy + roadmap
Related roles
Sourced from
Ready to Generate Your Own Prep?
Drop your CV and a job description on the home page. A couple of minutes later you get a report with everything you need to land the job.