A health-tech company focused on longevity and preventive care is looking for a Health Data Scientist with deep statistical expertise to join their analytics team. This is not a standard data science role. You will work at the intersection of clinical evidence, actuarial thinking, and population health — building risk models that directly shape how the company identifies at-risk members, routes clinical interventions, and evaluates whether prevention programs are actually changing health trajectories.
The questions you will work on are meaningful: Which members are most likely to develop a chronic condition in the next 12 months? What biomarkers from a longitudinal study best predict hospitalization? Are interventions moving the needle on health outcomes? If those questions genuinely excite you, this role was designed for you.
About You
You are a statistician at heart who happens to work with health data. You understand that predicting member risk is ultimately about preventing a bad health outcome — not just estimating a cost. You are comfortable with uncertainty, rigorous about methodology, and able to translate complex quantitative findings into clear, actionable insights for clinical and business audiences. You come from a background in statistics, biostatistics, epidemiology, actuarial science, mathematics, quantitative biology, or public health with a strong quantitative component.
What You'll Be Doing
Build and maintain clinical risk models that generate predictive scores for hospitalization, chronic disease onset, and health deterioration — validated out-of-sample and monitored over time
Conduct survival analyses of health events to help the clinical team determine when and how to intervene with specific member populations
Design and execute statistical analyses that quantify the impact of prevention programs on health outcomes and claims frequency
Develop risk segmentation frameworks grounded in biomarkers, diagnosis patterns, and longitudinal behavior — going beyond simple business rules
Analyze claims patterns across risk cohorts, documenting frequency and severity differences by age band, condition, and risk profile
Collaborate with clinicians, actuaries, and a Data Scientist to ensure models are grounded in clinical evidence and operationally actionable
Deliver clear quantitative findings to clinical and business stakeholders through tables, charts, and written summaries
Document methodologies so analyses can be reproduced, audited, and built upon over time
What We're Looking For
Someone who is genuinely fascinated by what data reveals about human health — not just someone who can run models
A rigorous statistician who understands distributions, time-to-event outcomes, and how to quantify uncertainty in a clinical context
A strong communicator who can make complex statistical findings accessible to clinical teams and non-technical leadership
A collaborative professional comfortable working across disciplines — clinical, actuarial, and data engineering
Someone with a track record of building models that hold up out-of-sample and can be explained and defended
Technical Requirements
Must-Haves
Statistical modeling in Python or R: GLMs, survival models, mixed effects models, and regularization techniques
Survival analysis applied to health or clinical data: Kaplan-Meier, Cox proportional hazards, and accelerated failure time models
Risk stratification or clinical segmentation using statistical or ML approaches
Biostatistics or epidemiological methods: incidence rates, relative risk, hazard ratios, and propensity score matching
Claims or health data analysis: frequency and severity patterns, cost drivers, and cohort differences
SQL for data extraction and preparation from health or operational databases
Ability to communicate quantitative findings clearly to clinical and business stakeholders
Nice-to-Haves
Experience with electronic health records (EHR), lab data, or clinical datasets
Familiarity with actuarial concepts such as loss ratios, pricing basics, and experience studies
Knowledge of causal inference methods: difference-in-differences, instrumental variables, or synthetic control for evaluating intervention effectiveness
Proficiency with R packages such as survival , lme4 , or ggplot2 , or Python equivalents including lifelines , statsmodels , or scikit-survival
Experience working with public health or insurance datasets in Mexico or Latin America