- We're looking for a Senior Data Engineer to own the data layer for the U.S. product team at a U.S.-based AI-powered workforce intelligence platform. You'll be the first data hire in the U.S. — building and operating the pipelines that power core product features, AI capabilities, and customer analytics, while serving as the in-timezone data responder for enterprise customers.
- You'll work alongside a Head of Product, a Senior Product Designer, and a Senior Full Stack Engineer as part of a small, high-ownership U.S. team, reporting to the Head of Data in Sydney. If you bring data science capability on top of engineering — analytics, model evaluation, experimentation — there is scope to take on that work as well.
- You have 5+ years in data engineering or backend systems with a strong data focus, and you've owned pipelines end-to-end in production environments.
- You've worked with ML or LLM pipelines, embeddings, or RAG systems — the product is AI-powered and you need to have built in this space.
- You're comfortable being the first and only data person in the room, making trade-offs, and handing off to a distributed team with complete context.
- You use AI coding tools (Cursor, Copilot, Claude Code) as a core part of how you build — this is a baseline expectation, not a bonus.
- You communicate clearly with engineers, product partners, and enterprise customers about data issues and trade-offs.
- Build and maintain data pipelines to ingest and process structured datasets — CSVs, spreadsheets, APIs, and event streams — with strong validation and error handling.
- Develop and operate pipelines supporting embeddings, RAG, and LLM-driven data workflows central to how the product functions.
- Serve as the U.S.-hours first responder for customer data issues — investigate, assess severity, and resolve directly or hand off to the Sydney team with full context.
- Build playbooks and observability tooling for data triage to compress response times and reduce cross-timezone back-and-forth.
- Partner with the Head of Product and engineering team to surface what the data can and can't support, and inform product direction.
- Deploy, monitor, and maintain data systems using Terraform and CI/CD pipelines in production.
- Contribute to consolidating AI-assisted development practices across the data stack.
- High ownership and comfort operating in ambiguous, fast-moving environments with a small team and no safety net.
- A proactive approach to customer issues — you investigate, document, and close the loop without waiting to be asked.
- Clear, direct communication across engineers, product, and customers.
- An AI-first mindset applied to how you build, not just what you build.
- :
- 5+ years in data engineering or backend systems with a strong data focus.
- Strong Python skills for pipelines, data processing, and scripting.
- Strong SQL and experience with relational databases (PostgreSQL, MySQL).
- Experience building ETL/ELT pipelines and event-driven processing (queues, async workers, pub/sub).
- Hands-on experience with ML or LLM pipelines, embeddings, or RAG systems.
- Infrastructure comfort: Terraform, CI/CD, and AWS services (S3, RDS, EKS).
- :
- Data science capability: analytics, model evaluation, experimentation, or statistical analysis.
- Experience with structured ontology data or knowledge graphs.
- Prior startup or scale-up experience in a fast-moving environment.
- Customer-facing data work with enterprise clients.
- Experience working across time zones in a distributed team.
Skills
Python
PostgreSQL
Terraform