Machine Learning Engineer
Osmosis · San Francisco, CA, US
Job Description
About Osmosis
At Osmosis, we help companies use cutting-edge reinforcement learning techniques to fine-tune open-source language models that beat foundation models on performance, latency, and cost.
We’ve raised $7M in funding from Y Combinator, top institutional investors like CRV and Audacious Ventures, as well as angel investors including Paul Graham (Y Combinator), Erik Bernhardsson (Modal Labs), Misha Laskin (Reflection AI), and Guillermo Rauch (Vercel).
About the Role
We're looking for a Machine Learning Engineer to contribute to high-performance distributed training infrastructure for RL at scale. You'll work directly with our founding team and design partners to push the boundaries of what's possible with post-training and continual learning systems.
This role requires expertise in RL algorithms, distributed training, and low-level optimization. You'll have exceptional agency to make impactful decisions while working in a fast-paced, customer-driven environment.
Responsibilities
You’ll contribute to work in areas like:
- Distributed Training Infrastructure: implement new RL algorithms and build scalable post-training pipelines
- Resource Management & Optimization: design infrastructure systems for efficient GPU utilization and dynamic resource allocation
- Customer-Facing Work: work directly with customers on production deployments and custom model development
Technology
- Backend: Python FastAPI, Golang
- Frontend: React, TypeScript, Next.js
- Cloud Infrastructure: AWS Fargate, Docker, Kubernetes, AWS SageMaker
- ML Frameworks: Verl / slime / Megatron-LM / SkyRL, PyTorch (FSDP experience is a plus), vLLM / SGLang
- Databases: DynamoDB, S3
Related jobs
Staff Machine Learning Systems Engineer, Embeddings Platform
Reddit · Remote - United States · Remote
Machine Learning Engineer II, Computer Vision Applied Science
Pinterest · San Francisco, CA, US; Remote, US · Remote
Sr. Staff Machine Learning Engineer, Agentic Ads
Pinterest · San Francisco, CA, US; Remote, US · Remote
Senior/Staff Machine Learning Research Engineer, General Agents, Enterprise GenAI
Scale AI · San Francisco, CA; New York, NY · On-site
Senior Machine Learning Engineer - Content Intelligence
Spotify · New York, NY · Hybrid · $184k - $263k
See how well your resume matches this job before you apply
Run a free ATS check