Cloud Inference Engineer

Luminal · San Francisco, CA, US

$150k - $350k

On-site

Full-time

Mid

Get personalized match scores and job alerts

Job Description

Luminal (YC S25) builds an AI compiler and serving stack that makes models 10x faster and production ready with one line.

Founding, on site in downtown SF. Ship low latency, high throughput model serving on Luminal Cloud.

Day to day responsibilities:

Deploy and tune models with optimizations like KV caching, paged attention, sequence packing, etc.
Conducting model performance reviews
Improve scheduler, batcher, autoscaling; profile latency, cost, utilization
Sometimes write kernels and, yes, occasional tasteful shitposting

Senior Compiler Engineer

Luminal · San Francisco, CA, US · On-site · $200k - $350k

Compiler Engineer

Luminal · San Francisco, CA, US · On-site · $150k - $250k

Staff + Sr. Software Engineer, Cloud Inference Launch Engineering

Anthropic · San Francisco, CA · Hybrid

Staff + Sr. Software Engineer, Cloud Inference

Anthropic · San Francisco, CA · Hybrid

Technical Program Manager, Cloud Inference

Anthropic · San Francisco, CA | New York City, NY · Hybrid

See how well your resume matches this job before you apply