Distributed Systems Engineer

Dedalus Labs · San Francisco, CA, US

$180k - $280k

On-site

Full-time

Mid

Check your resume against this job Apply on Ycwaas

Get personalized match scores and job alerts

Job Description

Please submit your application at: https://www.dedaluslabs.ai/careers

Only candidates who filled out our form above will be considered.

Mission

Dedalus Labs is an AI research neolab building infrastructure for AI agents.

We’re building the persistent compute layer that powers the next generation of autonomous software. Our platform spans distributed storage, virtualization, orchestration, networking, scheduling, and runtime infrastructure for long-running AI agents.

We’re looking for engineers who enjoy designing systems that continue working long after individual machines fail.

You might be a fit if you

Think distributed systems are one of computer science’s most beautiful subjects.
Care deeply about consistency, fault tolerance, and system correctness.
Enjoy designing systems before writing them.
Think latency, throughput, and reliability are all product features.
Read systems papers because they’re genuinely interesting.
Have strong opinions about storage engines, consensus algorithms, scheduling, or distributed architecture.
Believe simple systems are usually harder to build than complicated ones.
Think every abstraction has a cost.
Measure before optimizing, then optimize relentlessly.
View infrastructure as a product for other engineers.
Are high agency and fiercely independent.
Say how things ought to be built, then build them.
Are a competitive teammate with a heart of gold.
Are hungry to learn, improve, and reflect deeply on feedback.
Go above and beyond in everything you do.

What you’ll build

Distributed infrastructure for large-scale AI agent workloads.
Persistent compute and distributed storage systems.
Scheduling and orchestration platforms.
Virtualization and sandboxing infrastructure.
Reliable multi-tenant cloud systems.
Internal developer platforms and infrastructure tooling.
Production systems operating under real-world scale, latency, and fault tolerance constraints.

Representative Projects

You might find yourself working on problems like:

Designing distributed storage systems for persistent agent state.
Building scheduling infrastructure that efficiently allocates compute across thousands of concurrent agents.
Improving reliability and fault tolerance across distributed infrastructure.
Designing virtualization and isolation systems for secure multi-tenant execution.
Optimizing bottlenecks across networking, storage, scheduling, and runtime layers.
Building infrastructure that makes operating AI agents dramatically simpler for developers.

What we look for

Strong systems programming ability in Rust (preferred), Go, C/C++, or similar languages.
Deep understanding of distributed systems, operating systems, and concurrent programming.
Experience building distributed systems in industry, research, or open source.
Strong engineering judgment around performance, scalability, reliability, and debugging.
Experience with Kubernetes and modern cloud infrastructure.
Ability to design systems that remain reliable under production workloads.
High agency and excellent engineering judgment.

Nice-to-have

Experience with distributed storage systems.
Familiarity with consistency models, consensus algorithms, or replication protocols.
Experience with virtualization, hypervisors, containers, or Firecracker.
Kernel, operating systems, or low-level runtime experience.
Experience operating infrastructure at production scale.
Published systems research (NSDI, OSDI, SOSP, EuroSys, ATC, etc.).
Contributions to systems-focused open-source projects.
Experience with performance engineering and systems optimization.

Taste

You know the difference between a distributed system that works and one that continues working when everything goes wrong.

You care about elegant architecture, principled engineering tradeoffs, and building infrastructure that engineers trust.

Logistics

In person in San Francisco.
We sponsor visas.
Relocation support available.
Competitive salary and meaningful equity.
Meals and office benefits included.

Tips

The first thing we look at is your GitHub.

Show us distributed systems you’ve built. Open-source infrastructure. Research. Storage engines. Schedulers. Consensus implementations. Infrastructure you’ve operated in production. Technical writing.

We care far more about systems you’ve built than years on your résumé.

Related jobs

Design Engineer

Dedalus Labs · San Francisco, CA, US · On-site · $150k - $250k

Distributed Systems Engineer — Full Time

Dedalus Labs · Main Office · On-site

Systems Engineer

Dedalus Labs · San Francisco, CA, US · On-site · $150k - $250k

Engineering Manager, Stateful Distributed Systems

Mixpanel · San Francisco, US (Hybrid) · Hybrid · $175k - $213k

Principal Software Engineer I - Distributed Systems - Elasticsearch

Elastic · Sweden

See how well your resume matches this job before you apply

Run a free ATS check