Staff Software Engineer

MongoDB, Inc.


EMPLOYER: MongoDB, Inc. 

 

Job ID: 9670965

 

Salary Range: $270,000/yr. - $351,000/yr. 

 

TITLE: Staff Software Engineer

 

Job Description: Collaborate with AI researchers and engineers from the Search Platform and Voyage .AI teams to productionize state-of-the-art embedding models and rerankers, enabling high-scale, low-latency inference for both real-time and batch workloads. Lead key projects focused on performance optimization, GPU utilization, autoscaling, and observability for the inference platform. Design and implement components of a multi-tenant inference platform, deeply integrated with Atlas Vector Search, to power semantic search, hybrid retrieval, and AI-native features for MongoDB customers at global scale. Build core platform capabilities, including model versioning, safe and automated deployment pipelines, latency-aware request routing, and model health monitoring—ensuring continuous delivery and system resilience. Make high-leverage architectural decisions and define long-term technical direction for the inference infrastructure, balancing performance, reliability, and developer ergonomics. Tools include vLLM, ONNX Runtime, and Kubernetes-based orchestration. Collaborate across engineering, infrastructure, ML, and product teams to define shared architectural patterns and operational best practices that support high availability and low-latency performance at scale. Influence strategic direction and planning, contributing to quarterly and annual roadmap development, evaluating trade-offs, and helping leadership balance short-term execution with long-term goals. Must appear in office 3 days per week; WFH permissible 2 days per week.

 

Requirements: Master’s degree or foreign degree equivalent in Computer Science or related field and 5 years of experience in ML inference serving and optimizations or in the job offered or a related role

 

Experience and/or education must include:  

 

  1. 5 years of experience designing and developing large-scale distributed systems in production, including microservices architectures supporting tens of thousands to millions of requests per second;

  2. Programming languages including Python, Go, and Java, with emphasis on developing high-performance, reliable, and maintainable systems, backend infrastructure, ML platforms, distributed systems, and systems-level optimization;

  3. 5 years of experience operating and managing Linux-based systems across cloud-native environments (AWS, GCP) including experience with infrastructure-as-code using Terraform, container orchestration with Kubernetes, multi-region deployment strategies, and high-availability service delivery at scale;

  4. 5 years of experience designing and maintaining high-throughput data ingestion and pipelines using Kafka or Pub/Sub, capable of reliably processing tens of millions of events per day;

  5. 5 years of experience building and optimizing large-scale data infrastructure using BigQuery, Presto, or Spark, including expertise in distributed schema design, analytical query optimization, and cost-performance tradeoffs in production;

  6. 1 year of experience developing machine learning infrastructure, including distributed feature stores, real-time and batch feature retrieval systems, and scalable model serving platforms and 

  7. 1 year of hands-on experience with large language models (LLMs) using frameworks such as PyTorch and LlamaFactory, including fine-tuning and model deployment.

 

JOB SITE: 499 Hamilton Avenue Palo Alto, CA 94301; Must appear in office 3 days per week; WFH permissible 2 days per week.

 

CONTACT: Please email resume to Apply-Careers@mongodb.com and reference Job ID 9670965