← All jobs · Databricks

Staff Software Engineer - Distributed Data Systems

Databricks ·
41
AI-Agency
B45 U35
📍 Bellevue, US 💰 $182K–$247K Staff 8+ yrs
Apache SparkDelta LakeJavaScalaC++Hadoop
TL;DR

Staff Software Engineer at Databricks building distributed data storage and processing systems. Focus on Apache Spark, Delta Lake, and next-generation query optimization for the Runtime team.

Apply at Databricks →
share:
you'll be redirected to the company's career page

Job description

P-988

At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the world's best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers — and customer obsessed — we leap at every opportunity to solve technical challenges, from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And we're only getting started.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects:

Apache Spark™: Develop the de facto open source standard framework for big data.

Data Plane Storage: Provide reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.

Delta Lake: A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming. Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.

Delta Pipelines: It's difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering: Build the next generation query optimizer and execution engine that's fast, tuning free, scalable, and robust.

What we look for:

 

Pay Range Transparency

Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents the expected salary range for non-commissionable roles or on-target earnings for commissionable roles.  Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location. Based on the factors above, Databricks anticipates utilizing the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above. For more information regarding which range your location is in visit our page here.

 

Local Pay Range
$182,400$247,000 USD

About Databricks

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on TwitterLinkedIn and Facebook.

Benefits

At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region click here.

Our Commitment to Diversity and Inclusion

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Compliance

If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.

Apply at Databricks →

More open roles at Databricks

Databricks 🔷 AI-first · 🔄 synced 10h ago
Principal Research Scientist - AI Scaling & Optimization
📍 Mountain View, US · Principal
Principal Research Scientist at Databricks leading a team on LLM scaling, efficiency, and post-training optimization. Focus on advancing foundation model training and inference through novel algorithms, systems optimization, and production deployment.
PythonPyTorchdistributed trainingLLMs
88
AI-core
Databricks 🔷 AI-first · 🔄 synced 10h ago
Principal Research Scientist – Scaling
📍 San Francisco, US · Principal
Principal Research Scientist leading a team at Databricks focused on large language model scaling, efficiency, and post-training. Role combines research leadership with hands-on work in distributed training optimization, inference efficiency, and production ML systems.
PythonPyTorchdistributed trainingLLM inference
88
AI-core
Databricks 🔷 AI-first · 🔄 synced 10h ago
AI Engineer - FDE (Forward Deployed Engineer)
📍 Singapore, SG 🛠 AI tools welcome at work · Mid
AI Engineer at Databricks building production GenAI applications including RAG, multi-agent systems, and fine-tuning for enterprise customers. Role involves customer engagement, technical advisory, and cross-functional collaboration with product and engineering teams.
PyTorchHuggingFaceLangChainDSPypandasscikit-learn
82
AI-core
Databricks 🔷 AI-first · 🔄 synced 10h ago
AI Engineer - FDE (Forward Deployed Engineer)
📍 US 🌐 Remote-only 💰 $180K–$248K 🛠 AI tools welcome at work · Senior
AI Engineer (Forward Deployed) at Databricks building and productionizing GenAI applications for customers. Focus on RAG, multi-agent systems, fine-tuning, and production ML deployments on cloud platforms.
HuggingFaceLangChainDSPyPyTorchpandasscikit-learn
82
AI-core
Databricks 🔷 AI-first · 🔄 synced 10h ago
AI Engineer - FDE (Forward Deployed Engineer)
📍 Sydney, AU 🌐 Remote-only · Mid
AI Engineer at Databricks building production GenAI applications including RAG, multi-agent systems, and fine-tuning. Role involves customer engagements, production rollouts, and cross-functional collaboration with product and engineering teams.
HuggingFaceLangChainDSPyPyTorchpandasscikit-learn
82
AI-core