← All jobs · Robinhood

Engineering Manager, Observability

Robinhood ·
53
AI-Agency
B55 U50
📍 Toronto, CA Manager
KubernetesPrometheusGrafana
TL;DR

Engineering Manager, Observability at Robinhood leading a team building metrics, logging, tracing, and alerting infrastructure. Combines technical leadership with people management to ensure observability is integrated across all services and improves system reliability.

Apply at Robinhood →
you'll be redirected to the company's career page

Job description

<div class="content-intro"><h2>Join us in building the future of finance.</h2> <p>Our mission is to democratize finance for all. <a href="https://www.cerulli.com/press-releases/cerulli-anticipates-124-trillion-in-wealth-will-transfer-through-2048" target="_blank">An estimated $124 trillion of assets</a> will be inherited by younger generations in the next two decades. The largest transfer of wealth in human history. If you’re ready to be at the epicenter of this historic cultural and financial shift, keep reading.</p></div><h2><strong>About the team + role</strong></h2> <p>We are building an elite team, applying frontier technologies to the world’s biggest financial problems.<strong> </strong>We’re looking for bold thinkers. Sharp problem-solvers. Builders who are wired to make an impact. Robinhood isn’t a place for complacency, it’s where ambitious people do the best work of their careers. We’re a high-performing, fast-moving team with ethics at the center of everything we do. Expectations are high, and so are the rewards.&nbsp;</p> <p>The <strong>Observability team</strong> at Robinhood is responsible for ensuring engineers across the company can understand, monitor, and improve the health of their systems. This team builds and maintains the foundational platforms that power metrics, logging, tracing, and alerting across all services. By creating reliable and scalable observability systems, the team enables engineers to quickly detect issues, investigate root causes, and maintain system reliability. The work directly supports product stability and developer efficiency across Robinhood’s infrastructure!</p> <p>As an <strong>Engineering Manager, Observability,</strong> you will lead a team of engineers responsible for building and evolving these critical systems. You will guide technical direction across observability tooling and infrastructure, including metrics pipelines, logging systems, and tracing frameworks. You will partner closely with engineering teams to ensure observability is seamlessly integrated into every service, while also improving system performance, reliability, and cost efficiency. This role combines technical leadership with people management, helping your team deliver scalable solutions while growing their skills and impact!</p> <p><strong>This role is based in our Toronto</strong><strong> </strong><strong>office, with in-person attendance expected at least 4 days per week.&nbsp;</strong></p> <p>At Robinhood, we believe in the power of in-person work to accelerate progress, spark innovation, and strengthen community. Our office experience is intentional, energizing, and designed to fully support high-performing teams.&nbsp;</p> <h2><strong>What you’ll do</strong></h2> <ul> <li>Lead and support a team of engineers building observability systems, including metrics, logging, tracing, and alerting infrastructure</li> <li>Guide the design and improvement of data ingestion pipelines and storage systems for high-volume telemetry data</li> <li>Work with product and infrastructure teams to ensure observability is integrated into new and existing services</li> <li>Improve system reliability by reducing time to detect and resolve production issues</li> <li>Manage team priorities, support career development, and ensure consistent delivery of high-quality engineering work</li> </ul> <h2><strong>What you bring</strong></h2> <ul> <li>Experience managing engineering teams, including hiring, mentoring, and performance management</li> <li>Background in infrastructure, reliability, or observability systems such as metrics, logging, or distributed tracing</li> <li>Experience working with technologies such as Kubernetes, distributed systems, and cloud-based platforms</li> <li>Familiarity with tools such as Prometheus, Grafana, or similar monitoring and observability solutions</li> <li>Ability to understand complex systems and guide technical decisions that improve system performance and reliability</li> </ul> <h2><strong>Leadership expectations&nbsp;</strong></h2> <p>Our ambitious roadmap requires a great culture shaped by exceptional leaders. Here’s what we expect from them:&nbsp;</p> <ul> <li>Drive high performance by setting clear, focused goals, giving real-time feedback, stretching top talent, and scaling impact through focus, innovation, and tech.</li> <li>Hire and retain top talent by setting a high bar, hiring only those who raise it, investing in onboarding, and addressing talent issues quickly and fairly.</li> <li>Create community by connecting work to purpose, removing friction while prioritizing safety, building trust and inclusion, and leading from the front with integrity.</li> </ul> <h2><strong>What we offer</strong></h2> <ul> <li>Challenging, high-impact work to grow your career</li> <li>Performance driven compensation with multipliers for outsized impact, bonus programs, and equity ownership</li> <li>Top tier benefits to fuel your work, including supplemental health insurance, ancillary insurance, and mental health support programs</li> <li>Lifestyle wallet - a highly flexible employer-paid benefits spending account expenses beyond traditional benefits such as wellness, childcare, learning, and more.</li> <li>Time off to recharge including company holidays, paid time off, sick time, paid volunteer time off, parental leave, and more!</li> <li>Exceptional office experience with catered meals, events, and comfortable workspaces.&nbsp;</li> <li>Monthly commuter stipend to help offset in-office commuting costs</li> </ul> <p>Our team is committed to providing an inclusive and welcoming interview experience for all candidates. If you require a specific accommodation during the application or interview process due to a physical or mental condition, please complete this<a href="https://robinhood.hracuity.net/webform/index/752875e4-42ba-4dff-9951-fd51e315997e"> Applicant Accommodation Form</a> to notify our team. The form should only be completed if you need a specific accommodation.</p> <p><strong>AI Usage Disclosure:</strong> Robinhood uses artificial intelligence (AI) tools to support parts of our recruiting process. These tools enhance the efficiency and consistency of our hiring process; however, all hiring decisions are made by our hiring teams.</p> <p><strong>Vacancy Notice:</strong> This job posting represents an existing vacancy that we are actively seeking to fill.</p><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p><strong>In addition to the base pay range listed below, this role is also eligible for bonus opportunities + equity + benefits.</strong></p> <p>Base pay for the successful applicant will depend on a variety of job-related factors, which may include education, training, experience, location, business needs, or market demands. The expected base pay range for this role is based on the location where the work will be performed.</p> <p><span style="text-decoration: underline;">Base Pay Range:</span></p></div><div class="title">Toronto, ON</div><div class="pay-range"><span>$200,000</span><span class="divider">&mdash;</span><span>$235,000 CAD</span></div></div></div><div class="content-conclusion"><p>Click <a href="https://careers.robinhood.com/benefits" target="_blank">here</a> to learn more about our Total Rewards, which vary by region and entity.</p> <p>If our mission energizes you and you’re ready to build the future of finance, we look forward to seeing your application.</p> <p>Robinhood provides equal opportunity for all applicants, offers reasonable accommodations upon request, and complies with applicable equal employment and privacy laws. Inclusion is built into how we hire and work—welcoming different backgrounds, perspectives, and experiences so everyone can do their best. Please review the&nbsp;<a href="https://careers.robinhood.com/applicantprivacypolicy" target="_blank">Privacy Policy</a> for your country of application.</p></div>
Apply at Robinhood →

More open roles at Robinhood

Robinhood ·
Staff Machine Learning Engineer, Agentic
📍 Bellevue, US 🛠 AI tools welcome · Staff
Staff Machine Learning Engineer at Robinhood building agentic AI systems. Defines evaluation frameworks, guides model selection, and ensures agent systems meet standards for correctness, safety, latency, and user satisfaction across the organization.
LLMagentic systemsevaluation frameworks
83
AI-core
Robinhood ·
Senior Machine Learning Engineer, Agentic
📍 Menlo Park, US 🛠 AI tools welcome · Senior
Senior Machine Learning Engineer at Robinhood building production AI agents for financial products. Focus on evaluation harnesses, optimization pipelines (DPO, PPO), and deploying fine-tuned models with robust monitoring.
PythonLLMDPOPPOagentic systems
83
AI-core
Robinhood ·
Staff Product Manager, Cortex
📍 Menlo Park, US 🛠 AI tools welcome · Staff
Staff Product Manager for Cortex, Robinhood's consumer AI assistant. Lead product strategy from research tool to AI-powered financial planning, owning roadmap, technical direction, and evaluation frameworks across investing, spending, and saving.
LLMRAGagentic systems
79
AI-core
Robinhood ·
Senior Security Engineer, AI Vulnerability Management
📍 Toronto, CA 🛠 AI tools welcome · Senior
Senior Security Engineer at Robinhood building AI-driven vulnerability management systems. Focus on architecting agentic AI agents for automated triage, remediation, and risk prioritization at scale.
GoPythonAWSKubernetesLangChainSnyk
78
AI-core
Robinhood ·
Senior Security Engineer, AI Vulnerability Management
📍 Menlo Park, US 🛠 AI tools welcome · Senior
Senior Security Engineer at Robinhood building AI-driven vulnerability management systems. Focus on architecting agentic AI for automated security triage, remediation, and risk-based prioritization across Kubernetes and AWS infrastructure.
GoPythonLangChainAWSKubernetesSnyk
78
AI-core