Reliability Engineer jobs in Oakland, CA

Reliability Engineer analyzes and evaluates the reliability of products, equipment, components, and processes using engineering methodologies and tools. Develops the methods and measures utilized for reliability analysis based on product specifications, tolerances, or operating standards. Being a Reliability Engineer utilizes analysis techniques like FMEA, fault tree, and root cause analysis to identify problems. Oversees testing activities and reviews results. Additionally, Reliability Engineer creates risk-based failure mitigation plans. Proposes new or revised product designs, manufacturing processes and testing specifications that utilize best practices and will increase reliability. Requires a bachelor's degree in engineering. Typically reports to an engineering manager. The Reliability Engineer work is closely managed. Works on projects/matters of limited complexity in a support role. To be a Reliability Engineer typically requires 0-2 years of related experience. (Copyright 2024 Salary.com)

L
Engineer, Service Reliability
  • LaunchDarkly
  • Oakland, CA FULL_TIME
  • About the Job: 

    Software powers the world, and LaunchDarkly empowers all teams to deliver and control the best software. We serve trillions of feature flags daily to help teams ship better software faster and eliminate risk for companies big and small.

    We're based in downtown Oakland and growing quickly. You'll help us tackle some of the most challenging engineering problems around, like delivering feature flags to hundreds of millions of users worldwide in milliseconds.

    In this role, you'll monitor our core systems and tools, respond to and mitigate incidents quickly, and define and drive opportunities to make our core services more resilient. You will also identify and develop force-multiplying capabilities for our internal engineering teams, helping our engineers become more effective at shipping robust code and thinking about reliable design earlier in the lifecycle.

    Our core daily technologies include AWS, Golang, CockroachDB, ElasticSearch, Redis, Flink, Kinesis, and Terraform.

    Responsibilities:

    • Lead the development and continuous refinement of SRE tools and processes to improve software delivery, observability, reliability and operational efficiency. You will step beyond your team’s boundary when proactively improving our overall service health.

    • Enable our engineers to deliver their services with higher autonomy, reliability, and performance through offerings written in Go and Terraform, or delivered through existing tools.

    • Design and implement disaster recovery and business continuity plans and conduct regular testing and exercises.

    • Act as a primary technical liaison between SRE and neighboring teams.

    • Mentor and support team members in their professional development as well as service reliability engineering principles, tools, and processes.

    • Drive the adoption of new technologies, system designs and best practices in code health, testing, and maintainability across teams.

    • Proactively identify and resolve potential performance and scalability bottlenecks in our front-end and back-end systems.

    • Analyze the performance of SQL queries, suggest improvements and build guardrails for teams.

    Qualifications:

    • Experience leading team ceremonies: project ideation, planning, grooming, retrospectives, etc. You will also drive alignment on decisions with cross-team impact, identify areas of misalignment across the team, and bring stakeholders together to realign.

    • Comfort with large-scale, highly available distributed systems

    • Comfort with server-side web development (e.g., in Java / Scala, Ruby, Python, Golang, Node.js) and Infrastructure-as-Code (e.g., Terraform.)

    • Experience guiding the architectural direction and scalability considerations for new projects.

    • A strong customer focus and ability to make technical decisions that tie back to business goals.

    • Experience working with a major cloud provider (AWS, Azure, or GCP)

    • Experience with observability tooling (Datadog/Honeycomb/etc.)

    • Experience with RDBMS technologies (CockroachDB/PostgreSQL/etc.)

    • Strong communication skills, a positive attitude, and a high degree of empathy

    • You have a high bar for quality of code and quality of user experience

    Pay:

    Target pay ranges based on Geographic Zones* for Levels P4-P5:

    • Zone 1: San Francisco/Bay Area or New York City Metropolitan Area (if not Bay area specific role): $183,600 - $235,000**

    • Zone 2: Boston, DC, Irvine, LA, Monterey, Santa Barbara, Santa Rosa, Seattle: $165,600 - $212,000**

    • Zone 3: All other US locations: $156,510 - $200,000**

    *Restricted Stock Units (RSUs), health, vision, and dental insurance, and mental health benefits in addition to salary.

    LaunchDarkly operates from a place of high trust and transparency; we are happy to state the pay range for our open roles to best align with your needs. Exact compensation may vary based on skills, experience, degree level, and location.

    About LaunchDarkly:

    Modern software delivery was supposed to be the foundation for a thriving digital business but reality has proven otherwise. Slow, inefficient development cycles, costly outages, and fragmented customer experiences are preventing developers from building their best software. The LaunchDarkly platform helps developers innovate on new features faster while protecting them with a safety valve to instantly rewind when things go wrong. Developers can target product experiences to any customer segment and maximize the business impact of every feature. And by gradually rolling out new application components, they escape nightmare "big-bang" technology migrations. 

    The LaunchDarkly platform was built to guide engineers to the next frontier of DevOps by:

    • Improving the velocity and stability of software releases, without the fear of end customer outages
    • Delivering targeted experiences by easily personalizing features to customer cohorts
    • Maximizing the business impact of every feature through the ability to experiment and optimize
    • Coordinating the release and optimization of software to provide consistent experiences across mobile platforms and device types
    • Improving the effectiveness and productivity of engineering teams, by providing insights into engineering cadence and stability

    At LaunchDarkly, we believe in the power of teams. We're building a team that is humble, open, collaborative, respectful and kind. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, or disability status.

    One of our company values is 'Widen the Circle'. Which means we seek out diversity of perspectives to get better results. We understand everyone has their own unique talents and experiences. We encourage you to apply to this role even if you don’t think you meet 100% of the qualifications outlined above. We can find out together if it's the right match for your skillset.

    Do you need a disability accommodation?

    Fill out this accommodations request form and someone from our People Operations team will contact you for assistance. 

  • Just Posted

L
Site Reliability Engineer
  • Laiba Technologies LLC
  • San Leandro, CA FULL_TIME
  • Job DetailsPosition: Site Reliability EngineerLocation: San Leandro, CA (Onsite role)5-10 years of experience in Production support/SRE teams with continued focus on improving Platform health Experien...
  • 4 Days Ago

A
Site Reliability Architect/Engineer
  • Ampcus Inc
  • Oakland, CA FULL_TIME
  • Job DetailsTitle: Principal Data Solutions Architect / Site Reliability / Data Engineer / Devops Location: Oakland, CA 94612 Hybrid - once in a month onsiteExpected Duration: 05/06/2024 to 05/05/2026F...
  • 6 Days Ago

A
Site Reliability Engineer (SRE)
  • Amiseq Inc.
  • Oakland, CA FULL_TIME
  • Job DetailsSite Reliability Engineer (SRE)Oakland, CA - HybridFulltime roleThis position is hybrid, working from your remote office and your assigned location based on business need.Job Responsibiliti...
  • 6 Days Ago

N
Site Reliability Engineer
  • NTT DATA Americas, Inc
  • San Leandro, CA FULL_TIME
  • Job DetailsCompany Overview:Req ID: 278807NTT DATA Services strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptab...
  • 7 Days Ago

P
Senior/Staff Reliability Engineer
  • Pyka
  • Oakland, CA FULL_TIME
  • Pyka is looking for an experienced reliability engineer to oversee the testing and validation of the aircraft’s subsystems. You will oversee all aspects of hardware test/validation. On a day-to-day ba...
  • 26 Days Ago

Filters

Clear All

Filter Jobs By Location
  • Filter Jobs by companies
  • More

0 Reliability Engineer jobs found in Oakland, CA area

I
Site Reliability Engineer Manager
  • Illuminate Literacy
  • San Francisco, CA
  • As the Site Reliability Engineer at Illuminate Literacy, you will serve a critical role in our mission to eradicate illi...
  • 4/24/2024 12:00:00 AM

A
Site Reliability Engineer - Solr
  • Apple Inc.
  • Cupertino, CA
  • The Apple Service Engineering - Solr SRE team is looking for Site Reliability Engineers with experience in developing pr...
  • 4/24/2024 12:00:00 AM

W
Reliability Engineer
  • Wipro
  • Cupertino, CA
  • Reliability Engineer Auston, TX or Cupertino, CA/Remote ok for locals Permanent Role Job Summary: A hardware reliability...
  • 4/23/2024 12:00:00 AM

S
Director, Database Reliability Engineering
  • SHEIN Technology LLC
  • San Francisco, CA
  • Job Title: Director, Database Reliability Engineering Reports to: Senior Director, Production Engineering Job Location: ...
  • 4/23/2024 12:00:00 AM

Z
Site Reliability Engineer
  • Zoox
  • San Mateo, CA
  • Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the ...
  • 4/23/2024 12:00:00 AM

O
Infrastructure and Site Reliability Engineer
  • Observable
  • San Francisco, CA
  • Observable is redefining how businesses create and share data apps by giving developers the tools they need to create th...
  • 4/23/2024 12:00:00 AM

C
Site Reliability Engineer
  • Compunnel Inc.
  • San Leandro, CA
  • Position: Site Reliability Engineer Location: San Leandro, CA or Charlotte, NC (Hybrid) Duration: Contract Locals to San...
  • 4/22/2024 12:00:00 AM

J
Site Reliability Engineer 1
  • Juniper Networks
  • Cupertino, CA
  • Job Description Juniper is seeking a full-time SRE to join our talented team and support high quality technology solutio...
  • 4/22/2024 12:00:00 AM

Oakland is in the eastern region of the San Francisco Bay. In 1991 the City Hall tower was at 37°48′19″N 122°16′21″W / 37.805302°N 122.272539°W / 37.805302; -122.272539 (NAD83). (The building still exists, but like the rest of the Bay Area, it has shifted northwest perhaps 0.6 meters in the last twenty years.) The United States Census Bureau says the city's total area is 78.0 square miles (202 km2), including 55.8 square miles (145 km2) of land and 22.2 square miles (57 km2) (28.48 percent) of water. Oakland's highest point is near Grizzly Peak Blvd, east of Berkeley, just over 1,760 feet (...
Source: Wikipedia (as of 04/11/2019). Read more from Wikipedia
Income Estimation for Reliability Engineer jobs
$92,626 to $109,764
Oakland, California area prices
were up 4.5% from a year ago

Reliability Engineer in Appleton, WI
Michael Kehoe, staff site reliability engineer for LinkedIn describes the person best suited for the SRE role as someone who has a little bit of knowledge about everything.
December 29, 2019
Reliability Engineer in Dallas, TX
The main responsibility of an electrical reliability engineer is to test an electrical component's reliability over a period of time and in various conditions.
December 23, 2019
Reliability Engineer in Allentown, PA
So, the same engineers that are building applications are also the ones that are running them, scaling them, and dealing with incidents.
December 04, 2019