Database Reliability Engineer / Senior Database Reliability Engineer, Reliability Remote, Americas
The GitLab DevOps platform empowers 100,000+ organizations to deliver software faster and more efficiently. We are one of the world’s largest all-remote companies with 2,000+ team members and values that guide a culture where people embrace the belief that everyone can contribute.
GitLab.com is a unique site and it brings unique challenges: it’s the biggest GitLab instance in existence; in fact, it’s one of the largest single-tenancy open-source SaaS sites on the internet. The experience of our team feeds back into other engineering groups within the company, as well as to GitLab customers running self-managed installations
As a DBRE you will:
- Work on database reliability and performance aspects for GitLab.com from within the SRE team as well as work on shipping solutions with the product.
- Analyze solutions and implement best practices for our main PostgreSQL database cluster and its components.
- Work on observability of relevant database metrics and make sure we reach our database objectives.
- Work with peer SREs to roll out changes to our production environment and help mitigate database-related production incidents.
- OnCall support on rotation with the team.
- Provide database expertise to engineering teams (for example through reviews of database migrations, queries and performance optimizations).
- Work on automation of database infrastructure and help engineering succeed by providing self-service tools.
- Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible.
- Plan the growth of GitLab's database infrastructure.
- Design, build and maintain core database infrastructure pieces that allow GitLab to scale to support hundreds of thousands of concurrent users.
- Support and debug database production issues across services and levels of the stack.
- Make monitoring and alerting alert on symptoms and not on outages.
- Document every action so your learnings turn into repeatable actions and then into automation.
You may be a fit to this role if you:
- Have at least 5 years of experience running PostgreSQL in large production environments
- Have at least 2 years of experience with infrastructure automation and configuration management (Chef, Ansible, Puppet, Terraform…)
- Have experience with Ruby on Rails, Django, other Ruby and/or Python web frameworks, or Go
- Have solid understanding of SQL and PL/pgSQL
- Have solid understanding of the internals of PostgreSQL
- Have experience working in a distributed production environment
- Share our values, and work in accordance with those values.
- Have excellent written and verbal English communication skills
- Have an urge to collaborate and communicate asynchronously.
- Have an urge to document all the things so you don't need to learn the same thing twice.
- Have a proactive, go-for-it attitude. When you see something broken, you can't help but fix it.
- Have an urge for delivering quickly and iterating fast.
- Know your way around Linux and the Unix Shell.
- Have the ability to orchestrate and automate complex administrative tasks. Knowledge in config management systems like Chef (the one we use)
- Passion for stable and secure systems management practices
- Strong data modeling and data structure design skills
Projects you could work on:
- Review, analyze and implement solutions regarding database administration (e.g., backups, performance tuning)
- Work with Terraform, Chef and other tools to build mature automation (automatic setup new replicas or testing and monitoring of backups).
- Implement self-service tools for our engineers using GitLab ChatOps.
- Provide technical assistance and support to other teams on database and database-related application design methodologies, system resources, application tuning.
- Review database related changes from engineering teams (e.g., database migrations).
- Recommend query and schema changes to optimize the performance of database queries.
- Jump on a production incident to mitigate database-related issues on GitLab.com.
- Participate actively in the infrastructure design and scalability considerations focusing on data storage aspects.
- Make sure we know how to take the next step to scale the database.
- Design and develop specifications for future database requirements including enhancements, upgrades, and capacity planning; evaluate alternatives; and make appropriate recommendations.
Database Reliability Engineers have the following job-family performance indicators:
- GitLab.com Availability
- GitLab.com Performance
- Apdex and Error SLO per Service
- Mean Time to Detection
- Mean Time to Resolution
- Mean Time Between Failure
- Mean Time to Production
- Disaster Recovery Time to Recovery
Please view the compensation range for this role at the bottom of the position description.
The base salary range for this role’s listed level is currently $100,800 - $183,600 for Colorado residents and $100,800 - $205,200 for New York and New Jersey residents only. Grade level and salary ranges are determined through interviews and a review of education, experience, knowledge, skills, abilities of the applicant, equity with other team members, and alignment with market data. See more information on our benefits and equity. Sales roles are also eligible for incentive pay targeted at up to 100% of the offered base salary.
Country Hiring Guidelines: GitLab hires new team members in countries around the world. All of our roles are remote, however some roles may carry specific location-based eligibility requirements. Our Talent Acquisition team can help answer any questions about location after starting the recruiting process.
GitLab is proud to be an equal opportunity workplace and is an affirmative action employer. GitLab’s policies and practices relating to recruitment, employment, career development and advancement, promotion, and retirement are based solely on merit, regardless of race, color, religion, ancestry, sex (including pregnancy, lactation, sexual orientation, gender identity, or gender expression), national origin, age, citizenship, marital status, mental or physical disability, genetic information (including family medical history), discharge status from the military, protected veteran status (which includes disabled veterans, recently separated veterans, active duty wartime or campaign badge veterans, and Armed Forces service medal veterans), or any other basis protected by law. GitLab will not tolerate discrimination or harassment based on any of these characteristics. See also GitLab’s EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know during the recruiting process.
Vacancy page : https://boards.greenhouse.io/gitlab/jobs/4783681002