Senior Data Engineer United States

Company: GitHub

The Data Engineering team at GitHub is looking for a savvy Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives. If you have a passion for data and GitHub we'd love to talk to you.

Responsibilities:

  • Implement secure and auditable data infrastructure as required.
  • Create and maintain optimal data pipeline architecture.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.
  • Participate in the on-call rotation.

Qualifications:

  • You have advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • You have experience developing on Git and GitHub.
  • You have experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • You have experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • You have strong analytic skills related to working with unstructured datasets.
  • You can build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • You have a successful history of manipulating, processing and extracting value from large disconnected datasets.
  • You have a working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • You have strong project management and organizational skills.
  • You have experience supporting and working with cross-functional teams in a dynamic environment.
  • We are looking for a candidate with 8+ years of experience in a Data Engineer role, who is educated in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
    • Experience with big data tools: Hadoop, Spark, Presto, etc.
    • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
    • Experience with data pipeline and workflow management tools: Airflow etc.
    • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
    • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
    • Experience with object-oriented/object function scripting languages: Python, Java and the JVM.
    • Experience with authentication/authorization protocols like Oauth2, LDAP, Kerberos, etc.

Who We Are:

GitHub is the best place to share code with friends, co-workers, classmates, and complete strangers. Over 27 million people use GitHub to build amazing things together across 79 million repositories. With the collaborative features of GitHub.com and GitHub Business, it has never been easier for individuals and teams to write faster, better code.

What We Value:

Collaboration: We believe the best work is done together. 
Empathy: We believe in putting people first. 
Quality: We believe in setting the standard for excellence. 
Positive Impact: We believe in making the world a better place through our work. 
Shipping: We believe in creating things for the people using them.

Why You Should Join:

At GitHub, we constantly strive to create an environment that allows our employees (Hubbers) to do the best work of their lives. We've designed one of the coolest workspaces in San Francisco (HQ), where over half of our Hubbers work, snack, and create daily. The other half of our Hubbers work remotely in 18 countries across the globe. Here is a complete list of where we can hire!

We are also committed to keeping Hubbers healthy, motivated, focused and creative. We've designed our top-notch benefits program with these goals in mind. In a nutshell, we've built a place where we truly love working, we think you will too.

GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!

Where We Can Hire

Please note that benefits vary by country, if you have any questions, please don't hesitate to ask your Talent Partner. 

 

#LI-POST

Vacancy page : https://boards.greenhouse.io/github/jobs/1453811