Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Upcoming role

This role is not yet open for application. If you would like to learn more or if you'd like to be notified when the application is open, please sign up join our mailing list.

Centers of Excellence - Site Reliability/DevOps Engineer

Centers of Excellence (COE) will soon be accepting applications for a GS-15 - Site Reliability/DevOps Engineer.

Applications will be open for submission on TBD. Check out Join TTS Hiring Process to learn more about the application process.

Location: Washington, DC

Salary Range: The base salary range for this position is: GS-15 Step 1 - $137,849 to GS-15 Step 10 $166,500

The base salary range does not include any adjustment for locality. Your locality will be determined by where you live since most of our positions are remote. If the position isn’t remote, then your locality will be determined by the location of the office where the position is based.

You can find more information about this in the compensation and benefits section on our site.

For specific details on locality pay, please visit OPM’s Salaries & Wages page or for a salary calculator OPM’s 2019 General Schedule (GS) Salary Calculator.

Please note the maximum salary available for the GS pay system is $166,500 Note: You may not be eligible for the maximum salary as it is locality dependent. Please refer to the maximum pay for your locality.

Who May Apply: All United States citizens and nationals (residents of American Samoa and Swains Islands) and applicants must not be GSA employees or contractors

Role Summary:

Site Reliability/DevOps Engineer - GS-15

We are looking for a Site Reliability/DevOps Engineer to join the IT Modernization Centers of Excellence (CoE) to help develop the capabilities and services of our partner agencies.

In collaboration with agency stakeholders, you will be providing engineering and technical leadership for the design, planning, development, and delivery of critical applications on cloud-native microservices architectures that will drive optimizations and improvements in the security, agility, responsiveness, and capital efficiency of their DevSecOps environment.

You will need a thorough knowledge of cloud, CI/CD, containerization, version control systems, configuration management tools, infrastructure as code, and effective monitoring to ensure that the agency’s DevSecOps architecture meets their current and future needs in an efficient, sustainable, agile, and adaptable manner.

Key Objectives

Key objective #1: Operate CoE and agency DevOps environments with high standards of performance and reliability:

-Define key success metrics for CoE and agency DevOps infrastructure and drive improvement toward those measures -Create and improve monitoring systems to collect data about the application, notify on any errors, and improve visibility/observability into application behavior. -Assist application teams in deploying code to the application regularly and as automatically as possible -Lead incident response and mitigate site errors as they occur -Lead postmortem discussions and drive continuous improvement to prevent similar outages -Participate in oncall shifts, serving as first-line support for incidents. Drive down page frequency as low as possible. (We currently page ~1-2 times per month.)

Key objective #2: Build CoE and agency DevOps infrastructure using modern cloud infrastructure techniques:

-Deploy and maintain CI/CD pipelines across multiple CoE and agency DevOps environments -Deploy and maintain critical applications on cloud-native microservices architecture -Use infrastructure-as-code (currently, but not limited to, Terraform) and configuration management (currently, but not limited to, Chef) to automate CoE and agency DevOps AWS, Azure, and/or Google Cloud infrastructure -Review code and consult with other engineers on new features and their implications for site performance, reliability, and security for the security of CoE and agency DevOps environments -Conduct load tests to ensure the application is ready to handle projected user traffic -Improve automation and fault tolerance of the deployment process -Drive long-term improvement in CoE and agency DevOps system availability by removing single points of failure

Key objective #3: Practice an exceptional level of customer service with all partners, providing a unique, tailored experience.

-Explain product or services to people who have varying levels of technical knowledge — always meet the agency partner where they’re at -Empathetically guide our agency partners through the bureaucracy of the sometimes long and arduous compliance and security processes -Skillfully map specific inquiries to product capabilities, identifying the product that best meets the agency partner’s needs -Serve as a liaison between the stakeholders and the project teams, delivering feedback to the team, enabling them to make necessary changes to product performance or presentation -Support a safe, inclusive workplace and a positive team culture where all team members value diversity and individual differences

Application Evaluation

The information in this sections outlines the criteria that your application will be evaluated against to determine if you meet the Qualifications for the position. There are two very important things to note about this step in the process:

  1. Only applications found “minimally qualified” are shared with the hiring manager and are the only candidates eligible to be interviewed
  2. The Minimum Qualification determination can only be made using the information that’s directly within your resume and directly associated your listed work experience.
    • Examples of stuff that can’t be used:
    • Links to portfolios or other external materials (Yes, the links themselves may be “directly” on the resume but the information is not).
    • Information you include in cover letters, responses to questions, etc. as these are not directly associated with your work experience
    • Lists of tools, technologies, programming languages, etc. that are listed separately from your work experience

The Qualification process is a bureaucratic requirement that we are stuck with. It’s best to think about it as the most intense and rigorous resume review you’ve ever heard of. To get through this process you need make sure your resume directly reflects the Qualifications listed below. We also have more guidance on creating a federal style resume on Join TTS Hiring Process

Qualifications

All applications will be reviewed by a panel of subject matter experts against a scoring rubric created for this role. In order to properly be able to evaluate your previous experience, we recommend being as detailed as possible in your resume and following our general guidance on creating federal style resume.

To qualify for this role, you must have one year of specialized experience equivalent to the GS-14 in the Federal service. Specialized experience is:

  1. Experience being a part of a team to deliver digital products or services. This experience must include ALL of the following: -Providing technical support or product development for clients -Delivering tools or products with high uptime or availability requirements (i.e. SLAs of 99.9%+) -Experience using Site Reliability Engineering or DevOps practices in a production environment

  2. Experience providing technical expertise on projects or initiatives to deliver digital products or services. This experience must include ONE of the following: -Conducting technology evaluations -Making architectural decisions -Developing new software features by writing code -Reducing technical debt -Leading incident response

  3. Experience deploying, operating, maintaining, or running a cloud infrastructure or platform. This experience must include TWO of the following: -Using a cloud computing platform -Using cloud computing infrastructure -Using continuous integration or continuous deployment tools -Using infrastructure automation tooling -Using infrastructure monitoring tooling

Qualification determinations cannot be made when resumes do not include the required information, so failure to provide this information may result in disqualification.

For each job on your resume, provide:

  • the exact dates you held each job (from month/year to month/year or “present”)
  • number of hours per week you worked (if part time)

How To Apply

If you would like to learn more or if you’d like to be notified when the application is open, please sign up join our mailing list.