System Analyst II - Site Reliability Engineer

Company : Duke University
Location : Durham, NC, 27701
Posted Date : 10 October 2025
Job Type : Other
Category : IT Operations & Helpdesk
Occupation : System Analyst
Job Details
System Analyst II - Site Reliability Engineer
At Duke Health, we are driven by a commitment to compassionate care that changes the lives of patients, their loved ones, and the greater community. No matter where your talents lie, join us and discover how we can advance health together.
Occupational Summary: The DHTS Systems Analyst-Site Reliability Engineer (SRE) is responsible for designing, implementing, and maintaining large-scale distributed systems with a focus on reliability, scalability, and performance. The SRE collaborates with development teams to ensure that applications and services are designed and operated to meet reliability targets and scale efficiently. This role involves working with OpenShift for on-premises environments and Azure Kubernetes Service (AKS) for cloud-based solutions.
Essential Tasks/Responsibilities:
- Level 1 (DHTS System Analyst 1): Assist in monitoring and maintaining production systems to ensure high availability and performance, including OpenShift clusters on-premises and AKS in the cloud. Participate in on-call rotations to respond to system alerts and incidents. Assist in troubleshooting and resolving system issues and outages across both on-premises and cloud environments. Help implement and maintain automation scripts for routine tasks and deployments in OpenShift and AKS. Contribute to the creation and maintenance of documentation for systems and processes. Assist in capacity planning and performance tuning of systems in both OpenShift and AKS environments. Participate in post-incident reviews and help implement recommendations. Learn and apply SRE best practices and methodologies specific to container orchestration platforms. Collaborate with development teams to improve system reliability and efficiency across on-premises and cloud infrastructures.
- Level 2 (DHTS System Analyst 2): In addition to the duties described for Level 1, the Level 2 SRE will: independently design and implement monitoring solutions for complex systems in OpenShift and AKS environments. Lead incident response efforts and coordinate with multiple teams during outages, considering the nuances of both on-premises and cloud infrastructures. Develop and implement automation solutions to improve system reliability and efficiency across OpenShift and AKS platforms. Conduct thorough root cause analysis for incidents and propose long-term solutions that align with the organization's hybrid infrastructure strategy. Contribute to the design and implementation of disaster recovery and business continuity plans, leveraging both on-premises and cloud resources. Mentor junior team members and provide technical guidance on OpenShift and AKS best practices. Participate in the evaluation and implementation of new technologies and tools that complement OpenShift and AKS environments. Collaborate with development teams to define and implement SLIs, SLOs, and SLAs across both platforms. Contribute to the development of architectural improvements to enhance system reliability and scalability in a hybrid infrastructure model.
- Level 3 (DHTS System Analyst 3): In addition to the duties described for Level 2, the Level 3 SRE will: function as a technical leader and subject matter expert in reliability engineering, with deep expertise in both OpenShift and AKS. Lead the design and implementation of large-scale, complex distributed systems across on-premises OpenShift and cloud-based AKS environments. Develop and implement strategies for continual improvement of system reliability, performance, and efficiency in a hybrid infrastructure model. Lead cross-functional projects to improve overall system architecture and reliability, considering the strengths and limitations of both OpenShift and AKS. Provide advanced troubleshooting and problem-solving for critical production issues in both on-premises and cloud environments. Develop and maintain relationships with key stakeholders across the organization to align SRE practices with business objectives. Drive the adoption of SRE best practices and methodologies across the organization, tailored to the specific needs of OpenShift and AKS platforms. Contribute to the definition of technical standards and best practices for the SRE team, ensuring consistency across on-premises and cloud environments. Mentor and provide technical leadership to junior and mid-level SREs in both OpenShift and AKS technologies. Participate in strategic planning for infrastructure and reliability improvements, considering the long-term evolution of the hybrid infrastructure model. Represent the SRE team in high-level technical discussions and decision-making processes related to container orchestration and cloud strategy.
Advancement to the next level requires employee, at a minimum, successfully attain the following: proven ability to work at the next level, potential to serve beyond the next level, consistently demonstrates a values-based approach in how they work, and is considered one of the top performers at their level across the organization.
Required Qualifications at this Level: Education Bachelor's degree in a related field is preferred, or equivalent work experience. Experience: Level 1 (DHTS System Analyst 1): 0-4 years of software development experience and/or IT solutions engineering. Level 2 (DHTS System Analyst 2): Minimum 5 years of software development experience and/or IT solutions engineering. Level 3 (DHTS System Analyst 3): Minimum 10 years of software development experience and/or IT solutions engineering.
Required Skills and Knowledge:
- Level 1 (DHTS System Analyst 1): Basic understanding of Application Development Lifecycle, ideally with DevOps focus. Familiarity with script writing (e.g., Ansible Playbooks, Helm Charts). Basic knowledge of containerization and orchestration technologies (Docker, Kubernetes, OpenShift). Familiarity with CI/CD technologies like GitLab CI or GitHub Actions. Basic understanding of server administration (preferably Linux). Understanding of networking topologies, firewall rules, and certificate management. Ability to analyze customer requirements and translate into effective solutions. Critical thinking and problem-solving skills. Strong customer service orientation. Basic troubleshooting and root cause analysis skills. Familiarity with project management and Agile/SCRUM methodologies. Proficiency in at least one programming language (e.g., Python, Go, Java). Familiarity with version control systems (e.g., Git).
- Level 2 (DHTS System Analyst 2): All Level 1 skills, plus: Strong experience with Application Development Lifecycle, with a DevOps focus. Proficiency in script writing (e.g., Ansible Playbooks, Helm Charts). Extensive experience with containerization and orchestration technologies (Docker, Kubernetes, OpenShift). Strong experience with CI/CD technologies and practices. Advanced knowledge of server administration (preferably Linux). Solid understanding of networking topologies, firewall rules, and certificate management. Proven ability to analyze complex customer requirements and translate into effective solutions. Advanced troubleshooting and root cause analysis skills. Strong project management skills, including Agile/SCRUM experience. Experience with cloud platforms (AWS, Azure, GCP) and services (SaaS, IaaS, PaaS, FaaS). Knowledge of Enterprise Architecture best practices. Familiarity with AI and ML concepts.
- Level 3 (DHTS System Analyst 3): All Level 2 skills, plus: Technical leadership in application development with a DevOps/CI focus. Technical leadership in automation (Ansible, Terraform, Bash). Extensive experience with Continuous Integration / Continuous Delivery. Extensive experience with server administration. Expert knowledge of network and security concepts. Proven ability to lead and mentor teams in adopting and optimizing container orchestration practices. Expert knowledge of cloud platforms (AWS, Azure, GCP) and services (SaaS, IaaS, PaaS, FaaS). Expert knowledge of Enterprise Architecture best practices. Advanced knowledge of AI and ML concepts and their application in SRE practices.
Desired Skills (All Levels): Red Hat OpenShift certifications. CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer) certifications. Experience with multi-cloud environments. Knowledge of FHIR APIs and healthcare-specific technologies. Excellent time management, organizational, and task prioritization skills. Strong presentation skills. Ability to communicate effectively with non-technical staff and members of interdisciplinary teams. Ability to interact well and effectively communicate with all levels of leadership. Experience with data and system flow diagramming. Familiarity with vulnerability management and patching for application containers.
Additional Responsibilities (All Levels): Provide application system support for team apps, including rotating 24x7 support. Develop relationships with vendors to ensure customer needs are met in a timely manner. Author and update system documentation to share all knowledge acquired in the developer guide. Ensure systems conform to Duke Information Security Office policies and procedures. Assist in oral and written presentations to project teams, customers, and management. Coordinate and perform application testing. Follow established Change Management processes. Provide feedback on departmental processes and procedures and suggest improvements. Plan and coordinate system and application upgrades. Identify internal resources to build project teams as required. Perform detailed analysis and documentation of customer workflows. Collaborate with Administrative, Clinical, and Research customers to understand and meet needs. Develop relationships with key customer management representatives.
Duke University is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex (including pregnancy and pregnancy related conditions), sexual orientation, or veteran status. Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas-an exchange that is best when the rich diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.
Trending Searches in Durham, NC
- Full time jobs near me Durham, NC
- Local job openings
- Places hiring near me
- Job vacancies near me
- Site reliability engineer jobs near me Durham, NC
- Site reliability engineer jobs hiring near me Durham, NC
- Site reliability engineer jobs hiring near Durham, NC
- Site reliability engineer jobs near Durham, NC
- Site reliability engineer jobs near me in Durham, NC
- Site reliability engineer jobs in Durham, NC
Top trending job titles hiring now
Popular Searches for Site Reliability Engineer
- Site reliability engineer jobs
- Jobs near me in the last 3 days
- Engineering jobs near me
- Jobs hiring near me in the last 3 days
- Site reliability engineer jobs in the last 3 days
- Systems engineering jobs near me
- Senior site reliability engineer jobs
- Jobs near me
- Immediate hire jobs
- Jobs near me degree
Other Jobs You May Like
Transmission Engineer - Transmission & Distribution (Atlanta or Charlotte)
Company : Burns & McDonnell
Location : Charlotte, NC
Senior Civil Engineer - Water/Wastewater
Company : Crescent City Recruitment Group
Location : Charlotte, NC
Top searches
- Jobs hiring immediately
- Part time jobs near me
- Full time jobs near me
- Jobs that are hiring near me
- Jobs near me hiring now
- Site reliability engineer jobs near me
- Site reliability engineer jobs
- Site reliability engineer jobs hiring near me
- Site reliability engineer openings near me
- Site reliability engineer vacancies near me
Employment opportunities at Duke University
- Duke University jobs near me Durham, NC
- Duke University jobs hiring near me Durham, NC
- Duke University jobs near Durham, NC
- Duke University jobs hiring near me
- Duke University openings near me
- Duke University jobs near me in Durham, NC
- Duke University jobs hiring in Durham, NC
- Employment opportunities near me
- Job openings near me
- Jobs hiring immediately