Description
The Site Reliability Engineer will support a premier Navy program team in reviewing, assessing, and improving the reliability, resilience, observability, and operational maintainability of next generation Navy afloat architecture. The candidate will work with technical staff, Navy stakeholders, cybersecurity teams, infrastructure engineers, software teams, and operational representatives to ensure that SRE principles are considered early in architecture, design, integration, test, deployment, and sustainment planning.
Here are the revised bullets, each beginning with a gerund, preserving the original meaning, professional tone, and removing periods at the end:
Reviewing proposed Navy afloat architecture designs for reliability, availability, scalability, maintainability, cybersecurity alignment, and operational supportability
Identifying architecture and implementation risks that could affect system uptime, Fleet usability, maintainability, troubleshooting, patching, monitoring, recovery, or sustainment
Defining and recommending SRE‑aligned practices for Navy afloat systems, including service level objectives, operational metrics, monitoring requirements, alerting thresholds, error budget concepts, incident response workflows, and reliability reporting
Assisting engineering teams in translating operational reliability requirements into technical design considerations, implementation standards, and sustainment procedures
Evaluating system designs against real‑world afloat constraints, including limited bandwidth, intermittent connectivity, shipboard infrastructure limits, cybersecurity controls, maintenance windows, disconnected operations, and mission availability requirements
Supporting development of observability strategies, including logging, metrics, tracing, dashboards, alerts, health checks, and performance monitoring
Recommending automation opportunities to reduce manual operational workload, improve repeatability, reduce configuration drift, and improve deployment and sustainment reliability
Supporting root cause analysis for operational issues, test findings, integration failures, or architecture concerns, then converting findings into corrective actions and long‑term reliability improvements
Assisting with reliability‑focused documentation, including architecture review comments, risk assessments, operational concepts, monitoring plans, sustainment recommendations, incident response workflows, and executive‑level technical summaries
Working with cybersecurity stakeholders to ensure reliability recommendations also support DoD cybersecurity requirements, including STIG compliance, vulnerability management, audit logging, privileged access controls, and continuous monitoring
Participating in technical working groups, architecture reviews, design reviews, test planning sessions, and customer briefings
Supporting planning for deployment, installation, test, checkout, transition to operations, and sustainment handoff activities
Helping define operational readiness criteria for new or updated afloat capabilities before Fleet deployment
Providing recommendations that balance modern SRE practices with Navy operational constraints, cybersecurity mandates, lifecycle supportability, and mission execution needs
Communicating clearly with both technical and non‑technical stakeholders, including government sponsors, program managers, engineers, cybersecurity staff, and operational users
FILLING THIS POSITION IS CONTINGENT UPON FUNDING
#LI-JC1
Requirements
Ability to obtain and maintain a DoD Secret clearance
U.S. citizenship required due to DoD contract and clearance requirements
Ability to support a remote eligible role with coordination to the primary office in Charleston, South Carolina
Ability to obtain CSWF / DoD 8140 aligned IAT Level II qualification within the required contract or program timeline
Current or ability to obtain one qualifying IAT Level II certification, typically including one of the following:
CompTIA Security+ CE
CompTIA CySA+
GIAC GSEC
ISC2 SSCP
EC-Council CND
Five or more years of experience in one or more of the following areas:
Site reliability engineering
Systems engineering
Platform engineering
DevSecOps
Network or infrastructure operations
Cloud, hybrid cloud, or enterprise hosting environments
Mission critical IT operations
Practical experience with Linux and Windows server environments, including system hardening, patching, configuration, troubleshooting, logging, and operational sustainment
Working knowledge of networking fundamentals, including TCP/IP, DNS, routing, switching, firewalls, load balancing, VPNs, segmentation, and network troubleshooting
Experience designing, reviewing, or operating highly available systems with attention to uptime, resilience, observability, recoverability, and operational risk
Experience with monitoring, alerting, log aggregation, performance analysis, and incident response.
Understanding of SRE principles, including:
Service level indicators
Service level objectives
Error budgets
Toil reduction
Automation first operations
Blameless post incident review
Capacity planning
Reliability risk assessment
Experience supporting cybersecurity compliance in regulated environments, preferably DoD or federal environments
Familiarity with vulnerability management, STIGs, security baselines, patch compliance, privileged access, audit logging, and continuous monitoring
Ability to evaluate architecture and design decisions for operational reliability, maintainability, cybersecurity posture, and lifecycle sustainment
Ability to translate technical findings into clear written recommendations for government sponsors, engineering teams, cybersecurity stakeholders, and program leadership
Strong written and verbal communication skills, including the ability to document technical risks, operational impacts, and recommended mitigations
Desired Skills
Prior experience as an SRE in Fortune 100 or similar large scale environments
Active DoD Secret clearance
Experience supporting Navy, NIWC, NAVWAR, Fleet, tactical, afloat, or shipboard systems
Experience with afloat or disconnected operations where bandwidth, latency, hardware constraints, cybersecurity requirements, and operational availability drive architecture decisions
Experience reviewing or contributing to next generation architecture for Navy, DoD, tactical edge, or mission critical platforms
Experience with DoD Risk Management Framework, Authority to Operate support, continuous monitoring, vulnerability remediation, POA&Ms, STIG implementation, and cyber inspection readiness
Experience with containerization and orchestration technologies such as Docker, Kubernetes, OpenShift, Rancher, or similar platforms
Experience with infrastructure as code and configuration management tools such as Ansible, Terraform, Puppet, Chef, PowerShell DSC, or similar technologies
Experience with CI/CD pipelines and secure software delivery using tools such as GitLab, Jenkins, GitHub Actions, Azure DevOps, Nexus, Artifactory, or similar platforms
Experience with observability platforms and tooling such as Prometheus, Grafana, ELK / Elastic Stack, Splunk, OpenTelemetry, Datadog, New Relic, or similar capabilities
Experience with cloud or hybrid environments, including AWS, Azure, Azure Government, GovCloud, private cloud, VMware, or other enterprise hosting platforms
Experience with backup, disaster recovery, fail-over planning, continuity of operations, and data protection for mission critical systems
Experience performing root cause analysis and converting incident findings into architectural, operational, or automation improvements
Familiarity with Zero Trust principles, identity and access management, certificate management, privileged access management, endpoint security, and secure remote administration
Familiarity with Navy change control, configuration management, test events, installation readiness reviews, deployment planning, or Fleet Readiness Change Board style processes
Experience working directly with government customers, system owners, cybersecurity teams, network engineers, software teams, and operational users
One or more of the following certifications:
Active Security+ CE or higher DoD 8140 / IAT Level II qualifying certification.
CompTIA CySA+
ISC2 SSCP
GIAC GSEC
GIAC GCIH
GIAC GCIA
GIAC GCWN or GCUX
Red Hat Certified System Administrator
Red Hat Certified Engineer
Certified Kubernetes Administrator
AWS Certified SysOps Administrator
AWS Solutions Architect
Microsoft Azure Administrator
VMware Certified Professional
Cisco CCNA or CCNP
ITIL Foundation
Certified ScrumMaster or SAFe certification, where relevant to program execution
Clearance Information
SRC IS A CONTRACTOR FOR THE U.S. GOVERNMENT, THIS POSITION WILL REQUIRE U.S. CITIZENSHIP AND ELIGIBILITY FOR A U.S. GOVERNMENT SECURITY CLEARANCE AT THE SECRET LEVEL
Travel Requirements
None
About Us
Scientific Research Corporation is an advanced information technology and engineering company that provides innovative products and services to government and private industry, as well as independent institutions. At the core of our capabilities is a seasoned team of highly skilled engineers and scientists with multidisciplinary backgrounds. This team is challenged daily to provide cutting edge technology solutions to our clients.
SRC offers a generous benefit package, including medical, dental, and vision plans, 401(k) with a company match, life insurance, vacation and sick paid time off accruals starting at 10 days of vacation and 5 days of sick leave annually, 11 paid holidays, tuition reimbursement, and a work environment that encourages excellence and more. For positions requiring a security clearance, selected applicants will be subject to a government security investigation and must meet eligibility requirements for access to classified information.
EEO
Scientific Research Corporation is an equal opportunity employer that does not discriminate in employment.
All qualified applicants will receive consideration for employment without regard to their race, color, religion, sex, age, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other protected characteristic under federal, state or local law.
Scientific Research Corporation endeavors to make www.scires.com accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact
[email protected] for assistance. This contact information is for accommodation requests only and cannot be used to inquire about the status of applications.