Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DevOps Guide

DevOps engineering

"DevOps Engineer"" is a highly relative job title. Purists will tell you the term makes no sense because DevOps is a methodology, not a person. Yet, you will find thousands of job listings, each defining the role differently.

In many cases, these positions are simply rebranded Operations engineers or SysAdmin roles equipped with modern tooling. However, the actual scope of a DevOps Engineer varies widely and typically entails one or more of the following tasks:

  • Build: Core Infrastructure & Operations
    • Provisioning and maintaining resources, whether on-premise or in the cloud.
    • System Administration: Installing, patching, and maintaining OS-level components (Linux/Windows). This includes managing users, permissions, and filesystems.
    • Configuration Management: Automating the setup and maintenance of software configurations across servers.
    • Networking & Storage: Managing software-defined networking (VPCs, subnets) and storage volumes.
    • Operations Management: Handling routine maintenance, backups, and general system health.
    • Database Management: Basic provisioning, replication setup, and ensuring data persistence.
  • Design: Architecture & Design
    • System Design: Architecting solutions based on needs, e.g. choosing between loosely coupled (microservices) or tightly coupled (monoliths) structures.
    • High Availability & Scalability Strategy: Designing systems to withstand traffic spikes (auto-scaling) and regional failures (redundancy).
    • Cloud Architecture: eciding which managed services (Serverless, Managed SQL, Object Storage) to use versus building from scratch.
  • automate: Automation & Tooling
    • Automation: Replacing manual UI interactions with reproducible code.
    • Scripting & Middleware Development: Writing scripts to connect tools that don't natively talk to each other.
    • Infrastructure as Code (IaC): Defining the entire environment in configuration files rather than manual setup.
  • Release Engineering & Software Supply Chain
    • Software Supply Chain Management: Managing dependencies, auditing libraries for safety, and generating Software Bill of Materials (SBOM).
    • Deployment Strategy (e.g., Weekly Deployment): Executing releases using strategies like "Blue/Green" swaps or "Canary" releases to limit the blast radius of errors.
    • Version Control Management: Enforcing branching strategies (e.g., GitFlow vs. Trunk-Based) to keep code organized.
    • Artifact Management: Securing compiled binaries and container images in private registries.
  • Operate: Reliability & Incident Management (SRE)
    • Monitoring & Observability: Setting up dashboards to track metrics (CPU, latency), logs (errors), and traces (user journey).
    • Incident Response: Acting as the first responder during outages to triage and coordinate fixes.
    • Post-Incident Review (Post-Mortems): Writing Root Cause Analysis (RCA) reports after incidents to prevent recurrence.
    • Chaos Engineering: Stress-testing systems by intentionally breaking components to ensure recovery automation works.
  • Help: Developer Experience (DevEx)
    • Developer Environment Building: Creating pre-configured environments (e.g., DevContainers) so new hires can code on Day 1 without setup friction.
    • Internal Developer Platform (IDP): Building self-service portals where developers can provision their own resources without blocking Ops.
    • Documentation & Knowledge Base: Maintaining runbooks and wikis to prevent "brain drain" when engineers leave.
  • Protect: Security & Governance (DevSecOps)
    • Security & Compliance: Ensuring infrastructure meets legal standards (GDPR, HIPAA, PCI-DSS) and internal policies.
    • Identity & Access Management (IAM): Enforcing "Least Privilege" to ensure developers don't have unnecessary "God mode" access to production.
    • Vulnerability Scanning: Automating security checks for both infrastructure (OS patches) and application code (libraries).
  • Collaborate: Culture & People
    • Team Support: Acting as a technical unblocker for development teams.
    • Coaching: Providing DevOps coaching to teams to instill cultural best practices.
    • FinOps: Monitoring cloud costs and guiding teams toward architecting cost-effective solutions.

Fundamentals

Networking

Storage