DevOps Glossary

A collection of common terms and definitions used in DevOps and cloud computing.

Agile

Do just enough design to start delivering value. Iteritive and continious improvement of software/product.

AI

Artificial Intelligence. The simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction. AI encompasses a broad range of techniques used to enable machines to perform tasks that typically require human intelligence.

Ansible

An open-source software provisioning, configuration management, and application-deployment tool.

API

Application Program Interface is the computer to computer information hub. It is used by third parties or mobile Applications to provide data, but not visual information.

AWS

Amazon Web Services, a public cloud provider. Infrastructure as a service.

Azure

Microsoft Azure, a public cloud computing platform offering a wide range of services including computing, analytics, storage, and networking. A major competitor to AWS and Google Cloud.

Backlog

A prioritised list of work items, features, or bugs to be addressed by a development team. In Agile, the backlog is continuously refined and reprioritised based on business value.

Blockers

An issue identified by an Agile team that is halting or slowing down progress.

Blue/Green Deployment

A release strategy that maintains two identical production environments (blue and green). New versions are deployed to the inactive environment and traffic is switched over, enabling zero-downtime releases and easy rollbacks.

Canary Release

A deployment technique where a new version of software is rolled out to a small subset of users before a full release, reducing risk by catching issues early.

Capital Expense

CapEx, the money a project/company spends to buy, maintain or improve it's fixed assets, such as buildings, vehicles and hardware.

ChatOps

A collaboration model that connects people, tools, and processes through chat platforms (e.g. Slack). Teams can trigger deployments, run scripts, and receive alerts directly within a chat interface.

CI/CD

Continuous Integration and Continuous Delivery/Deployment. A method to frequently deliver apps to customers by introducing automation into the stages of app development.

Cloud

A network of remote servers hosted on the internet to store, manage, and process data, rather than a local server or personal computer. Major providers include AWS, Azure, and Google Cloud.

Computer Process

A program or function that provides results (outputs) based on data (inputs).

Configuration Management

Used to apply configuration to platforms, servers and software.

Confluence

A team collaboration and documentation tool by Atlassian. Widely used alongside Jira to create, share, and organise project documentation and knowledge bases.

Containerization

A lightweight alternative to full machine virtualization that involves encapsulating an application in a container with its own operating environment.

Containers

A process running on a server, in a jail from a predefined disk image/file structure.

Continuous Integration

A pipeline where automated tests check commited code, providing a fast feedback loop to uncover errors.

CPU

Central Processing Unit. This provides the computer the power to run programs, code and tasks.

Data Centre

A dedicated space that is climate controlled and secure, for housing and operating servers and other infrastructure.

Database

A service that saves, holds and returns data. From a spreadsheet, to a full on Database with data analytics stored functions and reports.

dDOS

Distributed Denial Of Service. Where a group of computers send traffic malicously to your website with the aim of distrupting service.

Dependencies

Reuseable stock or library code, that are installed alongside the main developer's code for the application to work.

Development

Creation and improvement of software running in a Software System.

DevOps

A term used in Agile. 1. A role that uses a mix of Infrastructure and Development Skills, often creating automated workflows and IaC. 2. A term that describes a way of teams working together.

DNS

Domain Name System. Translates human-readable domain names (e.g. example.com) into IP addresses that computers use to communicate over a network.

Docker

A further step into application isolation where an independant image can run on a server. Often used with microservices.

ECS

Amazon Webservices Managed docker service. Enable the management of running docker containers.

Encryption

The process of converting data into a coded format to prevent unauthorised access. Essential for securing data in transit (e.g. HTTPS) and at rest (e.g. encrypted databases).

ESB

Enterprise Service Bus is a common Data Access Layer used to link dispersed IT systems together within an organisation.

Firewall

A network device that controls access between network components.

GCP

Google Cloud Platform. A suite of cloud computing services offered by Google, including compute, storage, machine learning, and networking. A major competitor to AWS and Azure.

Git

A distributed version-control system for tracking changes in source code during software development.

GitHub

A web-based platform for hosting and collaborating on Git repositories. It provides source control, code review, issue tracking, and CI/CD features used by millions of developers worldwide.

GitHub Actions

A CI/CD and automation platform built into GitHub. It allows developers to define workflows triggered by repository events such as pushes or pull requests, automating build, test, and deployment pipelines.

Helm

A package manager for Kubernetes. Helm charts define, install, and upgrade complex Kubernetes applications, simplifying deployment management.

HTTP/HTTPS

HyperText Transfer Protocol (Secure). The foundation of data communication on the web. HTTPS adds encryption via TLS/SSL to secure data between clients and servers.

Hybrid private and public cloud

A mix of private and public cloud, usually seen during migrations and for sensitive data reasons.

Hypervisor

A system that runs on a server to enable virtual machines to run.

Incident Management

The process of identifying, analysing, and resolving service disruptions or outages. Includes on-call rotations, runbooks, post-mortems, and SLA tracking.

Infrastructure

Refers to Servers, Routers, Network Switches, Firewall and other foundational components of a software system. Can be purchased on a Pay as You Use from Cloud Providers.

Infrastructure as Code (IaC)

Where code is written that can create or destroy infrastructure and computer environments.

Jenkins

An open-source automation server that enables developers to build, test, and deploy their software.

Jira

A project management and issue tracking tool by Atlassian. Widely used in Agile teams to plan sprints, track bugs, and manage backlogs across software development projects.

Kanban

An inventory and scheduling system. Used in software development, like SCRUM, however the tasks are not timeboxed, but subjected to other measures and limits.

KPI

Key Performance Indicators enable decisions to be made through metrics about your business, app and service.

Kubernetes

An open-source platform designed to automate deploying, scaling, and operating application containers.

Lambda

Serverless offering of AWS. They are small quick tasks running from a predefined Docker container.

LLM

Large Language Model. A type of AI model trained on vast amounts of text data, capable of generating, summarising, and understanding human language. Examples include GPT and Claude, which power many modern AI assistants and tools.

Load Balancer

A device or service that distributes incoming network traffic across multiple servers to ensure reliability, availability, and performance.

Logging

The practice of recording events, errors, and system activity to files or centralised services. Logs are essential for debugging, auditing, and monitoring application behaviour.

Memory

Where computers temporarily store data.

Memory (Non-Volatile)

Is able to store data for long periods of time, like tape or disks.

Memory (RAM)

Random access memory, programs and data stored while the computer is on.

Memory (ROM)

Read-only memory, programs and data stored while computer is off. Usually contains bootstrap code.

Microservices

A small and deployable software program, part of a Software System.

Monitoring

The continuous observation of a system's health, performance, and availability. Tools like Prometheus, Grafana, and Datadog are commonly used to collect and visualise metrics.

MVP

Minimum Viable Product is a small scale product or service that is used to demonstrate a demand for that product or service.

Network

The connections between servers. This enables communication between software system components as well as the internet.

Network Switch

A physical device (is virtual in Cloud Environments), to marshall network traffic and communications between software system components.

Observability

The ability to understand the internal state of a system from its external outputs. Built on three pillars: logs, metrics, and traces. Goes beyond monitoring to enable root cause analysis.

On-Premise

Infrastructure and software that is hosted and managed within an organisation's own data centre, rather than in the cloud.

Open Source

Software whose source code is publicly available for anyone to view, use, modify, and distribute. Many foundational DevOps tools (Linux, Kubernetes, Terraform) are open source.

Operation Expenses

OpEx, the ongoing costs running a product, business or system.

Pair Programming

Where two people working together on a task improves efficiency. E.g building a wardrobe.

Pipeline

A defined process that links tasks together, usually on a continuous integration server.

Platform

Managed IaC to simplify the deployment of Software Systems.

Post-Mortem

A blameless review conducted after an incident or outage to understand what happened, why it happened, and how to prevent it in future. Also called a retrospective or incident review.

Pull Request

A mechanism in source control platforms (e.g. GitHub) for proposing code changes. Team members review, discuss, and approve changes before they are merged into the main branch.

Puppet

An open-source configuration management tool that automates the provisioning and management of infrastructure using a declarative language.

Reliability

The ability of a system to perform its intended function consistently over time. Site Reliability Engineering (SRE) is a discipline focused on building and maintaining reliable systems.

Repository

A storage location for source code and its history, managed by a version control system such as Git. Can be hosted on platforms like GitHub or GitLab.

Router

A network device that links networks together.

S3

Amazon Simple Storage Service. An object storage service from AWS used to store and retrieve any amount of data, commonly used for backups, static websites, and data lakes.

Scaling

To grow or shrink the servers delivering a service.

SCRUM

A framework for organising tasks. Tasks are scheduled into a timeboxed period known as a Sprint.

Security

The practice of protecting systems, networks, and data from digital attacks, unauthorised access, and damage. In DevOps, security is integrated throughout the pipeline (DevSecOps).

Serverless

A paradyme where code is run on servers maintained by the cloud provider. They can be cheap to start with, but cost can escalate for larger more frequent workloads.

Servers

Compute power of a Software System. Where computers and CPUs are employed to carry out the work.

SLA

Service Level Agreement. A formal commitment between a service provider and a customer defining the expected level of service, including uptime, response times, and support.

Slack

A cloud-based team messaging and collaboration platform widely used in tech organisations for communication, notifications, and ChatOps integrations.

SLO

Service Level Objective. A specific measurable target within an SLA, such as 99.9% uptime per month. Used by SRE teams to balance reliability with development velocity.

Software System

A collection of software, hardware and virtual hardware that makes up a system for running software.

Source Control

Where code is stored, in a way that every change and version is also kept. Useful for auditing and finding bugs due to changes.

Sprint

A fixed-length iteration in Scrum (typically 1–4 weeks) during which a team completes a set of planned work items from the backlog.

SRE

Site Reliability Engineering. A discipline that applies software engineering principles to infrastructure and operations, with a focus on reliability, scalability, and automation.

SSL/TLS

Secure Sockets Layer / Transport Layer Security. Cryptographic protocols that provide secure communication over a network. TLS is the modern successor to SSL, used in HTTPS.

Staging

A pre-production environment that mirrors production as closely as possible, used to test changes before they are released to end users.

Terraform

An open-source infrastructure as code software tool that provides a consistent CLI workflow to manage hundreds of cloud services.

Test Driven Development

TDD - A programming practice where the tests are written first. A test ensures that given a set of inputs, a program or function produces the correct output.

Virtualisation

Enabling the resources of a computer to run more than one logical computer in an isolated way on the same computer, whilst they are not aware of each other.

VPN

Virtual Private Network. Extends a private network across a public network, allowing users to securely access resources as if they were directly connected to the private network.

This glossary is regularly updated with new terms and definitions.

XLinkedInGitHubYouTube
© 2017-2025 Neil Millard
Privacy Policy - Terms of Service - Contact Us
Github - Twitter - Finance Tools - Facebook - YouTube channel - Tech Answers Club - Clock - Delta Famiglia Ltd