YAML (YAML Ain't Markup Language) Format Guide

YAML (YAML Ain't Markup Language)

Extension: .yaml

MIME Type: application/x-yaml

Overview

YAML is a human-friendly data serialization language designed to be more readable than JSON and less verbose than XML for configuration files, data exchange, and structured document authoring. Its design philosophy prioritizes readability: YAML uses indentation to denote structure (similar to Python), supports comments (beginning with #), and allows complex data types like multi-line strings, anchors for deduplication, and custom tags for type annotation — all without requiring brackets, braces, or quotation marks for most values.

YAML is a strict superset of JSON, meaning that any valid JSON document is also a valid YAML document. However, YAML's native syntax goes far beyond JSON's capabilities: block-style mappings and sequences use indentation instead of braces and brackets, scalars can be unquoted or use literal block (|) and folded block (>) indicators for multi-line strings, and anchors (&) and aliases (*) enable DRY (Don't Repeat Yourself) references within a document. These features make YAML the preferred format for configuration files that humans read and edit frequently.

YAML has become the dominant configuration language in the DevOps and cloud-native ecosystem. Docker Compose, Kubernetes, Ansible, GitHub Actions, GitLab CI, CircleCI, Azure Pipelines, Helm charts, and Swagger/OpenAPI specifications all use YAML as their primary configuration format. Its readability advantage over JSON is most apparent in deeply nested structures and lists, where YAML's indentation-based syntax eliminates the visual noise of punctuation.

History

YAML was first proposed in 2001 by Clark Evans, with Ingy dot Net and Oren Ben-Kiki joining the specification effort. The original recursive acronym was 'Yet Another Markup Language,' but it was quickly changed to 'YAML Ain't Markup Language' to emphasize that the format is for data serialization, not document markup. YAML 1.0 was released in January 2004, YAML 1.1 followed in 2005, and YAML 1.2 (the current version) was published in October 2009.

YAML 1.2 made a significant change by aligning its JSON compatibility — previous versions had subtle incompatibilities with JSON's string quoting and number representation. The specification also clarified the type resolution system, making it explicit that unquoted strings like 'yes', 'no', 'on', 'off' are booleans only when the schema says so, addressing a notorious source of bugs (the Norway problem, where the country code 'NO' was interpreted as a boolean false).

Technical Details

A YAML document begins with an optional directive line (e.g., %YAML 1.2) and a document start marker (---). Multiple documents can exist in a single file, separated by --- markers, with ... indicating document end. The data model consists of three node kinds: scalars (strings, numbers, booleans, null, timestamps), sequences (ordered lists), and mappings (unordered key-value pairs). Block style uses newlines and indentation; flow style uses JSON-like brackets and braces for compact inline notation.

YAML's type system relies on tags. Core schema tags include !!str, !!int, !!float, !!bool, !!null, !!seq, and !!map. Without explicit tags, YAML parsers apply implicit type resolution: unquoted 42 becomes an integer, 3.14 a float, true a boolean, and ~ or null a null value. Anchors (&name) mark a node for reuse, and aliases (*name) reference it elsewhere, enabling data deduplication. Merge keys (<<) allow mapping inheritance. Multi-line scalars use literal style (| preserves newlines) or folded style (> converts newlines to spaces). Indentation must use spaces (tabs are forbidden) and is significant for structure.

Pros & Cons

Pros

Highly readable, clean syntax with indentation-based structure and comment support
JSON superset — any valid JSON is valid YAML
Multi-line string support with literal and folded block indicators
Anchor/alias mechanism enables DRY references and data deduplication
De facto standard for DevOps tooling (Kubernetes, Docker Compose, CI/CD pipelines)

Cons

Indentation sensitivity means invisible whitespace errors cause parsing failures
Implicit type coercion can cause subtle bugs (the 'Norway problem' with boolean values)
More complex specification than JSON, leading to parser implementation inconsistencies
Security risks from arbitrary object deserialization in some language bindings
Slower parsing than JSON due to the more complex grammar

Common Use Cases

Defining Kubernetes manifests, Helm charts, and container orchestration configurations
Writing CI/CD pipeline definitions for GitHub Actions, GitLab CI, and Azure Pipelines
Configuring Docker Compose services and multi-container application stacks
Authoring OpenAPI/Swagger API specifications with nested schema definitions
Managing Ansible playbooks and infrastructure automation runbooks
Storing application configuration that developers read and edit frequently

Related Formats

.jsonJSON .xmlXML .txtTXT .csvCSV

YAML (YAML Ain't Markup Language)

Extension: .yaml

MIME Type: application/x-yaml

Overview

History

Technical Details

Pros & Cons

Pros

Highly readable, clean syntax with indentation-based structure and comment support
JSON superset — any valid JSON is valid YAML
Multi-line string support with literal and folded block indicators
Anchor/alias mechanism enables DRY references and data deduplication
De facto standard for DevOps tooling (Kubernetes, Docker Compose, CI/CD pipelines)

Cons

Indentation sensitivity means invisible whitespace errors cause parsing failures
Implicit type coercion can cause subtle bugs (the 'Norway problem' with boolean values)
More complex specification than JSON, leading to parser implementation inconsistencies
Security risks from arbitrary object deserialization in some language bindings
Slower parsing than JSON due to the more complex grammar

Common Use Cases

Defining Kubernetes manifests, Helm charts, and container orchestration configurations
Writing CI/CD pipeline definitions for GitHub Actions, GitLab CI, and Azure Pipelines
Configuring Docker Compose services and multi-container application stacks
Authoring OpenAPI/Swagger API specifications with nested schema definitions
Managing Ansible playbooks and infrastructure automation runbooks
Storing application configuration that developers read and edit frequently

Related Formats

.jsonJSON .xmlXML .txtTXT .csvCSV

YAML (YAML Ain't Markup Language)

Overview

History

Technical Details

Pros & Cons

Pros

Cons

Common Use Cases

Related Formats

Related Tools

YAML (YAML Ain't Markup Language)

Overview

History

Technical Details

Pros & Cons

Pros

Cons

Common Use Cases

Related Formats

Related Tools