← Back to posts

Infrastructure as Code: Structuring Ansible Repositories

Best practices for organizing your Ansible inventory, group_vars, and host_vars to cleanly separate development and production environments.

Case Snapshot

Situation

During enterprise Linux and virtualization operations across multi-team environments, this case came from work related to "Infrastructure as Code: Structuring Ansible Repositories."

Issue

Needed a repeatable way to apply best practices for organizing your Ansible inventory, group_vars, and host_vars to cleanly separate development and production environments.

Solution

Implemented a practical runbook/automation pattern with clear safety checks, execution steps, and verification points.

Used In

Used in Linux platform engineering, middleware operations, and datacenter modernization projects in regulated environments.

Impact

Improved repeatability, reduced incident risk, and made operational handoffs clearer across teams.

Situation

As your Ansible automation scales from a few scripts to manage hundreds of servers, a flat repository structure quickly becomes unmanageable. Hardcoded variables leak between environments, and it becomes difficult to determine exactly what configuration applies to a specific server.

To solve this, you need a strict directory structure that isolates environments (Development vs. Production) and hierarchical variable definitions.

Task 1 – The Ideal Directory Structure

Here is a mature refactor-ansible-structure designed for clarity and safety:

├── ansible.cfg                 # Main Ansible configuration
├── inventory/                  # Root directory for all inventories
│   ├── production/             # PRODUCTION ENVIRONMENT
│   │   ├── group_vars/         # Variables for prod groups
│   │   ├── host_vars/          # Variables for specific prod hosts
│   │   ├── hosts.yml           # Prod data file (Hosts & attributes)
│   │   └── inventory.config    # Rule file (How groups are created)
│   └── development/            # DEVELOPMENT ENVIRONMENT
│       ├── group_vars/         # Variables for dev groups
│       ├── host_vars/          # Variables for specific dev hosts
│       ├── hosts.yml           # Dev data file (Hosts & attributes)
│       └── inventory.config    # Rule file (How groups are created)
├── playbooks/                  # Operational tasks (patching, reboots)
├── roles/                      # Reusable logic (software installation)
├── files/                      # Static files copied as-is (certs, scripts)
├── templates/                  # Dynamic Jinja2 templates (configs)
└── site.yml                    # Master playbook orchestrating roles

Task 2 – The Benefits of Separation

1. Environment Isolation

By completely separating inventory/production/ and inventory/development/, a development variable cannot accidentally bleed into a production deployment.

If you run ansible-playbook -i inventory/development/ ..., Ansible only loads variables from the development/group_vars and development/host_vars directories. This is critical for Secret Distribution—your dev database password is never loaded into memory during a prod run.

2. The constructed plugin (Dynamic Grouping)

Notice the inventory.config file. Instead of manually maintaining complex group lists, we use Ansible’s native constructed inventory plugin.

The hosts.yml file acts as the single source of truth, simply listing servers and their attributes:

host-example-01:
  env: dev
  role: app_tomcat
  zone: dmz

The inventory.config file contains rules that automatically generate groups based on those attributes. For example, it looks for the key role and creates a group named app_tomcat.

3. Variable Precedence

Ansible merges variables in a specific order of precedence. We structure our variables to take advantage of this:

  1. roles/defaults/main.yml: The absolute baseline. Sane defaults that apply if nothing else is defined.
  2. inventory/<env>/group_vars/all.yml: Base variables for the entire environment (e.g., the corporate DNS servers).
  3. inventory/<env>/group_vars/<group_name>.yml: Variables specific to a group of servers (e.g., app_tomcat.yml defines the JVM heap size for all tomcat servers). These override all.yml.
  4. inventory/<env>/host_vars/<hostname>.yml: Variables specific to a single machine. These win against all group variables.

This architecture reduces human error, makes the repository easier to navigate, and ensures that your infrastructure is truly defined by data (hosts.yml) rather than complex manual groupings.

Architecture Diagram

Infrastructure as Code: Structuring Ansible Repositories execution diagram

This diagram supports Infrastructure as Code: Structuring Ansible Repositories and highlights where controls, validation, and ownership boundaries sit in the workflow.

Post-Specific Engineering Lens

For this post, the primary objective is: Increase automation reliability and reduce human variance.

Implementation decisions for this case

  • Chose a staged approach centered on ansible to avoid high-blast-radius rollouts.
  • Used iac checkpoints to make regressions observable before full rollout.
  • Treated devops documentation as part of delivery, not a post-task artifact.

Practical command path

These are representative execution checkpoints relevant to this post:

ansible-playbook site.yml --limit target --check --diff
ansible-playbook site.yml --limit target
ansible all -m ping -o

Validation Matrix

Validation goalWhat to baselineWhat confirms success
Functional stabilityservice availability, package state, SELinux/firewall posturesystemctl --failed stays empty
Operational safetyrollback ownership + change windowjournalctl -p err -b has no new regressions
Production readinessmonitoring visibility and handoff notescritical endpoint checks pass from at least two network zones

Failure Modes and Mitigations

Failure modeWhy it appears in this type of workMitigation used in this post pattern
Inventory scope errorWrong hosts receive a valid but unintended changeUse explicit host limits and pre-flight host list confirmation
Role variable driftDifferent environments behave inconsistentlyPin defaults and validate required vars in CI
Undocumented manual stepAutomation appears successful but remains incompleteMove manual steps into pre/post tasks with assertions

Recruiter-Readable Impact Summary

  • Scope: deliver Linux platform changes with controlled blast radius.
  • Execution quality: guarded by staged checks and explicit rollback triggers.
  • Outcome signal: repeatable implementation that can be handed over without hidden steps.