Skip to main content

Overview

Ironcore Backup Solution (IBS) is a layered architecture optimised for incremental, deduplicated, encrypted backups across virtual machines, system containers, and physical hosts. At its core is Changed Block Tracking (CBT) — the hypervisor identifies precisely which disk blocks have changed since the previous backup so that only modified blocks are ever read, transmitted, and stored. CBT is what makes incremental backups fast and storage-efficient at scale. This page describes each component, the data flow between them, and the design decisions behind the storage model.
Prerequisites
  • Administrator role on the Polystack platform
  • Understanding of TCP networking, storage backends, and TLS

Deployment Model

Ironcore Backup Solution (IBS) splits the backup pipeline between the host that owns the workload and the backup server. The split is fixed: the host always performs change tracking, chunking, compression, and encryption; the backup server always performs deduplication, storage, replication, and verification. The location of the “host side” depends on the workload type:
Workload TypePipeline Runs OnSoftware in the Guest
Virtual machine on IroncoreIroncore hypervisor hostNone — fully agentless
System container on IroncoreIroncore hypervisor hostNone — fully agentless
Physical Linux hostThe host itselfironcore-backup-client package
File-archive from any Linux machineThe machine itselfironcore-backup-client package
Agentless for hypervisor-managed workloads — block-level change tracking is performed by the hypervisor against the VM’s virtual disks. The guest OS is not touched, does not need a backup driver, and cannot detect the backup. This applies to all VMs and system containers running on Ironcore.
Agent-based for physical workloads — bare-metal hosts, standalone Linux servers, and machines outside the virtualisation layer require the lightweight ironcore-backup-client package. The client provides the same pipeline (chunk, compress, encrypt) and pushes to the backup datastore.

High-Level Component Diagram

Changed Block Tracking (CBT) is the foundational mechanism of every incremental backup. The hypervisor maintains a per-disk bitmap that records every block written since the previous backup. Only blocks marked dirty are read, hashed, compressed, encrypted, and transmitted — every other block is skipped entirely. CBT is what allows multi-terabyte VMs to back up in seconds when only a small fraction of disk has changed.

Service Inventory

ServiceProcessPurpose
Backup APIironcore-backup-apiREST API for all backup operations
Schedulerironcore-backup-schedulerTriggers jobs based on configured schedule
Workerironcore-backup-workerExecutes backup, restore, and sync tasks
Verification Workerironcore-backup-verifierRe-reads chunks and validates SHA-256
Garbage Collectorironcore-backup-gcReclaims orphaned chunks
Replication Workerironcore-backup-syncMirrors snapshots between sites
Tape Workerironcore-backup-tapeReads / writes LTO tape libraries
Notification Engineironcore-backup-notifyRoutes alerts to SMTP, webhooks, and metric servers

Host-Side Pipeline

The host owns the most expensive part of the pipeline: identifying changes, chunking, compressing, and encrypting. As described in the Deployment Model above, the “host” is the hypervisor host for VM and container backups (agentless from the guest’s perspective), or the physical host itself for bare-metal backups via the ironcore-backup-client package.

Changed Block Tracking (CBT)

Changed Block Tracking (CBT) is the mechanism IBS uses to identify exactly which blocks of a disk have been modified since the previous backup. It is the single most important determinant of backup performance and storage efficiency. The hypervisor maintains a dirty bitmap per disk: a compact in-memory data structure where each bit represents a block on the disk. The bitmap is updated every time the guest writes to a block. At the start of each incremental backup:
  1. The hypervisor exposes the current dirty bitmap.
  2. The host-side pipeline iterates the dirty bits — and only the dirty bits.
  3. For each dirty block, the data is read, chunked, compressed, encrypted, and transmitted to the datastore.
  4. After the snapshot is committed, the bitmap is reset so the next incremental captures only post-backup writes.
Clean blocks (those not marked dirty) are never read. This is the core performance contract of CBT.
PropertyBehaviour
Tracking granularityPer-block (typically 64 KiB per bit)
PersistenceBitmap survives guest reboot
Reset triggerSuccessful backup commit
Bitmap size~64 KB per 1 TB of disk capacity (negligible memory cost)
Disk coverageAll virtual disks attached to the VM at backup time
Failure handlingA corrupted or lost bitmap falls back to a full read; the next backup is implicitly a full

Why CBT Matters

ScenarioWithout CBTWith CBT
1 TB VM, 1% daily changeRead 1 TB every dayRead 10 GB
5 TB VM, 0.5% daily changeRead 5 TB every dayRead 25 GB
10 TB database VM, 2% daily changeRead 10 TB every dayRead 200 GB
For the production reference fleet (200 VMs × 100 GB × 2% daily change), CBT reduces daily backup IO from 20 TB to ~400 GB — a 50x reduction that makes the standard nightly backup window practical.
The first backup of any new VM has no bitmap history — it reads every block once to establish a baseline (effectively a full backup). All subsequent backups use CBT and read only changed blocks.
CBT applies to VM and container backups where the hypervisor owns the storage. For physical host backups via the ironcore-backup-client package, IBS uses an equivalent file-level change detection approach — comparing inode metadata and content fingerprints against the previous backup’s manifest. The net effect is the same: only changed data is transmitted.

Content-Defined Chunking

Backup data is split into variable-sized chunks using a rolling hash. Chunk boundaries are determined by content fingerprints, not fixed offsets — this means inserting a byte in the middle of a file does not shift every subsequent chunk boundary, so deduplication remains effective.
Chunk parameterDefault
Minimum size512 KiB
Average size4 MiB
Maximum size16 MiB

Zstandard Compression

Per-chunk Zstandard compression precedes encryption. Typical compression ratios on mixed workloads are between 2x and 4x. Throughput on modern x86_64 hardware exceeds 1 GiB/s per core.

AES-256-GCM Encryption

Each chunk is encrypted with AES-256 in Galois/Counter Mode using the project encryption key. GCM provides authenticated encryption: any tampering with the ciphertext fails the GCM authentication tag during decryption.
PropertyValue
CipherAES-256
ModeGCM (authenticated)
IV96-bit per-chunk random
Auth tag128-bit
Key handlingClient-side only — server never sees the key
See Security and Encryption for key lifecycle, master keys, and paperkey recovery.

Server-Side Storage Model

The backup server stores data in a chunk-addressed datastore. Every chunk is named after its post-encryption SHA-256 fingerprint. Identical chunks always produce the same fingerprint, so writing a duplicate is detected and skipped.
ComponentDescription
ManifestPer-snapshot signed metadata: owner, timestamp, source, key fingerprint
Chunk IndexOrdered list of chunk SHA-256s that reconstruct the source
ChunkEncrypted, compressed, content-addressed data blob

Deduplication

When the client offers a chunk, the server checks whether the fingerprint already exists in the datastore. If so, the client uploads nothing — the snapshot manifest simply references the existing chunk. This applies across all sources sharing the datastore.

Append-Only Semantics

Existing chunks are never modified. New snapshots add new chunks; pruning removes snapshot manifests; garbage collection reclaims chunks no manifest references. This append-only model is critical for ransomware protection — see Security and Encryption.

Datastore Backends

BackendUse CaseNotes
Local filesystemPrimary DC fast-access datastoreXFS or ZFS on flash or hybrid array
Replicated filesystemTwo-node clustered datastoreZFS replication or shared filesystem
S3-compatible object storageArchive or cloud backendNative S3 API; reduces hardware footprint
Tape libraryLong-term compliance archivalLTO-5 and newer; barcoded catalog
See Datastores for provisioning detail and capacity planning.

Replication Topology

Replication mirrors snapshots from one datastore to another over an encrypted TLS 1.3 channel. The default topology pairs a Primary DC datastore with a Backup site datastore.
PropertyBehaviour
TransportTLS 1.3
DirectionPush (default) or pull
FrequencyConfigurable; typically weekly for archival
BandwidthThrottled per-job via --bwlimit
Chunk transferOnly chunks not present on the destination
Integrity verificationSHA-256 after each sync
See Replication and Sync.

Compliance Mapping

The following table maps internal architecture features to compliance requirements.
RequirementArchitectural Mechanism
Changed Block Tracking (CBT)Hypervisor dirty bitmaps; per-block tracking with persistent state across guest reboots
Full + incremental backupSnapshot model with chunk references
File-level restoreFile archive format supports random access
VM-level restore + live-restoreChunk index streams blocks on demand
Replication encryptionTLS 1.3 transport with per-session keys
At-rest encryptionAES-256-GCM at the chunk level
Integrity verificationSHA-256 per chunk with GCM auth tags
DeduplicationContent-defined chunking + chunk-addressed store
CompressionZstandard per chunk
Ransomware protectionAppend-only chunks, role-restricted prune
Mock recovery drillRestore-test job from the Backup site datastore

Next Steps

Datastores

Provision local, replicated, and object storage datastores

Retention Policies

Configure daily, weekly, monthly, and yearly retention windows

Security and Encryption

Encryption key lifecycle and ransomware protection model

Infrastructure Sizing

Plan capacity for Primary DC and Backup site