> ## Documentation Index > Fetch the complete documentation index at: https://docs.polystack.tech/llms.txt > Use this file to discover all available pages before exploring further. # Verification and Validation > Schedule SHA-256 verification jobs, run integrity checks on backups, and conduct bi-annual mock recovery drills for Ironcore Backup Solution. ## Overview A backup that cannot be restored is no backup at all. Ironcore Backup Solution (IBS) operates two complementary safeguards: * **Verification** — periodic SHA-256 integrity checks on every chunk in the datastore to detect bit rot, media failure, or tampering. * **Mock Recovery Drills** — bi-annual end-to-end restore exercises from the Backup site to confirm full recoverability of the data and the playbook. This page covers the schedule, automation, and operational procedure for both. **Prerequisites** * Administrator role on the Polystack platform * At least one datastore with active backups * For mock drills: dedicated compute and storage resources at the Backup site *** ## Verification ### What Verification Checks For every snapshot in scope: 1. **Chunk presence** — every chunk referenced by the manifest exists in the datastore. 2. **Chunk SHA-256** — the recomputed hash matches the manifest reference. 3. **GCM auth tag** — the AES-256-GCM tag validates the ciphertext is unmodified. 4. **Manifest signature** — the manifest signature is valid. Any failure marks the snapshot as **CORRUPT** in the verification report and emits a notification. ### Schedule a Verification Job Navigate to **Backup Solution** > **Datastores** > *(select datastore)* \> **Verification**. Set: * **Schedule**: `Sun 06:00` (weekly is the default) * **Scope**: All snapshots, or filter by namespace / age * **Skip recently verified**: Skip snapshots verified within the last `keep-verified` window (default: 30 days) to limit IO load Attach the operational notification group. Failures dispatch immediately; successful runs dispatch on the configured success policy. Click **Save**. The verification job runs on the next scheduled tick and produces a report visible under the snapshot Verification tab. ```bash title="Configure datastore verification" theme={null} ironcore-backup datastore update \ --name ibs-primary \ --verify-schedule "Sun 06:00" \ --verify-window-days 30 \ --verify-notification ops-team ``` ```bash title="Trigger one-shot verification of a datastore" theme={null} ironcore-backup verify start ibs-primary ``` ```bash title="Verify a single snapshot" theme={null} ironcore-backup verify start \ ibs-primary:vm/12345/2026-05-21T00:00:00Z ``` ### Verification Report | Field | Description | | ------------------ | -------------------------------------- | | **Snapshot** | The snapshot under verification | | **Started** | When verification began | | **Duration** | Elapsed verification time | | **Chunks checked** | Number of chunks read and validated | | **Result** | `OK`, `CORRUPT`, `MISSING`, or `STALE` | | **First failure** | First chunk fingerprint and reason | | **Verifier** | Job ID that produced the report | A snapshot status flows through: ```mermaid theme={null} stateDiagram-v2 [*] --> Unverified: Snapshot created Unverified --> OK: First verification passes OK --> Stale: Verification window expired Stale --> OK: Re-verified Stale --> CORRUPT: Re-verification fails OK --> CORRUPT: Verification detects bit rot or tampering CORRUPT --> [*]: Operator restores from Backup site ``` ### Limit Verification Load Verification reads every chunk, which is IO-intensive. Tune for production: | Setting | Default | Effect | | ---------------------------- | --------- | ------------------------------------- | | `--verify-window-days` | 30 | Skip snapshots verified within N days | | `--max-concurrent-snapshots` | 4 | Parallel verification work | | `--bwlimit` | unlimited | Cap read bandwidth | | `--io-priority` | normal | Lower IO priority during verification | For very large datastores, configure verification to run hourly with a short window (`keep-last=20`) — this gradually verifies every snapshot over a rolling period without a heavy weekly spike. *** ## Mock Recovery Drills A mock drill is a scheduled, end-to-end recovery exercise using backup data from the Backup site. The drill validates the recovery playbook, IBS data, restored application correctness, and operator readiness — without touching the production environment. ### Why Bi-Annual? Bi-annual drills satisfy common compliance frameworks (banking, government infrastructure, telecom). Two drills per year balance preparedness against operator burden. Higher-stakes environments may drill quarterly. ### Drill Roles and Resources | Role | Responsibility | | --------------------- | ------------------------------------------------------------------- | | **Drill coordinator** | Plans the drill, sets success criteria, captures findings | | **Backup admin** | Provides Backup site access, encryption keys, restore configuration | | **Workload owner** | Validates the restored workload functions correctly | | **Auditor** | Records timings, deviations, and outcomes | Required resources at the Backup site: * Compute resources matching the largest workload to be drilled * Storage capacity for rehydration of archived backup data * Network connectivity between the Backup site and the restored workload's test network ### Drill Workflow Schedule the drill at least 4 weeks in advance. Announce to all participating teams and the executive sponsor. Drills run twice per year on a recurring calendar. Select a representative sample: * One Tier-0 production VM * One Tier-1 production VM * One container * One physical host backup * One database with point-in-time recovery requirements On the Backup site, provision compute hosts and a target storage backend for the drill workloads. Use a dedicated test network — never connect drill workloads to the production network. For each selected workload: 1. Identify the most recent weekly archival snapshot in the Backup site datastore. 2. Restore using the Backup Solution Dashboard or CLI. 3. Record start time, time to first guest availability (for live-restore), and time to full restore completion. ```bash title="Drill restore example" theme={null} ironcore-backup vm restore \ --snapshot ibs-archival:production/vm/12345/2026-05-19T03:00:00Z \ --target-host drill-compute-01 \ --storage drill-storage \ --rename drill-vm-12345-2026-05-21 \ --live ``` The workload owner runs a defined functional test: * Service starts * Critical processes are running * Application responds to its standard health check * Database connectivity, replication state, and integrity all healthy Record any deviations from production behaviour. From the same snapshot, restore an individual file and a directory archive. Confirm content, permissions, and modification times match expectations. De-provision drill infrastructure. Confirm no drill data remains accessible to production users. Produce a drill report covering: * Snapshots restored, with timings * Workloads validated, with pass / fail status * Deviations from the recovery playbook * Issues found in the playbook, infrastructure, or backup data * Action items with owners and deadlines File the report in the compliance documentation system. ### Drill Success Criteria | Criterion | Threshold | | ---------------------------- | ------------------------------------------------- | | Restore RPO | Most recent weekly archival snapshot ≤ 7 days old | | Restore RTO | Recovery completes within the published target | | Live-restore time to running | ≤ 5 seconds for any tier | | Application validation | All functional tests pass | | File-level restore | Single-file restore completes in ≤ 1 minute | | Playbook accuracy | No critical deviations from documented procedure | A failed drill is a critical finding. Restoration playbooks, backup data, and any infrastructure deviation must be remediated before the next production change window. File a formal incident if the failure indicates data loss. ### Automate the Drill Restore Step The restore portion of a drill can be automated via a sync job that periodically restores a sample workload to a test environment. The output is compared against a known-good baseline. ```bash title="Automated weekly restore-test job" theme={null} ironcore-backup restore-test create \ --name weekly-vm-restore-test \ --schedule "Mon 09:00" \ --snapshot-filter "namespace:production,backup-id:vm/12345" \ --target-host drill-compute-01 \ --target-storage drill-storage \ --rename "test-{{snapshot-id}}" \ --validate-script /etc/ironcore/drill-validate.sh \ --teardown ``` The restore-test job: 1. Picks the latest matching snapshot 2. Restores to the test target 3. Runs the validation script and records exit code 4. Tears down the test target 5. Emits a notification with the result *** ## Compliance Reporting The verification and drill systems produce evidence suitable for compliance audits: | Evidence | Source | | --------------------------------- | ---------------------------------- | | Per-snapshot verification log | Snapshot detail > Verification tab | | Periodic verification job history | Tasks panel filtered by `verify` | | Mock drill reports | Filed in compliance documentation | | Encryption key rotation history | Audit log | | Access grant history | Audit log | | Replication completion log | Tasks panel filtered by `sync` | Export evidence: Open **Audit Log** > **Export**. Choose the time range and the categories to export. The export is delivered as CSV and JSON. ```bash theme={null} ironcore-backup audit export \ --since 2026-01-01 \ --until 2026-06-30 \ --categories verify,sync,prune,access \ --output /tmp/audit-2026-h1.tar.gz ``` *** ## Troubleshooting Reduce scope (lower `--verify-window-days`), increase parallelism, or split verification into multiple jobs targeting different namespaces. The Backup site typically has fewer or slower storage tiers than Primary. Drill targets should be provisioned to match the expected disaster-recovery capacity — undersized drill infrastructure can give a false negative. Verification is not running. Confirm the schedule and check the task history. If the worker is offline, restart `ironcore-backup-verifier`. The validation script may be tied to production-specific paths or hostnames. Maintain drill-specific overrides in the validate script. *** ## Next Steps Backup site replication required for drills Encryption keys required to restore during a drill Alert routing for verification and drill failures Underlying integrity and append-only design