Secrets Management in 2026: Building Zero-Trust Credential Infrastructure

Every production system has secrets: database passwords, API keys, TLS certificates, signing keys, OAuth credentials. How those secrets are managed — stored, accessed, rotated, and audited — is one of the highest-leverage security controls an engineering organization can get right. It's also one of the most consistently underengineered, because "it works" and "it's secure" are very different standards.

In 2026, the threat model for secrets has evolved. Supply chain attacks, AI-assisted code analysis for credential leaks, and increasingly sophisticated adversaries who know that credentials are the fast path to production systems make this a domain where the stakes of getting it wrong have materially increased.

Here's the architecture that modern engineering organizations should be building toward, and the practical path from where most teams actually are.

The Problem with How Most Teams Store Secrets Today

The most common secrets management approaches, in roughly ascending order of maturity:

Hardcoded in source code. Still happens, despite being definitively wrong. A database password in a config file committed to a repository creates a permanent record of the credential. Even if rotated immediately after discovery, the credential exists in git history.

Environment variables, set manually. The step up from hardcoded. Variables are injected at runtime, not committed to source. The problems: no audit trail of who changed what, rotation requires redeployment, secrets aren't encrypted at rest in most hosting environments, and "set manually" doesn't scale — someone has to know all the secrets to provision a new environment.

AWS Secrets Manager / GCP Secret Manager / Azure Key Vault. Managed secrets storage with audit logging, rotation support, and IAM-based access control. This is the right approach for cloud-native workloads. The secret lives in the vault, your service fetches it at startup or uses a rotation hook, and access is tied to the service's IAM identity rather than a shared credential.

HashiCorp Vault. A dedicated secrets management platform that works across clouds and on-premises. More operational overhead than managed cloud secrets services, but provides capabilities they don't: dynamic secrets (just-in-time credentials generated for each workload), PKI infrastructure, secrets leasing with TTL enforcement, and a unified policy engine across all secret types.

The Zero-Trust Credential Model

The shift from static to dynamic credentials is the most important trend in secrets management. The conventional model: create a database user, store its password in Secrets Manager, rotate it every 90 days. The zero-trust model: when a service needs database access, it requests a credential from Vault (or an equivalent), receives a short-lived credential valid for 15 minutes, uses it, and the credential expires automatically.

Why dynamic credentials matter:

Rotation is automatic and continuous. A credential that expires in 15 minutes doesn't need to be "rotated" — it's effectively rotated with every use. This eliminates the category of incidents caused by expired credentials that were never rotated.

Blast radius of compromise is bounded. A stolen 15-minute database credential is useless to an attacker after 15 minutes. A stolen 90-day password is usable for 90 days.

Audit trail is per-credential, not per-rotation. Every credential issuance is logged. You can see exactly which workload accessed which database, when, and for how long.

Here's the Vault dynamic credential pattern for database access:

# Service startup: request dynamic database credential
import hvac
import os

def get_database_credential() -> dict:
    """Fetch a short-lived database credential from Vault."""
    client = hvac.Client(url=os.environ['VAULT_ADDR'])
    
    # Authenticate using the service's cloud identity (AWS IAM, GCP SA, etc.)
    # No static Vault token needed
    auth_response = client.auth.aws.iam_login(
        access_key=os.environ['AWS_ACCESS_KEY_ID'],
        secret_key=os.environ['AWS_SECRET_ACCESS_KEY'],
        session_token=os.environ.get('AWS_SESSION_TOKEN'),
        role='my-service-role'
    )
    
    # Request a dynamic database credential
    # Vault creates a temporary user in the database with a TTL
    db_creds = client.secrets.database.generate_credentials(
        name='my-postgres-role'  # Vault role → database permissions mapping
    )
    
    return {
        'username': db_creds['data']['username'],
        'password': db_creds['data']['password'],
        'lease_id': db_creds['lease_id'],
        'lease_duration': db_creds['lease_duration']
    }

# The credential is valid for `lease_duration` seconds
# Renew before expiry if the service is long-running
# On shutdown, explicitly revoke the lease to clean up the database user

The database never has a permanent "app user" — Vault creates and destroys users on demand, scoped to the specific permissions the service needs.

Secrets in Kubernetes

Kubernetes Secret objects are base64-encoded, not encrypted, and are accessible to anyone with the right RBAC permissions. Running workloads that need secrets in Kubernetes requires additional controls.

Sealed Secrets (Bitnami). Encrypt secrets with a cluster-specific key before committing them to git. Sealed Secrets are safe to commit to source control because they can only be decrypted by the cluster that holds the corresponding private key. The workflow: develop locally with real secrets (never committed), seal them for specific clusters before committing.

External Secrets Operator. Bridges Kubernetes with external secrets backends (Vault, AWS Secrets Manager, GCP Secret Manager). You define an ExternalSecret resource that references a secret in your external store; the operator fetches the secret and creates a Kubernetes Secret object. Rotation is handled by the operator polling the external store and updating the Kubernetes Secret when the value changes.

# ExternalSecret definition — references AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 15m  # Poll for updated secret every 15 minutes
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: database-credentials  # Creates this Kubernetes Secret
    creationPolicy: Owner
  data:
  - secretKey: DB_PASSWORD
    remoteRef:
      key: production/my-service/db
      property: password
  - secretKey: DB_USERNAME
    remoteRef:
      key: production/my-service/db
      property: username

Vault Agent Injector. Runs a Vault Agent sidecar alongside your application container. The agent authenticates to Vault, fetches secrets, and writes them to a shared in-memory volume that the application reads from. No Vault SDK required in the application code — the secret appears as a file on the filesystem.

Secret Detection in CI/CD Pipelines

Static secrets committed to repositories are a persistent problem despite years of awareness. The tools that help:

Pre-commit hooks. Run secret detection before code is committed. detect-secrets, gitleaks, and Trufflehog are commonly used. These catch obvious patterns (AWS key formats, private key headers, high-entropy strings) before they reach the remote.

CI/CD pipeline scanning. Scan every pull request and commit with a secrets detection tool. Trufflehog's git scanning mode traverses the entire commit history — useful for detecting secrets committed in the past and later "deleted" (they're still in history).

Real-time repository scanning. GitHub's secret scanning, GitLab's secret detection, and dedicated platforms like GitGuardian scan repositories continuously and alert when new secrets are detected. For private repositories, this catches cases where the pre-commit hook was bypassed or disabled.

Credential validity checking. Not all detected strings are valid credentials. Tools that validate detected credentials against the relevant APIs (does this AWS key actually work?) reduce false positive fatigue. A detected but invalid credential is a lower priority than a detected and valid one.

The defense-in-depth approach: pre-commit hooks (developer-side), CI scanning (team-side), and repository scanning (organization-side). Multiple layers because each can be bypassed or miss cases the others catch.

Rotation Infrastructure

Secrets that are stored statically need to be rotated. Rotation needs to be automated. Manual rotation is operationally expensive and consistently delayed — the 90-day rotation policy that's enforced by "someone emails the team and hopes it happens" is not a rotation policy.

The key requirements for rotation automation:

Zero-downtime rotation. Your application must be able to use both the old credential and the new credential during the rotation window. If your service is restarted with the new credential before the old one is revoked, there's no gap. If the old credential is revoked before the service is restarted, you have an outage.

Coordinated rotation for shared credentials. If multiple services use the same database credential, rotation requires coordinating the update across all services. Dynamic credentials solve this by eliminating sharing.

Rotation monitoring. Know when credentials are due for rotation, when rotation fails, and when services are using credentials past their intended expiry. AWS Secrets Manager has built-in rotation scheduling; Vault's lease system enforces TTLs automatically.

The AI Angle: AI Systems Need Secrets Too

LLM-powered agents and AI systems access APIs, databases, and services — and those access credentials need the same rigor as any other production credential. A few considerations specific to AI systems:

API keys for model providers are high-value targets. An attacker who obtains your Anthropic or OpenAI API key can run inference at your cost and (more concerning) potentially access data you've submitted to the API. These credentials should be stored in your secrets manager with the same controls as production database credentials, not in environment variables or hardcoded in agent scripts.

Agent credentials should be least-privilege. An AI agent that needs to query CloudWatch logs should have an IAM role that allows CloudWatch read access — not AdministratorAccess. As agents become more capable, the blast radius of a compromised or misbehaving agent grows. Least-privilege credential scoping is a critical safeguard.

Audit trails for agent actions. When an AI agent uses a credential to take an action, that action should be attributable. IAM session tags that include the agent invocation ID allow you to trace every API call an agent makes back to the specific investigation or task that triggered it. This is essential for incident investigations involving agent behavior.

*Zak Hassan is a Staff SRE specializing in infrastructure security, zero-trust architecture, and AI-powered operations. Find him at zakhassan.com or on LinkedIn.*

Topic Paths

SRE and Reliability Kubernetes and Platform Engineering Observability and Incident Learning Identity Reliability Cloud Cost and Capacity

About the Author

Zak Hassan writes about reliability engineering under real scale constraints.

Staff-level SRE and platform engineer focused on identity reliability, Kubernetes, observability, cloud architecture, AI infrastructure, and reducing operational uncertainty.

Connect on LinkedIn