As businesses move their data to the public cloud, one of the most pressing issues is how to keep it safe from illegal access.

Using a tool like HashiCorp Vault gives you greater control over your sensitive credentials and fulfills cloud security regulations.

In this blog, we’ll walk you through HashiCorp Vault High Availability Setup.

Hashicorp Vault

Hashicorp Vault is an open-source tool that provides a secure, reliable way to store and distribute sensitive information like API keys, access tokens, passwords, etc. Vault provides high-level policy management, secret leasing, audit logging, and automatic revocation to protect this information using UI, CLI, or HTTP API.

High Availability

Vault can run in a High Availability mode to protect against outages by running multiple Vault servers. When running in HA mode, Vault servers have two additional states, i.e., active and standby. Within a Vault cluster, only a single instance will be active, handling all requests, and all standby instances redirect requests to the active instance.

Integrated Storage Raft

The Integrated Storage backend is used to maintain Vault’s data. Unlike other storage backends, Integrated Storage does not operate from a single source of data. Instead, all the nodes in a Vault cluster will have a replicated copy of Vault’s data. Data gets replicated across all the nodes via the Raft Consensus Algorithm.

Raft is officially supported by Hashicorp.

Architecture

Prerequisites

This setup requires Vault, Sudo access on the machines, and the below configuration to create the cluster.

  • Install Vault v1.6.3+ent or later on all nodes in the Vault cluster 

In this example, we have 3 CentOs VMs provisioned using VMware. 

Setup

1. Verify the Vault version on all the nodes using the below command (in this case, we have 3 nodes node1, node2, node3):

vault --version

 

2. Configure SSL certificates

Note: Vault should always be used with TLS in production to provide secure communication between clients and the Vault server. It requires a certificate file and key file on each Vault host.

We can generate SSL certs for the Vault Cluster on the Master and copy them on the other nodes in the cluster.

Refer to: https://developer.hashicorp.com/vault/tutorials/secrets-management/pki-engine#scenario-introduction for generating SSL certs.

  • Copy tls.crt tls.key tls_ca.pem to /etc/vault.d/ssl/ 
  • Change ownership to `vault`
[user@node1 ~]$ cd /etc/vault.d/ssl/           
[user@node1 ssl]$ sudo chown vault. tls*

 

  • Copy tls* from /etc/vault.d/ssl to of the nodes

3. Configure the enterprise license. Copy license on all nodes:

cp /root/vault.hclic /etc/vault.d/vault.hclic
chown root:vault /etc/vault.d/vault.hclic
chmod 0640 /etc/vault.d/vault.hclic

 

4. Create the storage directory for raft storage on all nodes:

sudo mkdir --parents /opt/raft
sudo chown --recursive vault:vault /opt/raft

 

5. Set firewall rules on all nodes:

sudo firewall-cmd --permanent --add-port=8200/tcp
sudo firewall-cmd --permanent --add-port=8201/tcp
sudo firewall-cmd --reload

 

6. Create vault configuration file on all nodes:

### Node 1 ###
[user@node1 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node1"
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node1.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

 

### Node 2 ###
[user@node2 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node2"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    } 
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node2.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

 

### Node 3 ###
[user@node3 ~]$ cat /etc/vault.d/vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node3"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node3.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

 

7. Set environment variables on all nodes:

export VAULT_ADDR=https://$(hostname):8200
export VAULT_CACERT=/etc/vault.d/ssl/tls_ca.pem
export CA_CERT=`cat /etc/vault.d/ssl/tls_ca.pem`

 

8. Start Vault as a service on all nodes:

You can view the systemd unit file if interested by: 

cat /etc/systemd/system/vault.service
systemctl enable vault.service
systemctl start vault.service
systemctl status vault.service

 

9. Check Vault status on all nodes:

vault status

 

10. Initialize Vault with the following command on vault node 1 only. Store unseal keys securely.

[user@node1 vault.d]$ vault operator init -key-shares=1 -key-threshold=1
Unseal Key 1: HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Initial Root Token: hvs.j4qTq1IZP9nscILMtN2p9GE0
Vault initialized with 1 key shares and a key threshold of 1.
Please securely distribute the key shares printed above. 
When the Vault is re-sealed, restarted, or stopped, you must supply at least 1 of these keys to unseal it
before it can start servicing requests.
Vault does not store the generated root key. 
Without at least 1 keys to reconstruct the root key, Vault will remain permanently sealed!
It is possible to generate new unseal keys, provided you have a
quorum of existing unseal keys shares. See "vault operator rekey" for more information.

 

11. Set Vault token environment variable for the vault CLI command to authenticate to the server. Use the following command, replacing <initial-root- token> with the value generated in the previous step.

export VAULT_TOKEN=<initial-root-token>
echo "export VAULT_TOKEN=$VAULT_TOKEN" >> /root/.bash_profile
### Repeat this step for the other 2 servers.

 

12. Unseal Vault1 using the unseal key generated in step 10. Notice the Unseal Progress key-value change as you present each key. After meeting the key threshold, the status of the key value for Sealed should change from true to false.

[user@node1 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                         Value
---                         -----
Seal Type                   shamir
Initialized                 true
Sealed                      false
Total Shares                1
Threshold                   1
Version                     1.11.0
Build Date                  2022-06-17T15:48:44Z
Storage Type                raft
Cluster Name                POC
Cluster ID                  109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                  true
HA Cluster                  https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                     active
Active Since                2022-06-29T12:50:46.992698336Z
Raft Committed Index        36
Raft Applied Index          36

 

13. Unseal Vault2 (Use the same unseal key generated in step 10 for Vault1):

[user@node2 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true

[user@node2 vault.d]$ vault status
Key                   Value
---                   -----
Seal Type             shamir
Initialized           true
Sealed                true
Total Shares          1
Threshold             1
Version               1.11.0
Build Date            2022-06-17T15:48:44Z
Storage Type          raft
Cluster Name          POC
Cluster ID            109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled            true
HA Cluster            https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode               standby
Active Node Address   https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index  37
Raft Applied Index    37

 

14. Unseal Vault3 (Use the same unseal key generated in step 10 for Vault1):

[user@node3 ~]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true

[user@node3 ~]$ vault status
Key                       Value
---                       -----
Seal Type                 shamir
Initialized               true
Sealed                    false
Total Shares              1
Threshold                 1
Version                   1.11.0
Build Date                2022-06-17T15:48:44Z
Storage Type              raft
Cluster Name              POC
Cluster ID                109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                true
HA Cluster                https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                   standby
Active Node Address       https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index      39
Raft Applied Index        39

 

15. Check the cluster’s raft status with the following command:

[user@node3 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    leader      true
node2    node2.int.us-west-1-dev.central.example.com:8201    follower    true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

 

16. Currently, node1 is the active node. We can experiment to see what happens if node1 steps down from its active node duty.

In the terminal where VAULT_ADDR is set to: https://node1.int.us-west-1-dev.central.example.com, execute the step-down command.

$ vault operator step-down # equivalent of stopping the node or stopping the systemctl service
Success! Stepped down: https://node2.int.us-west-1-dev.central.example.com:8200

 

In the terminal, where VAULT_ADDR is set to https://node2.int.us-west-1-dev.central.example.com:8200, examine the raft peer set.

[user@node1 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    follower    true
node2    node2.int.us-west-1-dev.central.example.com:8201    leader      true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

 

Conclusion 

Vault servers are now operational in High Availability mode, and we can test this by writing a secret from either the active or standby Vault instance and see it succeed as a test of request forwarding. Also, we can shut down the active vault instance (sudo systemctl stop vault) to simulate a system failure and see the standby instance assumes the leadership.

Leave a Reply