HA Clustering Overview

A number of Palo Alto Networks ® firewall models now support session state synchronization among firewalls in a high availability (HA) cluster of up to 16 firewalls. The HA cluster peers synchronize sessions to protect against failure of the data center or a large security inspection point with horizontally scaled firewalls. In the case of a network outage or a firewall going down, the sessions fail over to a different firewall in the cluster. Such synchronization is especially helpful in the following use cases.

One use case is when HA peers are spread across multiple data centers so that there is no single point of failure within or between data centers. A second multi-data center use case is when one data center is active and the other is standby.

Diagram showing HA Cluster across two data centers.

HA Clustering: Multi-Data Center Redundancy

A third HA clustering use case is horizontal scaling, in which you add HA cluster members to a single data center to scale security and ensure session survivability.

Diagram showing horizontal scaling with HA Cluster in a single data center.

HA Clustering: Horizontal Scaling in a Single Data Center

HA clusters support a Layer 3 or virtual wire deployment. HA peers in the cluster can be a combination of HA pairs and standalone cluster members. In an HA cluster, all members are considered active; there is no concept of passive firewalls except for HA pairs, which can keep their active/passive relationship after you add them to an HA cluster.

All cluster members share session state. When a new firewall joins an HA cluster, that triggers all firewalls in the cluster to synchronize all existing sessions. HA4 and HA4 backup connections are the dedicated cluster links that synchronize session state among all cluster members having the same cluster ID. The HA4 link between cluster members detects connectivity failures between cluster members. HA1 (control link), HA2 (data link), and HA3 (packet-forwarding link) are not supported between cluster members that aren’t HA pairs.

For a normal session that has not failed over, only the firewall that is the session owner creates a traffic log. For a session that failed over, the new session owner (the firewall that receives the failed over traffic) creates the traffic log.

The firewall models that support HA clustering and the maximum number of members supported per cluster are as follows:

Firewall Model Number of Members Supported Per Cluster
PA-3200 Series 6
PA-5200 Series 16
PA-5450 8
PA-7000 Series firewalls that have at least one of the following cards: PA-7000-100G-NPC, PA-7000-20GQXM-NPC, PA-7000-20GXM-NPC PA-7080: 4
PA-7050: 6
VM-300 6
VM-500 6
VM-700 16

HA clustering is not supported in public cloud deployments. Consider the HA Clustering Best Practices and Provisioning before you start to Configure HA Clustering.

HA Clustering Best Practices and Provisioning

These are the provisioning requirements and best practices for HA clustering.

Provisioning Requirements and Best Practices

Session Synchronization Best Practices

Health Check Best Practices

Configure HA Clustering

Learn about HA clustering and follow the HA Clustering Best Practices and Provisioning before you configure HA firewalls as members of a cluster.

  1. Establish an interface as an HA interface (to later assign as the HA4 link).
    1. Select Network > Interfaces > Ethernet and select an interface; for example, ethernet1/1.
    2. Select the Interface Type to be HA .
    3. Click OK .
    4. Repeat this step to configure another interface to use as the HA4 backup link.
  2. Enable HA clustering.
    1. Select Device > High Availability > General and edit the Clustering Settings.
    2. Enable Cluster Participation .
    3. Enter the Cluster ID , a unique numeric ID for an HA cluster in which all members can share session state; range is 1 to 99.
    4. Enter a short, helpful Cluster Description .
    5. (Optional) Change Cluster Synchronization Timeout (min) , which is the maximum number of minutes that the local firewall waits before going to Active state when another cluster member (for example, in unknown state) is preventing the cluster from fully synchronizing; range is 0 to 30; default is 0.
    6. (Optional) Change Monitor Fail Hold Down Time (min) , which is the number of minutes after which a down link is retested to see if it is back up; range is 1 to 60; default is 1.
    7. Click OK .
  3. Configure the HA4 link.
    1. Select Device > High Availability > HA Communications and in the Clustering Links section, edit the HA4 section.
    2. Select the interface you configured in the first step as an HA interface to be the Port for the HA4 link; for example, ethernet1/1.
    3. Enter the IPv4/IPv6 Address of the local HA4 interface.
    4. Enter the Netmask .
    5. (Optional) Change the HA4 Keep-alive Threshold (ms) to specify the timeframe within which the firewall must receive keepalives from a cluster member to know that the cluster member is functional; range is 5,000 to 60,000; default is 10,000.
    6. Click OK .
  4. Configure the HA4 Backup link.
    1. Edit the HA4 Backup section.
    2. Select the other interface you configured in the first step as an HA interface to be the Port for the HA4 backup link.
    3. Enter the IPv4/IPv6 Address of the local HA4 backup interface.
    4. Enter the Netmask .
    5. Click OK .
  5. Specify all members of the HA cluster, including the local member and both HA peers in any HA pair.
    1. Select Device > High Availability > Cluster Config .
    2. ( On a supported firewall ) Add a peer member’s Device Serial Number .
    3. ( On Panorama ) Add and select a Device from the dropdown and enter a Device Name .
    4. Enter the HA4 IP Address of the HA peer in the cluster.
    5. Enter the HA4 Backup IP Address of the HA peer in the cluster.
    6. Enable Session Synchronization with the peer you identified.
    7. (Optional) Enter a helpful Description .
    8. Click OK .
    9. Select the device and Enable it.
  6. Define HA failover conditions with link and path monitoring. (Refer to standard HA configuration guides)
  7. Commit .
  8. ( Panorama only ) Refresh the list of HA firewalls in the HA cluster.
    1. Under Templates, select Device > High Availability > Cluster Config .
    2. Click Refresh at the bottom of the screen.
  9. View HA cluster information in the UI.
    1. Select Dashboard .
    2. View the HA cluster fields. The top section displays cluster state and HA4 connections to provide cluster health at a glance. The HA4 and HA4 Backup indicators will be one of the following: Green indicates the link status of the cluster members is Up. Red indicates the link status of all the cluster members is Down. Yellow indicates the link status of some cluster members is Up while the status of other cluster members is Down. Grey indicates not configured. The center section displays the capacity of the local session table and session cache table so you can monitor how full the tables are and plan for firewall upgrades. The lower section displays communication errors on the HA4 and HA4 backup links, signifying possible problems with synchronizing information between members.
Screenshot of the HA Cluster Dashboard widget in PAN-OS UI.

HA Cluster Widget on the Dashboard

  1. Access the CLI to view HA cluster and HA4 link information and perform other HA clustering tasks.

    Example CLI commands:

    • show high-availability cluster members
    • show high-availability cluster statistics
    • show high-availability cluster session brief
    • show counter global filter aspect ha_cluster

You can view HA cluster flap statistics. The cluster flap count is reset when the HA device moves from suspended to functional and vice versa. The cluster flap count also resets when the non-functional hold time expires.

HA Firewall States

An HA firewall can be in one of the following states:

HA Firewall State Occurs In Description
Initial A/P or A/A Transient state of a firewall when it joins the HA pair. The firewall remains in this state after boot-up until it discovers a peer and negotiations begins. After a timeout, the firewall becomes active if HA negotiation has not started.
Active A/P State of the active firewall in an active/passive configuration.
Passive A/P State of the passive firewall in an active/passive configuration. The passive firewall is ready to become the active firewall with no disruption to the network. Although the passive firewall is not processing other traffic:
  • If passive link state auto is configured, the passive firewall is running routing protocols, monitoring link and path state, and the passive firewall will pre-negotiate LACP and LLDP if LACP and LLDP pre-negotiation are configured, respectively.
  • The passive firewall is synchronizing flow state, runtime objects, and configuration.
  • The passive firewall is monitoring the status of the active firewall using the hello protocol.
Active-Primary A/A In an active/active configuration, state of the firewall that connects to User-ID agents, runs DHCP server and DHCP relay, and matches NAT and PBF rules with the Device ID of the active-primary firewall. A firewall in this state can own sessions and set up sessions.
Active-Secondary A/A In an active/active configuration, state of the firewall that connects to User-ID agents, runs DHCP server, and matches NAT and PBF rules with the Device ID of the active-secondary firewall. A firewall in active-secondary state does not support DHCP relay. A firewall in this state can own sessions and set up sessions.
Tentative A/A State of a firewall (in an active/active configuration) caused by one of the following:
  • Failure of a firewall.
  • Failure of a monitored object (a link or path).
  • The firewall leaves suspended or non-functional state.

A firewall in tentative state synchronizes sessions and configurations from the peer.

  • In a virtual wire deployment, when a firewall enters tentative state due to a path failure and receives a packet to forward, it sends the packet to the peer firewall over the HA3 link for processing. The peer firewall processes the packet and sends it back over the HA3 link to the firewall to be sent out the egress interface. This behavior preserves the forwarding path in a virtual wire deployment.
  • In a Layer 3 deployment, when a firewall in tentative state receives a packet, it sends that packet over the HA3 link for the peer firewall to own or set up the session. Depending on the network topology, this firewall either sends the packet out to the destination or sends it back to the peer in tentative state for forwarding.

After the failed path or link clears or as a failed firewall transitions from tentative state to active-secondary state, the Tentative Hold Time is triggered and routing convergence occurs. The firewall attempts to build routing adjacencies and populate its route table before processing any packets. Without this timer, the recovering firewall would enter active-secondary state immediately and would silently discard packets because it would not have the necessary routes.

When a firewall leaves suspended state, it goes into tentative state for the Tentative Hold Time after links are up and able to process incoming packets.

Tentative Hold Time range (sec) can be disabled (which is 0 seconds) or in the range 10-600; default is 60.

Non-functional A/P or A/A Error state due to a dataplane failure or a configuration mismatch , such as only one firewall configured for packet forwarding, VR sync or QoS sync.
In active/passive mode, all of the causes listed for Tentative state cause non-functional state.
Suspended A/P or A/A The device is disabled so won’t pass data traffic and although HA communications still occur, the device doesn’t participate in the HA election process. It can’t move to an HA functional state without user intervention.

Diagrams

HA Cluster Configuration Flow HA State Transitions (Simplified)

State diagram showing common transitions between HA firewall states (simplified).

State diagram showing common transitions between HA firewall states (simplified).

Interactive Quiz

Test your knowledge of Palo Alto HA Clustering. Please answer all questions before submitting.

1. Which dedicated HA link is used specifically for session synchronization between members in an HA Cluster (not just an HA pair)?

2. What is the highly recommended best practice for provisioning and managing configurations across HA Cluster members?

3. What is a critical requirement for firewalls participating in the same HA Cluster?

4. In which environment is HA Clustering explicitly NOT supported according to the documentation?

5. What is a potential issue with asymmetric traffic (client-to-server and server-to-client flows hitting different firewalls) in an HA Cluster?

6. In an Active/Active HA configuration, what state does a firewall typically enter after recovering from a monitored link failure or leaving suspended state, before becoming fully active?

7. Before assigning an interface for HA4 or HA4 Backup use, what "Interface Type" must it be configured as under Network > Interfaces?

8. What is the maximum number of members supported in an HA Cluster for the PA-5200 Series or VM-700?

9. Under normal (non-failover) conditions in an HA Cluster, which firewall generates the traffic log for a session?

10. Which setting must be checked under Device > High Availability > General to make a firewall part of an HA Cluster?

11. Which HA state indicates a critical error, such as a dataplane failure or a significant configuration mismatch between peers?

12. What are the key requirements for the Layer 2 network connecting the HA4 links between cluster members?

13. After changing the default host key type or regenerating HA1 SSH keys via CLI, which command is used to push the new key to the HA peer?

14. In an Active/Active cluster, which firewall state does NOT support DHCP Relay functionality?

15. Setting parameters for `session-rekey` (data, interval, packets) in an SSH service profile for HA1 affects:

16. For sessions to successfully fail over between HA cluster members, what configuration element must be consistent across the firewalls?

17. Which interface deployment modes are supported for HA Clustering?

18. What parameter in the HA4 link configuration determines how long a firewall waits for heartbeats from a peer before considering it potentially down?

19. Using the `force` option with `request high-availability session-reestablish` is necessary when:

20. Which HA state prevents a firewall from participating in HA election or passing data traffic, requiring manual intervention to become functional again?