An introduction to the FGCP

HA override

The HA override CLI keyword is disabled by default. When override is disabled a cluster may not always renegotiate when an event occurs that affects primary unit selection. For example, when override is disabled a cluster will not renegotiate when you change a cluster unit device priority or when you add a new cluster unit to a cluster. This is true even if the unit added to the cluster has a higher device priority than any other unit in the cluster. Also, when override is disabled a cluster does not negotiate if the new unit added to the cluster has a failed or disconnected monitored interface.

For a virtual cluster configuration, override is enabled by default for both virtual clusters when you enable virtual cluster 2. For more information, see Virtual clustering and HA override on page 1430.

In most cases you should keep override disabled to reduce how often the cluster negotiates. Frequent negotiations may cause frequent traffic interruptions.

However, if you want to make sure that the same cluster unit always operates as the primary unit and if you are less concerned about frequent cluster negotiation you can set its device priority higher than other cluster units and enable override.

To enable override, connect to each cluster unit CLI (using the execute ha manage command) and use the config system ha CLI command to enable override.

For override to be effective, you must also set the device priority highest on the cluster unit that you want to always be the primary unit. To increase the device priority, from the CLI use the config system ha command and increase the value of the priority keyword to a number higher than the default priority of 128.

You can also increase the device priority from the web-based manager by going to System > HA. To increase the device priority of the primary unit select edit for the primary or subordinate unit and set the Device Priority to a number higher than 128.

The override setting and device priority value are not synchronized to all cluster units. You must enable override and adjust device priority manually and separately for each cluster unit.

With override enabled, the primary unit with the highest device priority will always become the primary unit. Whenever an event occurs that may affect primary unit selection, the cluster negotiates. For example, when override is enabled a cluster renegotiates when you change the device priority of any cluster unit or when you add a new unit to a cluster.

 

Override and primary unit selection

Enabling override changes the order of primary unit selection. As shown below, if override is enabled, primary unit selection considers device priority before age and serial number. This means that if you set the device priority higher on one cluster unit, with override enabled this cluster unit becomes the primary unit even if its age and serial number are lower than other cluster units.

Similar to when override is disabled, when override is enabled primary unit selection checks for connected monitored interfaces first. So if interface monitoring is enabled, the cluster unit with the most disconnected monitored interfaces cannot become the primary unit, even of the unit has the highest device priority.

If all monitored interfaces are connected (or interface monitoring is not enabled) and the device priority of all cluster units is the same then age and serial number affect primary unit selection.

 

Controlling primary unit selection using device priority and override

To configure one cluster unit to always become the primary unit you should set its device priority to be higher than the device priorities of the other cluster units and you should enable override on all cluster units.

Using this configuration, when the cluster is operating normally the primary unit is always the unit with the highest device priority. If the primary unit fails the cluster renegotiates to select another cluster unit to be the primary unit. If the failed primary unit recovers, starts up again and rejoins the cluster, because override is enabled, the cluster renegotiates. Because the restarted primary unit has the highest device priority it once again becomes the primary unit.

In the same situation with override disabled, because the age of the failed primary unit is lower than the age of the other cluster units, when the failed primary unit rejoins the cluster it does not become the primary unit. Instead, even though the failed primary unit may have the highest device priority it becomes a subordinate unit because its age is lower than the age of all the other cluster units.

Points to remember about primary unit selection when override is enabled

Some points to remember about primary unit selection when override is enabled:

  • The FGCP compares primary unit selection criteria in the following order: Failed Monitored Interfaces > Device Priority > Age > Serial number. The selection process stops at the first criteria that selects one cluster unit.
  • Negotiation and primary unit selection is triggered whenever an event occurs which may affect primary unit selection. For example negotiation occurs, when you change the device priority, when you add a new unit to a cluster, if a cluster unit fails, or if a monitored interface fails.
  • Device priority is considered before age. Otherwise age is handled the same when override is enabled.

Configuration changes can be lost if override is enabled

In some cases, when override is enabled and you make configuration changes to an HA cluster these changes can be lost. For example, consider the following sequence:

1. A cluster of two FortiGate units is operating with override enabled.

  • FGT-A: Primary unit with device priority 200 and with override enabled
  • FGT-B: Subordinate unit with device priority 100 and with override disabled
  • If both units are operating, FGT-A always becomes the primary unit because FGT-A has the highest device priority.

2. FGT-A fails and FGT-B becomes the new primary unit.

3. The administrator makes configuration changes to the cluster.

The configuration changes are made to FGT-B because FGT-B is operating as the primary unit. These configuration changes are not synchronized to FGT-A because FGT-A is not operating.

4. FGT-A is restored and starts up again.

5. The cluster renegotiates and FGT-A becomes the new primary unit.

6. The cluster recognizes that the configurations of FGT-A and FGT-B are not the same.

7. The configuration of FGT-A is synchronized to FGT-B.

The configuration is always synchronized from the primary unit to the subordinate units.

8. The cluster is now operating with the same configuration as FGT-A. The configuration changes made to FGT-B have been lost.

 

The solution

When override is enabled, you can prevent configuration changes from being lost by doing the following:

  • Verify that all cluster units are operating before making configuration changes (from the web-based manager go to System > HA to view the cluster members list or from the FortiOS CLI enter get system ha status).
  • Make sure the device priority of the primary unit is set higher than the device priorities of all other cluster units before making configuration changes.
  • Disable override either permanently or until all configuration changes have been made and synchronized to all cluster units.

Override and disconnecting a unit from a cluster

A similar scenario to that described above may occur when override is enabled and you use the Disconnect from Cluster option from the web-based manager or the execute ha disconnect command from the CLI to disconnect a cluster unit from a cluster.

Configuration changes made to the cluster can be lost when you reconnect the disconnected unit to the cluster. You should make sure that the device priority of the disconnected unit is lower than the device priority of the current primary unit. Otherwise, when the disconnected unit joins the cluster, if override is enabled, the cluster renegotiates and the disconnected unit may become the primary unit. If this happens, the configuration of the disconnected unit is synchronized to all other cluster units and any configuration changes made between when the unit was disconnected and reconnected are lost.

This entry was posted in FortiOS 5.4 Handbook and tagged , on by .

About Mike

Michael Pruett, CISSP has a wide range of cyber-security and network engineering expertise. The plethora of vendors that resell hardware but have zero engineering knowledge resulting in the wrong hardware or configuration being deployed is a major pet peeve of Michael's. This site was started in an effort to spread information while providing the option of quality consulting services at a much lower price than Fortinet Professional Services. Owns PacketLlama.Com (Fortinet Hardware Sales) and Office Of The CISO, LLC (Cybersecurity consulting firm).

2 thoughts on “An introduction to the FGCP

  1. Danilo Arias

    Hi, thanks for sharing this information, however I wanted to make a query, that timer is only modified when there is a drop in monitored ports and does not increase over time is fixed? My question is why in his example I see that when the monitored port is reconnected, the teacher’s time is shorter in 136 seconds.

    Thanks and forgive my english but use google translate

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.