5.4 Failover Rules
The HA function is capable of performing health check on the system status and network condition in an HA domain. Once any failure is detected by health check and it matches one of the pre-configured failover conditions, the corresponding failover action will be taken. Usually, the system will select another unit which is with the highest priority among the available units and change the status of the floating IP group enabled on that unit to be “Active” forcibly. To do this, HA provides failover rules to control the switchover of group status.
Failover rules are defined by associating failover conditions with failover actions. Failover conditions indicate the monitoring status on system hardware or software, such as network interface status, CPU utilization and so on. Failover actions are the operations to be performed by the system when the associated failover conditions occur. HA provides three failover actions:
- Group_Failover: Switch over the status of the floating IP group. For this action, the system will select a new unit based on the health condition and group priority, and change the status of the floating IP group enabled on that unit to be “Active” to take over the services.
- Unit_Failover: Switch over the status of all the floating IP groups enabled on a unit.
- Reboot: Switch over the status of all the floating IP groups enabled on a unit, and then restart the unit.
To facilitate use of administrators, HA also provides built-in network connectivity check to detect network exceptions, such as network interface failure and network interruption among units. Once any of these exceptions occur, the system will perform failover actions automatically.
Note: Only when the network connections of all interfaces in a bond interface become down, will the “Group_Failover” action be taken for the floating IP group to which the IP addresses of the bond interface belong.
While providing built-in failover rules, the HA function also allows administrators to manually configure multiple failover rules. To do this, the following software or hardware health check conditions can be configured as failover conditions:
Ø Hardware:
- CPU overheat health check condition
- SSL card health check condition
- Port health check condition Ø Software:
- CPU utilization health check condition
- ATCP zone memory utilization health check condition
- System memory health check condition
- Network packet memory health check condition
- Process health check condition Ø Network condition:
- Gateway health check condition
In some complex application environments, more complicated failover rules are required. For example, theoretically, in the environment with a bond interface deployed, only when the network connections of all interfaces in the bond interface become down will failover action be taken. However, in practical application, it is required to take failover action as long as the network connection of one interface becomes down. To meet this kind of complicated applications, HA further introduces the concept of health check condition group (vcondition). A vcondition comprises multiple health check sub-conditions. A sub-condition can be a real health check condition or another vcondition, which further comprises sub-conditions. The logical relationship among multiple sub-conditions can be either “AND” or “OR”. To apply vcondition to the above application, administrators can first define health check conditions for each of the network interfaces in a bond interface, and combine these conditions into a vcondition by setting the logical relationship to “OR”. Then, associate the vcondition with some failover actions.