Supplementary Software Information

Appendix E. Supplementary Software Information

The information in the following sections can assist in the management of the cluster software configuration.

E.1. Cluster Communication Mechanisms

A cluster uses several intra-cluster communication mechanisms to ensure data integrity and correct cluster behavior when a failure occurs. The cluster uses these mechanisms to:

  • Control when a system can become a cluster member

  • Determine the state of the cluster systems

  • Control the behavior of the cluster when a failure occurs

The cluster communication mechanisms are as follows:

  • Shared (quorum) partitions

    Periodically, each cluster system writes a time-stamp and system status to the primary and shadow shared partitions, which are raw partitions located on shared storage. Each member reads the system status and time-stamp that were written by the other members and determines if they are up to date. The members attempt to read the information from the primary shared partition. If this partition is corrupted, the members read the information from the shadow shared partition and simultaneously repair the primary partition. Data consistency is maintained through checksums and any inconsistencies between the partitions are automatically corrected.

    If a member reboots but cannot write to both shared partitions, the system is not allowed to join the cluster. In addition, if an existing member can no longer write to both partitions, it removes itself from the cluster by shutting down.

    Shared partitions are only used as a communication mechanism in two-member clusters that have network tie-breaker disabled.

  • Remote power switch monitoring

    Periodically, each member monitors the health of the remote power switch connection, if any. The member uses this information to help determine the status of the other cluster members. The complete failure of the power switch communication mechanism does not automatically result in a failover. If a power switch fails to power-cycle a hung system, no failover is performed as the cluster infrastructure cannot guarantee the member's present state.

  • Ethernet heartbeats

    The members are connected together by using point-to-point Ethernet lines. Periodically, each member issues heartbeats (pings) across these lines. The cluster uses this information to help determine the status of the members and to ensure correct cluster operation. The complete failure of the heartbeat communication mechanism does not automatically result in a failover.

If a member determines that a time-stamp from another member is not up-to-date, it checks the heartbeat status. If heartbeats to the member are still operating, the cluster software takes no action. If a member does not update its time-stamp after some period of time, and does not respond to heartbeat pings, it is considered down.

The cluster remains operational as long as one cluster system can write to the shared partitions, even if all other communication mechanisms fail.

Note that shared partition is only used as a back-up in some two-member configurations. The network membership algorithm is the primary determining factor as to which cluster members are active and which are not. A member that is not updating its time-stamp in this configuration never causes a failover unless clumembd reports that the member is down.