Last week, several folks in my company asked me what are the clusterware heartbeats and how do they operate? My answers are..
At all time, the clusterware needs to know who the members are by sending heart beats communication among each other. So here it goes...there are two type of heart beats:
At all time, the clusterware needs to know who the members are by sending heart beats communication among each other. So here it goes...there are two type of heart beats:
1) Network heart beat across the interconnect: Every one second, a sending thread in the cssd sends a network tcp heartbeat to itself and all nodes. The receiving thread of the ocssd.bin receives the heartbeat. If the package network is dropped or has error, the error correction mechanism on tcp would retransmit the package. Oracle does not retransmit. From the ocssd.log, you will see a WARNING message about missing of heartbeat if a node does not receive a heartbeat from another node for 15 seconds (50% of miscount). Another warning is reported in ocssd.log if the same node is missing for 22 seconds (75% of miscount)..another warning continues from the same node for 27 seconds (90% miscount). When the heartbeat is missing 100% ..30 seconds miscount, the node is evicted
2) Disk heart beat to the voting devices: A thread in ocssd.bin updates the voting disk every second. This is called disk heartbeat. If a node does not update the voting disks for 200 seconds, it's evicted. However, the ocssd.bin on the local node has the logic that it will bring down the node if it has an I/O error more than majority of the voting disks. Also there is a CRS reconfiguration is happening when misscount is 27 second and the local node is rebooted. As a result, you rarely see an eviction due to failure of the voting disk on 10.2.0.4 (this is more common in 10.2.0.1)) because the ocssd.bin will abort the node before it get evicted by another node if writing to the voting disk is the problem.
To enable trace the heart beat:
crsctl debug log css CSSD: 5
To disable the trace:
crsctl debug log css CSSD:0
No comments:
Post a Comment