Class FD


  • public class FD
    extends Protocol
    Failure detection based on simple heartbeat protocol. Regularly polls members for liveness. Multicasts SUSPECT messages when a member is not reachable. The simple algorithms works as follows: the membership is known and ordered. Each HB protocol periodically sends an 'are-you-alive' message to its *neighbor*. A neighbor is the next in rank in the membership list, which is recomputed upon a view change. When a response hasn't been received for n milliseconds and m tries, the corresponding member is suspected (and eventually excluded if faulty).

    FD starts when it detects (in a view change notification) that there are at least 2 members in the group. It stops running when the membership drops below 2.

    When a message is received from the monitored neighbor member, it causes the pinger thread to 'skip' sending the next are-you-alive message. Thus, traffic is reduced.

    Author:
    Bela Ban
    • Field Detail

      • timeout

        protected long timeout
      • max_tries

        protected int max_tries
      • num_heartbeats

        protected int num_heartbeats
      • num_suspect_events

        protected int num_suspect_events
      • suspect_history

        protected final BoundedList<java.lang.String> suspect_history
      • local_addr

        protected Address local_addr
      • last_ack

        protected volatile long last_ack
      • num_tries

        protected final java.util.concurrent.atomic.AtomicInteger num_tries
      • lock

        protected final java.util.concurrent.locks.Lock lock
      • ping_dest

        protected volatile Address ping_dest
      • members

        protected final java.util.List<Address> members
      • pingable_mbrs

        protected final java.util.List<Address> pingable_mbrs
        Members from which we select ping_dest. Copy of members minus the suspected members
      • timeout_checker_future

        protected java.util.concurrent.Future<?> timeout_checker_future
      • heartbeat_sender_future

        protected java.util.concurrent.Future<?> heartbeat_sender_future
      • bcast_task

        protected final FD.Broadcaster bcast_task
        Transmits SUSPECT message until view change or UNSUSPECT is received
    • Constructor Detail

      • FD

        public FD()
    • Method Detail

      • getLocalAddress

        public java.lang.String getLocalAddress()
      • getMembers

        public java.lang.String getMembers()
      • getPingableMembers

        public java.lang.String getPingableMembers()
      • getPingDest

        public java.lang.String getPingDest()
      • getNumberOfHeartbeatsSent

        public int getNumberOfHeartbeatsSent()
      • getNumSuspectEventsGenerated

        public int getNumSuspectEventsGenerated()
      • getTimeout

        public long getTimeout()
      • setTimeout

        public void setTimeout​(long timeout)
      • getMaxTries

        public int getMaxTries()
      • setMaxTries

        public void setMaxTries​(int max_tries)
      • getCurrentNumTries

        public int getCurrentNumTries()
      • printSuspectHistory

        public java.lang.String printSuspectHistory()
      • start

        public void start()
                   throws java.lang.Exception
        Description copied from class: Protocol
        This method is called on a JChannel.connect(String). Starts work. Protocols are connected and queues are ready to receive events. Will be called from bottom to top. This call will replace the START and START_OK events.
        Overrides:
        start in class Protocol
        Throws:
        java.lang.Exception - Thrown if protocol cannot be started successfully. This will cause the ProtocolStack to fail, so JChannel.connect(String) will throw an exception
      • stop

        public void stop()
        Description copied from class: Protocol
        This method is called on a JChannel.disconnect(). Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed
        Overrides:
        stop in class Protocol
      • getPingDest

        protected Address getPingDest​(java.util.List<Address> mbrs)
      • stopFailureDetection

        public void stopFailureDetection()
      • startFailureDetection

        public void startFailureDetection()
      • startMonitor

        protected void startMonitor()
        Requires lock to held by caller
      • stopMonitor

        protected void stopMonitor()
        Requires lock to be held by caller
      • isMonitorRunning

        public boolean isMonitorRunning()
      • up

        public java.lang.Object up​(Message msg)
        Description copied from class: Protocol
        A single message was received. Protocols may examine the message and do something (e.g. add a header) with it before passing it up.
        Overrides:
        up in class Protocol
      • up

        public void up​(MessageBatch batch)
        Description copied from class: Protocol
        Sends up a multiple messages in a MessageBatch. The sender of the batch is always the same, and so is the destination (null == multicast messages). Messages in a batch can be OOB messages, regular messages, or mixed messages, although the transport itself will create initial MessageBatches that contain only either OOB or regular messages.

        The default processing below sends messages up the stack individually, based on a matching criteria (calling Protocol.accept(org.jgroups.Message)), and - if true - calls Protocol.up(org.jgroups.Event) for that message and removes the message. If the batch is not empty, it is passed up, or else it is dropped.

        Subclasses should check if there are any messages destined for them (e.g. using MessageBatch.getMatchingMessages(short,boolean)), then possibly remove and process them and finally pass the batch up to the next protocol. Protocols can also modify messages in place, e.g. ENCRYPT could decrypt all encrypted messages in the batch, not remove them, and pass the batch up when done.

        Overrides:
        up in class Protocol
        Parameters:
        batch - The message batch
      • down

        public java.lang.Object down​(Event evt)
        Description copied from class: Protocol
        An event is to be sent down the stack. A protocol may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the protocol may need to add a header to it (or do nothing at all) before sending it down the stack using down_prot.down().
        Overrides:
        down in class Protocol
      • sendHeartbeatResponse

        protected void sendHeartbeatResponse​(Address dest)
      • unsuspect

        protected void unsuspect​(Address mbr)
      • updateTimestamp

        protected void updateTimestamp​(Address sender)
      • computePingDest

        protected void computePingDest​(Address remove)
        Computes pingable_mbrs (based on the current membership and the suspected members) and ping_dest
        Parameters:
        remove - The member to be removed from pingable_mbrs