Cluster communication problems: Member has left cluster, or Member has been forcefully evicted from cluster, or A potential communication problem has been detected

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

This article applies to Confluence clustered 5.4 or earlier.

Symptoms

Confluence cluster is not working as expected. Eg: you cannot start more than one node without the new member being evicted from the cluster.

The following appears in the atlassian-confluence.log:

2014-05-11 18:24:05,957 WARN [Logger@9233091 3.3.1/389] [Coherence] log 2014-05-11 18:24:05.957 Oracle Coherence GE 3.3.1/389 <Warning> (thread=PacketPublisher, member=2): A potential communication problem has been detected. A packet has failed to be delivered (or acknowledged) after 45 seconds, although other packets were acknowledged by the same cluster member (Member(Id=1, Timestamp=2014-05-11 18:17:48.519, Address=xxx.xxx.xxx.x:8090, MachineId=12345, Location=process:1234@CONFLUENCE01)) to this member (Member(Id=2, Timestamp=2014-05-11 18:23:16.19, Address=xxx.xxx.xxx.x:8090, MachineId=67891, Location=process:1234@CONFLUENCE02)) as recently as 0 seconds ago. It is possible that the packet size greater than 1468 is responsible; for example, some network equipment cannot handle packets larger than 1472 bytes (IPv4) or 1468 bytes (IPv6). Use the 'ping' command with the <size> option to verify successful delivery of specifically sized packets. Other possible causes include network failure, poor thread scheduling (see FAQ if running on Windows), an extremely overloaded server, a server that is attempting to run its processes using swap space, and unreasonably lengthy GC times.

2014-05-11 18:13:49,218 WARN [Logger@9226875 3.3.1/389] [Coherence] log 2014-05-11 18:13:49.218 Oracle Coherence GE 3.3.1/389 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; the member appears to be alive, but exhibits long periods of unresponsiveness; removing Member(Id=1, Timestamp=2014-05-11 18:09:52.641, Address=xxx.xxx.xxx.x:8090, MachineId=41352, Location=process:1234@CONFLUENCE01)

2014-05-11 18:13:49,249 INFO [Cluster:EventDispatcher] [confluence.cluster.coherence.TangosolClusterManager] memberLeft Member has left cluster: Member(Id=1, Timestamp=2014-05-11 18:13:49.218, Address=xxx.xxx.xxx.x:8090, MachineId=12345, Location=process:1234@CONFLUENCE01) 2014-05-11 18:13:49,436 WARN [Logger@9226875 3.3.1/389] [Coherence] log 2014-05-11 18:13:49.436 Oracle Coherence GE 3.3.1/389 <Warning> (thread=Cluster, member=2): The member formerly known as Member(Id=1, Timestamp=2014-05-11 18:13:49.218, Address=xxx.xxx.xxx.x:8090, MachineId=12345, Location=process:1234@CONFLUENCE01) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored.

Cause

There are multiple potential causes for this issue:

  1. The packet size is too large for the network configuration to handle

  2. Garbage Collection

  3. Other environmental issues:

    1. Network failure

    2. A VM using swap space

    3. An otherwise overloaded server

Workaround

Start just one node and allow that to serve your customers independently while you investigate the root cause of the issue.

Resolution

Packet Size

Run these commands to confirm that larger packets are allowed through your network:

1 2 3 ping ping -l 1500 ping -l 3000

If any of these are rejected, get your network administrators to allow larger packet sizes.

Garbage Collection

  1. How to Enable Garbage Collection (GC) Logging

  2. Review the logs using a tool like GCViewer

  3. Raise a Support Request if you'd like Support to help you analyse the logs and determine if they are causing the issue

  4. Follow these guidelines to reduce the size of your heap and bring the GC times down

Other environmental issues

Get your network and infrastructure administrators to investigate the current state of the network and the server itself.

Updated on April 10, 2025

Still need help?

The Atlassian Community is here for you.