Last node to start takes over the system and answer all the requests

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

The second node to start always take over from the first one. Only the latest node delivers requests.

Start node A and everything is fine. Then start node B, after the start, node B takes over the cluster, and node A is out of the cluster.

The same happens if you start node B first, after the start of the node A, the node A takes over the system.

Environment

Confluence Data Center using a clustered setup.

Diagnosis

  1. Start node A

  2. Go to the Cog icon → General Configuration → Clustering. The node A will appear there as the only node on the cluster

  3. Start node B

  4. After something, check the Clustering page again and only node B is there, but node A is still running fine at OS level

The logs will not show any error, but checking both nodes it's possible to see this behaviour:

Start-up example first node

This is the start-up on the node A, it's possible to see LINE 3, the both nodes IP are in place and then start from node A is just fine:

1 2 3 4 5 6 7 8 9 10 2020-08-20 14:25:36,891 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] contextInitialized Starting Confluence XXX [build XXXX based on commit hash ] - synchrony version XXXXXXX 2020-08-20 14:25:40,710 INFO [Catalina-utility-1] [atlassian.confluence.cluster.DefaultClusterConfigurationHelper] lambda$populateExistingClusterSetupConfig$5 Populating setup configuration if running with Cluster mode... 2020-08-20 14:25:41,002 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] configure Configuring Hazelcast with instanceName [confluence], join configuration TCP/IP member addresses: NODE_A_IP|NODE_B_IP, network interfaces [NODE_A_IP] and local port XXXXXX 2020-08-20 14:25:41,002 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Starting the cluster. 2020-08-20 14:27:52,859 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Confluence cluster node identifier is [XXXXXXXXX] 2020-08-20 14:27:52,860 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Confluence cluster node name is [NODE_A_NAME] 2020-08-20 14:27:52,933 INFO [Catalina-utility-1] [springframework.web.context.ContextLoader] initWebApplicationContext Root WebApplicationContext: initialization started 2020-08-20 14:27:56,713 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] <init> Loading EhCache cache manager 2020-08-20 14:28:04,860 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster ClusterService{address=[NODE_A_IP]:XXXX} 2020-08-20 14:28:04,862 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster contains Member [NODE_A_IP]:XXXXX - XXXXXXXXXXXXXXXXXXXXXXX this

The last line, from the log above, shows the node A as the only node in the cluster, which is the correct state at this moment.

Start-up example second node

Then, once you start node B, it's possible to see a similar behavior:

1 2 3 4 5 6 7 8 9 10 2020-08-20 14:32:43,165 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] contextInitialized Starting Confluence XXXX [build XXXX based on commit hash XXXXXX] - synchrony version XXXXXXX 2020-08-20 14:32:46,310 INFO [Catalina-utility-1] [atlassian.confluence.cluster.DefaultClusterConfigurationHelper] lambda$populateExistingClusterSetupConfig$5 Populating setup configuration if running with Cluster mode... 2020-08-20 14:32:46,666 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] configure Configuring Hazelcast with instanceName [confluence], join configuration TCP/IP member addresses: NODE_A_IP|NODE_B_IP, network interfaces [10.31.199.79] and local port 5801 2020-08-20 14:32:46,667 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Starting the cluster. 2020-08-20 14:34:58,581 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Confluence cluster node identifier is [XXXXXXXX] 2020-08-20 14:34:58,581 INFO [Catalina-utility-1] [confluence.cluster.hazelcast.HazelcastClusterManager] startCluster Confluence cluster node name is [NODE_B_NAME] 2020-08-20 14:34:58,673 INFO [Catalina-utility-1] [springframework.web.context.ContextLoader] initWebApplicationContext Root WebApplicationContext: initialization started 2020-08-20 14:35:02,293 INFO [Catalina-utility-1] [com.atlassian.confluence.lifecycle] <init> Loading EhCache cache manager 2020-08-20 14:35:09,931 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster ClusterService{address=[NODE_B_IP]:XXXX} 2020-08-20 14:35:09,933 INFO [Catalina-utility-1] [cluster.hazelcast.monitoring.HazelcastMembershipListener] init init: cluster contains Member [NODE_B_IP]:XXXXX - XXXXXXXXXXXXXXXXXXXXXXX this

The important point in these logs is the last line. It's possible to see that node B joined the cluster, but node A is no longer there. However, there are no problems on the node A side.

Cause

This issue will happen when the communication between the nodes isn't working. The main cause for this will be the Hazelcast communication port being blocked in one or more of the nodes.

Solution

Ensure that all the hosts involved on the Cluster are able to communicate with each other using the port 5701 (default Hazelcast internal communication port).

Updated on February 25, 2025

Still need help?

The Atlassian Community is here for you.