Node fails to join cluster with error: nodes are connected to the same database but different shared homes
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Your Bitbucket node fails to join or form a cluster, either from a newly provisioned node, a restart, or rollback from DR or other environment, with the following message:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2023-02-14 19:27:45,375 WARN [hz.hazelcast.cached.thread-4] com.hazelcast.nio.tcp.TcpIpAcceptor [0.0.0.0]:5701 [BitbucketCluster] [3.12.9] com.atlassian.stash.internal.cluster.NodeConnectionException: Nodes are connected to the same database but different shared homes
com.atlassian.stash.internal.cluster.NodeConnectionException: Nodes are connected to the same database but different shared homes
at com.atlassian.stash.internal.cluster.DefaultClusterJoinManager.negotiateOutcome(DefaultClusterJoinManager.java:304)
at com.atlassian.stash.internal.cluster.DefaultClusterJoinManager.accept(DefaultClusterJoinManager.java:124)
at com.atlassian.stash.internal.hazelcast.ClusterJoinSocketInterceptor.onAccept(ClusterJoinSocketInterceptor.java:49)
at com.hazelcast.nio.NodeIOService.interceptSocket(NodeIOService.java:300)
at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread.configureAndAssignSocket(TcpIpAcceptor.java:316)
at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread.access$1400(TcpIpAcceptor.java:138)
at com.hazelcast.nio.tcp.TcpIpAcceptor$AcceptorIOThread$1.run(TcpIpAcceptor.java:305)
at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.lang.Thread.run(Thread.java:748)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)
... 1 frame trimmed
Another variation of this issue occurs when each node's shared homes point to the same mount volume but use different path names.
1
2
3
4
2024-06-21 10:00:00,051 INFO [hz.hazelcast.cached.thread-2] com.hazelcast.nio.tcp.TcpIpConnector [10.0.6.30]:5701 [bitbucket-dev] [3.12.13] Could not connect to: /140.8.43.72:5701. Reason: NodeConnectionException[Required property 'sharedHome' should be '/data/atlassian/bitbucket/shared' but is '/var/atlassian/application-data/bitbucket']
2024-06-21 10:00:00,051 INFO [hz.hazelcast.cached.thread-2] c.hazelcast.cluster.impl.TcpIpJoiner [10.0.6.30]:5701 [bitbucket-dev] [3.12.13] [10.0.6.31]:5701 is added to the blacklist.
2024-06-21 10:00:00,078 INFO [hz.hazelcast.cached.thread-3] com.hazelcast.nio.tcp.TcpIpConnector [10.0.6.30]:5701 [bitbucket-dev] [3.12.13] Could not connect to: /140.8.43.73:5701. Reason: NodeConnectionException[Required property 'sharedHome' should be '/data/atlassian/bitbucket/shared' but is '/var/atlassian/application-data/bitbucket']
2024-06-21 10:00:00,078 INFO [hz.hazelcast.cached.thread-3] c.hazelcast.cluster.impl.TcpIpJoiner [10.0.6.30]:5701 [bitbucket-dev] [3.12.13] [10.0.6.32]:5701 is added to the blacklist.
Environment
The solution has been validated in Bitbucket Data Center 7.21 but may be applicable to other versions.
Diagnosis
Here are a few things to check to diagnose that you are encountering this issue:
For the node failing to join the cluster, they may get the following messages in the atlassian-bitbucket.log
NodeConnectionException: Nodes are connected to the same database but different shared homes
NodeConnectionException[Required property 'sharedHome' should be '<shared-home-path-2>' but is '<shared-home-path-1>']
If it is a single node starting, then you will be presented with an error page and the message "Nodes are connected to the same database but different shared homes."
Check your
${BITBUCKET_HOME}
/shared
path to ensure a stalecluster-join.txt
file does not exist.Verify that each node has the same path for
${BITBUCKET_HOME}
/shared
, including the underlying NFS path.
Cause
New node
If a newly provisioned node is joining the cluster, then this node does not have the same shared path as the rest of the cluster. You must verify that the ${BITBUCKET_HOME}/shared
path is the same across all nodes. This includes the underlying path for the NFS mount.
Alternatively, Bitbucket writes a file called cluster-join.txt
to the ${BITBUCKET_HOME}/shared
path. The other nodes reads the information and uses it to join the cluster. A stale join file could have outdated information that the starting node is reading.
Existing node / Restarts / DR failover
In other scenarios, a new node is not being provisioned, but other activity has taken place such as failing over from disaster recover. It would seem that you are starting the cluster for the first time as you encounter this error; however, another node or instance is still actively running and connected to the database and has a different shared path. Therefore, in the recovery to start the cluster for the first time, they are attempting to join an existing cluster and are being rejected because the shared path differs.
Solution
New node
Verify the underlying path for
${BITBUCKET_HOME}/shared
matches the other existing nodes.Verify that the user running Bitbucket is able to read and write to
${BITBUCKET_HOME}/shared
.Delete any stale cluster-join.txt file from
${BITBUCKET_HOME}/shared
path and restart the affected node.
Existing node / Restarts / DR failover
Ensure that all nodes in the previous environment and configuration are no longer active. If one node in the old configuration is active while the new/original nodes are starting, you will encounter this error.
Delete any stale cluster-join.txt file from
${BITBUCKET_HOME}/shared
path and restart the affected node.Ensure that all the nodes are seeing the same set of files under
${BITBUCKET_HOME}/shared
. Test this by creating a file on one node and checking if it is reflected on other nodes.
Was this helpful?