Cluster cache replication to a node fails with "Retry replication to node <node_id> failed, node still unreachable"

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

Problem

One or many nodes in Jira Data Center (source) fails/fails to replicate its/their cache to a given node (target).

The following appears in atlassian-jira.log:

1 [c.a.jira.cluster.CuttingOffExecutorImpl] Retry replication to node QYVRH73JIRAP03 failed, node still unreachable. This was 8 attempt. Backing off for 300000 milliseconds.

Diagnosis

  1. Review the content of the clusternode table:

    1 SELECT * FROM clusternode;

    This table is the reference used by Jira when it comes to Data Center operations.

  2. Get the ip column value for the new node (node 3), this column may point to an actual IP or a hostname (DNS).

  3. Get the cache_listener_port value for the new node (node 3). By default, this is 40001.

  4. From the source node(s) failing to replicate its/their cache:

    1. lookup the IP or hostname for the target node:

      1 nslookup <target_node_ip_or_hostname>
    2. Test the connectivity to the target on the cache port:

      1 telnet <target_node_ip_or_hostname> <target_node_cache_listener_port>

Cause

This issue is generally caused by a network or DNS misconfiguration.

Solution

Resolution

Involve your network team to allow connectivity between the Data Center nodes.

Updated on April 11, 2025

Still need help?

The Atlassian Community is here for you.