Cannot discover nodes, returning empty list AWS Hazelcast Discovery
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Confluence Data Center fails to discover nodes with a Cannot discover nodes, returning empty list warning followed by a connect timed out stack trace:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
2020-03-19 16:11:23,627 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [1] retrying in 1 seconds...
2020-03-19 16:11:35,134 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [2] retrying in 2 seconds...
2020-03-19 16:11:47,396 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [3] retrying in 3 seconds...
2020-03-19 16:12:00,784 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [4] retrying in 5 seconds...
2020-03-19 16:12:15,857 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [5] retrying in 7 seconds...
2020-03-19 16:12:33,458 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [6] retrying in 11 seconds...
2020-03-19 16:12:54,857 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [7] retrying in 17 seconds...
2020-03-19 16:13:21,953 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [8] retrying in 25 seconds...
2020-03-19 16:13:57,589 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [9] retrying in 38 seconds...
2020-03-19 16:14:46,030 WARN [Catalina-utility-1] [hazelcast.aws.utility.RetryUtils] log Couldn't connect to the AWS service, [10] retrying in 57 seconds...
2020-03-19 16:15:53,699 WARN [Catalina-utility-1] [com.hazelcast.aws.AwsDiscoveryStrategy] log Cannot discover nodes, returning empty list
com.hazelcast.core.HazelcastException: java.net.SocketTimeoutException: connect timed out
at com.hazelcast.util.ExceptionUtil$1.create(ExceptionUtil.java:40)
at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:124)
at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:69)
at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:129)
at com.hazelcast.aws.utility.RetryUtils.retry(RetryUtils.java:56)
at com.hazelcast.aws.impl.DescribeInstances.callServiceWithRetries(DescribeInstances.java:272)
at com.hazelcast.aws.impl.DescribeInstances.execute(DescribeInstances.java:262)
at com.hazelcast.aws.AWSClient.getAddresses(AWSClient.java:57)
at com.hazelcast.aws.AwsDiscoveryStrategy.discoverNodes(AwsDiscoveryStrategy.java:146)
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryService.discoverNodes(DefaultDiscoveryService.java:71)
at com.hazelcast.internal.cluster.impl.DiscoveryJoiner.getPossibleAddresses(DiscoveryJoiner.java:70)
at com.hazelcast.internal.cluster.impl.DiscoveryJoiner.getPossibleAddressesForInitialJoin(DiscoveryJoiner.java:59)
at com.hazelcast.cluster.impl.TcpIpJoiner.joinViaPossibleMembers(TcpIpJoiner.java:131)
at com.hazelcast.cluster.impl.TcpIpJoiner.doJoin(TcpIpJoiner.java:90)
at com.hazelcast.internal.cluster.impl.AbstractJoiner.join(AbstractJoiner.java:135)
at com.hazelcast.instance.Node.join(Node.java:767)
at com.hazelcast.instance.Node.start(Node.java:411)
at com.hazelcast.instance.HazelcastInstanceImpl.<init>(HazelcastInstanceImpl.java:131)
at com.hazelcast.instance.HazelcastInstanceFactory.constructHazelcastInstance(HazelcastInstanceFactory.java:202)
at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:181)
at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:131)
at com.hazelcast.core.Hazelcast.newHazelcastInstance(Hazelcast.java:57)
at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.startCluster(HazelcastClusterManager.java:344)
at com.atlassian.confluence.cluster.hazelcast.HazelcastClusterManager.reconfigure(HazelcastClusterManager.java:316)
at com.atlassian.confluence.cluster.DefaultClusterConfigurationHelper.bootstrapCluster(DefaultClusterConfigurationHelper.java:407)
at com.atlassian.confluence.setup.DefaultBootstrapManager.afterConfigurationLoaded(DefaultBootstrapManager.java:831)
at com.atlassian.config.bootstrap.DefaultAtlassianBootstrapManager.init(DefaultAtlassianBootstrapManager.java:75)
at com.atlassian.confluence.setup.DefaultBootstrapManager.init(DefaultBootstrapManager.java:188)
at com.atlassian.config.util.BootstrapUtils.init(BootstrapUtils.java:36)
at com.atlassian.confluence.setup.ConfluenceConfigurationListener.initialiseBootstrapContext(ConfluenceConfigurationListener.java:133)
at com.atlassian.confluence.setup.ConfluenceConfigurationListener.contextInitialized(ConfluenceConfigurationListener.java:64)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4682)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5143)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.security.ssl.SSLSocketImpl.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.<init>(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.New(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(Unknown Source)
at com.hazelcast.aws.impl.DescribeInstances.callService(DescribeInstances.java:291)
at com.hazelcast.aws.impl.DescribeInstances$1.call(DescribeInstances.java:276)
at com.hazelcast.aws.impl.DescribeInstances$1.call(DescribeInstances.java:272)
at com.hazelcast.aws.utility.RetryUtils.retry(RetryUtils.java:52)
... 38 more
Environment
Confluence Data Center
AWS node discovery
New node is added to the cluster
Diagnosis
Adding the following to <Confluence-Install>\conf\logging.properties
before restarting Confluence to see what request is timing out:
1
2
sun.net.www.protocol.http.HttpURLConnection.level = FINEST
sun.net.www.protocol.http.HttpURLConnection.handlers = java.util.logging.ConsoleHandler
Replace debugging level from FINE to FINEST for the below entry in the same file
1
java.util.logging.ConsoleHandler.level = FINEST
Cause 1
Causes will vary, but in one case we saw that there was an HTTP NULL response was returned from GET /latest/meta-data/iam/security-credentials/
1
19-Mar-2020 17:13:10.135 FINE [Catalina-utility-1] sun.net.www.protocol.http.HttpURLConnection.writeRequests sun.net.www.MessageHeader@123456 pairs: {GET /latest/meta-data/iam/security-credentials/test-iam HTTP/1.1: null}{User-Agent: Java/1.8.0_171}{Host: 123.456.789.123}{Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2}{Connection: keep-alive}
Cause 2
If you see messages like this, check if the IAM role in use and its permissions are correct, update <Confluence-local-home>/confluence.cfg.xml accordingly if needed:
1
2
3
4
5
6
7
8
com.hazelcast.config.InvalidConfigurationException: Unable to retrieve credentials from IAM Role: <IAM-Role-Name>
at com.hazelcast.aws.impl.DescribeInstances.fillKeysFromIamRole(DescribeInstances.java:134)
...
Caused by: com.hazelcast.config.InvalidConfigurationException: Unable to lookup role in URI: http://169.254.169.254/latest/meta-data/iam/security-credentials/<IAM-Role-Name>
at com.hazelcast.aws.utility.MetadataUtil.retrieveMetadataFromURI(MetadataUtil.java:78)
...
Caused by: java.io.FileNotFoundException: http://169.254.169.254/latest/meta-data/iam/security-credentials/<IAM-Role-Name>
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1893)
Resolution
For Cause 1, adding proxy connection information to your setenv file on all nodes to allow the connection to complete:
1
2
3
4
CATALINA_OPTS="-Dhttp.nonProxyHosts=localhost\|169.254.170.2\|169.254.169.254\|127.0.0.1 ${CATALINA_OPTS}"
CATALINA_OPTS="-Dhttps.nonProxyHosts=localhost\|169.254.170.2\|169.254.169.254\|127.0.0.1 ${CATALINA_OPTS}"
CATALINA_OPTS="-Dhttp.proxyHost=<the proxy url> -Dhttp.proxyPort=<the proxy port> ${CATALINA_OPTS}"
CATALINA_OPTS="-Dhttps.proxyHost=<the proxy url> -Dhttps.proxyPort=<the proxy port> ${CATALINA_OPTS}"
Restart Confluence after this has been applied in order to resolve the issue.
For Cause 2, check if the IAM role name is correct in <Confluence-local-home>/confluence.cfg.xml.
Was this helpful?