Agents will not reconnect after the server hosting Bamboo ran out disk space
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
After an out of disk-space event on the Bamboo server, all agents (Elastic and / or Remote) have gone offline and are no longer reconnecting despite the storage situation being resolved:
The following may appear in the agent logs, bamboo-elastic-agent.out or atlassian-bamboo-agent.log:
1
2
INFO | jvm 2 | 2017/06/14 14:12:10 | 2017-06-14 14:12:10,585 INFO [AgentRunnerThread] [AgentRegistrationBean] Registering agent on the server,
INFO | jvm 2 | 2017/06/14 14:17:13 | 2017-06-14 14:17:13,029 WARN [AgentRunnerThread] [RemoteAgent$1] Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'agentRegistrationBean': Invocation of init method failed; nested exception is org.springframework.remoting.RemoteTimeoutException: Receive timeout after 300000 ms for RemoteInvocation: method name 'registerAgent'; parameter types [com.atlassian.bamboo.buildqueue.RemotableRemoteAgentDefinition]
The above is a generic warning that suggests that the agent timed out trying to communicate with the messaging broker on the Bamboo Server. The above warning does not imply that your Bamboo Server has run out of disk space or that the below resolution needs to be followed.
Errors similar to the below are present on the Bamboo Server within the atlassian-bamboo.log:
1
2
3
2017-06-13 21:48:27,985 ERROR [ConcurrentQueueStoreAndDispatch] [MessageDatabase] KahaDB failed to store to Journal
java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(Native Method)
Cause
The ActiveMQ JMS Broker used for agent communication never recovered after the server ran out of disk space, despite the storage situation being resolved.
Resolution
Ensure there's enough disk space on the server hosting Bamboo.
Restart Bamboo so that the ActiveMQ JMS Broker also restarts successfully.
Generally after a system runs out of disk-space entirely and the storage situation has been resolved, it's a good idea to restart the entire server (not just Bamboo).
It's not uncommon for certain Bamboo XML configuration to become corrupt after the server runs out of disk-space during operation. This can present after the server is restarted and cause a Bamboo outage. Please see be aware of the below two knowledge-base articles which may be applicable in such an event:
Was this helpful?