Confluence process dies unexpectedly due to Linux OOM-Killer
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Problem
When Confluence is installed on a Linux host, the entire Confluence process suddenly terminates without warning. That is to say, the process ID (pid) is gone.
The browser shows a generic "cannot connect" or similar error, indicating that it is not able to reach the webpage
Nothing out of the ordinary appears in the Confluence application logs (<confluence_home>/logs/atlassian-confluence.log), since the application was terminated without properly shutting down
Similarly, Tomcat logs (<confluence_install>/logs/catalina.out) also do not show errors
Causes and Diagnosis
Confluence does not terminate its own pid unless a shutdown.sh or stop-confluence.sh is issued. If these commands are issued, there'd be some entries in the atlassian-confluence.log file indicating that Confluence is in the process of shutting down. On the other hand, if the Confluence pid is unexpectedly terminated, this would mean that there's an external factor.
For example:
A
kill -9
commandSome sort of a script that issues a kill command
Some other mechanism outside of the application's control that is killing the process, e.g. the Linux OOM-Killer
This KB article focuses on the Linux OOM-Killer, which is a feature on some Linux installations that will sacrifice processes to free up memory if the operating system experiences memory exhaustion for its own operations. Please note that this is different from Confluence running out of memory. In this case, the OS itself is in danger of running out of memory, thus starts terminating processes to avoid it.
On the host machine, look in the /var/log/ directory for the syslog or messages, and locate the timestamps spanning the approximate time when the pid was terminated. If you see entries similar to the following, then you know the process died to the OOM-Killer:
1
2
Jan 15 04:20:30 confluence-a kernel: [370087.856050] Out of memory: Kill process 2753 (java) score 320 or sacrifice child
Jan 15 04:20:30 confluence-a kernel: [370087.857773] Killed process 2753 (java) total-vm:2256656kB, anon-rss:400988kB, file-rss:0kB
Resolution
In the case of the OOM-Killer, the possible resolutions would be to:
Increase the amount of memory available on the host machine itself
Decrease the amount of memory allocated to Confluence, or competing processes on the machine
Disable the OOM-Killer (not recommended)
ℹ️ Additional info on how the OOM-Killer operates, please see: http://prefetch.net/blog/index.php/2009/09/30/how-the-linux-oom-killer-works/
Was this helpful?