StackOverflowError details are lost due to the stack size limit
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Under specific circumstances, Jira may face a failure due to a StackOverflowError - in most cases, this will place the application in an inconsistent state, where a restart is recommended. When diagnosing the reason for the StackOverflowError, we'll want to look into the stack trace to identify the origin of the stack, and what sort of functionalities might be the origin of the code loop. However, the size of stack traces is limited by default in the JVM to 1024, which may not be enough to identify the origin of the problem.
Environment
Jira Server/Data Center
Diagnosis
Below is an example of a stack trace where the exact same method repeats from start to finish:
1
2
3
4
5
6
7
8
2021-02-03 17:05:00,036+0000 Caesium-1-3 ERROR ServiceRunner [c.a.s.caesium.impl.SchedulerQueueWorker] Unhandled exception thrown by job QueuedJob[jobId=com.atlassian.jira.service.JiraService:10000,deadline=1612371900000]
java.lang.StackOverflowError
at java.base/java.util.Collections$UnmodifiableCollection.isEmpty(Collections.java:1033)
at java.base/java.util.Collections$UnmodifiableCollection.isEmpty(Collections.java:1033)
at java.base/java.util.Collections$UnmodifiableCollection.isEmpty(Collections.java:1033)
...
at java.base/java.util.Collections$UnmodifiableCollection.isEmpty(Collections.java:1033)
Solution
Increase the size limit for stack traces with the following JVM argument:
1
-XX:MaxJavaStackTraceDepth=1000000
We can tune the value here - 1M should be more than enough. Once this setting has been added to the JVM, the application needs to be restarted (rolling restarts are enough for DC). We are then able to check the origin of the issue, as exemplified below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
2021-02-11 21:19:00,028+0000 Caesium-1-2 ERROR ServiceRunner [c.a.s.caesium.impl.SchedulerQueueWorker] Unhandled exception thrown by job QueuedJob[jobId=com.atlassian.jira.service.JiraService:10000,deadline=1613078340000]
java.lang.StackOverflowError
at java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1042)
at java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1041)
at java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1042)
at java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1041)
at java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1042)
...
~30,000 lines later
....
at java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1041)
at java.base/sun.security.ssl.ServerNameExtension$CHServerNameProducer.produce(ServerNameExtension.java:228)
at java.base/sun.security.ssl.SSLExtension.produce(SSLExtension.java:532)
at java.base/sun.security.ssl.SSLExtensions.produce(SSLExtensions.java:249)
at java.base/sun.security.ssl.ClientHello$ClientHelloKickstartProducer.produce(ClientHello.java:648)
at java.base/sun.security.ssl.SSLHandshake.kickstart(SSLHandshake.java:515)
at java.base/sun.security.ssl.ClientHandshakeContext.kickstart(ClientHandshakeContext.java:104)
at java.base/sun.security.ssl.TransportContext.kickstart(TransportContext.java:228)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:395)
at com.sun.mail.util.SocketFetcher.configureSSLSocket(SocketFetcher.java:619)
Was this helpful?