Jira startup failure on Kubernetes due to startup probe killing pod and leaving lock record in the database

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

Jira deployments on the Kubernetes cluster using helm chart has a startup probe configured for the pods. This probe checks the HTTP port every 5 seconds for 120 (failureThreshold) times by the settings in the default values.yaml file. If the pod fails to pass the startup probe within the duration, pod shutdown initiates. After 30 seconds (default terminationGracePeriodSeconds), Kubernete kills the JVM, and this leaves lock records in the database, which prevents Jira from startup successfully.

Environment

Kubernetes deployments using Helm chart

Diagnosis

Following logs can be seen in atlassian-jira.log.

1 2 3 4 Caused by: java.lang.IllegalStateException: Too many rows updated in JiraClusterLockQueryDSLDao: 2 for lock name: com.atlassian.jira.cluster.zdu.DefaultClusterUpgradeStateManager.clusterUpgradeState ... 2 filtered at com.atlassian.beehive.db.DatabaseClusterLock.tryLockRemotely(DatabaseClusterLock.java:264) at com.atlassian.beehive.db.DatabaseClusterLock$Attempt.perform(DatabaseClusterLock.java:632)

Cause

Jira helm chart has startup probe configured to check Jira http port in every 5 seconds for 120 times. Default settings from the values.yaml file is like the following.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Confirm that Jira is up and running with a StartupProbe # https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes # startupProbe: # -- Whether to apply the startupProbe check to pod. # enabled: true # -- How often (in seconds) the Jira container startup probe will run # periodSeconds: 5 # -- The number of consecutive failures of the Jira container startup probe # before the pod fails startup checks. # failureThreshold: 120

When Jira startup is slow and startup takes more than 10 minutes, probe fails and starts to restart the server. The grace period of the shutdown process is 30 seconds and beyond that kubelet kills the process. As Jira has not started successfully, it also does not stop gracefully. By killing the Jira process during the startup, locks in the clusterlockstatus table remains. This leads to Jira to log previous lines after the restart.

1 2 3 4 Caused by: java.lang.IllegalStateException: Too many rows updated in JiraClusterLockQueryDSLDao: 2 for lock name: com.atlassian.jira.cluster.zdu.DefaultClusterUpgradeStateManager.clusterUpgradeState ... 2 filtered at com.atlassian.beehive.db.DatabaseClusterLock.tryLockRemotely(DatabaseClusterLock.java:264) at com.atlassian.beehive.db.DatabaseClusterLock$Attempt.perform(DatabaseClusterLock.java:632)

Solution

In order to start Jira again, lock records must be removed from the database.

  • Scale down Jira statefulset to remove all the pods.

    1 kubectl scale sts jira --replicas=0 -n <namespace>
  • Delete the locks in the database table.

    1 DELETE FROM jiraschema.clusterlockstatus where lock_name = 'com.atlassian.jira.cluster.zdu.DefaultClusterUpgradeStateManager.clusterUpgradeState';
  • Scale up Jira statefulset to start the first pod.

    1 kubectl scale sts jira --replicas=1 -n <namespace>

In order to fix the issue, Jira should start within the time that is set for the startup probe or disabling the startup probe. This can be achieved by the following methods,

  • Modify the total duration of the startup probe to provide enough time to start Jira in values.yaml file. helm upgrade needed to reflect the changes to the deployment.

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Confirm that Jira is up and running with a StartupProbe # https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes # startupProbe: # -- Whether to apply the startupProbe check to pod. # enabled: true # -- How often (in seconds) the Jira container startup probe will run # periodSeconds: <custom period seconds> # -- The number of consecutive failures of the Jira container startup probe # before the pod fails startup checks. # failureThreshold: <custom failureThreshold>
  • Improve resource related bottlenecks on the server or database to improve startup time.

Updated on March 20, 2025

Still need help?

The Atlassian Community is here for you.