Health Check: Automation Queue Processing Time
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
To use the health check for a Automation Queue Processing Time, upgrade the ATST plugin to at least version 2.6.0 and Jira version to 10.4.0.
Solution
About the health check
This health check ensures the health of automation rule executions. Async rules in automation are executed through a queue mechanism. Messages get inserted into the queue due to Jira events OR when async rule execution encounters branches. The health check is reported as unhealthy when the automation queue is experiencing processing delays and there are unprocessed and unclaimed messages in the Automation Queue that exceed the configured threshold age.
Understanding the results
Jira Data Center
Icon | Result | What this means |
---|---|---|
✅Pass | Events are regularly processed in the automation queue, and age of the items is less than the alert threshold | All events are claimed promptly from Automation Queue. |
⚠️Warning | There are unprocessed and unclaimed messages in the Automation Queue | This indicates that the processing of the automation queue is slow, as the age of unprocessed messages is high, which could impact the performance of the automation process. An administrator should troubleshoot the issue. |
What happens if I ignore the warning?
The performance of automation process could be affected. This means that some automation rules may experience delays in execution.
Troubleshooting
As a first step, check whether you are receiving this alert with no degradation in system performance or impact on business processes that rely on automation. If that is the case, you could consider increasing the alert threshold.
The alert threshold configuration can be modified by adding the property jira
.diagnostics.thresholds.earliest-unprocessed-item-seconds
in the jira
-config.properties
file. The default value of the alert threshold is 3600 seconds.
Edit alert configuration
Steps to edit the jira-config.properties file can be referred from Edit the jira-config.properties file in Jira server | Jira | Atlassian Documentation
If you are receiving this alert continuously and notice a lot of delay in rule execution, it could potentially start to impact business processes that rely on automation.
You could follow the steps below to troubleshoot further
Check if there is a system-wide performance degradation (slow page load, slow JQL, etc) in Jira. If it is the case, automation’s performance also gets degraded. Check the articles below to troubleshoot further
Investigate rule execution: Identify rules that are taking more execution time and consider optimising or disabling them using the articles below
Investigate queue
Run curl below to identify type of events that contribute to the growth of queue and assess if those events are expected during normal system operation. If not, trace the root cause of events (such as incorrectly configured bot making a lot of REST calls, events from 3rd party plugin, etc) and fix them
1 2 3 4 5 6 7 8
curl -D- \ -X POST \ -H "Authorization: Bearer <token>" \ -H 'Accept: application/json' \ '$BASEURL/rest/cb-automation/latest/automation-queue/insight/events/by-type' Optional query params: limit=20
Run curl below to identify top rules that contribute to the growth of queue. Consider optimising / disabling them using the articles mentioned in step-2
1 2 3 4 5 6 7 8
curl -D- \ -X POST \ -H "Authorization: Bearer <token>" \ -H 'Accept: application/json' \ '$BASEURL/rest/cb-automation/latest/automation-queue/insight/events/by-rule' Optional query params: limit=20"
Providing data to Atlassian Support
If the above steps do not provide any resolution, create a support ticket at support.atlassian.com and attach the requested information to the ticket by following the steps below:
Go to Administration ⚙ > System > Logging and profiling
SelectConfigure logging level for another package
Use com.codebarrel.automation as the package name and select "DEBUG" for the Logging level
Use com.codebarrel.jira.plugin.automation as the package name and select "DEBUG" for the Logging level
Go to Administration ⚙ > System > Troubleshooting and support tools > Diagnostic settings
Make sure that the 2 settings below are enabled (which will allow to automatically generate thread dumps):
Thread diagnostics
Runtime diagnostics
Wait for about 30 minutes so that we can collect enough logs and thread dumps
Go to Administration ⚙> System > Troubleshooting and support tools > Create support zip.
Make sure that the option Runtime diagnostics data is ticked
SelectCreate zip and download the support zip. In case you have a Data Center cluster, repeat this step for each node.
Go to Administration ⚙> System > Automation rules > ... > View performance insights
Take a screenshot of the Performance Insights page
Run the 2 curl commands mentioned in the Troubleshooting section of this page
Attach to the support ticket the following information:
The support zip(s)
The screenshot of the Automation performance insights page
The output from the 2 curl commands
Was this helpful?