What is System Health?

System Health is available to customers in a beta program.

System Health is a personalized dashboard for Atlassian organization administrators. It allows you to monitor incidents that impact your Cloud apps and subscribe to incident notifications tailored to your preferences.

System Health versus Atlassian public status page

System Health allows you to learn whether your organization’s sites are impacted by ongoing incidents thanks to our site-level monitoring. Organization administrators can check for ongoing incidents in the Atlassian Administration user interface, as well as subscribe to email notifications. Read about getting started with Systems Health

In the beta version, site-level impact information is available for reliability incidents with experiences covered by service-level agreements (SLAs) for Jira and Confluence.

Here’s a quick comparison between System Health and our public status page.

機能

System Health

status.atlassian.com

Surface incidents impacting my organization

Subscribe only to incident notifications impacting my organization

Site-level impact detection

  • Jira

  • Confluence

  • Jira Service Management

  • Bitbucket

Available to

Organization administrators only

Everyone

Incidents captured in System Health

System Health communications take two forms:

  1. Confirmed impact incidents: This describes incidents that are actively affecting one or more of your sites.

  2. Advisories: These describe ongoing incidents where we haven’t detected an impact on your specific sites. While you are unlikely to be affected, we can’t confirm that all sites are operational during an advisory.

Site-level impact detection is available for reliability incidents, which:

  • Impact Jira, Confluence, Jira Service Management, or Bitbucket.

  • Impact core experiences for the mentioned apps (including SLA-covered experiences as well as other commonly-used experiences)

“Experiences” are logical pieces of underlying functionality, such as viewing work items in Jira or creating pages in Confluence. Every Atlassian app has a number of underlying experiences that make up its full feature set.

Reliability incidents include anything resulting in abnormally high error rates, such as outages or task-related failures.

Incident statuses

System Health uses the following incident statuses:

  • Investigating: We’ve acknowledged the incident and are investigating the root cause.

  • Identified: We’ve identified the root cause and are working on a solution.

  • Monitoring: We’ve implemented a fix and are monitoring whether it worked.

  • Resolved: Business impact has ended, and the problem is resolved.

How we detect site-level impact

We detect site-level impact primarily by monitoring error rates. An example of such an error is a Jira work item failing to load for a user. Once the number of these errors reaches a certain threshold over time, we can confirm the impact. Every user is linked to a particular site, and each error corresponds to a specific application and experience. By leveraging this data, we can accurately identify the impact on specific sites and experiences.

In addition, we also monitor our databases to check for failures. Because these databases are also associated with specific apps and sites, these signals contribute to our impact detection.

その他の情報

Get started with System Health

Set up incident notifications

さらにヘルプが必要ですか?

アトラシアン コミュニティをご利用ください。