Synchronization with Crowd directory fails intermittently - SAXParseException
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
The full synchronization between a Crowd directory and other Atlassian applications fails within the same time period, e.g. Jira synchronization always fails after 30 seconds.
Incremental synchronization works most of the time, but depending on the number of records being synced, it can fail intermittently.
Environment
Atlassian Crowd (Server or Data Center)
Any Atlassian application syncing to Crowd behind a Load Balancer
Diagnosis
The error message may change according to the application trying to sync with Crowd.
For example, when Jira tries to sync with Crowd and it fails, a Parsing Exception, due to a premature end of the XML file, is thrown at the application logs :
1
2
3
4
5
2020-06-12 00:56:01,389 Caesium-1-3 INFO ServiceRunner [c.a.crowd.directory.DbCachingRemoteDirectory] failed synchronisation complete for directory [ 10000 ] in [ 48878ms ]
2020-06-12 00:56:01,461 Caesium-1-3 ERROR ServiceRunner [c.atlassian.scheduler.JobRunnerResponse] Unable to synchronise directory
com.atlassian.crowd.exception.OperationFailedException: javax.xml.bind.UnmarshalException
- with linked exception:
[org.xml.sax.SAXParseException; lineNumber: 3443; columnNumber: 16; Premature end of file.]
Cause
The load balancer in front of the application that is trying to sync with Crowd has a gateway timeout set to a value lower than it needs to complete the full sync.
The issue is usually triggered by a full synchronization because it takes more time to sync all the records. The failure always happens after the same amount of time - a few seconds after the gateway timeout set at the load balancer.
Solution
To overcome this issue you can either:
Increase the Gateway Timeout on the Load Balancer settings; or
Bypass the load balancer when connecting the application to Crowd
Was this helpful?