Crowd Data Center nodes being simultaneously started may cause: ERROR: duplicate key value violates unique constraint "cwd_cluster_lock_pkey"

Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.

Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

This document has the purpose of explaining the node registration process, the exceptions that resulted in simultaneously starting nodes of a Crowd Datacenter Instance and possible solutions.

Diagnosis

After starting nodes with a small time gap between them, the below error can be seen in the atlassian-crowd.log file and at least one of the nodes will not be started.

1 2 3 4 5 6 7 2024-07-24 11:11:58,326 main ERROR [jdbc.batch.internal.BatchingBatch] HHH000315: Exception executing batch [java.sql.BatchUpdateException: Batch entry 0 insert into cwd_cluster_lock (lock_timestamp, node_id, lock_name) values (('1721812318298'::int8), (NULL), ('com.atlassian.crowd.manager.upgrade.UpgradeManagerImpl')) was aborted: ERROR: duplicate key value violates unique constraint "cwd_cluster_lock_pkey" Detail: Key (lock_name)=(com.atlassian.crowd.manager.upgrade.UpgradeManagerImpl) already exists. Call getNextException to see other errors in the batch.], SQL: insert into cwd_cluster_lock (lock_timestamp, node_id, lock_name) values (?, ?, ?) 2024-07-24 11:11:58,328 main WARN [engine.jdbc.spi.SqlExceptionHelper] SQL Error: 0, SQLState: 23505 2024-07-24 11:11:58,328 main ERROR [engine.jdbc.spi.SqlExceptionHelper] Batch entry 0 insert into cwd_cluster_lock (lock_timestamp, node_id, lock_name) values (('1721812318298'::int8), (NULL), ('com.atlassian.crowd.manager.upgrade.UpgradeManagerImpl')) was aborted: ERROR: duplicate key value violates unique constraint "cwd_cluster_lock_pkey" Detail: Key (lock_name)=(com.atlassian.crowd.manager.upgrade.UpgradeManagerImpl) already exists. Call getNextException to see other errors in the batch. 2024-07-24 11:11:58,328 main ERROR [engine.jdbc.spi.SqlExceptionHelper] ERROR: duplicate key value violates unique constraint "cwd_cluster_lock_pkey" Detail: Key (lock_name)=(com.atlassian.crowd.manager.upgrade.UpgradeManagerImpl) already exists.

Cause

Starting Crowd Data center nodes with a small time gap between them, may interfere in the execution of a batch insert operation on thecwd_cluster_locktable in the database.

To understand this issue cause, we need to first understand the sequence of events when a Crowd DC instance starts up.

  1. During startup, Crowd registers itself by sending a heartbeat request in the form of a database entry in cwd_cluster_heartbeat. The entry mainly contains generated node-id and heartbeat timestamp.

  2. The node will also collect some information from the node and push it to the cwd_cluster_info table. After this Heartbeat is sent every 1 minute

  3. A node is considered live if it has written a heartbeat within a reasonable tolerance level (eg 5 minutes). If the node hasn’t written a heartbeat in 5 mins, the entries are deleted from the above table, and the node is considered evicted from the cluster.

The Clustered wide jobs like synchronization, backups, etc are run by a node by acquiring a lock in thecwd_cluster_lock table. We can see the list of cluster-wide jobs in cwd_cluster_jobs with:

1 SELECT * FROM cwd_cluster_jobs;

Solution

In case the nodes eventually need to be restarted, it's recommended to start one by one with a time interval of a few minutes (3-5 at least).

Updated on March 13, 2025

Still need help?

The Atlassian Community is here for you.