Bitbucket Server DIY Backup fails - Operations from one or more SCMs did not finish within the allotted timeout
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Symptoms
Bitbucket Server DIY backup fails gracefully, specifically, this function doesn't finish in time: bitbucket_backup_wait
The following appears in the atlassian-bitbucket-YYYY-MM-DD.log
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2015-10-09 02:50:42,925 WARN [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainScmStep The SCMs could not be drained. Aborting...
2015-10-09 02:50:42,962 WARN [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.DefaultMaintenanceTaskMonitor BACKUP maintenance has failed (Cause: BackupException: A backup file could not be created.)
com.atlassian.stash.internal.backup.BackupException: A backup file could not be created.
at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:78) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:130) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:69) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.MaintenanceModePhase.run(MaintenanceModePhase.java:27) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.backup.AbstractBackupTask.run(AbstractBackupTask.java:85) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.DefaultMaintenanceTaskMonitor.run(DefaultMaintenanceTaskMonitor.java:212) ~[stash-service-impl-3.11.1.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_45]
at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_45]
at com.atlassian.stash.internal.concurrent.StateTransferringExecutor$StateTransferringRunnable.run(StateTransferringExecutor.java:73) [stash-platform-3.11.1.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_45]
at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_45]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [na:1.8.0_45]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_45]
at java.lang.Thread.run(Unknown Source) [na:1.8.0_45]
... 1 frame trimmed
Caused by: com.atlassian.stash.internal.backup.BackupException: Operations from one or more SCMs did not finish within the allotted timeout. To prevent corruption due to inconsistent state, the backup has been aborted. Please try backup up again when the system is under less load.
at com.atlassian.stash.internal.maintenance.LatchAndDrainScmStep.newDrainFailedException(LatchAndDrainScmStep.java:36) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.AbstractLatchAndDrainTask.run(AbstractLatchAndDrainTask.java:86) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:130) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:69) ~[stash-service-impl-3.11.1.jar:na]
at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:74) ~[stash-service-impl-3.11.1.jar:na]
... 15 common frames omitted
2015-10-09 02:50:42,925 WARN [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainScmStep The SCMs could not be drained. Aborting...
Diagnosis
Run
ps -ef
to find the back-end running git processes, e.g.1 2 3 4
atlstash 2964 6 0.0 00:00:00 0.0 1184 113132 ? S 16:38:31 git http-backend atlstash 2965 0 0.0 00:00:00 0.0 1232 112188 ? S 16:38:31 git-http-backend <noArgs> atlstash 2966 7 0.0 00:00:00 0.0 1316 123376 ? S 16:38:31 git receive-pack_--stateless-rpc_. atlstash 2971 0 0.1 00:01:16 0.0 13024 137468 ? S 16:38:31 git index-pack_-
Run
lsof
to determine the affected repositories, e.g.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
git 2971 atlstash cwd DIR 0,19 4096 67504227 /apps/stash-data/shared/data/repositories/1 git 2971 atlstash rtd DIR 253,0 4096 2 / git 2971 atlstash txt REG 253,0 7303493 286128 /usr/local/libexec/git-core/git git 2971 atlstash mem REG 253,0 91096 139390 /lib64/libz.so.1.2.3 git 2971 atlstash mem REG 253,0 99158576 263410 /usr/lib/locale/locale-archive git 2971 atlstash mem REG 253,0 19536 135157 /lib64/libdl-2.12.so git 2971 atlstash mem REG 253,0 1921216 131097 /lib64/libc-2.12.so git 2971 atlstash mem REG 253,0 142640 131121 /lib64/libpthread-2.12.so git 2971 atlstash mem REG 253,0 1963296 266130 /usr/lib64/libcrypto.so.1.0.1e git 2971 atlstash mem REG 253,0 154664 132201 /lib64/ld-2.12.so git 2971 atlstash mem REG 253,0 26060 527143 /usr/lib64/gconv/gconv-modules.cache git 2971 atlstash 0r FIFO 0,8 ? 591801098 pipe git 2971 atlstash 1w FIFO 0,8 ? 591801518 pipe git 2971 atlstash 2w FIFO 0,8 ? 591801517 pipe git 2971 atlstash 3u REG 0,19 4186750018 67295571 /apps/stash-data/shared/data/repositories/1/objects/pack/tmp_pack_lCDLna
Cause
There were Git processes active that didn't finish within the default time (60 seconds) expected by the backup script.
Solution
Resolution
Increase the timeout from the 60 second default (e.g. to 2 minutes) to give more time for the Git operations to be completed. Update bitbucket.properties
with the following parameter and restart the application:
backup.drain.scm.timeout=120
There is no exact answer to what the timeout value should be set to. Iterate until you find enough time so the SCM requests can be processed.
Was this helpful?