Starting Bitbucket Server takes a long time after upgrading to version 4.12 or newer
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Problem
After the upgrade of Bitbucket Server or Data Center to version 4.12 or newer, the initial startup is taking significantly longer. In the case of Data Center installation, the issue affects every node while it is attached to the cluster.
During startup, you can see the following errors in the logs:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2019-06-11 12:00:08,114 ERROR [spring-startup] c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998]
java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79998/tmp-61d593556c512d39-config.lock
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:335)
at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293)
at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160)
at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83)
at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96)
at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.lang.Thread.run(Thread.java:748)
... 1 frame trimmed
Diagnosis
Environment
The instance has been upgraded to version 4.12 or newer
The issue started happening only after the upgrade
No significant load on the database or filesystem is observable during the long startup
The issue still reproducible in How to Enable UPM Safe Mode using UPM REST API
The startup is stuck on Preparing plugin framework
Diagnostic Steps
Enable Debug logging and profiling
Restart the instance
Search the logs for the occurrence of
1
SalGitUpgradeManager IncludeSystemConfigTask failed for repository
Verify the time in atlassian-bitbucket-profiler.log consumed by the task
git: apply IncludeSystemConfigTask
Cause
In version 4.12 we introduced IncludeSystemConfigTask
which rewrites the config files for all repositories to add its own settings for a shared config file and to add a repository config file for each repository. We also introduced additional filesystem locks in order to provide required isolation to prevent concurrent changes to the individual repository settings. In other words, while Bitbucket Server has a config.lock file in place, if someone was to try and use git config to edit the configuration as well, Git would reject their edit.
In order to implement this locking mechanism in version 4.12, the new upgrade task has been added to perform the following actions:
Query all the repositories from the database (git: apply IncludeSystemConfigTask)
Create the
tmp-<some_hash>-config.lock
file in each repository as a hard linkIf the creation fails throw an exception with ERROR level and reschedule the retry for all repositories during the next restart.
Retry with each restart until the task finishes successfully for all repositories.
The described above logic is causing an issue in the case of the list of repositories stored in the database differs from the real repositories on the filesystem. In that case, Bitbucket will fail to create the lock file as the path does not exist on the filesystem. And when Bitbucket fails to create the lock the tasks are marked as failed:
1
2
c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories
java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories
This task is then rescheduled for the next restart. Meaning that each node restart will trigger the task to execute and when it cannot create the lock, it will be scheduled to run again at the next restart. The effect this has is that each node will encounter increased startup times. The startup times will be more pronounced as more repositories are added to the system (more repositories for IncludeSystemConfigTask to check).
Solution
Resolution
There are two resolutions available:
As the main root cause is the inconsistency between the database and the filesystem the issue can be resolved with Bitbucket Integrity checker in Data Center installations.
⚠️ Please note that it can take a very long time to run the integrity check on the instance with a significant number of repositories. You should only use the Integrity checker as a resolution if the errors reported in the logs affect more than 50 repositories.
For Bitbucket Server installations or Bitbucket Data Center installations with fewer than 50 affected repositories follow these steps:
Recreate (delete and create again) all the impacted repositories via UI
Restart the instance
Verify that there are no errors reported from
IncludeSystemConfigTask
inatlassian-bitbucket.log
i.e:Errors showing the tasks failed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
2019-01-11 12:53:08,114 ERROR [spring-startup] c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998] java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79904/tmp-61d593556c518d39-config.lock at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) at java.nio.channels.FileChannel.open(FileChannel.java:287) at java.nio.channels.FileChannel.open(FileChannel.java:335) at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293) at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160) at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83) at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96) at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.lang.Thread.run(Thread.java:748) ... 1 frame trimmed 2019-01-11 12:53:08,126 ERROR [spring-startup] c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:366) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:317) at com.atlassian.stash.internal.user.DefaultEscalatedSecurityContext.call(DefaultEscalatedSecurityContext.java:58) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$DelegatingUpgradeTask.apply(SalGitUpgradeManager.java:264) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.doUpgrade(SalGitUpgradeManager.java:325) at com.atlassian.sal.core.upgrade.PluginUpgrader.doUpgrade(PluginUpgrader.java:72) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalPluginUpgrader.apply(SalPluginUpgrader.java:27) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:382) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:373) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager.start(SalGitUpgradeManager.java:133) at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:173) at org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:50) at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:350) at org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:149) at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:112) at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:880) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:546) at javax.servlet.GenericServlet.init(GenericServlet.java:158) at java.lang.Thread.run(Thread.java:748) ... 8 frames trimmed
Messages for successful task completion:
1 2 3
c.a.s.i.s.g.u.IncludeSystemConfigTask Executor service has shutdown gracefully c.a.sal.core.upgrade.PluginUpgrader Upgraded plugin com.atlassian.bitbucket.server.bitbucket-git to version 8 - Updates all repositories to include system-config for common configuration
Perform another restart to confirm that the issue is resolved.
If you do not see any errors but the instance still takes a lot of time to startup please contact Atlassian Support and attach the log files.
Was this helpful?