Bitbucket Data Center Repository Sizing and Health Check Guide
Platform Notice: Data Center Only - This article only applies to Atlassian apps on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
This article provides step-by-step instructions for assessing the size and health of Bitbucket Data Center repositories (version 7.17 and above) before migration or optimization. It outlines key checks for repository eligibility, offers commands and an automation script to evaluate disk usage and garbage collection conditions, and links to further resources for managing large repositories.
Solution
Follow the steps below to assess if the repository on the server is eligible for a garbage collection.
Bitbucket Thresholds for a Repack
Starting Bitbucket v7.13+, Bitbucket must meet one of the below 3 key conditions for a git repack operation to be triggered on the server for a particular repository:
More than 27 files in
objects/17objects/17larger than 1MBMore than 50 pack files in
objects/pack
The steps on this article will help to determine if these conditions are already met or near to being met and which will thus trigger a GC operation automatically on the server.
Only if the conditions are not being met, we recommend you continue to evaluate the options on the knowledge article “Migrating Large Repositories to Bitbucket Cloud: Reducing Repository Size and Best Practices” to reduce the repository size.
Navigate to the repository directory on the Bitbucket home
cd <bitbucket-home>/shared/data/repositories/<repo-id>Ensure that the repository is not corrupted by validating the below command output
git fsck --fullView repository-specific Git configuration. From the config file, we can confirm if the repository has forks or not by checking if prune is disabled.
git config -lView global Git configuration that may affect repository behavior.
git config --global --listCheck the number and size of loose objects:
git count-objects -v --human-readableThis will provide a summary of the loose objects and their disk usage, helping identify uncompressed or un-packed data.
Check total and per-directory disk usage. This will allow us to inspect the objects/17 directory for its size and the number of objects it holds along with the number of pack files in the objects/pack folder.
du -sh ./ && du -sh ./* du -sh objects/* | sort -rh | head du -sh objects/17 && ls objects/17 | wc -l du -sh objects/pack/* | sort -rh | headCheck the last timestamps of the repack and pack-refs operations along with the gc logs. The gc log will allow us to check when the last repack operation was run and if it had a prune.
ls -ltr <bitbucket-home>/shared/data/repositories/<repo-id>/app-info cat <bitbucket-home>/shared/data/repositories/<repo-id>/app-info/gc.logYou can contact our Support team to evaluate these details and discuss the next steps here.
Alternatively, you can run this script to perform the above checks in an automated way. This script automates the comprehensive health check of the Bitbucket repository. It:
Displays repository and global Git configuration.
Summarises the number and size of loose objects.
Identifies the largest object directories and pack files.
Lists the
app-infodirectory and recent garbage collection logs.Evaluates key conditions that must be met for a repack operation to be triggered:
The below script content on this page is out of the scope of our Atlassian Support Offerings and is only shared here as a reference to automate these checks for multiple repositories. Atlassian cannot guarantee support for it. Please be aware that this material is provided for your information only, and you may use it at your own risk.
Copy the Script:
Copy the script given below and save it as check_repo.sh on your Bitbucket Data Center node.
Usage: bash check_repo.sh -H <BITBUCKET_HOME> -r <REPO_ID>
check_repo.sh
#!/usr/bin/env bash
# Check Bitbucket DC repo health and GC-related conditions
# Usage: bash check_repo.sh -H <BITBUCKET_HOME> -r <REPO_ID>
set -euo pipefail
BITBUCKET_HOME=""
REPO_ID=""
while getopts ":H:r:" opt; do
case "$opt" in
H) BITBUCKET_HOME="$OPTARG" ;;
r) REPO_ID="$OPTARG" ;;
*) echo "Usage: $0 -H <bitbucket-home> -r <repo-id>" >&2; exit 1 ;;
esac
done
if [[ -z "${BITBUCKET_HOME}" || -z "${REPO_ID}" ]]; then
echo "Usage: $0 -H <bitbucket-home> -r <repo-id>" >&2
exit 1
fi
REPO_DIR="${BITBUCKET_HOME}/shared/data/repositories/${REPO_ID}"
OBJ_DIR="${REPO_DIR}/objects"
OBJ_17_DIR="${OBJ_DIR}/17"
PACK_DIR="${OBJ_DIR}/pack"
APP_INFO_DIR="${REPO_DIR}/app-info"
if [[ ! -d "${REPO_DIR}" ]]; then
echo "ERROR: Repository directory not found: ${REPO_DIR}" >&2
exit 2
fi
echo "== Repo directory =="
echo "${REPO_DIR}"
echo
# Move into the repo
cd "${REPO_DIR}"
echo "== git config (repo-local) =="
git --no-pager config -l || true
echo
echo "== git config (global) =="
git --no-pager config --global --list || true
echo
echo "== git count-objects =="
git count-objects -v --human-readable || true
echo
echo "== Disk usage (repo root + immediate children) =="
du -sh ./ 2>/dev/null || true
du -sh ./* 2>/dev/null || true
echo
echo "== Largest objects subdirs (top 10) =="
if [[ -d "${OBJ_DIR}" ]]; then
du -sh "${OBJ_DIR}"/* 2>/dev/null | sort -rh | head || true
else
echo "Objects dir not found: ${OBJ_DIR}"
fi
echo
echo "== objects/17 size and count =="
if [[ -d "${OBJ_17_DIR}" ]]; then
du -sh "${OBJ_17_DIR}" 2>/dev/null || true
# number of regular files under objects/17 (recursive)
find "${OBJ_17_DIR}" -type f 2>/dev/null | wc -l | awk '{print "files in objects/17 (recursive): " $1}'
else
echo "objects/17 not found"
fi
echo
echo "== Largest packs (top 10) =="
if [[ -d "${PACK_DIR}" ]]; then
du -sh "${PACK_DIR}"/* 2>/dev/null | sort -rh | head || true
else
echo "objects/pack not found"
fi
echo
echo "== app-info listing and gc.log preview =="
if [[ -d "${APP_INFO_DIR}" ]]; then
ls -ltr "${APP_INFO_DIR}" 2>/dev/null || true
echo
if [[ -f "${APP_INFO_DIR}/gc.log" ]]; then
echo "-- app-info/gc.log (last 50 lines) --"
tail -n 50 "${APP_INFO_DIR}/gc.log" || true
else
echo "gc.log not found in app-info"
fi
else
echo "app-info dir not found: ${APP_INFO_DIR}"
fi
echo
echo "== Evaluating GC-related conditions =="
cond_files_over_27="NO"
cond_17_over_1mb="NO"
cond_packs_over_50="NO"
# Condition 1: objects/17 has more than 27 files
if [[ -d "${OBJ_17_DIR}" ]]; then
count_17_files=$(find "${OBJ_17_DIR}" -type f 2>/dev/null | wc -l | awk '{print $1}')
if [[ "${count_17_files}" -gt 27 ]]; then
cond_files_over_27="YES"
fi
else
count_17_files=0
fi
# Condition 2: objects/17 is more than 1MB in size
# Use du -sk to get size in KB and compare to 1024
if [[ -d "${OBJ_17_DIR}" ]]; then
size_17_kb=$(du -sk "${OBJ_17_DIR}" 2>/dev/null | awk '{print $1}')
if [[ "${size_17_kb:-0}" -ge 1024 ]]; then
cond_17_over_1mb="YES"
fi
else
size_17_kb=0
fi
# Condition 3: objects/pack has more than 50 pack files (*.pack)
if [[ -d "${PACK_DIR}" ]]; then
count_pack_files=$(find "${PACK_DIR}" -maxdepth 1 -type f -name '*.pack' 2>/dev/null | wc -l | awk '{print $1}')
if [[ "${count_pack_files}" -gt 50 ]]; then
cond_packs_over_50="YES"
fi
else
count_pack_files=0
fi
echo "objects/17 file count : ${count_17_files} (want > 27) => ${cond_files_over_27}"
echo "objects/17 size (KB) : ${size_17_kb} (want >= 1024 KB) => ${cond_17_over_1mb}"
echo "objects/pack *.pack files : ${count_pack_files} (want > 50) => ${cond_packs_over_50}"
echo
if [[ "${cond_files_over_27}" == "YES" || "${cond_17_over_1mb}" == "YES" || "${cond_packs_over_50}" == "YES" ]]; then
echo "RESULT: At least one condition is satisfied (GC attention likely needed)."
else
echo "RESULT: None of the conditions are satisfied."
fiMake the Script Executable:
chmod +x check_repo.sh
Run the Script:
Execute the script with the required parameters for your Bitbucket instance:
bash check_repo.sh -H <BITBUCKET_HOME> -r <REPO_ID>
Replace
<BITBUCKET_HOME>with the absolute path to your Bitbucket home directory.Replace
<REPO_ID>with the numeric ID of the repository you want to check.
Was this helpful?