Repository size remains the same after deleting large files and running garbage collection (GC) on the remote

Platform Notice: Cloud Only - This article only applies to Atlassian products on the cloud platform.

Summary

You may encounter issues where the repository size remains the same or does not reduce even after deleting large files and running garbage collection on the remote repository.

Diagnosis

This issue can be caused by large file objects that are preserved in Git references on the remote repository. These references are preserved to improve the performance of complex diffs.

Solution

TheBitbucket Cloud Support Team can help confirm the list of pull requests that need to be deleted in order to clear the repository's storage space. You can use the Bitbucket Cloud REST API to back up your pull requests based on the list provided by the support team before you approve the pull requests for deletion.

Below is a sample Python script that can be used to export pull request details, including author, state, ID, created date, source branch, destination branch, and description, into a CSV file. You can also add an API query to filter the pull request details created after a given date. For example, "%%3E+2022-07-01T00%%3A00%%3A00-07%%3A00" filters all the pull requests created after July 1, 2022. For more details, please refer to the API querying documentation.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 import requests from requests.auth import HTTPBasicAuth ##Login username = '<please-key-in-your-bitbucket-username-here>' password = '<please-key-in-your-app-password>' repository = '<workspace-id/repo-name>' # Request the next page URL next_page_url = 'https://api.bitbucket.org/2.0/repositories/%s/pullrequests?fields=next,values.id,values.created_on,values.state,values.author,values.source.branch.name,values.destination.branch.name,values.source.commit.hash,values.destination.commit.hash,values.description&&q=created_on+%%3E+2022-07-01T00%%3A00%%3A00-07%%3A00&&pagelen=20' % repository f = open('pr_stats.csv','a') print("PR Author"+","+"PR Status"+","+"PR Number"+","+"PR created date"+","+"PR Source Branch"+","+"PR Destination Branch"+","+"PR Source Branch Commit"+","+"PR Destination Branch Commit"+","+"PR Description", file=f) # Keep fetching pages while there's a page to fetch while next_page_url is not None: response = requests.get(next_page_url, auth=HTTPBasicAuth(username, password)) page_json = response.json() # Parse repositories from the JSON for repo in page_json['values']: author=repo['author']['display_name'] state=repo['state'] PR_ID=str(repo['id']) created_date=str(repo['created_on']) PR_SourceBranch=repo['source']['branch']['name'] PR_DestinationBranch=repo['destination']['branch']['name'] PR_SourceBranch_Commit=repo['source']['commit']['hash'] PR_DestinationBranch_Commit=repo['destination']['commit']['hash'] PR_Description=repo['description'] print(author+","+state+","+PR_ID+","+created_date+","+PR_SourceBranch+","+PR_DestinationBranch+","+PR_SourceBranch_Commit+","+PR_DestinationBranch_Commit+","+PR_Description, file=f) next_page_url = page_json.get('next', None)

Sample 'pr_stats.csv' output:

PR Author

PR Status

PR Number

PR created date

PR Source Branch

PR Destination Branch

PR Source Branch Commit

PR Destination Branch Commit

PR Description

XYX

OPEN

968

2022-10-23T10:03:35.478284+00:00

test

master

ff5evv8a5i9f

019auieha046

updated test changes

ABC

OPEN

967

2022-10-22T09:21:52.577095+00:00

develop

release

e2e3hhade042

57009debfca0

updated the release changes

Once you have backed up the details of your pull requests, please inform the Bitbucket Cloud Support team to delete the pull requests. Once this is done, the large files associated with the Git references will be removed from the repository, which should reduce the repository size.

Updated on April 8, 2025

Still need help?

The Atlassian Community is here for you.