Finding storage usage of Confluence space and page using REST API

Platform Notice: Cloud Only - This article only applies to Atlassian products on the cloud platform.

Summary

Problem

Storage can be tracked per product but not by individual space and page.

(Auto-migrated image: description temporarily unavailable)

Reference: https://support.atlassian.com/security-and-access-policies/docs/track-storage-and-move-data-across-products/

Solution

In case you are unable to utilize the Storage usage feature, you can use the Confluence Cloud REST API to programmatically list the storage size of each attachment in Confluence spaces and pages, and save the output to a .CSV file.

(Auto-migrated image: description temporarily unavailable)
(Auto-migrated image: description temporarily unavailable)

To proceed, you should have:

  • A terminal with Python installed

  • Some programming knowledge

Steps:

  1. Log in as a user with Confluence Administrator permission, and create or use an existing API token for your Atlassian account. The user who is running the script will only be able to fetch data that they can access in Confluence. Depending on page restrictions and permissions - there can be situations where not all attachments, pages and/or spaces are returned

  2. Copy and paste sample code below to a new file. Change the values of USER, TOKEN, and BASE_URL as appropriate

    Note: Below may not work due to changes in the specifications of the REST API. Please refer to Confluence Cloud REST API for up-to-date info

    Python script

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 # This sample was updated on 18-Dec-2023 # This code sample uses the 'requests' 'json' 'csv' library import requests import json import csv # Input your base url, username and token USER="your_email_address@example.com" TOKEN="XXXXXXXXXXXXXXXXX" BASE_URL="https://your_site.atlassian.net" # Get all attachments from pages def process_pages(pages, perPageWriter): space_attachment_volume = 0 for page in pages: page_attachment_volume = 0 print(f" Page ID: {page['id']}") url = f"{BASE_URL}/wiki/api/v2/pages/{page['id']}/attachments" while url: response = requests.get(url, headers=headers, auth=(USER, TOKEN)) data = response.json() attachment_results = data["results"] for attachment in attachment_results: attachment_name = attachment["title"] attachment_size = attachment["fileSize"] print(f" Attachment Name: {attachment_name}, {attachment_size} bytes") page_attachment_volume += int(attachment_size) space_attachment_volume += page_attachment_volume if "next" not in data["_links"]: break url = f"{BASE_URL}{data['_links']['next']}" print(f" --> PAGE TOTAL: {page_attachment_volume}") # Write page attachment volume to CSV perPageWriter.writerow([page["id"], str(page_attachment_volume)]) return space_attachment_volume # Get pages from space_id def get_pages(space_id, perSpaceWriter): get_pages_space_attachment_volume = 0 url = f"{BASE_URL}/wiki/api/v2/spaces/{space_id}/pages" while url: response = requests.get(url, headers=headers, auth=(USER, TOKEN)) data = response.json() page_results = data["results"] get_pages_space_attachment_volume += process_pages(page_results, perPageWriter) if "next" not in data["_links"]: break url = f"{BASE_URL}{data['_links']['next']}" print(f"\n SPACE TOTAL: {get_pages_space_attachment_volume} bytes") print("----------") # Write space attachment volume to CSV print(f" --> TESTSPACE TOTAL: {get_pages_space_attachment_volume}") perSpaceWriter.writerow([space["name"], space["key"], str(get_pages_space_attachment_volume)]) with open('per_page.csv', 'w') as pagecsvfile, open('per_space.csv', 'w') as spacecsvfile: perPageWriter = csv.writer(pagecsvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL) perSpaceWriter = csv.writer(spacecsvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL) perPageWriter.writerow(['pageid','attachment_size(byte)']) perSpaceWriter.writerow(['space_name','space_key','attachment_size(byte)']) headers = { "Accept": "application/json" } # Get all space_ids url = f"{BASE_URL}/wiki/api/v2/spaces" while url: response = requests.get(url, headers=headers, auth=(USER, TOKEN)) data = response.json() space_id_results = data["results"] for space in space_id_results: get_pages(space["id"], perSpaceWriter) if "next" not in data["_links"]: break url = f"{BASE_URL}{data['_links']['next']}"
  3. Execute the file. You may need to install additional "Requests", "JSON" and "CSV" Python libraries

    Terminal command

    1 $ python <filename>
    (Auto-migrated image: description temporarily unavailable)
  4. In the same directory(current working directory), files named "per_space.csv" and "per_page.csv" will be generated with the data stored there

    (Auto-migrated image: description temporarily unavailable)
Updated on April 8, 2025

Still need help?

The Atlassian Community is here for you.