Get Pages with oldest Historic Versions in Confluence Data Center
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
As described in the Page History and Page Comparison Documentation, Confluence tracks the history of changes to each page by creating a new version of the page each time it's modified, allowing users to view the changes between different versions, and roll back to a previous version upon need. The historic entries are actually pages that remain stored in the database, and sometimes Confluence instances may have several historic pages saved in the backend, which may not be in use and are not automatically purged.
As an administrator, you would like to get a list of how many versions exist for a specific page in different spaces.
If you have Confluence Data Center, remaining versions will not be renumbered when a version is deleted. Older versions may also be deleted automatically, if your administrator has defined retention rules for the site or space.
Environment
Confluence 6.2.1 and later
All below queries are written for Postgres and may require modification to run on other database engines.
Solution
Run the following SQL statement to list the current number of versions of a specific page, and the latest/current version on the page/space:
WITH VERSIONED_PAGES AS(
SELECT c1.contentid as current_contentid ,c1.title as current_title ,c1.version current_version,c1.spaceid current_spaceid,c2.contentid previous_contentid,c2.title previous_title
FROM CONTENT c1
JOIN CONTENT c2 ON c1.contentID = c2.prevver
WHERE c1.CONTENTTYPE = 'PAGE'
)
SELECT MAX(current_version) AS "Latest Page Version", COUNT(current_version) AS "Total Page Versions", current_title AS "Current Page Title", s.spacekey AS "Space Key"
FROM VERSIONED_PAGES vp
JOIN SPACES s ON s.spaceid=vp.current_spaceid
GROUP BY current_title, s.spacekey
ORDER BY 2 DESC
Run the following SQL statement to list the total number of versions in your spaces:
WITH VERSIONED_PAGES AS(
SELECT c1.contentid as current_contentid ,c1.title as current_title ,c1.version current_version,c1.spaceid current_spaceid,c2.contentid previous_contentid,c2.title previous_title
FROM CONTENT c1
JOIN CONTENT c2 ON c1.contentID = c2.prevver
WHERE c1.CONTENTTYPE = 'PAGE'
)
SELECT COUNT(current_contentid) AS "Total Versions in Space", s.spacekey
FROM VERSIONED_PAGES vp
JOIN SPACES s ON s.spaceid=vp.current_spaceid
GROUP BY s.spacekey
ORDER BY 1 DESC
If you want all the pages and their historical version count per space, run the following query:
SELECT content.title, content.version
FROM content
INNER JOIN spaces
on content.spaceid = spaces.spaceid
WHERE content.contenttype = 'PAGE'
and content.prevver is null
and content.content_status = 'current'
and spaces.spacename ='<SPACENAME>'
ORDER BY 2 DESC;
ℹ️ Be sure to replace <SPACENAME>
place holder with the spaces full name you want to query.
If you are looking to list the pages that has historic versions older than X days, run the following query:
select distinct parentid
from "content"
where content_status !='current'
and contenttype ='PAGE'
and (now()::date - creationdate::date) > X ;
ℹ️ Replace 'X' with the actual number in the below query
Additional resources: retrieve oldest Confluence data
You may also wish to fetch the oldest set of data from your Confluence instance.
Invariably, the oldest data will always be a space, since a space needs to be present in the first place for content to be created within it.
This query will give you the space creation date from the oldest to the latest:
WITH Recently_Viewed AS (SELECT
"SPACE_KEY", MAX("LAST_VIEW_DATE") as "last_view_date"
FROM "AO_92296B_AORECENTLY_VIEWED"
GROUP BY 1)
SELECT s.spacename, s.spacekey, s.creationdate as "creation_date", MAX(c.lastmoddate) as "last_updated_date",r."last_view_date"
FROM content c
JOIN spaces s ON s.spaceid=c.spaceid
LEFT JOIN Recently_Viewed r ON r."SPACE_KEY" = s.spacekey
GROUP BY 1,2,3,5; order by s.creationdate
And this query should give you the content such as pages etc, from oldest to newest:
SELECT content.title, content.pageid, content.contenttype, content.lastmoddate, spaces.spacename
FROM content join spaces on content.spaceid=spaces.spaceid
WHERE(content.lastmoddate) > '2000-05-10' order by content.lastmoddate;
Was this helpful?