Language statistics retrieval in Bitbucket Data Center repositories
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Discover how to use third-party tools like Linguist to extract language statistics from Bitbucket Data Center repositories.
Solution
Caveats
This document summarizes developer productivity statistics, addressing a common customer query to Atlassian Support; however:
Bitbucket DC is primarily an application for hosting Git repositories, with governance surrounding access controls and workflow management.
The information in this document is presented as-is.
Atlassian doesn't guarantee that the data created using this document is correct or fit for any purpose.
Atlassian won't provide any support regarding the information in this document and won't answer support requests raised in relation to it.
Any mention of tools in the section below doesn't constitute an endorsement of these tools, nor does it imply that the mentioned tools are fit for any particular purpose. These are third-party tools not developed or maintained by Atlassian. Always evaluate third-party tools before making a purchasing decision and speak to the tool vendor when in doubt. Atlassian doesn't provide any support for the functionality of third-party tools.
You may be interested in extracting statistics from your Bitbucket Data Center repositories regarding the programming languages they contain. While there are no built-in ways of obtaining this kind of metric, you may consider the third-party tool linguist, which appears to be able to return this information for any git repository.
It is important that the tool (or any other similar tool) be run on a clone of the repository (for instance, on a workstation) and not the original repository, which is stored on disk in the shared Bitbucket home directory. This helps avoid performance problems in case the git command performs heavy operations that might slow down other git operations running in Bitbucket Data Center.
Alternatively, you could choose to perform these actions on a separate (non-production) Data Center instance that holds the same data.
Was this helpful?