How To Estimate FishEye Repository Index
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
The purpose of this article is to provide you with steps to estimate the length of time it will take to index your FishEye repository.
This page is based on getting estimates using Git.
Solution
Step 1 - Identify Repository Size
The most important factors that determine slurp time are the total number of commits (reachable both from remote heads and tags).
Run the following commands to retrieve the corresponding values:
total commits reachable
number of paths changed in every commit
number of tags
number of paths for the most recent tag
1
2
3
4
5
6
7
8
$ git rev-list --all --count
1000000
$ git show-ref --heads --no-abbrev | wc -l
0
$ git show-ref -d --tags | wc -l
6000
$ git ls-tree -r origin/<recent_tag> | wc -l
50000
Step 2 - Identify the Number of Commits Fisheye Can Process Per Day
ℹ️ This information is only available if DEBUG logging is enabled on Fisheye/Crucible.
To enable debug logging
Go to Administration > Global settings > Server
Under Debug logging, click on Turn debug logging ON
Retrieve the FishEye logs and execute the following command:
1
2
grep "<- Processing" atlassian-fisheye-YYYY-MM-DD.log | awk 'BEGIN {total = 0}{total += $15}END{printf("%d commits processed in %dms total (%.1fms avg)\n", NR, total, total / NR);}'
1000000 commits processed in 75525706ms total (42.9ms avg)
Of which, we look for the unique commits that were processed:
1
2
grep "<- Processing" atlassian-fisheye-YYYY-MM-DD.log | awk '{print $13}' | sort | uniq | wc -l
50000
Step 3 - Calculate Estimate
From the above, we now know that we have 1 million commits within our repository. Of those, only 50 thousand were able to be processed in a single day. This is 5% of the total number of commits we need to be indexed. This means that it would take approximately 20 days to index this repository.
Bonus - Check (Git) Index Progress
Fisheye doesn't provide any indication of a Git repository indexing progress via its UI, but one can hit a REST endpoint that returns a number of commits processed so far:
<FECRU_URL>/rest-service-fecru/admin/repositories-v1/<REPOSITORY_KEY>
E.g.:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{
"name": "processor-sdk-linux-local",
"displayName": "processor-sdk-linux-local",
"state": "RUNNING",
"enabled": true,
"indexingStatus": {
"linesOfContentIndexingInProgress": false,
"indexingStateCounts": {
"INFILLED": 0,
"METADATA_INDEXED": 0,
"COMPLETE": 0,
"SCANNED": 0,
"UNKNOWN": 33958,
"CONTENT_INDEXED": 0
},
"initialScanningComplete": false,
"fullRepositorySlurpDone": false,
"incrementalIndexingInProgress": false,
"message": "Processing commit dc335d9735220b3a9ece5ec2d95864b1e8ff06a0",
"error": false,
"fullIndexingInProgress": true,
"crossRepositoryRescanInProgress": false
}
}
The number we're interested in is indexingStatus.indexingStateCounts.UNKNOWN and can be compared to a total number of commits in the repo (git rev-list --all --count) to get a rough estimate of indexing progress.
Was this helpful?