Omit data using an anti-join
Platform Notice: Cloud Only - This article only applies to Atlassian products on the cloud platform.
At times, you may want to exclude rows where two joining queries have matching values, only keeping the rows that didn’t match for a particular query. In this case, we’ll be interested in keeping the rows from the query on the left where they did not match with the query on the right. This is called an anti-join, and the way to carry this out in Atlassian Analytics through Visual SQL steps.
In the following example, we used our Atlassian Data Lake data source to create a custom column that includes four Jira projects used for testing we’d like to exclude from our results. We performed the anti-join by doing the following:
For your first query, pull in the rows you’d like to exclude from your second query’s result set. You need to add the same column twice since you’ll need to use the second column as a means to filter after joining the two queries—if it’s not clear now, you’ll see what we mean a bit later.
For our example, we added our custom column twice by querying Name from Jira Project with a query filter of Nameis not like ‘%test%’.
In the query filter, we used the case insensitive version of the is not like filter operator in case some projects had ‘test’ in their name while others had ‘Test’. Using the case insensitive filter operator will ensure both options are removed from the query results.
Add a second query and include the column you’d like filtered, where you’ll only include the rows that don’t match with the Jira project values from the query in step 1. You also need to add this column twice in the query.
For our example, we want a list of all the Jira projects in our instance, so we will query Name from the Jira Project table.
Click Run query.
Click the "Join" step below Query 2. By default, the Queries are merged with an Outer join on the first columns, which is what we want in this case.
This is where pulling in the same column twice helps us see where we have a match with the other query’s result set. In our output, the second column includes the results you’re looking to exclude.
Add a "Filter" step to exclude any rows with results in the second column.
For our example, we included any rows where Name_2 is not null. Click Save. Notice how our results return four less rows now!
Since they’re no longer needed, hide the duplicate columns.
Here, we hid Name_2 and Name_2:1 so only “Name” remained in our result set.
We now have successfully removed the test projects in our custom column, reducing our 21 rows to 17 rows!
Was this helpful?