Using Apex to call the Gearset Reporting API

If you're considering consuming the Gearset Reporting API data in Salesforce in order to take advantage of its built in declarative reporting and dashboard creation capabilities, here are a few considerations based on my own experience of such an implementation.

In this article, I’ll refer to “endpoint pairs” - these are the two endpoints per DORA metric as grouped on our Reporting API page. If I want to reference one in particular, it'll be named specifically.

Data model

I decided to store the entire response data in Salesforce, so as to have greater options to derive my own custom metrics from it. Therefore I created one SObject for each API response format, and matched the fields exactly, names and data types. This certainly reduces the load on any deserialization activities when consuming the data.

Something I didn’t think about ahead of time was the relationships between the results from different endpoints. For example, Deployment frequency will enumerate all deployments across the team between the supplied dates, including Gearset’s DeploymentId for each. Similarly, Change failure rate also returns the DeploymentId for each error-causing deployment. You could make the latter a child of the former - a failed deployment has to be related to the original deployment of course - but I didn’t do this; each response was completely independent.

Make sure you consider all current use cases and design for extensibility when creating the data model to store the reporting data.

Understanding the heuristics

To get all the necessary data, it’s critical to read through the API endpoint descriptions on the specification page and note in particular that the Change failure rate and Time to restore endpoint pairs detail that they involve heuristics created by Gearset. This means that Gearset will be using internal logic to determine when a deployment is deemed to have included a failed change (and therefore how long it took to recover).

Another important point to note is that those same two endpoints, as they are essentially covering corrective activities to earlier deployments, take the original deployment time as the reference point. That’s to say that if a deployment was done on the 1st of a month, and then actions which Gearset would determine to be the handling of a change failure were deployed on the 3rd, then hitting the Change failure rate or Time to restore endpoint pairs with a start date of 2nd would not return the details of the corrective deployment, as it’s the date of the original deployment which is being interrogated. You’d need to query from the 1st again to get the details for the failure and time to restore.

Job arguments

As above, the Change failure rate and Time to restore endpoint pairs will return details of failed deployments only when the original deployment which is being corrected is between the provided start and end dates.

Clearly, the end date won’t provide any value from being later than the current date and time, and there’s no reason whatsoever for it to be earlier than that, so the value for that is obvious.

However, the start time needs to be early enough to cover any deployment which might have subsequently been corrected. The question therefore becomes what is the longest possible time we might have a deployment in the target org (almost invariably this will be production) that has a failure before this is recognised and rolled back/reverted/corrected as per the heuristic?

Obviously this is subjective and needs to be determined on a customer by customer basis, and it’s worth adding a margin of error on to whatever the answer to the above question ends up being. However, it’s not solved by just choosing a massive time range - this will mean the API call needs to process huge amounts of data and risks timing out. In my example, I chose a rolling two week window.

The other endpoint pairs, Deployment frequency and Lead time for changes don’t involve the potential revision of previously received data, so they can simply be called with contiguous dates - the start date of the current run is the end date of the last run, and the end date of the current run should be the current date and time.

For the very first run of these two, then this can be the first date for which you have interest in reporting the traffic through the Pipelines feature. Nothing will be returned for a time period before Pipelines was being used.

The only exception to the above would be if you are looking to import a large chunk of historical data. It’s possible that this might be too much to retrieve in a single call, so you may need to split this up into multiple calls - here, the end date each time would need to be specified rather than being the current date and time.

Configuration items

Hardcoding values is never great. Most of the endpoints require environment or pipeline Ids to be provided, and the values to be provided here should be hidden enough so as not to be accessible to everyone, but it should equally not be a painful exercise if these need to be changed for any reason.

The above is typical for most integrations, and Custom Metadata evolved as a great solution for handling this sort of data. In my solution, I created three custom metadata types:

one for containing pipeline Ids
one for containing environment Ids
one for containing general settings relating to the integration

We’ll touch on the practicalities of handling multiple pipelines and/or environments in the next section.

Three different types does feel quite a lot, especially as each only has one field. If I were designing this for an environment with a single pipeline and single production environment (and assuming that deployment metrics for earlier orgs in the pipeline aren’t needed) then these could all easily be combined into a single custom metadata type, with fields for the pipeline Id, environment Id and for each other setting.

The ‘general settings’ referred to was a value to represent how far back in time to set the start date if:

no existing start date was held
the call was to the two endpoint pairs discussed above where the start date couldn’t just be the end date of the last call

The single field was a numeric value that defined the number of weeks to deduct from the current date and time to create the correct start date for the call.

Multi pipelines/orgs and Salesforce standard charts

There are some decisions to be made here. Although Pipelines will capture all activity across all orgs in the pipeline, how many do you really wish to track? The DORA metrics are only concerned about production, and that makes sense. You might expect to have “change failures” in earlier environments - that’s why you have them. Do you really therefore need to call the Change failure rate and Time to restore endpoint pairs for integration/QA/UAT orgs as well as production?

Deployment frequency will report across the whole team. Are they all working on the same thing? Will you need to add some sort of custom filtering on the data either once in Salesforce or when consuming the API output?

If you have multiple production orgs, and you’re looking to produce charts for dashboards, bear in mind that standard Salesforce functionality doesn’t allow you to create a chart with e.g. a drop down menu that can change the rendered data set. For example, if you ingest the data for two production orgs into the same SObject, then report on that, whilst the report can filter to one or the other, you can’t have both in the report and then in a chart switch between the two. You can of course have both shown on the same chart, perhaps as two lines on a line chart if that’s appropriate, but you might need to create custom components if you’re needing to do particularly advanced things across large numbers of orgs or pipelines.

Data retrieval mechanism

The Reporting API enforces limits to ensure fair access to all users. There’s also an upper limit to the length of time a query can take before it times out. Just as with Salesforce, there’s a balance between not exceeding the number of permitted API calls and not leaving too long between retrievals so that all the data can be retrieved and returned.

Depending on the volume of throughput in the relevant environments, pipeline/s and even team for the Deployment frequency, the optimal interval between calls will be subjective, but the obvious options are once a day and once an hour.

To enable this, I created a Schedulable Apex orchestrator class, with separate Queueable classes, one for each endpoint. This is necessary because you can’t call web services with pending DML operations, so a retrieve-insert-retrieve-insert etc. model wouldn’t be permitted within a single transaction. The schedulable class retrieves the configuration details then enqueues an instance of each of the eight endpoint classes, passing in whatever configuration is relevant to each one. In, for example, a setup with two production orgs, there’d be two separate queueable class instances of each of the Lead time for changes endpoints.

It’s probably unlikely you’ll reach it, but bear in mind that you can enqueue 50 jobs at most in a single transaction.

The orchestration is dependent on the data model. I mentioned above that I had no enforced relationships between the SObjects I’d created to receive the data, so each retrieval operation could be independent and parallel. If, for example, I’d decided to make the Time to restore object the detail side of a master/detail relationship with Deployment frequency as the master, then to avoid DML errors, the Time to restore data would have to be loaded after Deployment frequency , probably with the Apex class for Deployment frequency then scheduling the dependent operations.

In my implementation, the Start Dates for Deployment frequency and Lead time for changes endpoint pairs were set as the latest value of the fields used to store the Date and MergedAt properties of all data retrieved so far. Whilst this worked within the proof of concept, it’s not particularly scalable given that queries such as those count all rows in order to return a result - so once there were more than 50k rows in the object, that query would fail having breached the relevant governor limit. Therefore, another SObject perhaps to store general summarised state like this might be useful.

That state could also be used for summarizing the statuses of each call in some manner. If one of the eight API calls failed for some reason, as each is its own transaction, I’d have seven endpoints with the latest data and one without. This might be acceptable if the next run is just going to fill in the gaps (although that would mean Next Start Date fields per API call perhaps) but you may want to a) know about such failures and b) manually retrigger the job if needed. Therefore, a way to isolate the arguments, the execution and the results would be necessary.

Data retrieval specifics

I wanted to keep the deserialization as simple as possible, so as I had one Apex class per endpoint, each of those classes had one or more inner classes set up to exactly match the structure of the response. For example, the class to receive the data from the Change failure rate endpoint has these three inner classes:

public class Events { List<DeploymentDetails> Events { get; set; } public Events() { Events = new List<DeploymentDetails>(); } } public class DeploymentDetails { public String DeploymentId { get; set; } public DateTime Timestamp { get; set; } public List<PullRequest> PullRequests; public DeploymentDetails() { PullRequests = new List<PullRequest>(); } } public class PullRequest { public DateTime MergedAt { get; set; } public DateTime CreatedAt { get; set; } public String AuthorUsername { get; set; } public String AuthorName { get; set; } public String Description { get; set; } public String Title { get; set; } public Integer PullRequestNumber { get; set; } public String Url { get; set; } public GSR_Change_Failed_PR__c convertToSObject(String deploymentId, String queriedEnvId) { GSR_Change_Failed_PR__c pr = new GSR_Change_Failed_PR__c(); pr.Author_Name__c = this.AuthorName; pr.Author_User_Name__c = this.AuthorUsername; pr.Description__c = this.Description; pr.Gearset_Deployment__r = new Gearset_Deployment__c(Deployment_Id__c = deploymentId); pr.GS_Environment_Id__c = queriedEnvId; pr.Merged_At__c = this.MergedAt; pr.Number__c = this.PullRequestNumber; pr.Title__c = this.Title; pr.URL__c = this.Url; return pr; } }

There are though a couple of implementation points to be aware of, which meant that the deserialization wasn’t always as simple as it could be.

Where everything worked well, that operation looks something like this (using the above example):

data = (Events)JSON.deserialize(response.getBody(), Events.class);

Other response structures though aren’t so easily to deserialize. The Deployment frequency responses include a property named Date. This is of course a reserved keyword in Apex, so you cannot create a property of the data transfer object class called Date. I named mine DeploymentDate but this means that in order for Apex to fully handle the deserialization, I needed to change the name of the property in the returned JSON string. Including the quotes ensured that no other properties including the word ‘Date’ were accidentally changed:

data = (Deployment)JSON.deserialize(response.getBody().replace('"Date"', '"DeploymentDate"'), Deployment.class);

Of course, Apex now supports Dataweave transformations, but this code was written before that was GA, and deployment of Dataweave scripts is not yet supported by Gearset.

The other implementation detail are for the aggregate endpoints. My goal with these was to produce a figure for a report and dashboard representing the current rate. For example with the Lead time for changes aggregate endpoint, you receive the Min, Max and Average lead times for items deployed during the requested time window, which might be an hour or a day. However, you are probably going to want to report on a moving average or average over all time.

This could potentially be done in a report, but once again you need to be mindful of execution time, rows queried and so forth. A report is just a database query, and is subject to governor limits. I decided to calculate the average in trigger logic as new records are inserted into the relevant SObject, so that it’s already calculated when the report/dashboard are run.

The final implementation point was on the Salesforce time out. The longest that you can configure an Apex callout to last is 120 seconds, and the default value is 10 seconds. There is a timeout at 60 seconds on the Gearset side, so increasing it beyond that value will have no effect on the result. Should you be experiencing timeouts at 60 seconds, you’ll need to narrow the range for which data is being selected.

Gearset suggested Apex tests

Getting started with the Gearset Reporting API

Using PowerBI with Gearset's Reporting API

Using Google Sheets with Gearset's Reporting API

Specifying Apex Tests for a Pull Request in Gearset Pipelines