In your data backup job, Gearset will execute either a full run, or an incremental run. The logic that dictates which type we'll run next depends on a handful of factors. In this article we'll take a look at how each type of run works, and how Gearset decides which type will run next.

The first run

The first run of a new backup job will be a full run; we'll need to fetch and store all objects and records (as specified in your filter). As this new job will have no prior backup history, Gearset can't run a diff against a previous snapshot - we're effectively fetching everything 'as new'. In subsequent backups, Gearset will utilise incremental runs and de-duplication logic, based on the backup history already stored. It's therefore typical that your first run will take longer than any subsequent runs.

Incremental runs

Once your first run has completed, you'll usually see your next run is an incremental one:

In these runs, Gearset asks Salesforce for all new/changed/deleted records occurring since the last backup run. Most standard and custom objects have a SystemModstamp field, which timestamps activity on a record. We ask Salesforce to return new/changed records with a timestamp which falls after the last backup run, and we'll also run a GetDeleted call.

In this incremental approach, we don't need to fetch records which haven't changed. Consequently:

  • Incremental runs are typically quicker than full runs
  • API calls are minimised, as we only make the necessary calls

Full runs

Most of the time, Gearset will execute a full run once every seven days, with incremental runs in between. (We'll go over other reasons for kicking off a full run in a moment.)

On an incremental run, we ask Salesforce what is new/changed/deleted. On a full run, we retrieve all records from Salesforce with a query(<all>) call. Occasionally, there is an inconsistency* in what Salesforce returns via the incremental run calls. So by asking Salesforce for every object specified in your filter, Gearset double checks we have everything.

Once we've retrieved all records, our de-duplication logic kicks in: Gearset will run a diff against everything already backed up, and then only stores new/changed/deleted records. This increases the speed of the run, as there's no need to back up records which haven't changed.

*The good news is: anything your end users can edit (e.g. standard and custom object records) is always captured by incremental runs. Some Salesforce automated processes are tricky to capture accurately, due to these objects having limited API call options. Examples of these types are DeleteEvent, LoginHistory, and LoginGeo. The query(<all>) call captures any of the outliers, and might explain spikes in your backup history for each full run:

If you consider these spikes to be unnecessary noise, you can always remove those objects from your backup filter.

What other factors trigger a full run?

Alongside running a full run once a week, there are other reasons Gearset might decide to do a full run:

  • If you edit your backup filter
  • If an incremental run fails due to an error. It can often be easier to 'fetch all', and let Gearset's engine do the hard work

Metadata backups

For every backup run, Gearset first fetches all metadata (as long as it's supported by the metadata API). You'll use this snapshot when running a metadata restore.

High frequency backups

Our high frequency beta also makes use of incremental and full runs. We'll execute a full run once a day, with incremental runs in between.

Did this answer your question?