Gearset doesn't use the exact same strategy for each run of a backup job. To back up an org effectively, various factors must be balanced (such as speed, efficiency, robustness and accuracy) so there's no single "best" strategy to use. The intelligence built into Gearset's backup engine means that different strategies are used at different times and, for the most part, these differences are not visible to you the user - each run captures the whole state of the org, regardless of which strategy was used.
Before Gearset can start combining these various strategies, it needs to establish a "baseline" for the org. This amounts to downloading every single record from every object included by the job's filter. This is a one-off operation and subsequent runs can then use this baseline to back up the org more efficiently and quickly.
Establishing this baseline can involve processing a large amount of data - larger orgs can have terabytes of it - which inevitably takes some time for Gearset to request and download from Salesforce.
Establishing the baseline
During the first runs of your new data backup job, Gearset will back up all the record data storage (such as Accounts, Contacts, Cases etc.) before backing up any file storage (such as Attachments, ContentDocument, Files etc.). The difference between data storage and file storage in a Salesforce org is described in more detail here.
Once Gearset has finished backing up the record data, the run's status will be ‘Record data complete’. At this point in time, Gearset will have backed up all records classified as "data storage", but no files.
From there, Gearset will continue backing up these files in the background without interfering with new runs of the job (for example: a run is manually initiated via the "Back up now" button or automatically via the scheduled start time). Any new runs will pause the process backing up the files, back up the record data as normal, give the run the ‘Record data complete’ status, and then continue backing up the files in the background again. This cycle will continue until all files have been successfully backed up once. At that point, the baseline has been fully established, and the job can continue as usual.
"As usual" here means that each run of the job will back up both the record and file storage completely before having the status of 'Completed'.
Notes
For a run with the ‘Record data complete’ status, you'll still be able to perform all the actions that you're able to perform on a completed run (such as replicating or downloading metadata and data). However, if these operations involve file storage objects, then caution is advised as these objects may not have been fully backed up. For example: you may not be able to restore a deleted Attachment record if Gearset has not backed up this record yet.
While the baseline is being established, it's not possible to edit the job's filter. Once the baseline has been established, you'll be able to edit the job's filter at any time.
High Frequency jobs do not use this strategy for establishing a baseline (and therefore will never appear as ‘Record data complete’). The object limit on these jobs means that they don’t need to process as large a volume of data, so there's no need for this strategy.
If you have any questions feel free to reach out to us via the in-app chat.