All Collections
Gearset for data sandbox seeding
Why does Gearset exclude duplicate records during a data deployment?
Why does Gearset exclude duplicate records during a data deployment?

Duplicate records cause errors with data deployments, so Gearset now detects them in the source and excludes them

Tom Smith avatar
Written by Tom Smith
Updated over a week ago

During any data migration where we want to update existing records, we need a way to match records in our source data with the same records in our target data.

In Gearset, we choose an external ID field to use to match records in the source org with records in the target org. Gearset tells Salesforce to use this external ID field when doing an upsert.

In this example, we can use the Email field to match Contact records.

Every record in the target org falls into one of these situations:

  1. No records in the source org match the record in the target

  2. Exactly one record in the source org matches the record in the target (one-to-one relationship)

  3. Multiple records in the source org match the record in the target (many-to-one relationship)

  4. One record in the source org matches multiple records in the target (one-to-many)

In situation number 3, a many-to-one relationship, if we were to attempt the upsert, Salesforce would throw a DUPLICATE_VALUE error and wouldn't update the record in the target, because it wouldn't know which source record to use for the upsert.

To avoid these errors, before telling Salesforce to upsert records, Gearset now detects any records in the source that have the same value for the external ID field, and excludes those duplicate records, so they will not be deployed. In the data deployment results, you will see an exclude duplicate references step like this:

In situation number 4, a one-to-many relationship, a DUPLICATE_EXTERNAL_ID error will occur. You can find more info about this error here.

Did this answer your question?