🧹

Data Management: Import, Export, and Quality (Master Class)

Mass Import/Export, validation rules, and data cleansing tools.

⏱️ Estimated reading time: 30 minutes

Import Tools: Data Import Wizard vs. Data Loader

This is the most critical comparison for the exam. You must memorize the differences.

Data Import Wizard (Native):
- Limit: Up to 50,000 records.
- Supported Objects: Accounts, Contacts, Leads, Solutions, Campaign Members, and ALL Custom Objects. (Note: Does NOT support Opportunities, Cases, or Products).
- Pros: Visual UI, no installation needed, simple deduplication, and allows disabling Workflows/Triggers (optional).

Data Loader (Desktop Client):
- Limit: Up to 5 million records.
- Supported Objects: All objects (including Opportunities, Cases, Products, Users).
- Pros: Supports all operations (Insert, Update, Upsert, Delete, Hard Delete, Export). Allows automation via CLI.

🎯 Key Points

  • βœ“ If you need to upload Opportunities -> Data Loader (Wizard doesn't support it)
  • βœ“ If you need to upload 60,000 records -> Data Loader (50k limit)
  • βœ“ Data Loader requires 'Security Token' if IP is not in Trusted IP Ranges
  • βœ“ Import Wizard allows mapping Contacts and Leads by Name, Email, or Salesforce ID

Data Operations: Upsert and Hard Delete

Upsert (Update + Insert): Smart operation that decides whether to create or update.
- Requirement: Unique matching field. Can be the Salesforce ID or an External ID.
- Scenario: You have a product list from an ERP with its own IDs (e.g., PROD-123). Use Upsert based on that External ID to avoid duplicates.

Hard Delete: Data Loader only. Deletes records bypassing the Recycle Bin. Danger: Irreversible. Requires 'Bulk API Hard Delete' permission.

🎯 Key Points

  • βœ“ External ID: Custom field with 'External ID' attribute checked (Max 25 per object)
  • βœ“ 15-char IDs (UI, Case-sensitive) vs 18-char IDs (API/Loader, Case-insensitive)
  • βœ“ Export All: Extracts active records AND archived/deleted records (in bin)

Duplicate Management (Matching & Duplicate Rules)

Native system to prevent dirty data in real-time.

1. Matching Rules (The Detector): Define *how* to identify a duplicate (e.g., Exact Email + Fuzzy Last Name).
2. Duplicate Rules (The Police): Define *what to do* when the Matching Rule finds something.
- Action on Create/Edit: Block or Allow with Alert.
- Report: If allowed, you can check 'Report' to add it to a 'Duplicate Record Set' for later review.

🎯 Key Points

  • βœ“ Cross-Object Matching: Detect if a duplicate Lead already exists as a Contact
  • βœ“ To Merge duplicates, you need Delete permission on the object
  • βœ“ Duplicate rules also run during API/Data Loader imports (can block bulk loads)

Data Quality: Picklists and Validation

Restricted Picklists: If the list is restricted, API/Data Loader will fail if you try to upload a value not in the config. If 'Unrestricted', the value is added to the record (but not to the list config).

State and Country Picklists: Standardizes addresses. Requires mapping old text data to new ISO values before enabling. Once active, using list values is mandatory.

🎯 Key Points

  • βœ“ Global Value Sets: Allow sharing the same list of values across different objects
  • βœ“ Field Dependencies: 'Controlling' filters 'Dependent'. (Standard Picklists can control Custom, but not vice-versa)
  • βœ“ Validation Rules run DURING DATA LOAD. If the row fails validation, it errors out.

Storage Limits and Troubleshooting

Salesforce splits storage into two buckets:

1. Data Storage: Records (Accounts, Contacts, Opportunities). Most weigh 2KB. (Exception: Person Accounts weigh 4KB as they are Account+Contact).
2. File Storage: Files, Documents, Attachments, Photos.

Troubleshooting:
- Import Error: 'Owner ID not found' -> Owner user is inactive or ID is wrong.
- Date Error: CSV format doesn't match the loading user's Locale.
- Validation Rules: If blocking a necessary bulk load, the admin must deactivate the rule temporarily.

🎯 Key Points

  • βœ“ If you hit Data Storage limits, you cannot create new records until you delete or buy more
  • βœ“ Mass Delete and Data Loader (Hard Delete) are ways to free up space
  • βœ“ System Validation (e.g., required field, data type) happens BEFORE Custom Validation Rules

Backup and Recovery

Salesforce DOES NOT offer free granular restore.

Data Export Service: Native backup (Weekly/Monthly). Generates ZIPs with CSVs. 48-hour download window.
Recycle Bin: Retains deleted records for 15 days.
- When restoring a parent, children are restored (Lookup/Master-Detail).
- Limitation: Bin size is limited (approx 25x your storage capacity in MB). If full, purges oldest records before 15 days.

🎯 Key Points

  • βœ“ Export On Demand: Allows immediate data export (1 time every 7 days for Weekly)
  • βœ“ Restoring a record DOES NOT restore its 'Field History' records automatically
  • βœ“ Sandbox Refresh is not backup, it's overwrite