How to Handle Deleted Records and Tombstones
Level: intermediate · ~17 min read · Intent: informational
Key takeaways
- Deletion is one of the easiest parts of sync design to ignore and one of the most likely to corrupt downstream systems when left undefined.
- A tombstone is a deletion marker that tells downstream systems a record existed and was removed. It helps prevent deleted records from being silently recreated during later syncs or replays.
- Good deletion handling depends on source-of-truth rules, retention windows, downstream behavior definitions, and a clear choice between soft delete, hard delete, archive, or tombstone propagation.
- If a workflow has no explicit deletion strategy, the system usually invents one through drift: orphaned records, ghost records, or accidental resurrection.
FAQ
- What is a tombstone in data sync?
- A tombstone is a marker that says a record once existed but has been deleted. It lets downstream systems know the deletion is real and should be handled intentionally rather than treated like missing data.
- Why are deleted records hard to handle in integrations?
- Because a deleted record may disappear from the source before downstream systems can understand what happened. Without explicit deletion signals, sync workflows often leave stale records behind or recreate them later by mistake.
- Should deleted records always be removed everywhere?
- Not always. Some systems should archive, anonymize, soft delete, or retain a history marker instead of physically deleting the record. The right behavior depends on business rules, compliance needs, and the system role.
- What happens if a workflow ignores deletions?
- It often creates stale or ghost records, bad reporting, inconsistent state across tools, or accidental resurrection when older data gets replayed later.
Deletion looks simple until a workflow spans several systems.
Then a record can disappear in one place and linger in three others.
Or worse:
- it gets recreated by the next sync
- it stays in reports forever
- or a replayed batch brings it back after everyone thought it was gone
That is why deletion handling needs explicit design.
Why this lesson matters
Many teams plan how records get created and updated but never define how records leave the system.
That creates a dangerous gap.
If the workflow does not know what deletion means, each connected system starts making its own guess.
Those guesses usually do not match.
The short answer
Handling deleted records well means deciding:
- how deletion is represented
- which system is allowed to initiate it
- how downstream systems should react
- how long deletion evidence should remain available
Tombstones are one of the main tools for this because they preserve a deletion signal even after the original record is gone.
What a tombstone is
A tombstone is a deletion marker.
Instead of acting like the record never existed, the system keeps a lightweight signal that says:
- this record existed
- this record was deleted
- downstream systems should treat that deletion intentionally
That signal is often much safer than simple disappearance.
Why simple disappearance is dangerous
If a record just vanishes from the source, downstream systems may not know whether:
- it was intentionally deleted
- it fell out of the query window
- the sync filtered it out
- the source API failed to return it
- or the record never existed there at all
That ambiguity is what makes deletion handling tricky.
Soft delete, hard delete, archive, and tombstone are not the same
These models solve different problems.
Soft delete
The record remains, but is marked inactive or deleted.
Hard delete
The record is physically removed.
Archive
The record stays for history or reporting but is excluded from active operations.
Tombstone
A deletion marker survives long enough to tell other systems what happened.
Good workflows choose among these intentionally instead of mixing them accidentally.
Start with source-of-truth rules
The first question is:
Which system is allowed to decide that this record is deleted?
If that is unclear, downstream systems may fight with each other.
Examples:
- CRM decides contact lifecycle
- ecommerce platform decides order and refund state
- identity provider decides account deprovisioning
Deletion handling gets much safer when that authority is explicit.
Decide the downstream behavior
Deletion in one system does not always mean hard deletion everywhere else.
Possible downstream actions include:
- hard delete
- soft delete
- archive
- anonymize
- mark inactive
- keep history but remove from active sync
The right choice depends on:
- reporting needs
- legal retention
- operational visibility
- audit requirements
Tombstone retention matters
A tombstone only helps while it still exists.
If it disappears too quickly, late replays or delayed syncs may no longer understand that the record was intentionally removed.
That is why the workflow needs a retention window that reflects:
- replay risk
- sync frequency
- outage recovery expectations
- downstream processing delay
This is especially important in asynchronous systems.
Watch out for accidental resurrection
One of the biggest deletion failures is resurrection.
That happens when a deleted record gets recreated because:
- an old batch replays
- a stale source exports it again
- a downstream system still thinks it is active
- the sync logic interprets absence incorrectly
This is where tombstones and idempotent replay logic help each other.
The system needs enough history to recognize:
- this record should stay deleted.
Common mistakes
Mistake 1: No deletion strategy at all
This is how ghost records are born.
Mistake 2: Assuming absence always means deletion
Sometimes it only means the query or export changed.
Mistake 3: Hard deleting everywhere by default
That can destroy history or violate downstream business rules.
Mistake 4: Dropping tombstones too early
Late replays and slow syncs then lose the deletion signal.
Mistake 5: No source-of-truth rule
If several systems can independently "restore" the record, deletion becomes unstable.
Final checklist
To handle deletions safely, ask:
- Which system is the source of truth for deletion?
- How is deletion represented: soft delete, hard delete, archive, or tombstone?
- What should downstream systems do when the deletion arrives?
- How long must deletion markers be retained?
- How will the workflow prevent accidental resurrection during replay or backfill?
- What records need historical retention even after deletion?
If those answers are unclear, deletion behavior is probably one of the hidden weak spots in the workflow.
FAQ
What is a tombstone in data sync?
A tombstone is a marker that says a record once existed but has been deleted. It lets downstream systems know the deletion is real and should be handled intentionally rather than treated like missing data.
Why are deleted records hard to handle in integrations?
Because a deleted record may disappear from the source before downstream systems can understand what happened. Without explicit deletion signals, sync workflows often leave stale records behind or recreate them later by mistake.
Should deleted records always be removed everywhere?
Not always. Some systems should archive, anonymize, soft delete, or retain a history marker instead of physically deleting the record. The right behavior depends on business rules, compliance needs, and the system role.
What happens if a workflow ignores deletions?
It often creates stale or ghost records, bad reporting, inconsistent state across tools, or accidental resurrection when older data gets replayed later.
Final thoughts
Deleted records are one of the clearest examples of why data sync is more than copying fields.
The workflow needs memory, intent, and policy around what disappearance really means.
When those rules are explicit, cross-system data stays much cleaner and recovery becomes much less dangerous.
About the author
Elysiate publishes practical guides and privacy-first tools for data workflows, developer tooling, SEO, and product engineering.