Update duplicate handling docs to include pension tables

- Document composite primary key handling for pension tables
- Add code examples for both single and composite key duplicate detection
- List all pension-related tables with duplicate protection
This commit is contained in:
HotSwapp
2025-10-13 09:36:09 -05:00
parent c3bbf927a5
commit 02d439cf8b

View File

@@ -22,16 +22,30 @@ The import system now implements multiple layers of duplicate protection:
### 1. In-Memory Duplicate Tracking
```python
# For single primary key
seen_in_import = set()
# For composite primary key (e.g., file_no + version)
seen_in_import = set() # stores tuples like (file_no, version)
composite_key = (file_no, version)
```
Tracks IDs encountered during the current import session. If an ID is seen twice in the same file, only the first occurrence is imported.
Tracks IDs or composite key combinations encountered during the current import session. If a key is seen twice in the same file, only the first occurrence is imported.
### 2. Database Existence Check
Before importing each record, checks if it already exists:
```python
# For single primary key (e.g., rolodex)
if db.query(Rolodex).filter(Rolodex.id == rolodex_id).first():
result['skipped'] += 1
continue
# For composite primary key (e.g., pensions with file_no + version)
if db.query(Pensions).filter(
Pensions.file_no == file_no,
Pensions.version == version
).first():
result['skipped'] += 1
continue
```
### 3. Graceful Batch Failure Handling
@@ -74,6 +88,10 @@ This is **not an error** - it means the system is protecting data integrity.
Currently implemented for:
-`rolodex` (primary key: id)
-`filetype` (primary key: file_type)
-`pensions` (composite primary key: file_no, version)
-`pension_death` (composite primary key: file_no, version)
-`pension_separate` (composite primary key: file_no, version)
-`pension_results` (composite primary key: file_no, version)
Other tables should be updated if they encounter similar issues.