Fix PHONE.csv import duplicate constraint error

- Implement upsert logic in import_phone() function - Check for existing (id, phone) combinations before insert - Track duplicates within CSV to skip gracefully - Update existing records instead of failing on duplicates - Add detailed statistics: inserted, updated, skipped counts - Align with upsert pattern used in other import functions - Add documentation in docs/PHONE_IMPORT_FIX.md Fixes: UNIQUE constraint failed: phone.id, phone.phone error when re-importing or uploading CSV with duplicate entries
2025-10-12 21:45:30 -05:00
parent 22e99d27ed
commit 63809d46fb
62 changed files with 500808 additions and 4269 deletions
--- a/docs/PHONE_IMPORT_FIX.md
+++ b/docs/PHONE_IMPORT_FIX.md
@@ -0,0 +1,86 @@
+# Phone Import Unique Constraint Fix
+
+## Issue
+When uploading `PHONE.csv`, the import would fail with a SQLite integrity error:
+
+```
+UNIQUE constraint failed: phone.id, phone.phone
+```
+
+This error occurred at row 507 and cascaded to subsequent rows due to transaction rollback.
+
+## Root Cause
+The `LegacyPhone` model has a **composite primary key** on `(id, phone)` to prevent duplicate phone number entries for the same person/entity. The original `import_phone()` function used bulk inserts without checking for existing records, causing the constraint violation when:
+
+1. Re-importing the same CSV file
+2. The CSV contains duplicate `(id, phone)` combinations
+3. Partial imports left some data in the database
+
+## Solution
+Updated `import_phone()` in `/app/import_legacy.py` to implement an **upsert strategy**:
+
+### Changes Made
+1. **Check for duplicates within CSV**: Track seen `(id, phone)` combinations to skip duplicates in the same import
+2. **Check database for existing records**: Query for existing `(id, phone)` before inserting
+3. **Update or Insert**: 
+   - If record exists → update the `location` field
+   - If record doesn't exist → insert new record
+4. **Enhanced error handling**: Rollback only the failed row, not the entire batch
+5. **Better logging**: Track `inserted`, `updated`, and `skipped` counts separately
+
+### Code Changes
+```python
+# Before: Bulk insert without checking
+db.bulk_save_objects(batch)
+db.commit()
+
+# After: Upsert with duplicate handling
+existing = db.query(LegacyPhone).filter(
+    LegacyPhone.id == rolodex_id,
+    LegacyPhone.phone == phone
+).first()
+
+if existing:
+    existing.location = clean_string(row.get('Location'))
+    result['updated'] += 1
+else:
+    record = LegacyPhone(...)
+    db.add(record)
+    result['inserted'] += 1
+```
+
+## Result Tracking
+The function now returns detailed statistics:
+- `success`: Total successfully processed rows
+- `inserted`: New records added
+- `updated`: Existing records updated
+- `skipped`: Duplicate combinations within the CSV
+- `errors`: List of error messages for failed rows
+- `total_rows`: Total rows in CSV
+
+## Testing
+After deploying this fix:
+1. Uploading `PHONE.csv` for the first time will insert all records
+2. Re-uploading the same file will update existing records (no errors)
+3. Uploading a CSV with internal duplicates will skip duplicates gracefully
+
+## Consistency with Other Imports
+This fix aligns `import_phone()` with the upsert pattern already used in:
+- `import_rolodex()` - handles duplicates by ID
+- `import_trnstype()` - upserts by T_Type
+- `import_trnslkup()` - upserts by T_Code
+- `import_footers()` - upserts by F_Code
+- And other reference table imports
+
+## Related Files
+- `/app/import_legacy.py` - Contains the fixed `import_phone()` function
+- `/app/models.py` - Defines `LegacyPhone` model with composite PK
+- `/app/main.py` - Routes CSV uploads to import functions
+
+## Prevention
+To prevent similar issues in future imports:
+1. Always use upsert logic for tables with unique constraints
+2. Test re-imports of the same CSV file
+3. Handle duplicates within the CSV gracefully
+4. Provide detailed success/error statistics to users
+