Migration Performance Optimization Plan¶
Date: 2026-02-26 Status: PLANNED Current: ~30 minutes per migration Target: <10 minutes per migration
Root Cause Analysis¶
The migration tool (SqlMigrationService.cs, 11,888 lines) has 5 systemic performance anti-patterns that compound:
| # | Issue | Impact | Est. Time Wasted |
|---|---|---|---|
| 1 | N+1 queries in MigratePropertyDataAsync |
CRITICAL | 15-20 min |
| 2 | N+1 queries in ALL other Migrate*Async methods |
CRITICAL | 4-6 min |
| 3 | BulkDataProcessor / SqlBulkCopy is DISABLED |
CRITICAL | 3-5 min |
| 4 | No transaction batching (every INSERT auto-commits) | HIGH | 2-3 min |
| 5 | Synchronous file logger with global lock | HIGH | 1-2 min |
| 6 | Excessive logging in hot paths | MEDIUM | 0.5-1 min |
| 7 | O(n^2) list processing in node migration | MEDIUM | 0.5-1 min |
| 8 | .GetAwaiter().GetResult() blocking call |
MEDIUM | Throughput |
| 9 | Single-connection sequential architecture | LOW | 1-2 min |
P0+P1 fixes alone would reduce runtime from ~30 min to ~5-8 min.
Fix 1 (P0): Pre-load Lookups + Batch Operations in MigratePropertyDataAsync¶
Location: SqlMigrationService.cs:5558-5768
Savings: 15-20 minutes
Problem¶
For EVERY property data row (thousands), the code executes 3+ individual SQL roundtrips:
1. SELECT COUNT(*) FROM umbracoContentVersion WHERE id = @versionId — check version exists
2. DELETE FROM umbracoPropertyData WHERE versionId=@versionId AND propertyTypeId=@propertyTypeId — remove existing
3. INSERT INTO umbracoPropertyData (...) — insert new
For Grid/BlockList properties, additional lookups per block:
4. GetElementTypeGuidAsync — up to 3 queries per element type
5. GetDocumentTypeGuidByAliasAsync — 1 query per alias
6. LoadElementTypePropertiesFromSourceAsync — 1 query per element type
7. TryResolveMediaGuidByPathAsync — 1 LIKE query per media path (slow)
8. CreateBlockGridContentItemsAsync — 5-8 queries per block item
A single Grid property with 5 blocks = ~50+ SQL roundtrips.
Solution¶
A) Pre-load all lookup data before the property loop:
// Before the foreach loop — ONE query each:
var existingVersionIds = new HashSet<int>(
await connection.QueryAsync<int>("SELECT id FROM umbracoContentVersion", target));
var elementTypeGuids = new Dictionary<string, Guid>(
await connection.QueryAsync<(string, Guid)>(
"SELECT alias, uniqueId FROM cmsContentType WHERE isElement = 1", target));
var contentTypeGuids = new Dictionary<string, Guid>(
await connection.QueryAsync<(string, Guid)>(
"SELECT alias, uniqueId FROM cmsContentType", source));
var mediaPathToGuid = new Dictionary<string, Guid>(
await connection.QueryAsync<(string, Guid)>(
"SELECT umbracoMediaPath, uniqueId FROM umbracoNode WHERE nodeObjectType = '...'", source));
Then in the loop: replace per-row SQL calls with in-memory Dictionary.TryGetValue() / HashSet.Contains().
B) Batch DELETE + INSERT using temp table:
// 1. Collect all transformed property data in-memory
var transformedRows = new List<PropertyDataRow>();
foreach (var prop in propertyData) {
// Transform value (no SQL needed — lookups are in-memory now)
transformedRows.Add(transformed);
}
// 2. Bulk insert into temp table
await SqlBulkCopy(transformedRows, "#PropertyDataStaging");
// 3. Single DELETE-JOIN
await target.ExecuteAsync(@"
DELETE pd FROM umbracoPropertyData pd
INNER JOIN #PropertyDataStaging s
ON pd.versionId = s.versionId AND pd.propertyTypeId = s.propertyTypeId");
// 4. Single INSERT-SELECT
await target.ExecuteAsync(@"
INSERT INTO umbracoPropertyData (versionId, propertyTypeId, ...)
SELECT versionId, propertyTypeId, ...
FROM #PropertyDataStaging");
This turns N×3 roundtrips into 3 total roundtrips.
Fix 2 (P0): Replace N+1 in All Other Migration Methods¶
Savings: 4-6 minutes
Every Migrate*Async method follows the same anti-pattern:
Affected methods (per-row queries):
- MigrateRelationsAsync (line 8485): 6 queries/row — worst offender
- MigrateContentAsync (line 5160): 3 queries/row
- MigrateContentVersionsAsync (line 5231): 3-4 queries/row
- MigrateDocumentVersionsAsync (line 5337): 3-4 queries/row
- MigrateAuditAsync (line 9053): 2 queries/row
- MigrateLogAsync (line 9131): 2 queries/row
Solution¶
A) Pre-load existence checks into HashSets:
// One query instead of N queries
var existingNodeIds = new HashSet<int>(
await connection.QueryAsync<int>("SELECT id FROM umbracoNode", target));
var existingRelationTypeIds = new HashSet<int>(
await connection.QueryAsync<int>("SELECT id FROM umbracoRelationType", target));
B) Use INSERT...WHERE NOT EXISTS or MERGE:
INSERT INTO umbracoContent (nodeId, contentTypeId)
SELECT @nodeId, @contentTypeId
WHERE NOT EXISTS (SELECT 1 FROM umbracoContent WHERE nodeId = @nodeId)
C) For MigrateRelationsAsync specifically: Pre-load node IDs, relation type IDs, and existing relation composite keys. Do ALL 6 checks in-memory. Single INSERT.
Fix 3 (P0): Re-enable and Integrate BulkDataProcessor¶
Location: Program.cs:328-334 (commented out)
Savings: 3-5 minutes
The BulkDataProcessor class exists and implements SqlBulkCopy (100-1000x faster than individual INSERTs), but it is:
1. Commented out in DI registration
2. Never referenced from SqlMigrationService
Solution¶
- Uncomment the DI registration in
Program.cs - Inject
IBulkDataProcessorintoSqlMigrationService - Use for high-volume tables:
umbracoPropertyData,umbracoLog,umbracoAudit,umbracoRelation,umbracoContentVersion - Collect rows into DataTable, call
BulkInsertAsync
Fix 4 (P1): Add Transaction Batching¶
Savings: 2-3 minutes
Currently ZERO transactions in the entire 11,888-line file. Every INSERT auto-commits individually, forcing SQL Server to flush the transaction log for each row.
Solution¶
Wrap each migration phase in a transaction, committing every 500-1000 rows:
using var transaction = targetConnection.BeginTransaction();
int batchCount = 0;
foreach (var row in rows) {
// ... insert ...
if (++batchCount % 1000 == 0) {
await transaction.CommitAsync();
transaction = targetConnection.BeginTransaction();
}
}
await transaction.CommitAsync();
Fix 5 (P1): Replace Synchronous File Logger¶
Location: Program.cs:372-405
Savings: 1-2 minutes
The custom logger:
- Uses a static lock object (ALL log calls serialize globally)
- Calls File.AppendAllText() synchronously per log entry (opens/writes/closes file each time)
- With verbose mode + 444+ log calls in SqlMigrationService, this is thousands of sync file I/O ops
Solution¶
Replace with async buffered logger:
private static readonly ConcurrentQueue<string> _logQueue = new();
public void Log<TState>(...) {
_logQueue.Enqueue(FormatLogEntry(logLevel, state, exception));
}
// Background flush every 500ms
private static void FlushLogs() {
var sb = new StringBuilder();
while (_logQueue.TryDequeue(out var entry)) sb.AppendLine(entry);
if (sb.Length > 0) File.AppendAllText(_filePath, sb.ToString());
}
Or simply use Serilog with async file sink.
Fix 6 (P1): Reduce Logging in Hot Paths¶
Savings: 0.5-1 minute
Quick Wins¶
- Remove
Console.WriteLineon line 1275 (debug statement with typo "cofig") - Demote per-row logging in
MigratePropertyDataAsyncfromLogInformationtoLogDebug: - Lines 5659, 5664, 5704-5706, 5737-5741, 5745-5746
- Demote
GetElementTypeGuidAsyncline 10529 fromLogInformationtoLogDebug - Add aggregate logging: log every 100/1000 rows instead of per-row
Fix 7 (P2): Fix O(n^2) Node List Processing¶
Location: SqlMigrationService.cs:855-992
Savings: 0.5-1 minute
remainingNodes.Remove(nodeToRemove) is O(n) per call inside a while loop → O(n^2) total.
Solution¶
Replace with HashSet<int> processedNodeIds and filter with Where(!processedNodeIds.Contains(id)).
Fix 8 (P2): Fix Blocking Async Call¶
Location: SqlMigrationService.cs:6741
// Current (blocks thread):
var result = migrationTask.GetAwaiter().GetResult();
// Fix:
var result = await _gridLayoutMigrator.MigratePropertyAsync(context, cancellationToken);
Fix 9 (P3): Parallelize Independent Migration Phases¶
Savings: 1-2 minutes
Independent phases that can run in parallel: - Group A: Audit + Log + Consent (no dependencies) - Group B: Relations + Redirect URLs (no dependencies)
await Task.WhenAll(
MigrateAuditAsync(source, target, ct),
MigrateLogAsync(source, target, ct),
MigrateConsentAsync(source, target, ct)
);
Requires opening separate connections per parallel task.
Implementation Order¶
Phase A (Quick Wins — 1 day): Fixes 5, 6, 7, 8
→ Remove Console.WriteLine, demote logging, fix O(n^2), fix blocking call
→ Expected: ~30 min → ~27 min
Phase B (Core Optimization — 3-4 days): Fixes 1, 2
→ Pre-load all lookups, batch operations, MERGE patterns
→ Expected: ~27 min → ~8-10 min
Phase C (Bulk Operations — 2 days): Fixes 3, 4
→ Re-enable BulkDataProcessor, add transaction batching
→ Expected: ~8-10 min → ~5-7 min
Phase D (Optional — 1 day): Fix 9
→ Parallelize independent phases
→ Expected: ~5-7 min → ~4-5 min
Total effort: ~7-8 days to reach <10 minute target (Phases A+B) Full optimization: ~10-11 days to reach ~5 minutes
Verification¶
After each phase:
1. Run migration with --verbose logging
2. Compare total time and per-step times
3. Verify migrated data integrity (row counts match)
4. Verify property values are correctly transformed (spot-check Grid/BlockList content)
5. Run full Umbraco site and verify rendering