Skip to content

Migration Performance Optimization Plan

Date: 2026-02-26 Status: PLANNED Current: ~30 minutes per migration Target: <10 minutes per migration


Root Cause Analysis

The migration tool (SqlMigrationService.cs, 11,888 lines) has 5 systemic performance anti-patterns that compound:

# Issue Impact Est. Time Wasted
1 N+1 queries in MigratePropertyDataAsync CRITICAL 15-20 min
2 N+1 queries in ALL other Migrate*Async methods CRITICAL 4-6 min
3 BulkDataProcessor / SqlBulkCopy is DISABLED CRITICAL 3-5 min
4 No transaction batching (every INSERT auto-commits) HIGH 2-3 min
5 Synchronous file logger with global lock HIGH 1-2 min
6 Excessive logging in hot paths MEDIUM 0.5-1 min
7 O(n^2) list processing in node migration MEDIUM 0.5-1 min
8 .GetAwaiter().GetResult() blocking call MEDIUM Throughput
9 Single-connection sequential architecture LOW 1-2 min

P0+P1 fixes alone would reduce runtime from ~30 min to ~5-8 min.


Fix 1 (P0): Pre-load Lookups + Batch Operations in MigratePropertyDataAsync

Location: SqlMigrationService.cs:5558-5768 Savings: 15-20 minutes

Problem

For EVERY property data row (thousands), the code executes 3+ individual SQL roundtrips: 1. SELECT COUNT(*) FROM umbracoContentVersion WHERE id = @versionId — check version exists 2. DELETE FROM umbracoPropertyData WHERE versionId=@versionId AND propertyTypeId=@propertyTypeId — remove existing 3. INSERT INTO umbracoPropertyData (...) — insert new

For Grid/BlockList properties, additional lookups per block: 4. GetElementTypeGuidAsync — up to 3 queries per element type 5. GetDocumentTypeGuidByAliasAsync — 1 query per alias 6. LoadElementTypePropertiesFromSourceAsync — 1 query per element type 7. TryResolveMediaGuidByPathAsync — 1 LIKE query per media path (slow) 8. CreateBlockGridContentItemsAsync — 5-8 queries per block item

A single Grid property with 5 blocks = ~50+ SQL roundtrips.

Solution

A) Pre-load all lookup data before the property loop:

// Before the foreach loop — ONE query each:
var existingVersionIds = new HashSet<int>(
    await connection.QueryAsync<int>("SELECT id FROM umbracoContentVersion", target));

var elementTypeGuids = new Dictionary<string, Guid>(
    await connection.QueryAsync<(string, Guid)>(
        "SELECT alias, uniqueId FROM cmsContentType WHERE isElement = 1", target));

var contentTypeGuids = new Dictionary<string, Guid>(
    await connection.QueryAsync<(string, Guid)>(
        "SELECT alias, uniqueId FROM cmsContentType", source));

var mediaPathToGuid = new Dictionary<string, Guid>(
    await connection.QueryAsync<(string, Guid)>(
        "SELECT umbracoMediaPath, uniqueId FROM umbracoNode WHERE nodeObjectType = '...'", source));

Then in the loop: replace per-row SQL calls with in-memory Dictionary.TryGetValue() / HashSet.Contains().

B) Batch DELETE + INSERT using temp table:

// 1. Collect all transformed property data in-memory
var transformedRows = new List<PropertyDataRow>();
foreach (var prop in propertyData) {
    // Transform value (no SQL needed — lookups are in-memory now)
    transformedRows.Add(transformed);
}

// 2. Bulk insert into temp table
await SqlBulkCopy(transformedRows, "#PropertyDataStaging");

// 3. Single DELETE-JOIN
await target.ExecuteAsync(@"
    DELETE pd FROM umbracoPropertyData pd
    INNER JOIN #PropertyDataStaging s
    ON pd.versionId = s.versionId AND pd.propertyTypeId = s.propertyTypeId");

// 4. Single INSERT-SELECT
await target.ExecuteAsync(@"
    INSERT INTO umbracoPropertyData (versionId, propertyTypeId, ...)
    SELECT versionId, propertyTypeId, ...
    FROM #PropertyDataStaging");

This turns N×3 roundtrips into 3 total roundtrips.


Fix 2 (P0): Replace N+1 in All Other Migration Methods

Savings: 4-6 minutes

Every Migrate*Async method follows the same anti-pattern:

Read all rows → foreach row → check exists → check FK → insert

Affected methods (per-row queries): - MigrateRelationsAsync (line 8485): 6 queries/row — worst offender - MigrateContentAsync (line 5160): 3 queries/row - MigrateContentVersionsAsync (line 5231): 3-4 queries/row - MigrateDocumentVersionsAsync (line 5337): 3-4 queries/row - MigrateAuditAsync (line 9053): 2 queries/row - MigrateLogAsync (line 9131): 2 queries/row

Solution

A) Pre-load existence checks into HashSets:

// One query instead of N queries
var existingNodeIds = new HashSet<int>(
    await connection.QueryAsync<int>("SELECT id FROM umbracoNode", target));
var existingRelationTypeIds = new HashSet<int>(
    await connection.QueryAsync<int>("SELECT id FROM umbracoRelationType", target));

B) Use INSERT...WHERE NOT EXISTS or MERGE:

INSERT INTO umbracoContent (nodeId, contentTypeId)
SELECT @nodeId, @contentTypeId
WHERE NOT EXISTS (SELECT 1 FROM umbracoContent WHERE nodeId = @nodeId)

C) For MigrateRelationsAsync specifically: Pre-load node IDs, relation type IDs, and existing relation composite keys. Do ALL 6 checks in-memory. Single INSERT.


Fix 3 (P0): Re-enable and Integrate BulkDataProcessor

Location: Program.cs:328-334 (commented out) Savings: 3-5 minutes

The BulkDataProcessor class exists and implements SqlBulkCopy (100-1000x faster than individual INSERTs), but it is: 1. Commented out in DI registration 2. Never referenced from SqlMigrationService

Solution

  1. Uncomment the DI registration in Program.cs
  2. Inject IBulkDataProcessor into SqlMigrationService
  3. Use for high-volume tables: umbracoPropertyData, umbracoLog, umbracoAudit, umbracoRelation, umbracoContentVersion
  4. Collect rows into DataTable, call BulkInsertAsync

Fix 4 (P1): Add Transaction Batching

Savings: 2-3 minutes

Currently ZERO transactions in the entire 11,888-line file. Every INSERT auto-commits individually, forcing SQL Server to flush the transaction log for each row.

Solution

Wrap each migration phase in a transaction, committing every 500-1000 rows:

using var transaction = targetConnection.BeginTransaction();
int batchCount = 0;
foreach (var row in rows) {
    // ... insert ...
    if (++batchCount % 1000 == 0) {
        await transaction.CommitAsync();
        transaction = targetConnection.BeginTransaction();
    }
}
await transaction.CommitAsync();

Fix 5 (P1): Replace Synchronous File Logger

Location: Program.cs:372-405 Savings: 1-2 minutes

The custom logger: - Uses a static lock object (ALL log calls serialize globally) - Calls File.AppendAllText() synchronously per log entry (opens/writes/closes file each time) - With verbose mode + 444+ log calls in SqlMigrationService, this is thousands of sync file I/O ops

Solution

Replace with async buffered logger:

private static readonly ConcurrentQueue<string> _logQueue = new();

public void Log<TState>(...) {
    _logQueue.Enqueue(FormatLogEntry(logLevel, state, exception));
}

// Background flush every 500ms
private static void FlushLogs() {
    var sb = new StringBuilder();
    while (_logQueue.TryDequeue(out var entry)) sb.AppendLine(entry);
    if (sb.Length > 0) File.AppendAllText(_filePath, sb.ToString());
}

Or simply use Serilog with async file sink.


Fix 6 (P1): Reduce Logging in Hot Paths

Savings: 0.5-1 minute

Quick Wins

  1. Remove Console.WriteLine on line 1275 (debug statement with typo "cofig")
  2. Demote per-row logging in MigratePropertyDataAsync from LogInformation to LogDebug:
  3. Lines 5659, 5664, 5704-5706, 5737-5741, 5745-5746
  4. Demote GetElementTypeGuidAsync line 10529 from LogInformation to LogDebug
  5. Add aggregate logging: log every 100/1000 rows instead of per-row

Fix 7 (P2): Fix O(n^2) Node List Processing

Location: SqlMigrationService.cs:855-992 Savings: 0.5-1 minute

remainingNodes.Remove(nodeToRemove) is O(n) per call inside a while loop → O(n^2) total.

Solution

Replace with HashSet<int> processedNodeIds and filter with Where(!processedNodeIds.Contains(id)).


Fix 8 (P2): Fix Blocking Async Call

Location: SqlMigrationService.cs:6741

// Current (blocks thread):
var result = migrationTask.GetAwaiter().GetResult();

// Fix:
var result = await _gridLayoutMigrator.MigratePropertyAsync(context, cancellationToken);

Fix 9 (P3): Parallelize Independent Migration Phases

Savings: 1-2 minutes

Independent phases that can run in parallel: - Group A: Audit + Log + Consent (no dependencies) - Group B: Relations + Redirect URLs (no dependencies)

await Task.WhenAll(
    MigrateAuditAsync(source, target, ct),
    MigrateLogAsync(source, target, ct),
    MigrateConsentAsync(source, target, ct)
);

Requires opening separate connections per parallel task.


Implementation Order

Phase A (Quick Wins — 1 day): Fixes 5, 6, 7, 8
  → Remove Console.WriteLine, demote logging, fix O(n^2), fix blocking call
  → Expected: ~30 min → ~27 min

Phase B (Core Optimization — 3-4 days): Fixes 1, 2
  → Pre-load all lookups, batch operations, MERGE patterns
  → Expected: ~27 min → ~8-10 min

Phase C (Bulk Operations — 2 days): Fixes 3, 4
  → Re-enable BulkDataProcessor, add transaction batching
  → Expected: ~8-10 min → ~5-7 min

Phase D (Optional — 1 day): Fix 9
  → Parallelize independent phases
  → Expected: ~5-7 min → ~4-5 min

Total effort: ~7-8 days to reach <10 minute target (Phases A+B) Full optimization: ~10-11 days to reach ~5 minutes


Verification

After each phase: 1. Run migration with --verbose logging 2. Compare total time and per-step times 3. Verify migrated data integrity (row counts match) 4. Verify property values are correctly transformed (spot-check Grid/BlockList content) 5. Run full Umbraco site and verify rendering

Migration documentation by Double for Progress Credit Union