6 min read
SSIS to Azure Migration: Strategies and Tools
Many organizations have significant investments in SQL Server Integration Services (SSIS) packages. Let’s explore the options for running these packages in Azure and strategies for modernization.
Migration Options
Option 1: Azure-SSIS Integration Runtime
Lift and shift existing SSIS packages with minimal changes:
{
"name": "AzureSsisIR",
"type": "Managed",
"typeProperties": {
"computeProperties": {
"location": "East US",
"nodeSize": "Standard_D8_v3",
"numberOfNodes": 2,
"maxParallelExecutionsPerNode": 8
},
"ssisProperties": {
"catalogInfo": {
"catalogServerEndpoint": "your-server.database.windows.net",
"catalogAdminUserName": "admin",
"catalogAdminPassword": {
"type": "SecureString",
"value": "your-password"
},
"catalogPricingTier": "S1"
},
"customSetupScriptProperties": {
"blobContainerUri": "https://storage.blob.core.windows.net/customsetup",
"sasToken": {
"type": "SecureString",
"value": "your-sas-token"
}
},
"edition": "Standard",
"licenseType": "LicenseIncluded"
}
}
}
Deploying to Azure-SSIS IR
# Connect to Azure
Connect-AzAccount
# Get Data Factory
$dataFactory = Get-AzDataFactoryV2 -ResourceGroupName "rg-data" -Name "adf-production"
# Get Integration Runtime
$ir = Get-AzDataFactoryV2IntegrationRuntime -ResourceGroupName "rg-data" `
-DataFactoryName "adf-production" -Name "AzureSsisIR"
# Deploy SSIS project
$projectPath = "C:\SSISProjects\ETL.ispac"
# Using SSMS or PowerShell
$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = "Server=your-server.database.windows.net;Database=SSISDB;User Id=admin;Password=your-password;"
$connection.Open()
$command = $connection.CreateCommand()
$command.CommandType = [System.Data.CommandType]::StoredProcedure
$command.CommandText = "catalog.deploy_project"
$command.Parameters.Add((New-Object System.Data.SqlClient.SqlParameter("folder_name", "ETL")))
$command.Parameters.Add((New-Object System.Data.SqlClient.SqlParameter("project_name", "ETL")))
$projectStream = [System.IO.File]::ReadAllBytes($projectPath)
$command.Parameters.Add((New-Object System.Data.SqlClient.SqlParameter("project_stream", $projectStream)))
$command.ExecuteNonQuery()
$connection.Close()
Option 2: Data Factory Mapping Data Flows
Convert SSIS logic to native Azure Data Factory:
{
"name": "CustomerETLFlow",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "SourceCustomers",
"type": "DatasetReference"
},
"name": "SourceData"
}
],
"transformations": [
{
"name": "CleanseData",
"description": "Equivalent to SSIS Data Conversion + Derived Column",
"dataset": null,
"linkedService": null,
"flowlet": null,
"transformation": "derivedColumn"
},
{
"name": "LookupCountry",
"description": "Equivalent to SSIS Lookup transform",
"transformation": "lookup"
},
{
"name": "SplitValidInvalid",
"description": "Equivalent to SSIS Conditional Split",
"transformation": "conditionalSplit"
},
{
"name": "AggregateByRegion",
"description": "Equivalent to SSIS Aggregate transform",
"transformation": "aggregate"
}
],
"sinks": [
{
"dataset": {
"referenceName": "DestCustomers",
"type": "DatasetReference"
},
"name": "SinkData"
},
{
"dataset": {
"referenceName": "ErrorLog",
"type": "DatasetReference"
},
"name": "ErrorSink"
}
],
"script": "source(output(\n CustomerId as string,\n CustomerName as string,\n Email as string,\n CountryCode as string,\n CreatedDate as string\n ),\n allowSchemaDrift: true) ~> SourceData\nSourceData derive(\n CustomerId = toInteger(CustomerId),\n CustomerName = trim(upper(CustomerName)),\n Email = lower(trim(Email)),\n CreatedDate = toDate(CreatedDate, 'yyyy-MM-dd')\n ) ~> CleanseData\nCleanseData, CountryRef lookup(\n CountryCode == CountryRef@Code,\n multiple: false,\n pickup: 'first',\n asc(Code, true)\n ) ~> LookupCountry\nLookupCountry split(\n isNull(CountryRef@Code),\n disjoint: false\n ) ~> SplitValidInvalid@(Invalid, Valid)\nSplitValidInvalid@Valid aggregate(\n groupBy(Region),\n CustomerCount = count(),\n TotalValue = sum(LTV)\n ) ~> AggregateByRegion\nAggregateByRegion sink(\n allowSchemaDrift: true,\n validateSchema: false\n ) ~> SinkData\nSplitValidInvalid@Invalid sink(\n allowSchemaDrift: true\n ) ~> ErrorSink"
}
}
}
SSIS to Data Flow Mapping
| SSIS Component | Data Flow Equivalent |
|---|---|
| OLE DB Source | Source transformation |
| Flat File Source | Delimited text dataset |
| Derived Column | Derived column transformation |
| Data Conversion | Derived column with cast() |
| Lookup | Lookup transformation |
| Conditional Split | Conditional split transformation |
| Union All | Union transformation |
| Aggregate | Aggregate transformation |
| Sort | Sort transformation |
| Merge Join | Join transformation |
| Multicast | Split (1 to many) |
| Row Count | Aggregate with count() |
Converting Common Patterns
SSIS Lookup to Data Flow
# SSIS Lookup equivalent in Data Flow script
source(output(
OrderId as integer,
CustomerId as integer,
OrderAmount as decimal(18,2)
)) ~> Orders
source(output(
CustomerId as integer,
CustomerName as string,
Region as string
)) ~> Customers
Orders, Customers lookup(
Orders@CustomerId == Customers@CustomerId,
multiple: false,
pickup: 'first'
) ~> JoinCustomer
SSIS Script Component to Data Flow
# Complex transformations that were in Script Component
# can use derived columns with expressions
source(...) ~> InputData
InputData derive(
# Email validation (was in Script Component)
IsValidEmail = regexMatch(Email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$'),
# Phone formatting (was in Script Component)
FormattedPhone = concat(
'(',
substring(replace(Phone, '-', ''), 1, 3),
') ',
substring(replace(Phone, '-', ''), 4, 3),
'-',
substring(replace(Phone, '-', ''), 7, 4)
),
# Complex date logic (was in Script Component)
FiscalYear = case(
month(TransactionDate) >= 7, year(TransactionDate) + 1,
year(TransactionDate)
)
) ~> TransformedData
SSIS For Each Loop to Data Factory
{
"name": "ProcessMultipleFiles",
"properties": {
"activities": [
{
"name": "GetFileList",
"type": "GetMetadata",
"typeProperties": {
"dataset": {
"referenceName": "SourceFolder",
"type": "DatasetReference"
},
"fieldList": ["childItems"]
}
},
{
"name": "ForEachFile",
"type": "ForEach",
"dependsOn": [
{
"activity": "GetFileList",
"dependencyConditions": ["Succeeded"]
}
],
"typeProperties": {
"items": {
"value": "@activity('GetFileList').output.childItems",
"type": "Expression"
},
"isSequential": false,
"batchCount": 20,
"activities": [
{
"name": "ProcessFile",
"type": "ExecuteDataFlow",
"typeProperties": {
"dataFlow": {
"referenceName": "FileProcessingFlow",
"type": "DataFlowReference"
},
"compute": {
"coreCount": 8,
"computeType": "General"
},
"traceLevel": "Fine"
}
}
]
}
}
]
}
}
Hybrid Approach
Sometimes the best approach is hybrid - some packages on Azure-SSIS IR, others converted to Data Flows:
{
"name": "HybridETLPipeline",
"properties": {
"activities": [
{
"name": "RunLegacySSIS",
"type": "ExecuteSSISPackage",
"typeProperties": {
"packageLocation": {
"type": "SSISDB",
"packagePath": "ETL/LegacyPackage/Package.dtsx"
},
"connectVia": {
"referenceName": "AzureSsisIR",
"type": "IntegrationRuntimeReference"
}
}
},
{
"name": "RunModernDataFlow",
"type": "ExecuteDataFlow",
"dependsOn": [
{
"activity": "RunLegacySSIS",
"dependencyConditions": ["Succeeded"]
}
],
"typeProperties": {
"dataFlow": {
"referenceName": "ModernTransformations",
"type": "DataFlowReference"
}
}
}
]
}
}
Migration Assessment
# Script to assess SSIS packages for migration
function Assess-SSISPackage {
param (
[string]$PackagePath
)
$xml = [xml](Get-Content $PackagePath)
$ns = New-Object System.Xml.XmlNamespaceManager($xml.NameTable)
$ns.AddNamespace("DTS", "www.microsoft.com/SqlServer/Dts")
$assessment = @{
PackageName = $xml.SelectSingleNode("//DTS:Property[@DTS:Name='PackageName']", $ns).InnerText
Components = @()
Complexity = "Low"
MigrationPath = "Data Flow"
}
# Check for complex components
$scriptTasks = $xml.SelectNodes("//DTS:Executable[@DTS:ExecutableType='Microsoft.SqlServer.Dts.Tasks.ScriptTask.ScriptTask']", $ns)
if ($scriptTasks.Count -gt 3) {
$assessment.Complexity = "High"
$assessment.MigrationPath = "Azure-SSIS IR"
}
# Check for custom components
$customComponents = $xml.SelectNodes("//component[@componentClassID!='']", $ns)
foreach ($comp in $customComponents) {
if ($comp.componentClassID -notlike "Microsoft*") {
$assessment.Components += "Custom: $($comp.componentClassID)"
$assessment.Complexity = "High"
$assessment.MigrationPath = "Azure-SSIS IR with custom setup"
}
}
return $assessment
}
# Assess all packages
$packages = Get-ChildItem -Path "C:\SSISProjects" -Filter "*.dtsx" -Recurse
$assessments = $packages | ForEach-Object { Assess-SSISPackage -PackagePath $_.FullName }
$assessments | Export-Csv -Path "migration-assessment.csv" -NoTypeInformation
Conclusion
SSIS migration to Azure offers flexibility. Azure-SSIS IR provides the fastest path for lift-and-shift, while Data Factory Data Flows offer a modern, serverless alternative. Most organizations benefit from a phased approach - running critical packages on Azure-SSIS IR while gradually converting simpler packages to native Data Flows.