Migrating from Amazon QLDB to ledger tables in Azure SQL Database: A Comprehensive Guide by info.odysseyx@gmail.com October 21, 2024 written by info.odysseyx@gmail.com October 21, 2024 0 comment 20 views 20 Amazon Web Services (AWS) announced the discontinuation of Amazon Quantum Ledger Database (QLDB). In a previous blog post Move from Amazon Quantum Ledger Database (QLDB) to a ledger in Azure SQL We discussed that Microsoft provides a great alternative for users to host their data with cryptographic immutability. This allows us to maintain strict data integrity standards. Ledger in Azure SQL It integrates easily into the Azure SQL environment, making it a powerful alternative. This post outlines how to migrate your Amazon QLDB ledger to Azure SQL Database by leveraging: US Department of Motor Vehicles (DMV) sample ledger from a tutorial in the Amazon QLDB Developer Guide… For reference. You can adapt this solution to fit your schema and migration strategy. Navigating Database Migration: Key Considerations for a Smooth Transition When you begin your database migration journey, one of the first decisions you face is determining the scope of data to migrate. Depending on the requirements of your application, you can choose to transfer the entire database, including all historical data, or migrate only the most recent data and retain older records for future reference. Another important aspect to consider is how you model your data in your new Azure SQL Database. This involves converting ledger data to fit the model selected during the migration process. There are several options here: standardization: This approach involves converting the document model to a relational model. Although this can be difficult, it is often the best way to ensure that your data remains in a format usable by a relational database after migration. JSON storage: Alternatively, you can migrate the entire JSON to a single column in your ledger table. This simplifies the migration process, but may not be the most efficient for accessing data in relational databases. Each option has its own pros and cons, and the right choice will depend on your specific use case and requirements. In this blog post standardization approach way out The solution is built in Azure Data Factory and is based in part on the blog post. Dynamically map JSON to SQL in Azure Data Factory | Business Intelligence (sqlkov… by MVP Koen Verbeek. prerequisites Prepare source files To begin the data migration process, you must first prepare your source files. There are two basic approaches to consider: Convert and import CSV files: Convert Amazon QLDB JSON documents to CSV files and import them into ledger tables in Azure SQL Database. This involves exporting and converting your Amazon QLDB ledger to a CSV file, as described in the “Exporting Data” and “Extracting and Converting” sections. this blog post. You can then import the CSV file into Azure SQL Database using the following methods: bulk copy Copy activity in Azure Data Factory: This activity migrates data from source files to Azure SQL Database. Save and import JSON files: Store QLDB data as JSON files in Azure Storage and import it as relational data using Azure Data Factory (ADF). In this post, I will focus on the second approach and provide a detailed walkthrough. To create the source file, follow these steps: Step 1: ‘Opens a section.Step 2: Create tables, indexes, and sample data in the ledger” from the Amazon Quantum Ledger Database (Amazon QLDB) documentation. Step 2: Go to the “Manual Options” section and copy the JSON sample data for the Person, DriversLicense, VehicleRegistration, and Vehicle tables into separate JSON files. Make sure you use the correct JSON syntax, as shown in the example below. Name the file according to that table. Step 3: Upload these JSON files to an Azure Storage container. [ { "FirstName" : "Raul", "LastName" : "Lewis", "DOB" : "1963-08-19", "GovId" : "LEWISR261LL", "GovIdType" : "Driver License", "Address" : "1719 University Street, Seattle, WA, 98109" }, { "FirstName" : "Brent", "LastName" : "Logan", "DOB" : "1967-07-03", "GovId" : "LOGANB486CG", "GovIdType" : "Driver License", "Address" : "43 Stockert Hollow Road, Everett, WA, 98203" }, { "FirstName" : "Alexis", "LastName" : "Pena", "DOB" : "1974-02-10", "GovId" : "744 849 301", "GovIdType" : "SSN", "Address" : "4058 Melrose Street, Spokane Valley, WA, 99206" }, { "FirstName" : "Melvin", "LastName" : "Parker", "DOB" : "1976-05-22", "GovId" : "P626-168-229-765", "GovIdType" : "Passport", "Address" : "4362 Ryder Avenue, Seattle, WA, 98101" }, { "FirstName" : "Salvatore", "LastName" : "Spencer", "DOB" : "1997-11-15", "GovId" : "S152-780-97-415-0", "GovIdType" : "Passport", "Address" : "4450 Honeysuckle Lane, Seattle, WA, 98101" } ] The following steps will save four JSON source files in your Azure Storage account, ready to be used for data migration. Prepare the target database and tables In this example Azure SQL Database To our goal. It is important to note that ledger functionality is also available in: Azure SQL managed instance and SQL Server 2022. Set up an updatable ledger table for each JSON file generated in the previous section. To create a database and updatable ledger table, follow these steps: Create a single database: We start by creating a single database in Azure SQL Database. You can find detailed instructions in the Azure SQL Database documentation. Create updatable ledger tables: Next, run the script provided below to create an updatable ledger table. Adjust the script according to your specific requirements. CREATE TABLE dbo.Person ( FirstName NVARCHAR(50), LastName NVARCHAR(50), DOB DATE, GovId NVARCHAR(50), GovIdType NVARCHAR(50), Address NVARCHAR(255) ) WITH ( SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[PersonHistory]), LEDGER = ON ); GO CREATE TABLE dbo.DriversLicense ( LicensePlateNumber NVARCHAR(50), LicenseType NVARCHAR(50), ValidFromDate DATE, ValidToDate DATE, PersonId NVARCHAR(50) ) WITH ( SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[DriversLicenseHistory]), LEDGER = ON ); GO CREATE TABLE dbo.VehicleRegistration ( VIN NVARCHAR(50), LicensePlateNumber NVARCHAR(50), State NVARCHAR(50), City NVARCHAR(50), PendingPenaltyTicketAmount DECIMAL(10, 2), ValidFromDate DATE, ValidToDate DATE, PrimaryOwner NVARCHAR(100), SecondaryOwner NVARCHAR(100) ) WITH ( SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[VehicleRegistrationHistory]), LEDGER = ON ); GO CREATE TABLE dbo.Vehicle ( VIN NVARCHAR(50), Type NVARCHAR(50), Year INT, Make NVARCHAR(50), Model NVARCHAR(50), Color NVARCHAR(50) ) WITH ( SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[VehicleHistory]), LEDGER = ON ); Configuring database users: Finally, create a database user corresponding to the managed identity in Azure Data Factory. To ensure that your Azure Data Factory pipeline has the necessary permissions to write to the database, add this new user to the db_datawriter role. CREATE USER [ledgeradf] FROM EXTERNAL PROVIDER; ALTER ROLE [db_datawriter] ADD MEMBER [ledgeradf] GRANT EXECUTE TO [ledgeradf]; By following these steps, you will be ready to set up Azure SQL Database with the appropriate ledger table and handle data migration from your JSON file. The next step requires creating two additional mapping tables. The first table maps SQL Server data types to the corresponding data types expected by Azure Data Factory. The second table establishes a mapping between table names and their collection references. Below is a script to create and populate these tables. CREATE TABLE ADF_DataTypeMapping ( [ADFTypeMappingID] int, [ADFTypeDataType] varchar(20), [SQLServerDataType] varchar(20) ) INSERT INTO ADF_DataTypeMapping ([ADFTypeMappingID], [ADFTypeDataType], [SQLServerDataType]) VALUES ( 1, 'Int64', 'BIGINT' ), ( 2, 'Byte array', 'BINARY' ), ( 3, 'Boolean', 'BIT' ), ( 4, 'String', 'CHAR' ), ( 5, 'DateTime', 'DATE' ), ( 6, 'DateTime', 'DATETIME' ), ( 7, 'DateTime', 'DATETIME2' ), ( 8, 'DateTimeOffset', 'DATETIMEOFFSET' ), ( 9, 'Decimal', 'DECIMAL' ), ( 10, 'Double', 'FLOAT' ), ( 11, 'Byte array', 'IMAGE' ), ( 12, 'Int32', 'INT' ), ( 13, 'Decimal', 'MONEY' ), ( 14, 'String', 'NCHAR' ), ( 15, 'String', 'NTEXT' ), ( 16, 'Decimal', 'NUMERIC' ), ( 17, 'String', 'NVARCHAR' ), ( 18, 'Single', 'REAL' ), ( 19, 'Byte array', 'ROWVERSION' ), ( 20, 'DateTime', 'SMALLDATETIME' ), ( 21, 'Int16', 'SMALLINT' ), ( 22, 'Decimal', 'SMALLMONEY' ), ( 23, 'Byte array', 'SQL_VARIANT' ), ( 24, 'String', 'TEXT' ), ( 25, 'DateTime', 'TIME' ), ( 26, 'String', 'TIMESTAMP' ), ( 27, 'Int16', 'TINYINT' ), ( 28, 'GUID', 'UNIQUEIDENTIFIER' ), ( 29, 'Byte array', 'VARBINARY' ), ( 30, 'String', 'VARCHAR' ), ( 31, 'String', 'XML' ), ( 32, 'String', 'JSON' ); GO CREATE TABLE [dbo].[TableCollectionReference]( [TableName] [nvarchar](255) NULL, [collectionreference] [nvarchar](255) NULL ) ON [PRIMARY] INSERT INTO [dbo].[TableCollectionReference] ([TableName] ,[collectionreference]) VALUES ('VehicleRegistration','Owners') As a final step, we need to create a stored procedure for the pipeline that maps the JSON file to the database table. Inspired by Koen Verbeeck’s function, this stored procedure reads the metadata of a table and converts it to the required JSON structure. You can find the code for this procedure below. CREATE PROCEDURE [dbo].[usp_Get_JSONTableMapping] @TableName VARCHAR(250) AS BEGIN SET NOCOUNT ON; DECLARE @CollectionReference VARCHAR(250) SELECT @CollectionReference=collectionreference FROM TableCollectionReference where TableName = @TableName SELECT jsonmapping = '{"type": "TabularTranslator", "mappings": ' + ( SELECT 'source.path' = '$[''' + c.[name] + ''']' --, 'source.type' = m.ADFTypeDataType , 'sink.name' = c.[name] , 'sink.type' = m.ADFTypeDataType FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id JOIN sys.all_columns c ON c.object_id = t.object_id JOIN sys.types y ON c.system_type_id = y.system_type_id AND c.user_type_id = y.user_type_id JOIN dbo.ADF_DataTypeMapping m ON y.[name] = m.SQLServerDataType WHERE 1 = 1 AND t.[name] = @TableName AND c.[name] not like 'ledger%' ORDER BY c.column_id FOR JSON PATH ) + ',"collectionreference": "' + ISNULL(@CollectionReference,'') + '","mapComplexValuesToString": true}'; END Building an Azure Data Factory Pipeline Configure dynamic datasets in Azure Data Factory For source, a JSON file is imported dynamically. To do this, you only need to specify the file path to a container in the storage account where the JSON file is stored. This configuration allows Azure Data Factory to automatically process incoming files without manual pipeline updates. See the screenshot below as an example. The sink, which is the database, is also configured dynamically. The table names in the database match the names in the imported JSON file. To achieve this, we utilize two parameters: One for the table schema and one for the table name. See the screenshot below as an example. Pipeline Overview The Azure Data Factory pipeline works in the following steps: Getting file name: The pipeline begins by retrieving the names of all files stored in the specified storage account containers. Extract JSON metadata: For each identified file, the pipeline gets JSON metadata. This metadata provides essential information about the structure of the JSON file, which is important for subsequent data transfer. Copy data to database: The pipeline uses the JSON structure to copy data from each JSON file to the corresponding table in the database. Table names in the database dynamically match file names, ensuring data is transferred accurately and efficiently. Let’s take a closer look at each component. Get metadata file name When working with Azure Data Factor Get metadata activity A powerful tool for managing data stored in Blob storage or data lake folders. This activity can retrieve an array of subitems listing all files and folders within a specified directory. Utilize the Get Metadata activity to get the names of all files stored in a specific storage account container. You define a storage account as a data set and its children as a list of fields. for each file The retrieved subitem array is used to review each file. For this purpose the following expression is used: Get JSON metadata In this activity we will map JSON to a table. Let’s use the Lookup activity to run the stored procedure usp_Get_JSONTableMapping we created earlier. The data set will be the database itself and the parameters of the stored procedure will be the table names derived from the file names without the JSON extension. Copy data to ledger table The final step involves copying the data from the JSON file into the ledger table. The source of data copy activity is the Azure Storage account path using the previously retrieved file name. The target (sink) is the database and its table names, which are dynamically derived from the file name without the JSON extension. Make sure the write action is set to “Insert” and the table options are set to “Use existing”. The mapping is based on the output of the stored procedure executed in the previous step. Activate the advanced editor and add dynamic content as shown in the example below. conclusion Migrating from Amazon QLDB to Azure SQL Database ledger tables provides a powerful solution for maintaining data integrity and leveraging advanced features of the Azure ecosystem. This guide outlined key considerations and steps involved in the migration process, including data modeling, preparing source files, and configuring the target database. By following the detailed guidance provided, your organization can ensure a smooth transition and take full advantage of the capabilities of Azure SQL Database. Useful Links To learn more and get started with ledgers in Azure SQL, see: Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Active Directory Hardening Series – Part 6 – Enforcing SMB Signing next post How to Introduce Custom Delay Between Two Messages in Azure Bot Composer You may also like Ride-sharing and Robotaxis Decopled Revenue Model Problems February 17, 2025 Web Raiders run the Global Brut Force attack from 2.5M IPS February 12, 2025 Generator Tech, Robot, risk of emerging February 11, 2025 Robotaxis is bringing in the lift dallas’ with ‘2026 with’ February 11, 2025 Why did Qualcom lose his first leadership February 10, 2025 Lenovo’s ThinkPad X 1 Carbon has rewrite my MacBook Pro February 5, 2025 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.