Home NewsX Migrating from Amazon QLDB to ledger tables in Azure SQL Database: A Comprehensive Guide

Migrating from Amazon QLDB to ledger tables in Azure SQL Database: A Comprehensive Guide

by info.odysseyx@gmail.com
0 comment 20 views


Amazon Web Services (AWS) announced the discontinuation of Amazon Quantum Ledger Database (QLDB). In a previous blog post Move from Amazon Quantum Ledger Database (QLDB) to a ledger in Azure SQL We discussed that Microsoft provides a great alternative for users to host their data with cryptographic immutability. This allows us to maintain strict data integrity standards. Ledger in Azure SQL It integrates easily into the Azure SQL environment, making it a powerful alternative.

This post outlines how to migrate your Amazon QLDB ledger to Azure SQL Database by leveraging: US Department of Motor Vehicles (DMV) sample ledger from a tutorial in the Amazon QLDB Developer Guide… For reference. You can adapt this solution to fit your schema and migration strategy.

Navigating Database Migration: Key Considerations for a Smooth Transition

When you begin your database migration journey, one of the first decisions you face is determining the scope of data to migrate. Depending on the requirements of your application, you can choose to transfer the entire database, including all historical data, or migrate only the most recent data and retain older records for future reference.

Another important aspect to consider is how you model your data in your new Azure SQL Database. This involves converting ledger data to fit the model selected during the migration process. There are several options here:

  1. standardization: This approach involves converting the document model to a relational model. Although this can be difficult, it is often the best way to ensure that your data remains in a format usable by a relational database after migration.
  2. JSON storage: Alternatively, you can migrate the entire JSON to a single column in your ledger table. This simplifies the migration process, but may not be the most efficient for accessing data in relational databases.

Each option has its own pros and cons, and the right choice will depend on your specific use case and requirements. In this blog post standardization approach

way out

The solution is built in Azure Data Factory and is based in part on the blog post. Dynamically map JSON to SQL in Azure Data Factory | Business Intelligence (sqlkov… by MVP Koen Verbeek.

prerequisites

Prepare source files

To begin the data migration process, you must first prepare your source files. There are two basic approaches to consider:

  1. Convert and import CSV files: Convert Amazon QLDB JSON documents to CSV files and import them into ledger tables in Azure SQL Database. This involves exporting and converting your Amazon QLDB ledger to a CSV file, as described in the “Exporting Data” and “Extracting and Converting” sections. this blog post. You can then import the CSV file into Azure SQL Database using the following methods:
  • bulk copy
  • Copy activity in Azure Data Factory: This activity migrates data from source files to Azure SQL Database.
  • Save and import JSON files: Store QLDB data as JSON files in Azure Storage and import it as relational data using Azure Data Factory (ADF).
  • In this post, I will focus on the second approach and provide a detailed walkthrough. To create the source file, follow these steps:

    • Step 1: ‘Opens a section.Step 2: Create tables, indexes, and sample data in the ledger” from the Amazon Quantum Ledger Database (Amazon QLDB) documentation.
    • Step 2: Go to the “Manual Options” section and copy the JSON sample data for the Person, DriversLicense, VehicleRegistration, and Vehicle tables into separate JSON files. Make sure you use the correct JSON syntax, as shown in the example below. Name the file according to that table.
    • Step 3: Upload these JSON files to an Azure Storage container.
    [
        {
            "FirstName" : "Raul",
            "LastName" : "Lewis",
            "DOB" : "1963-08-19",
            "GovId" : "LEWISR261LL",
            "GovIdType" : "Driver License",
            "Address" : "1719 University Street, Seattle, WA, 98109"
        },
        {
            "FirstName" : "Brent",
            "LastName" : "Logan",
            "DOB" : "1967-07-03",
            "GovId" : "LOGANB486CG",
            "GovIdType" : "Driver License",
            "Address" : "43 Stockert Hollow Road, Everett, WA, 98203"
        },
        {
            "FirstName" : "Alexis",
            "LastName" : "Pena",
            "DOB" : "1974-02-10",
            "GovId" : "744 849 301",
            "GovIdType" : "SSN",
            "Address" : "4058 Melrose Street, Spokane Valley, WA, 99206"
        },
        {
            "FirstName" : "Melvin",
            "LastName" : "Parker",
            "DOB" : "1976-05-22",
            "GovId" : "P626-168-229-765",
            "GovIdType" : "Passport",
            "Address" : "4362 Ryder Avenue, Seattle, WA, 98101"
        },
        {
            "FirstName" : "Salvatore",
            "LastName" : "Spencer",
            "DOB" : "1997-11-15",
            "GovId" : "S152-780-97-415-0",
            "GovIdType" : "Passport",
            "Address" : "4450 Honeysuckle Lane, Seattle, WA, 98101"
        }
    ]

    The following steps will save four JSON source files in your Azure Storage account, ready to be used for data migration.

    PieterVanhove_1-1729500688961.png

    Prepare the target database and tables

    In this example Azure SQL Database To our goal. It is important to note that ledger functionality is also available in: Azure SQL managed instance and SQL Server 2022. Set up an updatable ledger table for each JSON file generated in the previous section.

    To create a database and updatable ledger table, follow these steps:

    • Create a single database: We start by creating a single database in Azure SQL Database. You can find detailed instructions in the Azure SQL Database documentation.
    • Create updatable ledger tables: Next, run the script provided below to create an updatable ledger table. Adjust the script according to your specific requirements.
    CREATE TABLE dbo.Person (
        FirstName NVARCHAR(50),
        LastName NVARCHAR(50),
        DOB DATE,
        GovId NVARCHAR(50),
        GovIdType NVARCHAR(50),
        Address NVARCHAR(255)
    )
    WITH 
    (
     SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[PersonHistory]),
     LEDGER = ON
    );
    
    GO
    
    
    CREATE TABLE dbo.DriversLicense (
        LicensePlateNumber NVARCHAR(50),
        LicenseType NVARCHAR(50),
        ValidFromDate DATE,
        ValidToDate DATE,
        PersonId NVARCHAR(50)
    )
    WITH 
    (
     SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[DriversLicenseHistory]),
     LEDGER = ON
    );
    
    GO
    
    CREATE TABLE dbo.VehicleRegistration (
        VIN NVARCHAR(50),
        LicensePlateNumber NVARCHAR(50),
        State NVARCHAR(50),
        City NVARCHAR(50),
        PendingPenaltyTicketAmount DECIMAL(10, 2),
        ValidFromDate DATE,
        ValidToDate DATE,
        PrimaryOwner NVARCHAR(100),
    	SecondaryOwner NVARCHAR(100)
    )
    WITH 
    (
     SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[VehicleRegistrationHistory]),
     LEDGER = ON
    );
    
    GO
    
    CREATE TABLE dbo.Vehicle (
        VIN NVARCHAR(50),
        Type NVARCHAR(50),
        Year INT,
        Make NVARCHAR(50),
        Model NVARCHAR(50),
        Color NVARCHAR(50)
    )
    WITH 
    (
     SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[VehicleHistory]),
     LEDGER = ON
    );
    

    • Configuring database users: Finally, create a database user corresponding to the managed identity in Azure Data Factory. To ensure that your Azure Data Factory pipeline has the necessary permissions to write to the database, add this new user to the db_datawriter role.
    CREATE USER [ledgeradf] FROM EXTERNAL PROVIDER;
    ALTER ROLE [db_datawriter] ADD MEMBER [ledgeradf]
    GRANT EXECUTE TO [ledgeradf];

    By following these steps, you will be ready to set up Azure SQL Database with the appropriate ledger table and handle data migration from your JSON file.

    The next step requires creating two additional mapping tables. The first table maps SQL Server data types to the corresponding data types expected by Azure Data Factory. The second table establishes a mapping between table names and their collection references. Below is a script to create and populate these tables.

    CREATE TABLE ADF_DataTypeMapping ( [ADFTypeMappingID] int, [ADFTypeDataType] varchar(20), [SQLServerDataType] varchar(20) )
    INSERT INTO ADF_DataTypeMapping ([ADFTypeMappingID], [ADFTypeDataType], [SQLServerDataType])
    VALUES
    ( 1, 'Int64', 'BIGINT' ), 
    ( 2, 'Byte array', 'BINARY' ), 
    ( 3, 'Boolean', 'BIT' ), 
    ( 4, 'String', 'CHAR' ), 
    ( 5, 'DateTime', 'DATE' ), 
    ( 6, 'DateTime', 'DATETIME' ), 
    ( 7, 'DateTime', 'DATETIME2' ), 
    ( 8, 'DateTimeOffset', 'DATETIMEOFFSET' ), 
    ( 9, 'Decimal', 'DECIMAL' ), 
    ( 10, 'Double', 'FLOAT' ), 
    ( 11, 'Byte array', 'IMAGE' ), 
    ( 12, 'Int32', 'INT' ), 
    ( 13, 'Decimal', 'MONEY' ), 
    ( 14, 'String', 'NCHAR' ), 
    ( 15, 'String', 'NTEXT' ), 
    ( 16, 'Decimal', 'NUMERIC' ), 
    ( 17, 'String', 'NVARCHAR' ), 
    ( 18, 'Single', 'REAL' ), 
    ( 19, 'Byte array', 'ROWVERSION' ), 
    ( 20, 'DateTime', 'SMALLDATETIME' ), 
    ( 21, 'Int16', 'SMALLINT' ), 
    ( 22, 'Decimal', 'SMALLMONEY' ), 
    ( 23, 'Byte array', 'SQL_VARIANT' ), 
    ( 24, 'String', 'TEXT' ), 
    ( 25, 'DateTime', 'TIME' ), 
    ( 26, 'String', 'TIMESTAMP' ), 
    ( 27, 'Int16', 'TINYINT' ), 
    ( 28, 'GUID', 'UNIQUEIDENTIFIER' ), 
    ( 29, 'Byte array', 'VARBINARY' ), 
    ( 30, 'String', 'VARCHAR' ), 
    ( 31, 'String', 'XML' ), 
    ( 32, 'String', 'JSON' );
    GO
    CREATE TABLE [dbo].[TableCollectionReference](
    	[TableName] [nvarchar](255) NULL,
    	[collectionreference] [nvarchar](255) NULL
    ) ON [PRIMARY]
    INSERT INTO [dbo].[TableCollectionReference]
               ([TableName]
               ,[collectionreference])
         VALUES
               ('VehicleRegistration','Owners')
    

    As a final step, we need to create a stored procedure for the pipeline that maps the JSON file to the database table. Inspired by Koen Verbeeck’s function, this stored procedure reads the metadata of a table and converts it to the required JSON structure. You can find the code for this procedure below.

    CREATE PROCEDURE [dbo].[usp_Get_JSONTableMapping]
            @TableName VARCHAR(250)
           
        AS
        BEGIN
            SET NOCOUNT ON;
    		DECLARE  @CollectionReference VARCHAR(250)
    		SELECT @CollectionReference=collectionreference FROM TableCollectionReference where TableName = @TableName
            SELECT jsonmapping = '{"type": "TabularTranslator", "mappings": ' + 
            (
                SELECT
                     'source.path'  = '$[''' + c.[name] + ''']'
                  --, 'source.type'  = m.ADFTypeDataType
                    , 'sink.name'    = c.[name]
                    , 'sink.type'    = m.ADFTypeDataType
                FROM sys.tables                 t
                JOIN sys.schemas                s ON s.schema_id        = t.schema_id
                JOIN sys.all_columns            c ON c.object_id        = t.object_id
                JOIN sys.types                  y ON c.system_type_id   = y.system_type_id
                                                    AND c.user_type_id  = y.user_type_id
                JOIN dbo.ADF_DataTypeMapping    m ON y.[name]           = m.SQLServerDataType
                WHERE   1 = 1
                    AND t.[name] = @TableName
                    AND c.[name] not like 'ledger%'
                ORDER BY c.column_id
                FOR JSON PATH
            ) + ',"collectionreference": "' + ISNULL(@CollectionReference,'') + '","mapComplexValuesToString": true}'; 
        END
    

    Building an Azure Data Factory Pipeline

    Configure dynamic datasets in Azure Data Factory

    For source, a JSON file is imported dynamically. To do this, you only need to specify the file path to a container in the storage account where the JSON file is stored. This configuration allows Azure Data Factory to automatically process incoming files without manual pipeline updates. See the screenshot below as an example.

    PieterVanhove_0-1729501122012.png

    The sink, which is the database, is also configured dynamically. The table names in the database match the names in the imported JSON file. To achieve this, we utilize two parameters: One for the table schema and one for the table name. See the screenshot below as an example.

    PieterVanhove_1-1729501142462.png

    PieterVanhove_2-1729501155655.png

    Pipeline Overview

    The Azure Data Factory pipeline works in the following steps:

    1. Getting file name: The pipeline begins by retrieving the names of all files stored in the specified storage account containers.
    2. Extract JSON metadata: For each identified file, the pipeline gets JSON metadata. This metadata provides essential information about the structure of the JSON file, which is important for subsequent data transfer.
    3. Copy data to database: The pipeline uses the JSON structure to copy data from each JSON file to the corresponding table in the database. Table names in the database dynamically match file names, ensuring data is transferred accurately and efficiently.

    PieterVanhove_3-1729501193646.png

    Let’s take a closer look at each component.

    Get metadata file name

    When working with Azure Data Factor Get metadata activity A powerful tool for managing data stored in Blob storage or data lake folders. This activity can retrieve an array of subitems listing all files and folders within a specified directory. Utilize the Get Metadata activity to get the names of all files stored in a specific storage account container. You define a storage account as a data set and its children as a list of fields.

    PieterVanhove_4-1729501242788.png

    for each file

    The retrieved subitem array is used to review each file. For this purpose the following expression is used:

    PieterVanhove_5-1729501294128.png

    Get JSON metadata

    In this activity we will map JSON to a table. Let’s use the Lookup activity to run the stored procedure usp_Get_JSONTableMapping we created earlier. The data set will be the database itself and the parameters of the stored procedure will be the table names derived from the file names without the JSON extension.

    PieterVanhove_0-1729511690405.png

    Copy data to ledger table

    The final step involves copying the data from the JSON file into the ledger table. The source of data copy activity is the Azure Storage account path using the previously retrieved file name.

    PieterVanhove_7-1729501365086.png

    The target (sink) is the database and its table names, which are dynamically derived from the file name without the JSON extension. Make sure the write action is set to “Insert” and the table options are set to “Use existing”.

    PieterVanhove_8-1729501385655.png

    The mapping is based on the output of the stored procedure executed in the previous step. Activate the advanced editor and add dynamic content as shown in the example below.

    PieterVanhove_9-1729501401527.png

    conclusion

    Migrating from Amazon QLDB to Azure SQL Database ledger tables provides a powerful solution for maintaining data integrity and leveraging advanced features of the Azure ecosystem. This guide outlined key considerations and steps involved in the migration process, including data modeling, preparing source files, and configuring the target database. By following the detailed guidance provided, your organization can ensure a smooth transition and take full advantage of the capabilities of Azure SQL Database.

    Useful Links

    To learn more and get started with ledgers in Azure SQL, see:





    Source link

    You may also like

    Leave a Comment

    Our Company

    Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

    Newsletter

    Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

    Laest News

    @2024 – All Right Reserved. Designed and Developed by OdysseyX