Home NewsX Updated Fabric GitHub Repo for 250M rows of CMS Healthcare data

Updated Fabric GitHub Repo for 250M rows of CMS Healthcare data

by info.odysseyx@gmail.com
0 comment 12 views


Last year, I worked with my colleague Inder Rana to build and launch a GitHub repository for using CMS Medicare Part D data within Microsoft Fabric. This repository is intended to provide examples of Fabric’s end-to-end analytics solutions that can be easily deployed by anyone using a Fabric environment. We’ve updated our analytics solution with several important improvements.

  • The Extract, Load, and Transform (ELT) process is now complete from the CMS to the Gold layer in Lakehouse. Less than 20 minutes Runs with enhanced automation.
  • The repository now contains logic to fetch new data from 2022, so your solution includes: 10 years of data (2013-2022) and almost 250 million rows.
  • There are two simple options to move data from CMS servers to the gold layer in less than 20 minutes.
    1. Move data to the gold tier using either 2) Spark notebooks orchestrated by pipelines, or 2) Spark notebooks and SQL stored procedures.
    2. Option 2 deploys the Gold tier in Fabric Warehouse for users using SQL and Python.

The updated GitHub repository can be found at this link. If you found it useful, please leave a “star”!: main fabric-samples-healthcare/analytics-bi-directlake-starschema · isinghrana/fabric-samples-hea…

The first option, using three Spark Notebooks with a single pipeline, is reviewed in the video below. A video reviewing SQL stored procedure versions will be released soon.

Below is a diagram reviewing the new and updated processes.

Logical_Diagram_Star_new.png





Source link

You may also like

Leave a Comment

Our Company

Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Laest News

@2024 – All Right Reserved. Designed and Developed by OdysseyX