Restoring Soft-Deleted Blobs with multithreading in Azure Storage Using C# by info.odysseyx@gmail.com September 3, 2024 written by info.odysseyx@gmail.com September 3, 2024 0 comment 12 views 12 Soft deletion of blobs is an essential feature to prevent accidental deletion or overwriting. It ensures data integrity and availability even in case of human error by retaining deleted data for a specified period of time. However, restoring data from a soft deletion state can be more labor-intensive as it requires calling the Undo Delete API for each individual deleted blob. There is currently no option to undelete all blobs in bulk. This blog provides sample C# code that helps you efficiently restore soft deleted data. This code is especially effective when you have a large number of blobs to restore, as it utilizes multiple threads to expedite the restore process. The program can also be configured to undelete blobs within a specific container or directory instead of scanning the entire storage account. To run this program, follow these steps: Installing the .NET SDK: Make sure you have the .NET SDK installed on your computer. Connect to your Azure account: Connect-AzAccount dotnet nuget add source https://api.nuget.org/v3/index.json -n nuget.org Creating a new console application: dotnet new console --force Add the following code to Program.cs.. using Azure.Core; using Azure.Identity; using Azure.Storage.Files.DataLake; using Azure.Storage.Files.DataLake.Models; var StorageAccountName = "xxxx"; var ContainerName = "xxxx"; var DirectoryPath = ""; var Concurrency = 500; var BatchSize = 500; static DataLakeServiceClient GetDatalakeClient(string accountName) { DataLakeClientOptions clientOptions = new DataLakeClientOptions() { Retry = { Delay = TimeSpan.FromMilliseconds(500), MaxRetries = 5, Mode = RetryMode.Fixed, MaxDelay = TimeSpan.FromSeconds(5), NetworkTimeout = TimeSpan.FromSeconds(30) }, }; // only works for prod. DataLakeServiceClient client = new( new Uri($"https://{accountName}.blob.core.windows.net"), new DefaultAzureCredential(), clientOptions); return client; } Console.WriteLine("Starting the program"); var client = GetDatalakeClient(StorageAccountName); var throttler = new SemaphoreSlim(initialCount: Concurrency); List tasks = new List(); List containerNames = new List(); if (string.IsNullOrEmpty(ContainerName)) { var containers = client.GetFileSystems(); foreach (var container in containers) { containerNames.Add(container.Name); } } else { containerNames.Add(ContainerName); } var totalSuccessCount = 0; var totalFailedCount = 0; foreach (var container in containerNames) { Console.WriteLine($"Recoverying for container {container}"); var fileSystem = client.GetFileSystemClient(container); var deletedItems = fileSystem.GetDeletedPaths(pathPrefix: DirectoryPath); var count = 0; var totalSuccessCountForContainer = 0; var totalFailedCountForContainer = 0; foreach (PathDeletedItem item in deletedItems) { await throttler.WaitAsync(); count++; try { var task = (fileSystem.UndeletePathAsync(item.Path, item.DeletionId)); var continuedTask = task.ContinueWith(t => { throttler.Release(); if (t.IsFaulted) { Interlocked.Increment(ref totalFailedCount); Interlocked.Increment(ref totalFailedCountForContainer); Console.WriteLine($"Failed count for container {totalFailedCountForContainer}, total failed count {totalFailedCount}, path {DirectoryPath + item.Path} due to {t.Exception.Message}"); } else { Interlocked.Increment(ref totalSuccessCount); Interlocked.Increment(ref totalSuccessCountForContainer); Console.WriteLine($"Success count for container {totalSuccessCountForContainer}, total success count {totalSuccessCount}"); } }); tasks.Add(continuedTask); } catch (Exception ex) { Console.WriteLine("Failed to create task: " + ex.ToString()); } finally { if (count == Math.Max(Concurrency, BatchSize)) { count = 0; await Task.WhenAll(tasks); tasks.Clear(); } } } await Task.WhenAll(tasks); Console.WriteLine($"Recover finished for container {container}"); } Replace xxxx with your specific storage account and container name. If you need to restore a specific directory, provide the directory name. Otherwise, leave it blank and scan the entire container. This code is configured to run with 500 threads by default, but you can adjust this number as needed. dotnet add package Azure.Identity dotnet add package Azure.Storage.Blobs dotnet build --configuration Release dotnet As your application runs, you can monitor the console window to track its progress and identify potential problems or errors. Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Azure OpenAI now authorized as a service at DoD IL4 and IL5 next post Explore Azure AI Services: Curated list of prebuilt models and demos You may also like 7 Disturbing Tech Trends of 2024 December 19, 2024 AI on phones fails to impress Apple, Samsung users: Survey December 18, 2024 Standout technology products of 2024 December 16, 2024 Is Intel Equivalent to Tech Industry 2024 NY Giant? December 12, 2024 Google’s Willow chip marks breakthrough in quantum computing December 11, 2024 Job seekers are targeted in mobile phishing campaigns December 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.