Home NewsX Now Available on Genomics Data Lake on Azure

Now Available on Genomics Data Lake on Azure

by info.odysseyx@gmail.com
0 comment 4 views


title:

Introducing the Open Targets dataset: Now available in the Genomics Data Lake on Azure for advanced biomedical research

introduction

Biomedical research is accelerating at an unprecedented rate, driven by vast amounts of data generated from genetics research, drug development, and disease research. Today we open target Now available in Azure Genomics open data lake. This data can be seamlessly integrated into research workflows, providing a rich resource for exploring gene-disease associations, drug targets, and biomedical mechanisms.

Azure’s cloud-based solutions not only make these datasets more accessible, but combine them with machine learning, analytics, and AI-based tools to gain deeper insights and drive innovation in areas such as drug discovery, personalized medicine, healthcare, and more. You can do it. Genomics.

Dataset overview

The Open Targets Consortium is a public-private collaborative research partnership that aims to systematically identify and prioritize drug targets. Our flagship informatics platform integrates genetic and molecular evidence related to targets and diseases and includes extensive data on diseases, the genetic basis of drugs, and identification of potential therapeutic targets.

Open Targets provides critical data integrating genetics, genomics, and drug information, enabling researchers to identify and prioritize drug targets for complex diseases. Supports drug discovery and development by providing insights into gene-disease associations, molecular interactions, and drug mechanisms. These datasets increase our understanding of the genetic basis of diseases and drug effects, fostering better treatment strategies and precision medicine. It is widely used in target validation, drug repositioning, and understanding drug side effects.

Key datasets

This dataset provides comprehensive access to 25 JSON and file formats that can be seamlessly integrated into analysis workflows. These datasets fall into the following categories:

  • drug data: Mechanism of action, indications, pharmacovigilance and pharmacogenetics
  • Target-Disease Association: Selected data linking specific genes to diseases allows researchers to better understand disease pathways and mechanisms.
  • Target, disease, drug annotation: Key annotations for molecular targets, diseases and drugs
  • molecular interactions: Targeted interactions and supporting evidence.
  • Expression and phenotype: Basic expression, animal model phenotype and gene ontology
  • Path and Required: Reactome pathway and DepMap essentiality for targeting

Importance of Research and Drug Development:

This rich data set opens up numerous opportunities for researchers in a variety of fields. Here are some use cases:

  1. Identifying drug targets for Alzheimer’s disease

Publication: “Genome-wide association studies identify novel loci and functional pathways influencing Alzheimer’s disease risk” (Kunkle, BW et al.) (2019) natural genetics.

Summary: Using the Open Targets dataset, researchers integrated genetic linkage data with functional genomics, which helped prioritize genes and pathways associated with Alzheimer’s disease. This approach has identified potential therapeutic targets that can be further investigated to develop treatments for Alzheimer’s disease.

  1. Understanding the genetic basis of inflammatory bowel disease (IBD)

Publication: “Genetic Risk Factors for Inflammatory Bowel Disease” by De Lange, KM et al. (2017) natural genetics.

Summary: This study utilized the Open Targets dataset to identify and prioritize genetic variants associated with IBD. By linking genetic associations to specific genes and pathways, researchers have gained valuable insight into the underlying mechanisms of the disease, which may help develop new treatment strategies

  1. Repurposing drugs for COVID-19

In Publication: “Drug Repurposing for COVID-19: A Systematic Review” (Zhou, Y. et al.) (2020) natural review drug discovery.

Summary: Researchers used the Open Targets dataset to analyze and prioritize drug targets for COVID-19. This analysis helped identify existing drugs that could be repurposed to treat COVID-19 and provided a list of potential candidates for clinical trials.

effectiveness

How to access datasets in Azure

** please refer to this **

We are enabling public access to all Genomics Data Lake containers. Existing “signed URLs” (shared access signatures) will be retired on 2024-11-04T00:00:00Z. After this time, URLs without query strings will continue to work, but “signed URLs” will no longer work and will return a 403 HTTP status code. After this date, plan to access public URLs without query strings (removing ‘?’ and trailing characters).

Accessing this dataset in Azure is simple and can be integrated into a variety of Azure services for analysis and visualization. Here’s how to get started:

  1. Use AzCopy

Prerequisites:

step:

  1. Get the SAS URL of the blob container or file you want to download. You can find the URL here
  2. Open a command line (such as Command Prompt, Terminal, or PowerShell).
  3. Run the following command to download data from Azure Blob storage.
azcopy copy "https://datasetopentargets.blob.core.windows.net/dataset//17.02/17.02_association_data.json.gz" "C:\Users\YourUser\Downloads\"

This will copy the blob “17.02_association_data.json.gz” to the download directory.

  1. Using the Python SDK
#Install the Azure Storage Blob library for Python
pip install azure-storage-blob

Import the necessary libraries
from azure.storage.blob import BlobClient 
import os

#Download the blob by specifying the SAS URL and the local file path.
sas_url = "https://datasetopentargets.blob.core.windows.net/dataset/17.02/17.02_association_data.json.gz?sv=2023-01-03&st=2024-10-24T21%3A20%3A22Z&se=2026-10-25T21%3A20%3A00Z&sr=c&sp=rl&sig=9EI4PbUvTkT%2F0jUCg5aNLP5CBlu1bUDsyK6TDFzZacw%3D"

local_path = "path/to/save/file"

#Create BlobClient
blob_client = BlobClient.from_blob_url(sas_url)

#Download the blob content to a local file
with open(local_path, "wb") as download_file: download_stream = blob_client.download_blob() download_file.write(download_stream.readall())

Run the script. The blob will be downloaded and saved to the specified location.

  1. Use Azure Storage Explorer

Prerequisites:

step:

  1. Open Azure Storage Explorer.
  2. Click “Add Account” to connect to your Azure account or use “Connect to Azure Storage Container.”
    • Select “Use Shared Access Signature (SAS) URI” and paste the SAS URL for your blob container. or
    • Select “Anonymous (my blob container allows public access).” [after 11/19/2024 since public access will be unabled on all dataset]
  3. Navigate to the Blob container on the left panel where your data is stored.
  4. Right-click the blob or folder you want to download and select “Download”.
  5. Select a destination folder on your local computer.
  6. Blob data is downloaded to the specified location.

We encourage researchers to explore the Open Targets dataset to accelerate breakthroughs in target prioritization!

Acknowledgments:

We would like to thank Annalisa Buniello, Manuel Bernal Llinares, Roberto LLeras, and Matt Mcloughlin for making the data available in Azure, and Helena Cornu for help writing the blog.





Source link

You may also like

Leave a Comment

Our Company

Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Laest News

@2024 – All Right Reserved. Designed and Developed by OdysseyX