Nextflow using GA4GH TES by info.odysseyx@gmail.com September 23, 2024 written by info.odysseyx@gmail.com September 23, 2024 0 comment 17 views 17 The GA4GH (Global Alliance for Genomics and Health) Job Execution Service (TES) API is a standardized schema and API for describing and executing batch execution jobs. We are pleased to announce this. Nextflowa powerful workflow management system for data-driven computational pipelines, is now fully supported. Global Genome and Health Alliance GA4GH Task Execution Service (TES)This integration provides seamless scalability, improved efficiency, and robust performance for data processing tasks across cloud and local computing. What is TES? The Task Execution Service (TES) API is a standardized schema and API for describing and executing batch execution tasks. It provides a common way to submit and manage tasks across a variety of computing environments, including on-premises high-performance computing and high-throughput computing (HPC/HTC) systems, cloud computing platforms, and hybrid environments. The TES API is designed to be flexible and extensible, enabling it to be applied to a wide range of use cases, such as “applying compute to data” solutions for federated and distributed data analytics, or load balancing across multi-cloud infrastructures. Why Nextflow and TES? Nextflow with TES is an ideal choice for managing computational workflows because it abstracts and simplifies the composition of complex data processing tasks. The standardized TES API provides a unified approach to task execution, ensuring compatibility across a variety of computational environments. This integration not only improves portability and scalability, but also significantly reduces the number of configuration files required to set up and manage workflows in a variety of cloud and local computing environments. This streamlined approach allows researchers and developers to focus more on their core scientific goals rather than the complexities of infrastructure management. How does it work? To run previously Nextflow Pipelines on Azure You need to create a configuration file that specifies the compute resources to use. By default, Nextflow uses a single compute configuration for all jobs. An example is shown below. process { executor="azurebatch" queue="Standard_E2d_v4" withLabel:process_low {queue="Standard_E2d_v4"} withLabel:process_medium {queue="Standard_E8d_v4"} withLabel:process_high {queue="Standard_E16d_v4"} withLabel:process_high_memory {queue="Standard_E32d_v4"} } azure { storage { accountName = "" sasToken = "" } batch { location = "" accountName = "" accountKey = "" autoPoolMode = false allowPoolCreation = true pools { Standard_E2d_v4 { autoScale = true vmType="Standard_E2d_v4" vmCount = 2 maxVmCount = 20 } Standard_E8d_v4 { autoScale = true vmType="Standard_E8d_v4" vmCount = 2 maxVmCount = 20 } Standard_E16d_v4 { autoScale = true vmType="Standard_E16d_v4" vmCount = 2 maxVmCount = 20 } Standard_E32d_v4 { autoScale = true vmType="Standard_E32d_v4" vmCount = 2 maxVmCount = 10 } } } } Integrating Nextflow with TES simplifies this configuration and lets you tell Azure what type of minimum machine requirements you need via basic Nextflow compute directives (e.g. CPU, memory, disk). TES looks at the available batch quota and minimum compute requirements and selects the lowest cost available compute that meets the minimum requirements for each process. plugins { id 'nf-ga4gh' } process { executor="tes" } azure { storage { accountName = """" accountKey = "" } } tes.endpoint= "" tes.basicUsername = "" tes.basicPassword = " How to get started To help you get started quickly, we introduce `.nf-hello-godk` Project, a Nextflow pipeline example designed to showcase the powerful capabilities of Nextflow. This pipeline demonstrates how to use Nextflow to analyze genomic data using the Genome Analysis Toolkit (GATK) and how to efficiently scale compute resources by leveraging Azure Batch. Deploying TES on Azure: Follow these steps: guide Deploy TES on Azure. Install Nextflow: Follow these steps: Nextflow Installation Guide Set up Nextflow on your local machine or in a cloud environment. Create TES Configuration: Create the following configuration with your TES and Azure credentials and save it as tes.config. process { executor="tes" } azure { storage { accountName = """" accountKey = "" } } tes.endpoint= "" tes.basicUsername = "" tes.basicPassword = " execution : ./nextflow run seqeralabs/nf-hello-gatk -c tes.config -w 'az://work' --outdir 'az://outputs' -r main After completion, all results can be found in the blob container prefix specified by: –outdir. Improved workflow management This integration makes it easier to manage and run Nextflow workflows on Azure Batch. This includes: Auto Scaling: Dynamically scale computing resources based on the needs of your workflow. Cost Effectiveness: Optimize your cloud spend with Azure Batch’s cost-effective pricing model. Seamless integration: Easily interact with Azure Batch using the TES API. We believe this integration will significantly enhance your data processing capabilities, making it easier to handle large-scale workflows with greater efficiency and cost-effectiveness. Expect more updates and community contributions as we continue to enhance support for Nextflow on Azure Batch using TES. Acknowledgements: We would like to acknowledge the following contributions: Liam Beckman ~ in Oregon Health & Science University Computational Biologyand Ben ShermanSoftware Engineer SekeraContributed basic support for Nextflow and TES. Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post New VDI solution for Teams on AVD/Windows 365 environments now generally available next post Redefine Possibilities: The Windows Driver Kit Agility Release is Here! You may also like 7 Disturbing Tech Trends of 2024 December 19, 2024 AI on phones fails to impress Apple, Samsung users: Survey December 18, 2024 Standout technology products of 2024 December 16, 2024 Is Intel Equivalent to Tech Industry 2024 NY Giant? December 12, 2024 Google’s Willow chip marks breakthrough in quantum computing December 11, 2024 Job seekers are targeted in mobile phishing campaigns December 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.