Java,CRaC,Azure,optimization,start time by info.odysseyx@gmail.com October 16, 2024 written by info.odysseyx@gmail.com October 16, 2024 0 comment 1 views 1 outline Java applications often experience startup delays due to runtime initialization and class loading processes. In the cloud-native era, this problem becomes more pronounced as applications start and stop more frequently with an increased need for scaling to accommodate dynamic traffic demands. To alleviate this, Coordinated Restore from Checkpoint (CRaC) Provides a solution to this problem by allowing checkpointing and restoration of applications, preventing lengthy startup times after first initialization. rainAccording to an experiment on Spring Pet Clinic Project After enabling CRaC in Azure Kubernetes Service on our project, we saw a 7x improvement in startup speed. In the final section, we will discuss the limitations of CRaC and its potential for future development. We welcome your feedback to help us continue to improve and optimize Java on Azure. Please feel free to share your thoughts in the comments section at the end of this article. Next, let’s look at how to: 1. Package and containerize Java applications locally. 2. Distribution target Azure Kubernetes Service (AKS). 3. Create checkpoints using CRaC. 4. Create a new application to restore from a checkpoint. 5. Compare startup performance between the original and restored applications. Java application packaging Before deploying a Java application to AKS, you need to package it and create a container image. To clone and package your application, follow these steps: 1. Clone your repository and build your application: git clone -b crac-poc https://github.com/leonard520/spring-petclinic.git cd spring-petclinic This repository is a fork of the official Spring PetClinic project. The only modification I made was adding the Spring CRaC dependency. 2. Create Dockerfile: making Docker file Defines how to containerize your application. The Zulu JVM is used here, which provides good support for CRaC. Added location where checkpoint images will be stored in Java startup parameters. FROM azul/zulu-openjdk:17-jdk-crac-latest as builder WORKDIR /home/app ADD . /home/app/spring-petclinic RUN cd spring-petclinic && ./mvnw -Dmaven.test.skip=true clean package FROM azul/zulu-openjdk:17-jdk-crac-latest WORKDIR /home/app EXPOSE 8080 COPY --from=builder /home/app/spring-petclinic/target/*.jar petclinic.jar ENTRYPOINT ["java", "-XX:CRaCCheckpointTo=/test", "-jar", "petclinic.jar"] 3. Build Docker image: Build the image using Docker. docker build -t spring-petclinic:crac . Create a deployment in Azure Kubernetes Service Now that your application is containerized, you can deploy it to AKS. Please follow these steps: 1. Create an AKS cluster: If you don’t have an AKS cluster, create one using the Azure CLI. az aks create --resource-group myResourceGroup --name myAKSCluster --node-count 1 --enable-addons monitoring --generate-ssh-keys 2. Push Docker image to Azure Container Registry (ACR): if you use **Azure Container Registry**Tag your images and push them to ACR. docker tag spring-petclinic:crac .azurecr.io/spring-petclinic:crac docker push .azurecr.io/spring-petclinic:crac 3. Create an image pull secret for ACR kubectl create secret docker-registry regcred --docker-server=.azurecr.io --docker-username= --docker-password= 4. Create an Azure file to mount to your deployment Note that checkpoint restore speed is closely related to disk performance, so we recommend using Azure Storage in the same region. az storage account create --name mystorageaccount --resource-group myResourceGroup --location eastus --kind FileStorage --sku Premium_LRS az storage share-rm create --resource-group myResourceGroup --storage-account mystorageaccount --name myfileshare az storage account keys list --resource-group myResourceGroup --account-name mystorageaccount kubectl create secret generic azure-secret --from-literal=azurestorageaccountname=mystorageaccount --from-literal=azurestorageaccountkey= 5. Create a Kubernetes deployment: Create a deployment YAML file (`deployment.yaml`) About your application: apiVersion: apps/v1 kind: Deployment metadata: name: myapp spec: replicas: 1 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers: - name: myapp image: .azurecr.io/spring-petclinic:crac ports: - containerPort: 8080 securityContext: allowPrivilegeEscalation: false capabilities: add: # The two capabilities are required to to checkpoint - SYS_PTRACE - CHECKPOINT_RESTORE privileged: false volumeMounts: - name: crac-storage mountPath: /test volumes: - name: crac-storage csi: driver: file.csi.azure.com volumeAttributes: secretName: azure-secret shareName: myfileshare mountOptions: 'dir_mode=0777,file_mode=0777,cache=strict,actimeo=30,nosharesock,nobrl' imagePullSecrets: - name: regcred 6. Deploy to AKS: Apply the deployment to your AKS cluster. kubectl apply -f deployment.yaml 7. Check startup log and duration: kubectl logs -l app=myapp |\ _,,,--,,_ /,`.-'`' ._ \-;;,_ _______ __|,4- ) )_ .;.(__`'-'__ ___ __ _ ___ _______ | | '---''(_/._)-'(_\_) | | | | | | | | | | _ | ___|_ _| | | | | |_| | | | __ _ _ | |_| | |___ | | | | | | | | | | \ \ \ \ | ___| ___| | | | _| |___| | _ | | _| \ \ \ \ | | | |___ | | | |_| | | | | | | |_ ) ) ) ) |___| |_______| |___| |_______|_______|___|_| |__|___|_______| / / / / ==================================================================/_/_/_/ :: Built with Spring Boot :: 3.3.0 2024-09-26T14:59:41.464Z INFO 129 --- [ main] o.s.s.petclinic.PetClinicApplication : Starting PetClinicApplication v3.3.0-SNAPSHOT using Java 17.0.12 with PID 129 (/home/app/petclinic.jar started by root in /home/app) 2024-09-26T14:59:41.470Z INFO 129 --- [ main] o.s.s.petclinic.PetClinicApplication : No active profile set, falling back to 1 default profile: "default" 2024-09-26T14:59:42.994Z INFO 129 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode. 2024-09-26T14:59:43.071Z INFO 129 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 66 ms. Found 2 JPA repository interfaces. 2024-09-26T14:59:44.125Z INFO 129 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port 8080 (http) 2024-09-26T14:59:44.134Z INFO 129 --- [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat] 2024-09-26T14:59:44.135Z INFO 129 --- [ main] o.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/10.1.24] 2024-09-26T14:59:44.176Z INFO 129 --- [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext 2024-09-26T14:59:44.178Z INFO 129 --- [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 2595 ms 2024-09-26T14:59:44.560Z INFO 129 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting... 2024-09-26T14:59:44.779Z INFO 129 --- [ main] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Added connection conn0: url=jdbc:h2:mem:131e3017-7e28-4a31-b704-5d3840cd46d6 user=SA 2024-09-26T14:59:44.781Z INFO 129 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed. 2024-09-26T14:59:45.011Z INFO 129 --- [ main] o.hibernate.jpa.internal.util.LogHelper : HHH000204: Processing PersistenceUnitInfo [name: default] 2024-09-26T14:59:45.073Z INFO 129 --- [ main] org.hibernate.Version : HHH000412: Hibernate ORM core version 6.5.2.Final 2024-09-26T14:59:45.113Z INFO 129 --- [ main] o.h.c.internal.RegionFactoryInitiator : HHH000026: Second-level cache disabled 2024-09-26T14:59:45.451Z INFO 129 --- [ main] o.s.o.j.p.SpringPersistenceUnitInfo : No LoadTimeWeaver setup: ignoring JPA class transformer 2024-09-26T14:59:46.466Z INFO 129 --- [ main] o.h.e.t.j.p.i.JtaPlatformInitiator : HHH000489: No JTA platform available (set 'hibernate.transaction.jta.platform' to enable JTA platform integration) 2024-09-26T14:59:46.468Z INFO 129 --- [ main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default' 2024-09-26T14:59:46.826Z INFO 129 --- [ main] o.s.d.j.r.query.QueryEnhancerFactory : Hibernate is in classpath; If applicable, HQL parser will be used. 2024-09-26T14:59:48.666Z INFO 129 --- [ main] o.s.b.a.e.web.EndpointLinksResolver : Exposing 14 endpoints beneath base path '/actuator' 2024-09-26T14:59:48.778Z INFO 129 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path "https://techcommunity.microsoft.com/" 2024-09-26T14:59:48.810Z INFO 129 --- [ main] o.s.s.petclinic.PetClinicApplication : Started PetClinicApplication in 8.171 seconds (process running for 8.862) As you can see, startup typically takes just over 8 seconds. Create checkpoints using CRaC Once the application is running, the next step is to create checkpoints using CRaC. 1. Create checkpoint: When the application reaches the desired state (e.g. after being fully initialized), it issues a checkpoint command. CRaC captures the state of the application that can be restored later for fast startup. The image is stored on an external volume in the Azure Storage file share just created. kubectl exec -it -- jcmd petclinic JDK.checkpoint Restore from checkpoint Now that you’ve created a checkpoint, you can package and deploy this state into a new Docker image for a quick restore. 1. Update your deployment to restore images from AKS: Modify your deployment YAML to use the restored command when starting the container. containers: - command: - java - -XX:CRaCRestoreFrom=/test Apply your changes. kubectl apply -f deployment.yaml 2. Check start time kubectl logs -l app=myapp 2024-09-26T15:01:42.400Z INFO 129 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Restarting Spring-managed lifecycle beans after JVM restore 2024-09-26T15:01:42.396Z WARN 129 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=4m9s910ms846?s988ns). 2024-09-26T15:01:42.473Z INFO 129 --- [Attach Listener] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port 8080 (http) with context path "https://techcommunity.microsoft.com/" 2024-09-26T15:01:42.474Z INFO 129 --- [Attach Listener] o.s.c.support.DefaultLifecycleProcessor : Spring-managed lifecycle restart completed (restored JVM running for 1009 ms) This time it started in just over a second! Performance comparison The final step is to compare the startup times of the original and restored versions of the application. 1. Measure start time: Measures the time from container startup to application preparation for both original and restored applications. Compared to the original startup time of over 8 seconds, restoring from a checkpoint reduced the startup time to just over 1 second, a 7x improvement. Moreover, this significant improvement requires only the addition of CRaC dependencies without any additional code modifications. 2. Compare results: Additionally, CRaC-enabled applications should exhibit much faster startup times because they restore from pre-initialized checkpoints. You can achieve this by giving your Java application enough time to prepare and then creating checkpoints. conclusion In this post, we looked at how to use it. CRaC To accelerate the startup of running Java applications Azure Kubernetes service. By examining fully initialized applications and restoring them later, you can significantly reduce startup times and improve both cold and warm start performance in containerized environments. CRaC is a promising technology, especially in environments where fast application startup is important, such as serverless platforms or microservices architectures. In comparison, Spring Native is another way to improve performance. Spring Native allows developers to compile Spring applications to native binaries using GraalVM, which provides very fast startup and low memory usage, making it ideal for short-lived stateless services. CRaC maintains full JVM functionality, while Spring Native may require code tweaks and has longer build times. However, as a relatively new technology, CRaC has its own limitations. For example, many third-party libraries do not yet support CRaC. Currently Spring Boot, Quarkus, and Micronaut all support CRaC, but there are still many frameworks and libraries that need to be adjusted for CRaC compatibility. Additionally, your application must close any open file handles before capturing the checkpoint. you can refer to https://github.com/CRaC/docs/blob/master/fd-policies.md For more information CRaC also requires that the environment at the time of checkpoint creation closely matches the environment during restoration. We will continue to monitor these restrictions closely and work with the community to improve broader applicability. We’d also like to hear your thoughts on this technology. Your feedback helps us improve the way Java runs on Azure. Please feel free to share your thoughts in the comments section at the end of this article. Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Troubleshooting page-related performance issues in Azure SQL next post Explore Exciting Graphic Design Job Opportunities in Mumbai at DigiMedia – Apply Today! You may also like Introducing the Modern Web App (MWA) Pattern for .NET November 2, 2024 Announcing Oracle Database@Azure in Italy North and Brazil South November 1, 2024 MGDC for SharePoint FAQ: How can I estimate my Azure bill? Updated! November 1, 2024 Coming soon: MS-4014: Build a foundation to extend Microsoft 365 Copilot November 1, 2024 Additional MB courses coming in December! MB-7005 & 7006 November 1, 2024 October 2024 V2 Title Plan now available! November 1, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.