A guide to Weaviate vector database benchmarks on Azure Kubernetes Service and Azure NetApp Files by info.odysseyx@gmail.com September 10, 2024 written by info.odysseyx@gmail.com September 10, 2024 0 comment 10 views 10 Introduction Prerequisites Install Weaviate Approximate Nearest Neighbor (ANN) Benchmarks ANN Benchmarks Setup ANN Benchmarks – Glove 100 Angular ANN Benchmarks – Sift 128 Euclidean ANN Benchmarks Analysis Results Conclusion Additional Information In this era of generative AI, the ability to process and analyze large datasets with precision and speed is not just advantageous—it’s essential. Vector databases, such as Weaviate, play a pivotal role in the infrastructure that powers generative AI applications, from natural language processing to image generation. These databases efficiently handle the similarity search operations at the core of generative models, enabling them to parse vast datasets and identify patterns that drive the creation of new, synthetic content. By leveraging Azure Kubernetes Service (AKS) and using high-performance Azure NetApp Files (ANF) as the back-end storage, deploying Weaviate creates a scalable foundation that effectively meets the demanding requirements of generative AI models. This blog post guides you through setting up Weaviate on AKS, backed by the robust storage solution of Azure NetApp Files. We then benchmark our setup with ANN-Benchmarks—the established framework for testing approximate nearest neighbor search algorithms with vector databases—to quantitatively measure Weaviate’s performance in a controlled environment. Follow along as we streamline the deployment process and benchmarking steps, providing a clear view of Weaviate’s performance in a cloud environment. By the end of our journey, you’ll have a comprehensive understanding of how to deploy a scalable vector search solution and what to expect from its performance on Azure’s robust infrastructure. Co-authors: Michael Haigh, Senior Technical Marketing Engineer, Kyle Radder, Technical Marketing Engineer (NetApp) If you’ll be following along step by step, be sure to have the following resources at your disposal: We use the Kubernetes Helm chart to install Weaviate on the AKS cluster. First, SSH to the Linux VM that’s deployed in the same virtual network as your AKS cluster, then add the Weaviate repository: helm repo add weaviate https://weaviate.github.io/weaviate-helmhelm repo update To view the possible configuration values for the Weaviate Helm chart, run the following command: helm show values weaviate/weaviate Depending on your generative AI application, you may want to configure additional Weaviate replica pods or enable local machine learning (ML) models. For our performance benchmarking, we leave all the defaults except for the following settings: cat <> values.yamlstorage: size: 30Tiservice: annotations: service.beta.kubernetes.io/azure-load-balancer-internal: "true"grpcService: annotations: service.beta.kubernetes.io/azure-load-balancer-internal: "true"EOF As mentioned in the prerequisites, a 30TiB Azure NetApp Files Ultra volume provides 30Gbps of throughput, which is roughly equivalent to the 30,000Mbps of bandwidth provided by the Standard_D64_v4 AKS node. If you’re using a smaller AKS node, you can reduce your volume size to result in an equivalent throughput (each TiB of an Ultra volume provides 1Gbps of throughput). The other two Helm settings are to use internal IP addresses for the HTTP and GRPC Weaviate services, so network traffic stays confined to our internal virtual network. To deploy Weaviate with these values, run the following command: helm install weaviate -n weaviate --create-namespace weaviate/weaviate -f values.yaml To check on the status of the deployment, run the following command: kubectl -n weaviate get all,pvc It takes less than a minute to get the external IPs populated, and about 5 to 10 minutes for the volume to go into a Bound state: $ kubectl -n weaviate get all,pvcNAME READY STATUS RESTARTS AGEpod/weaviate-0 1/1 Running 0 8m21s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/weaviate LoadBalancer 172.16.213.188 10.20.0.8 80:31961/TCP 8m21sservice/weaviate-grpc LoadBalancer 172.16.23.238 10.20.0.9 50051:30943/TCP 8m21sservice/weaviate-headless ClusterIP None 80/TCP 8m21s NAME READY AGEstatefulset.apps/weaviate 1/1 8m21s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEpersistentvolumeclaim/weaviate-data-weaviate-0 Bound pvc-4c354c0d-29fe-4af8-a611-35e993c9ecab 30Ti RWO azure-netapp-files-ultra 8m21s Depending on your virtual network settings, your external IPs will probably be different, but verify that they’re RFC 1918 internal IP addresses to ensure that network traffic stays on the internal virtual network. Take note of these IPs for use in the next section. Once the volume is bound and the weaviate-0 pod is in a Running state, we’re ready to start performance testing. ANN benchmarks is a benchmarking environment for approximate nearest neighbor (ANN) algorithms. ANN algorithms are used to find the nearest neighbors to a point in a dataset, where approximate means that the algorithm is allowed to return points that are close to the nearest neighbors, rather than the exact ones. This trade-off enables significantly faster processing times, which is especially useful when dealing with very large datasets. ANN Benchmarks Setup ANN algorithms are an effective tool for testing vector databases due to their efficiency with these large datasets, which are typical in real-world applications such as recommendation systems and natural language processing. By simulating practical use cases, ANN benchmarks allow the evaluation of a vector database’s ability to balance accuracy and speed, a critical aspect of user experience. These tests also offer insights into the scalability and resource efficiency of the databases, revealing how performance evolves with growing data volumes and complexity. ANN testing can also inform about the impact of the underlying infrastructure on the database’s performance, which is vital for optimizing deployments. The Weaviate test module in ANN-Benchmarks uses the v3 Weaviate client and an embedded Weaviate instance. Because the v3 client is deprecated and is no longer recommended, and we’re using an external (running on AKS) Weaviate instance, the test module must be modified. A fork of the ANN-Benchmarks repository has been created with these modifications. If you’re curious about the specific changes, see this diff. From your workstation VM, run the following commands to clone the forked ANN-Benchmarks repository and change into the created directory: git clone https://github.com/MichaelHaigh/ann-benchmarks.gitcd ann-benchmarks We now install Python 3.10, which is the validated Python version for ANN-Benchmarks: sudo apt install -y software-properties-commonsudo add-apt-repository -y ppa:deadsnakes/ppasudo apt updatesudo apt install -y python3.10 python3.10-distutils python3.10-venv (This example is for Ubuntu; if you’re running a different flavor of Linux, the commands will be different.) Next, we create our Python virtual environment and install the necessary packages: python3.10 -m venv venvsource venv/bin/activatepip install -r requirements.txtpip install weaviate-client Finally, open the weaviate module.py file with your favorite text editor: vim ann_benchmarks/algorithms/weaviate/module.py Take note of lines 14-21, and especially lines 15 and 18: 14 self.client = weaviate.connect_to_custom(15 http_host="10.20.0.8",16 http_port="80",17 http_secure=False,18 grpc_host="10.20.0.9",19 grpc_port="50051",20 grpc_secure=False,21 ) Lines 15 and 18 must be updated with the external IPs of the weaviate and weaviate-grpc services, respectively, from the previous section. When complete, save the file and exit the text editor. ANN Benchmarks – Glove 100 Angular We’re now ready to start our performance benchmarking with the following command: python run.py --algorithm weaviate --local (i) Note This task will take 1 to 2 days to complete, depending on your setup; it took 30 hours with the configuration just described. The –algorithm argument instructs ANN-Benchmarks to run the Weaviate tests, and the –local argument instructs to run the tests “locally” rather than the default Docker method. Because we’ve modified the test module to connect to our external Weaviate instance running on AKS, it’s not truly a “local” test. The first action of the benchmark is to download the GloVe 100 Angular dataset. (This can be modified by the –dataset argument, as shown in the next section.) We then print out a large list of the order of the tests that will be run: $ python run.py --algorithm weaviate --localdownloading https://ann-benchmarks.com/glove-100-angular.hdf5 -> data/glove-100-angular.hdf5...2024-08-19 19:52:11,777 - annb - INFO - running only weaviate2024-08-19 19:52:12,622 - annb - INFO - Order: [Definition(algorithm='weaviate', constructor="Weaviate", module="ann_benchmarks.algorithms.weaviate", docker_tag='ann-benchmarks-weaviate', arguments=['angular', 64, 128], query_argument_groups=[[16], [32], [48], [64], [96], [128], [256], [512], [768]], disabled=False), Definition(algorithm='weaviate', constructor="Weaviate",... In this example (order varies because the tests are randomized), the first test is: GloVe. Global vectors for word representation, where the vector representations of words are learned in such a way that the geometric relationships between the vectors capture semantic meaning. 100. The number of dimensions of the vectors in the dataset (100-dimensional). Angular. The distance metric used to measure the similarity between vectors when searching for nearest neighbors. When the angular distance is used, the focus is on the orientation of the vectors, not their length, which is particularly useful for comparing word embeddings where the direction of the vector is more meaningful than its magnitude. Arguments (64 and 128). The number of groups into which the dataset’s vectors are categorized during indexing. A higher value implies a more fine-grained partitioning of the data, which could lead to a longer preprocessing stage, because the algorithm must process and organize the vectors into more groups. Query arguments (16, 32, 48, 96, 128, 256, 512, and 768). The number of groups that are considered when searching for the nearest neighbors to a query vector. A higher value means that the algorithm checks more groups, which can increase the computational effort during the query phase but may also improve the likelihood of finding the true nearest neighbors, thus increasing recall. These tests are controlled by the config.yml file located in the algorithm directory, so feel free to modify that file to reduce the number of tests, if desired. After the tests have been running for a few hours, you can view the PVC overview page of the Azure portal to view the volume’s metrics. Make sure that the “throughput limit reached” chart stays at 0; otherwise your volume has been sized too small in relation to the bandwidth of your selected node. After 1 to 2 days, the GloVe 100 Angular benchmarking will be complete, and we can move on to our next dataset. ANN Benchmarks – Sift 128 Euclidean Because the GloVe 100 Angular dataset is geared toward word vectors, we’ll use the Sift 128 Euclidean dataset, which is geared toward image vectors: Sift. Scale Invariant Feature Transform vectors capture local features of images and are widely used in computer vision tasks. Each vector in the dataset represents a distinct feature extracted from an image. 128. the number of dimensions of the vectors in the dataset (128-dimensional). Euclidean. This term specifies the distance metric used to measure the similarity between vectors in the dataset; the smaller the distance the more similar the vectors. The Euclidean distance, also known as L2 norm or L2 distance, is the “ordinary” straight-line distance between two points in Euclidean space. This time when we execute the benchmark, we’ll use the –dataset argument to specify this dataset: python run.py --algorithm weaviate --local --dataset sift-128-euclidean Again, this command can take 1 to 2 days to complete; in our testing it took roughly 24 hours. When complete, you can continue to run additional benchmarks with more datasets, if desired. However, we’ll now move on to analysis. ANN Benchmarks Analysis There are a handful of ways to analyze the results of our benchmark testing: Run python plot.py, which creates a single image of the vector database(s) and dataset(s). This image can be heavily customized to specify the X and Y axis units and scales, in addition to several other options. (Run python plot.py –help to view all available configuration options.) Run python create_website.py, which creates an HTML page with about a dozen graphs. Run python data_export.py –out res.csv, which exports all results to a CSV file, which can be useful when additional post-processing is needed. We’ll go with option 2 here, because a single command yields many interesting images. However, feel free to play around with options 1 and 3 in your environment. In your workstation, run the following command: python create_website.py If your workstation has a desktop environment, open the weaviate.html file that was generated. Otherwise, run the following command to copy the file to your physical machine: scp @:/home//ann-benchmarks/weaviate.html weaviate.html Once opened, scroll through the page to view the results. The entire page of results is included in the results section of this blog, but let’s dig into just two of the images. The axes of the above chart represent: Recall. The fraction of true nearest neighbors that are returned by the approximate nearest neighbor search. For example, if the ANN search is supposed to return the 10 nearest neighbors to a query point, but only 7 of those are among the true 10 nearest neighbors, the recall would be 0.7. Queries per second. The number of queries that can be processed per second, indicating the performance or efficiency of the vector database. In general, the longer amount of time spent on making the queries should result in a higher recall value. Values to the up and right are better, meaning that Weaviate performed better with the Sift 128 Euclidean dataset than with the GloVe 100 Angular dataset. This indicates that Weaviate is a more capable vector database for computer vision tasks rather than natural language processing. However we recommend testing against additional datasets and vector databases to find the best match for your specific application. Let’s investigate one more chart: While the previous chart focused purely on the query phase, this chart focuses on the trade-off between the quality of the search results and the memory footprint of the index. The X axis (Recall) is the same; however the Y axis represents the amount of memory used by the vector database to store the data structure that facilitates the neighbor search. As we can see, for certain levels of recall, Weaviate has a lower memory footprint for the GloVe 100 Angular dataset, but for other levels of recall the Sift 128 Euclidean dataset’s memory footprint is lower. Depending on your generative AI application, you may value memory footprint over query speed—for example, a computer vision application in embedded systems. Other applications, like a chatbot or coding assistant, may value query speed over memory footprint. Performing benchmarks against potential vector databases with relevant datasets can help determine the ideal configuration for your generative AI applications. Results The remaining results of the ANN-Benchmarks testing are shown here. The deployment and benchmarking of Weaviate on Azure Kubernetes Service with Azure NetApp Files demonstrates the platform’s robust capabilities in handling generative AI workloads. The detailed walk-through in this blog simplifies the setup process, and it also equips users with the necessary insights to make informed decisions about their vector database deployments. The results from the ANN-Benchmarks reveal valuable performance metrics that are essential for optimizing AI applications. Weaviate’s impressive handling of the Sift 128 Euclidean dataset suggests a strong suit in computer vision tasks, and its performance with the GloVe 100 Angular dataset opens avenues for natural language processing applications. However, users must consider the specific requirements of their applications, because trade-offs can significantly impact the user experience and operational costs. By leveraging Azure’s scalable infrastructure and Weaviate’s vector search capabilities, developers and organizations can confidently scale their AI solutions, knowing that they have a reliable and efficient system in place. The benchmarks are a testament to the potential of Weaviate on AKS and Azure NetApp Files, providing a solid foundation for future generative AI endeavors. Whether your focus is on maximizing recall, query throughput, or maintaining a minimal memory footprint, this setup means that you can achieve your goals with efficiency and precision. Source link Share 0 FacebookTwitterPinterestEmail info.odysseyx@gmail.com previous post Securing Containerized Applications with SSH Tunneling in Azure next post Exciting Developer Job Opportunities in Ahmednagar with a Leading Client of Freshersworld You may also like 7 Disturbing Tech Trends of 2024 December 19, 2024 AI on phones fails to impress Apple, Samsung users: Survey December 18, 2024 Standout technology products of 2024 December 16, 2024 Is Intel Equivalent to Tech Industry 2024 NY Giant? December 12, 2024 Google’s Willow chip marks breakthrough in quantum computing December 11, 2024 Job seekers are targeted in mobile phishing campaigns December 10, 2024 Leave a Comment Cancel Reply Save my name, email, and website in this browser for the next time I comment.