📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*
New Post has been published on https://thedigitalinsider.com/guest-post-zilliz-unveiled-milvus-2-4-at-gtc-24-transforming-vector-databases-with-gpu-acceleration/
📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*
Collaboration with NVIDIA boosts Milvus performance 50x
Last week, Zilliz and NVIDIA collaborated to unveil Milvus 2.4 – the world’s first vector database accelerated by powerful GPU indexing and search capabilities. This breakthrough release harnesses NVIDIA GPUs’ massively parallel computing power and the new CUDA-Accelerated Graph Index for Vector Retrieval (CAGRA) from the RAPIDS cuVS library.
The performance gains enabled by GPU acceleration in Milvus 2.4 are extraordinary. Benchmarks demonstrate up to 50x faster vector search performance than industry standard CPU-based indexes like HNSW.
While the open-source Milvus 2.4 is available now, enterprises looking for a fully managed vector database service can look forward to GPU acceleration coming to Zilliz Cloud later this year. Zilliz Cloud provides a seamless experience for deploying and scaling Milvus on major cloud providers like AWS, GCP, and Azure without operational overhead.
We asked Charles Xie, the founder and CEO of Zilliz, to tell us more about it.
What is Milvus
Milvus is an open-source vector database system built for large-scale vector similarity search and AI workloads. Initially created by Zilliz, an innovator in the realm of unstructured data management and vector database technology, Milvus made its debut in 2019. To encourage widespread community engagement and adoption, it has been hosted by the Linux Foundation since 2020.
Since its inception, Milvus has gained considerable traction within the open-source ecosystem. With over 26,000 stars and over 260 contributors on GitHub and a staggering 20 million+ downloads and installations worldwide, it has become one of the most widely adopted vector databases globally. Milvus is trusted by over 5,000 enterprises across diverse industries, including AIGC, e-commerce, media, finance, telecom, and healthcare, to power their mission-critical vector search and AI applications at scale.
Why GPU Acceleration
In today’s data-driven world, quickly and accurately searching through vast amounts of unstructured data is crucial for powering cutting-edge AI applications. From generative AI and similarity search to recommendation engines and virtual drug discovery, vector databases have emerged as the backbone technology enabling these advanced capabilities. However, the insatiable demand for real-time indexing and high throughput has continued to push the boundaries of what’s possible with traditional CPU-based solutions.
Real-time indexing
Vector databases often need to ingest and index new vector data continuously and at a high velocity. Real-time indexing capabilities are essential to keep the database up-to-date with the latest data without creating bottlenecks or backlogs.
High throughput
Many applications that leverage vector databases, such as recommendation systems, semantic search engines, and anomaly detection, require real-time or near-real-time query processing. High throughput ensures that vector databases can handle a large volume of incoming queries concurrently, delivering low-latency responses to end-users or services.
At the heart of vector databases lies a core set of vector operations, such as similarity calculations and matrix operations, which are highly parallelizable and computationally intensive. With their massively parallel architecture comprising thousands of cores capable of executing numerous threads simultaneously, GPUs are an ideal computational engine for accelerating these operations.
The Architecture
To address these challenges, NVIDIA developed CAGRA, a GPU-accelerated framework that leverages the high-performance capabilities of GPUs to deliver exceptional throughput for vector database workloads. Next, let’s explore how to integrate the CAGRA algorithm into the Milvus system.
Milvus is designed for cloud-native environments and follows a modular design philosophy. It separates the system into various components and layers involved in handling client requests, processing data, and managing the storage and retrieval of vector data. Thanks to this modular design, Milvus can update or upgrade the implementation of specific modules without changing their interfaces. This modularity makes it relatively easy to incorporate GPU acceleration support into Milvus.
The Milvus 2.4 architecture
The modular architecture of Milvus comprises components such as the Coordinator, Access Layer, Message Queue, Worker Node, and Storage layers. The Worker Node itself is further subdivided into Data Nodes, Query Nodes, and Index Nodes. The Index Nodes are responsible for building indexes, while the Query Nodes handle query execution.
To leverage the benefits of GPU acceleration, CAGRA is integrated into Milvus’ Index and Query Nodes. This integration enables offloading computationally intensive tasks, such as index building and query processing, to GPUs, taking advantage of their parallel processing capabilities.
Within the Index Nodes, CAGRA support has been incorporated into the index building algorithms, allowing for efficient construction and management of high-dimensional vector indexes on GPU hardware. This acceleration significantly reduces the time and resources required for indexing large-scale vector datasets.
Similarly, in the Query Nodes, CAGRA is utilized to accelerate the execution of complex vector similarity searches. By leveraging GPU processing power, Milvus can perform high-dimensional distance calculations and similarity searches at unprecedented speeds, resulting in faster query response times and improved overall throughput.
Performance Evaluation
For this evaluation, we utilized three publicly available instance types on AWS:
m6id.2xlarge: This instance type is powered by the Intel Xeon 8375C CPU.
g4dn.2xlarge: This GPU-accelerated instance is equipped with an NVIDIA T4 GPU.
g5.2xlarge: This instance type features the NVIDIA A10G GPU.
By leveraging these diverse instance types, we aimed to evaluate the performance and efficiency of Milvus with CAGRA integration across different hardware configurations. The m6id.2xlarge instance served as a baseline for CPU-based performance, while the g4dn.2xlarge and g5.2xlarge instances allowed us to assess the benefits of GPU acceleration using the NVIDIA T4 and A10G GPUs, respectively.
Evaluation environments, AWS
We used two publicly available vector datasets from VectorDBBench:
OpenAI-500K-1536-dim: This dataset consists of 500,000 vectors, each with a dimensionality of 1,536. It is derived from the OpenAI language model.
Cohere-1M-768-dim: This dataset contains 1 million vectors, each with a dimensionality of 768. It is generated from the Cohere language model.
These datasets were specifically chosen to evaluate the performance and scalability of Milvus with CAGRA integration under different data volumes and vector dimensionalities. The OpenAI-500K-1536-dim dataset allows for assessing the system’s performance with a moderately large dataset of extremely high-dimensional vectors. In contrast, the Cohere-1M-768-dim dataset tests the system’s ability to handle larger volumes of moderately high-dimensional vectors.
Index Building Time
We compare the index-building time between Milvus with the CAGRA GPU acceleration framework and the standard Milvus implementation using the HNSW index on CPUs.
Evaluating the index-building times
For the Cohere-1M-768-dim dataset, the index building times are:
CPU (HNSW): 454 seconds
T4 GPU (CAGRA): 66 seconds
A10G GPU (CAGRA): 42 seconds
For the OpenAI-500K-1536-dim dataset, the index building times are:
CPU (HNSW): 359 seconds
T4 GPU (CAGRA): 45 seconds
A10G GPU (CAGRA): 22 seconds
The results clearly show that CAGRA, the GPU-accelerated framework, significantly outperforms the CPU-based HNSW index building, with the A10G GPU being the fastest across both datasets. The GPU acceleration provided by CAGRA reduces the index building time by up to an order of magnitude compared to the CPU implementation, demonstrating the benefits of leveraging GPU parallelism for computationally intensive vector operations like index construction.
Throughput
We present a performance comparison between Milvus with the CAGRA GPU acceleration framework and the standard Milvus implementation using the HNSW index on CPUs. The metric being evaluated is Queries Per Second (QPS), which measures the throughput of query execution.
We varied the batch size during the evaluation process, representing the number of queries processed concurrently, from 1 to 100. This comprehensive range of batch sizes allowed us to conduct a realistic and thorough evaluation, assessing the performance under different query workload scenarios.
Evaluating throughput
Looking at the charts, we can see that:
For a batch size of 1, the T4 is 6.4x to 6.7x faster than the CPU, and the A10G is 8.3x to 9x faster.
When the batch size increases to 10, the performance improvement is more significant: T4 is 16.8x to 18.7x faster, and A100 is 25.8x to 29.9x faster.
With a batch size of 100, the performance gain continues to grow: T4 is 21.9x to 23.3x faster, and A100 is 48.9x to 49.2x faster.
The results demonstrate the substantial performance gains achieved by leveraging GPU acceleration for vector database queries, particularly for larger batch sizes and higher-dimensional data. Milvus with CAGRA unlocks the parallel processing capabilities of GPUs, enabling significant throughput improvements and making it well-suited for demanding vector database workloads.
Blazing New Trails
The integration of NVIDIA’s CAGRA GPU acceleration framework into Milvus 2.4 represents a groundbreaking achievement in vector databases. By harnessing GPUS’ massively parallel computing power, Milvus has unlocked unprecedented levels of performance for vector indexing and search operations, ushering in a new era of real-time, high-throughput vector data processing.
The unveiling of Milvus 2.4, a collaboration between Zilliz and NVIDIA, exemplifies the power of open innovation and community-driven development by bringing GPU acceleration to vector databases. This milestone marks the beginning of a transformative era, where vector databases are poised to experience exponential performance leaps akin to NVIDIA’s remarkable achievement of increasing GPU computing power by 1000x over the past eight years. In the coming decade, we will witness a similar surge in vector database performance, catalyzing a paradigm shift in how we process and harness the immense potential of unstructured data.
*This post was written by Charles Xie, founder and CEO at Zilliz, specially for TheSequence. We thank Zilliz for their insights and ongoing support of TheSequence.
0 notes
YOSHITOMO NARA 北海道写真展
本当に久しぶりのブログ更新となります、当店は期間中 通常通り 火・水曜休みで営業致します。コロナ禍遠出もままならないですが、無理のない範囲でお立ち寄りお待ちしております。
https://atyhs.net
奈良美智さんによる、北海道初の写真展を、
札幌、小樽の小さなお店全13店舗で同時開催いたします。
それぞれのお店に似合う写真を、奈良美智さんがセレクト。
全会場違う作品が展示されます。
会期2021年4月29日(木・祝)〜5月9日(日)
会場Ach so ne 石田珈琲店 cagra カスタネット Cafuné 喫茶とギャラリー なみなみ
サロン チロル 台湾料理ごとう 庭ビル vivre sa vie+mi-yyu
フラワーショップ四季&やかん by みちみち種や 北欧雑貨piccolina MACRO
入場料無料(飲食店の場合は1オーダー必須)
主催YOSHITOMO NARA 北海道写真展 実行委員会
・・・・・・・・・・・・・・・・・・・・・・・・・・・・・
見えない敵と戦った2020年。
今、大切にすべきことを守りながら、また、これからの未来が明るく楽しみなものであるように。一日、一日が大切な日常になってほしいと思い、2021年、札幌と小樽の小さなお店13店舗で、現代美術作家の奈良美智さんの写真展『Though no one may notice, the world knows all that you have seen.』を開催します。
「誰も気が付かないかもしれないけど、君がずっと見てきたことを、ちゃんと知っているよ。」
写真には、その人が何を感じてシャッターを切ったのかが鏡のように映るものです。奈良さんの写真からは優しさ、温かさ、その街、その時の瞬間が伝わってきます。これまで、奈良さんの写真が北海道で紹介されることはなかったのですが、その魅力に惚れ込んだ私たちは、たくさんの方にも観てもらえたらと、写真展を企画しました。
札幌と小樽にある雑貨店、喫茶店、靴工房、美容室、お花屋さん。毎日に彩りを添えてくれるお店に、奈良さんの写真が加わるとどんな変化が起きるのか、それも楽しみの1つです。そして、奈良さんの写真が待ってるのならと、気になっていたあのお店に行ってみようと勇気が湧いてくるかもしれません。
新しい町を散歩する気持ちで、ちょっとワクワク。奈良さんがファインダー越しに見た世界を、一緒にのぞいてみましょう。
皆さまに会える日を楽しみにしています。
YOSHITOMO NARA 北海道写真展 実行委員会
0 notes
Best PMS Products for Wealth Creation
Equities are the one of the best ways to create wealth over longer periods of time. And, Portfolio Management Services are one of the most effective ways to invest in equities.
In the last decade (2010-2020 ) equity market delivered very average returns less than 5 % CAGRas SENSEX went up from 20000 to 40000, and the decade prior to that ( 2000-2010 ), was one of the best decades for equities as SENSEX went up from 3000 level to around 21000 level. If the pattern is to be extrapolated the decade next 2020-2030 could be to be another stellar decade for wealth creation. There are many fundamental reasons for that. Let’s understand them.
With historic fall in interest rates, cost of capital has gone down significantly. This presents a good business case which many entrepreneurs have always chased in India over years. Also, fall in interest rates, from highs of 9% to almost 5% currently, makes the case for valuations to be seen differently, and hence some high valued companies could command even higher valuations as they would be seen for the potential to generate higher return on capital employed over next decade.
Indian Economy as measured from Gross Domestic Product ( GDP ) has grown from US $ 1.6 tn in 2010 to US $ 2.8 tn in 2020. An absolute performance of 75%. Significantly higher than performance of equity market as Sensex growth was only 50% during last 10 years. This has led to a multi-year GAP between equity market cap and GDP. Currently market cap - to - GDP stands at ~ 60 %. This ratio has fallen swiftly from ~80% as in FY 19 to ~60% in FY 20. It is today, much below long - term average of 75% and closer to levels last seen during FY 09. The ratio has been quite stable over FY15-19 in the 70-80% band. The lowest in the last two decades has been ~45% seen in FY04. The ratio hit a peak of 149% in December 2007 during the 2003-08 bull run. This GAP in Market Cap to GDP ought to be filled and Market Cap to GDP could surpass 100% mark over next decade.
India is an agrarian economy with nearly 50% of Indians livelihood dependent on agriculture and allied sectors. This year has seen a good Rabi crop and the India meteorological department (IMD) announced that it expects monsoon rainfall to be normal this year. Rollout of long pending Agri reforms like scraping of essential commodities act, allowing farms to sell their produce anywhere in the country etc shall pave a way for corporatization of agriculture sector and should lead to growth of this sector in medium to long term.
FII holdings today stands at the lowest level since 2013, at ~20%, govt holding is also at a record low of at ~ 6.6%, DII holding is at 14%, and retail holding is also at 14%. But, at the same time, Indian promotors have increased their holding and are on the buying Spree where they see their franchises under-valued.
China which is largest manufacturing hub to many multinational companies is facing distress wave. And, most of these companies are looking for another alternative, today. CII & Indian Govt. has made representation to around 1500 global companies for moving production to India. Besides, last 10 years have paved way too many policy moves as well as reforms at fiscal, monetary and tax levels. Current scenario is becoming highly conducive for India to attract potential FDI over next decade.
Fall in global crude oil prices is a huge positive for India. Remember 85% of oil is what we import, and every 1 dollar fall in its price, leads to 1 bn dollars of saving on our import bill. This has cascading effect on lower inflation, lower current account deficit, and accommodative monetary stance.
So , investing in Equity is going to see some interesting times ahead and PMS managed by experienced fund managers are going to be one of the best ways to WEALTH CREATION in the coming decade.
Best 5 PMSesbased on the recent past performance (Year 2018 & 2019 ) are
1) Marcellus Consistent Compounders,
2) Stallion Asset Core Fund,
3) IIFL Multi Cap PMS,
4) Ambit Coffee CAN,
5) ASK India Entrepreneurial Portfolio.
0 notes