Navigating the Financial Data Stream: How Kaito Leveraged RisingWave Streaming Database
A Kaito data engineer constructed more than 1,000 user-visible analysis dashboards (materialized views) and internal product data tracking dashboards on a single RisingWave cluster in just two weeks, providing essential support for the company's data analytics and business operations.
Kaito is a fintech company headquartered in Seattle, Washington, USA. Backed by investors such as Dragonfly, Sequoia, and Jane Street, the company's mission is to create the industry's first financial search engine based on a Large Language Model (LLM).
It aims to meet the cryptocurrency community's demand for indexing data scattered across various private information sources and blockchains, which are invisible to traditional search engines like Google.
By combining its proprietary financial search engine's real-time data with advanced LLM capabilities, Kaito aims to provide a revolutionary information access experience for 300 million users, enabling them to obtain information related to the blockchain and its surrounding ecosystem more efficiently.
Kaito provides analytical capabilities for timely insights to investors, day traders, and quantitative trading companies.
This requires the company to merge off-chain and on-chain data in real-time, perform complex analysis, and present results to users with sub-millisecond latency. Given the vast amount of data, the company must build scalable infrastructure to meet data analysis needs.
Kaito's platform developers have two main demands for the infrastructure:
- Ease of Use: As a rapidly growing company, Kaito places great emphasis on delivering new features quickly. In the past, Kaito relied on Amazon Glue for backfilling and re-indexing off-chain data from various sources. However, this experience proved suboptimal because it required engineers to write Java or Python code using complex APIs. Hence, Kaito seeks a solution that offers SQL as the primary interface. This choice not only reduces the learning curve and speeds up the development process but also makes it easier for Kaito to find candidates with SQL skills.
- Cost-Effective: Kaito continuously collects vast amounts of data from various off-chain and on-chain sources, maintaining a 24/7 data stream. The demand for processing and storing data is considerable, so cost-effectiveness is crucial. Given the unpredictability and sudden nature of real-world data, elasticity in the solution is vital, meaning it can seamlessly scale up and down based on the current workload. This ensures we avoid resource bottlenecks and unnecessary wastage.
When Kaito's tech team began looking for suitable solutions, they researched various stream processing systems on the market, including Apache Flink. However, after evaluation, the following issues emerged:
- Steep Learning Curve: Although the team has a strong technical background, for a fast-growing company like Kaito, they are reluctant to invest a lot of time and effort in a tedious system learning process. They need a solution that they can quickly adapt to and meet current needs.
- Complex Data Stack: Flink only offers computational capabilities. If Kaito wanted to use Flink to build dashboards, they would need to purchase additional databases to support query services, leading to significant cost increases.
- Inability for Instant Elastic Scaling: Flink, a system born a decade ago, uses a coupled storage-computation architecture. During scaling, data needs to be rearranged and imported into the local RocksDB. With Kaito's massive data volumes, this time-consuming process is exceptionally prolonged. It cannot assure Kaito's customers 24/7 access to real-time information streams.
Kaito's market research went deep, ultimately choosing RisingWave as their solution for the following reasons:
- Familiar and Flexible: RisingWave is compatible with PostgreSQL, offering developers the flexibility to build applications using one of the most widely used SQL variants. This compatibility also allows them to plug into the existing ecosystem, achieving seamless integration with third-party business intelligence tools and client libraries.
- Strong Elasticity: RisingWave adopts a decoupled storage-compute architecture, delivering exceptional elasticity and low-latency performance. In contrast, traditional solutions use a coupled storage-compute architecture, requiring data on local disks to be rearranged during scaling. This kind of operation typically extends to tens of minutes or even hours, unacceptable for Kaito.
- Integrated Querying: RisingWave can handle materialized views and provide query services directly from them. This advantage emerged during Kaito's extensive solution research. Many other stream processing systems can only send results to downstream systems like Redis or Cassandra for user query use. This approach requires deploying a second system, increasing operational and maintenance overheads, compromising data consistency, and leading to an isolated developer experience. After verifying RisingWave's materialized view functionality in production, Kaito stopped considering other systems.
RisingWave in Production
After successfully deploying RisingWave to their GKE cluster, a Kaito data engineer built over 1,000 user-visible analysis dashboards (materialized views) and dashboards for internal product data tracking on a single RisingWave cluster in just two weeks. These dashboards are now in production, providing vital support for the company's data analytics and business operations.
Today, Kaito is actively preparing to launch new features, including real-time message alerts, enabling users to customize and receive important message notifications via platforms like Telegram, Slack, and email. This will offer users more practical features and more flexible information communication methods.
Kaito anticipates that based on their existing customer base, the new features will require creating over 1,000 more stream processing jobs to meet the growing demand, further highlighting RisingWave's scalability and performance. This success story showcases Kaito's continuous development and innovation in data processing and analytics.
The data stack inside Kaito is shown in the following diagram:
“At Kaito, timely and accurate information holds immense value for our business. We cater to a discerning audience of sophisticated traders, investors, and quantitative trading firms, all of whom have come to rely on the real-time analytics and low-latency alerts provided by RisingWave as integral components of their decision-making processes. The remarkable speed at which RisingWave enables us to ship new features is so impressive that our data engineers simply can't do without it.”
For fast-growing companies like Kaito, flexibility and quick adaptability are more than just buzzwords; they are the cornerstone of growth.By choosing RisingWave, Kaito not only addressed their current challenges but also prepared for future scalability and feature development.
The success story of Kaito and RisingWave demonstrates how the new generation of stream processing infrastructure is transformative.