The Preview of Stream Processing Performance Report: Apache Flink and RisingWave Comparison
Discover the Future of Stream Processing! Get a sneak peek at the highly anticipated Stream Processing Performance Report: Apache Flink vs. RisingWave.
RisingWave is a cloud-native stream processing database that has emerged in the era of cloud computing. Apache Flink is a widely adopted distributed stream processing framework in the big data field. Despite the evolving landscape, performance remains a crucial aspect.
Developed in Rust and specifically tailored for cloud environments, RisingWave claims to provide a 10X improvement in performance compared to Flink. To achieve optimal performance, RisingWave underwent extensive performance testing for over six months. Recently, the RisingWave team shared a private preview of the performance report with more than 50 community members, including seasoned engineers, Flink contributors, and Flink PMC members.
This week has been phenomenal. We are pleased to release a preview version of the performance report publicly. This will allow everyone to gain insights into the performance comparison between RisingWave and Flink.
Before we begin
Although Apache Flink and RisingWave are both open-source stream processing systems, their design philosophies differ significantly, making a direct comparison between the two somewhat unfair. However, we can still outline the similarities and differences between the two projects to provide a reference for comparison. In this discussion, we will primarily focus on comparing Flink SQL and RisingWave.
Flink is an open-source project that originated and developed during the big data era. It is a unified stream and batch processing platform designed for general-purpose use. While Flink is well-known for its efficient stream processing capabilities, it has expanded to encompass other domains as well. Flink is evolving into a unified system that covers features such as batch processing, machine learning (Flink ML), data lakes (Paimon, also known as Flink Table Store), and stateful functions (StateFun). Flink provides a Java API similar to MapReduce, and it offers higher-level Python and SQL interfaces built on top of the Java API for ease of use.
RisingWave is a cloud-native streaming database designed to improve the performance and efficiency of cloud-based stream processing. RisingWave is wire-compatible with PostgreSQL, meaning users can connect to it using any PostgreSQL-compatible tools such as psql and JDBC.
Unlike Flink, RisingWave does not provide a Java API. Instead, it offers Python User-Defined Functions (UDFs) to enhance the expressiveness of SQL queries. This approach provides users with a more flexible and intuitive way to work with the data and enables them to use the programming language they are most familiar with.
To better understand the similarities and differences between Flink and RisingWave, our focus will be on comparing their respective SQL capabilities.
Benchmarks and Environment
To compare Flink and RisingWave, we used Flink version 1.16 and RisingWave version 0.19.
In the realm of stream processing, the Nexmark benchmark is widely recognized. The Flink community uses the Nexmark benchmark with source code available. The RisingWave team implemented Nexmark using Kafka as the streaming data source. Check out the implementation here.
However, as the Nexmark benchmark does not cover certain commonly used SQL operators, we included an additional set of 5 query statements.
To simulate real-world scenarios, we generated continuous data using Kafka and connected it to both Flink and RisingWave. Our primary focus was on throughput rather than latency. It is worth noting that both Flink and RisingWave were configured to guarantee exactly-once semantics.
For our tests, we utilized AWS EC2 c5.2xLarge instances, each of which features eight vCPUs and 16GB of memory.
RisingWave consists of four components:
- Frontend Node
- Meta Node
- Compute Node
- Compactor Node
The Compute Node and Compactor Node share the same machine, while the Frontend Node and Meta Node have minimal impact on performance and share another machine.
In contrast, Flink consists of two components:
The Task Manager occupies one machine (compaction threads in RocksDB correspond to the Compactor Node), while the Job Manager occupies another machine based on their corresponding relationship.
Test Dataset Configuration
To prepare for the test, we allocated a separate machine for Kafka and generated a 2 billion data points dataset using the data generation tool. Once the data generation was completed, we initiated the system (either RisingWave or Flink) to execute the queries.
After each query finished running, we stopped the system, deleted all checkpoint data on S3, restarted the system, and proceeded to the next query. Each query was constrained to a maximum execution time of 60 minutes, and if it exceeded this time limit, it was promptly terminated.
The data results presented below are based on the individual runtime of each query.
Performance Test Results
The performance result is shown in the table below.
Performance comparison between RisingWave and Flink. 100% represents the same performance for RisingWave and Flink. Greater than 100% indicates that RisingWave is faster, while less than 100% indicates that Flink is faster. Performance comparison between RisingWave and Flink. 100% represents the same performance for RisingWave and Flink. Greater than 100% indicates that RisingWave is faster, while less than 100% indicates that Flink is faster.
Performance comparison between RisingWave and Flink. 100% represents the same performance for RisingWave and Flink. Greater than 100% indicates that RisingWave is faster, while less than 100% indicates that Flink is faster.
Please note that in this table, we compared two versions of Flink: Flink and Flink (better storage). We used EBS storage for RocksDB state, and since internal state management can be a performance bottleneck in stream processing, we increased the performance parameters to 12000 IOPS and 500 MB/s on top of using the default EBS gp3 to enhance Flink's performance.
As shown in the table, RisingWave achieved performance advantages in 22 out of the 27 queries listed. Among them, 12 queries demonstrated performance improvements of at least 50% compared to Flink (i.e., values above 150% in the chart). ten queries surpassed Flink's performance by a factor of two (i.e., values above 200% in the chart).
Notably, q102 achieved a performance improvement of over 520 times, and q104 achieved a performance improvement of 660 times compared to Flink.
Why are some query results missing?
Please note that we did not include the results for queries q6, q11, q13, and q14 in the comparison table. Here is the reason for each:
- q6: This query uses a window function, which is not currently supported in the current version of RisingWave. However, it will be supported soon (by the end of this month).
- q11: This query uses session windows, which we consider to be a lower priority, and therefore, it is not yet implemented.
- q13: This query requires generating a side input, which we omitted for the purpose of this comparison.
- q14: This query requires UDF (User-Defined Function) support. Although RisingWave supports UDF, it requires deploying a separate UDF service. Since our focus was primarily on testing the performance of RisingWave and Flink, we omitted this query.
Why does the performance of stateless computations appear to be similar?
In queries q0-q3 and q21-q22, RisingWave's performance improvement over Flink is not significant. This may seem counterintuitive since RisingWave, being a project developed in Rust, is theoretically expected to achieve several times the performance of Flink, developed in Java.
However, the main reason why we did not observe such performance improvement in these tests is that the network I/O introduced by the information transfer between Flink/RisingWave and Kafka became the primary bottleneck. In other words, a significant amount of time was spent on network communication rather than CPU computation.
Although we can indeed construct complex stateless computations (such as parsing nested JSON) to showcase the performance advantages of Rust projects, Nexmark does not cover such operations, and to ensure fairness, we excluded such tests.
How to explain the two queries (q5 and q16) where RisingWave is significantly slower than Flink?
For Flink's testing, we used the same source definition as the official Nexmark source code. The only difference between Flink and RisingWave is that Flink defines a watermark on the Nexmark data source, which provides additional optimization opportunities for Flink.
For instance, a window aggregation function can determine when a time window can be closed and output the final result once and only once when the watermark arrives instead of continuously outputting the latest intermediate result after each row update. We refer to the former semantics as Emit On Window Close (EOWC). However, we noticed that Flink doesn't support window aggregation functions with non-EOWC semantics, so we couldn't force Flink to use non-EOWC semantics by removing the watermark definition.
In the production environments of RisingWave users, they have expressed a need for both semantics.
RisingWave will soon support EOWC, but it was not yet supported in the tests. In Flink's q5, this semantics is utilized, while RisingWave still outputs a large number of real-time intermediate results. This difference causes the downstream join operator in RisingWave to be triggered extensively, resulting in lower performance compared to Flink. We will present the results with EOWC support in the next round of testing.
Regarding q16, we are still investigating. At present, RisingWave does not provide the optimization of Split Distinct Aggregation, as described here. Once we support this optimization, we will provide performance data with the same execution plan in the next version of the performance report.
In which types of queries is RisingWave significantly better than Flink?
RisingWave achieved significant performance improvements, sometimes even doubling or increasing by hundreds of times, in queries that have complex internal states and require a large amount of space. Generally, the more complex a query is, the more complex the internal state it needs to maintain and the larger the state space it occupies.
For instance, in our tests, queries like q4, q7, q9, q18, and q20 had internal states approaching or exceeding 20GB, while q102 and q105 had internal states close to 10 GB. In such cases, accessing and manipulating the internal state often becomes the bottleneck of the computation process, requiring optimizations such as caching and indexing by the stream processing engine.
From the test results, it is evident that RisingWave is better equipped to handle complex stream processing tasks compared to Flink.
Why can RisingWave achieve such high performance?
RisingWave's performance is closely linked to its design and implementation. In summary, there are three key factors that contribute to RisingWave's performance advantage over Flink:
- RisingWave is developed from scratch in Rust and relies on very few third-party JVM components. This gives it a significant advantage over Flink, which is developed in Java, in terms of language choice. However, this advantage primarily exists at the computation layer, as Flink's underlying RocksDB storage is developed in C++.
- As a big data project, Flink has a Java API layer similar to MapReduce, and Flink SQL is essentially a wrapper over the Flink Java API. Computer science theory suggests that the more abstraction layers we have, the worse the performance. In contrast, RisingWave doesn't have an intermediate layer, allowing for highly customized optimizations directly targeting SQL queries, thereby achieving significant performance improvements.
- Flink directly uses RocksDB to store intermediate computation states, and RocksDB is unaware of the computation itself. In contrast, RisingWave has its own storage implementation that is aware of the computation, and it uses remote storage (such as S3, HDFS, etc.) to greatly reduce storage costs, thus achieving improved computational costeffectiveness.
In addition to these three factors, there are other aspects that could contribute to RisingWave's performance advantage over Flink. For instance, Flink's current direction is to become a unified platform, and the introduction of features like batch processing, machine learning, and data lakes can easily increase system complexity, resulting in unnecessary performance overhead.
Complex Stateful Computations
As mentioned earlier, it is theoretically expected that a system developed in Rust would outperform a system built on JVM languages, especially when handling complex stateful computations. Given that many modern applications require complex operations such as JSON parsing and string processing within stream processing systems, we plan to add tests for such queries in the future.
Multi-stream joins are a common scenario in stream processing. When users have multiple data sources, such as multiple MySQL instances or Kafka topics, they often turn to stream processing systems to perform joins for analysis. Handling multi-stream joins often involves dealing with large internal states.
Based on the results of the Nexmark benchmark test, we have already preliminarily verified RisingWave's excellent performance in scenarios with large internal states, and it is evident that multi-stream joins are an area where RisingWave excels.
We have conducted experiments on multi-stream joins, and based on initial results, RisingWave can easily handle joins of ten or more data streams, while Flink often encounters state management issues and crashes. We will gradually release the experimental results of multi-stream joins to the public.
UDFs, Watermarks, and Advanced Features
The effectiveness of a stream processing system is not limited to handling basic operators such as aggregation, join, and windows. In practice, users often require advanced features, such as User-Defined Functions (UDFs) or watermarks, to extend the expressive power and ensure the system's correctness. However, due to the limitations of the Nexmark benchmark, we did not test these features. We plan to cover these features in future tests and evaluate their impact on RisingWave's performance.
How to optimize performance in Flink?
Optimizing performance in Flink is not only a technical task but also an experiential one. The RisingWave team has accumulated nearly a year of experience in optimizing Flink's performance. In summary, there are three aspects to consider when optimizing performance in Flink:
- Deployment optimization: Flink supports Kubernetes deployment, but deploying Flink via Kubernetes does not necessarily result in optimal performance due to its heavy dependencies on the JVM ecosystem. Specific deployment considerations, such as how to deploy ZooKeeper nodes, need to be taken into account to achieve optimal performance.
- SQL optimization: Flink uses Calcite for SQL parsing and planning, but it lacks awareness of the data, which limits its ability to optimize SQL effectively. Therefore, in some queries, it may be necessary to rewrite the SQL to achieve higher performance manually.
- RocksDB optimization: Flink uses RocksDB for internal state management. However, since RocksDB is not aware of the computation and acts only as a storage layer, users often need to tune RocksDB parameters for optimal performance manually. It's worth noting that RocksDB has hundreds of tunable parameters, so achieving optimal RocksDB performance requires a deep understanding of its internal structure.
How to degrade performance in RisingWave?
The most likely way to degrade performance in RisingWave is by reducing the memory of the compute nodes and introducing irregular access patterns that result in a high cache miss rate. The underlying principle is that RisingWave uses remote storage to maintain the internal state and caches the most frequently accessed state on compute nodes. When the compute node capacity is limited, and access patterns are irregular, it can disrupt the caching strategy (RisingWave uses an LRU-based algorithm) and cause frequent access to remote storage. Accessing remote storage is often costly, and frequent access inevitably leads to a decrease in overall performance.
This article presents a performance comparison between Flink and RisingWave based on the Nexmark benchmark test. While a single benchmark cannot cover all aspects of a stream processing system, it provides a general understanding of the performance differences and underlying reasons between Flink and RisingWave in common scenarios. Apart from stream processing capabilities, Flink offers other features worth exploring, including batch processing, machine learning, and StateFun. In contrast, RisingWave will focus more on optimizing the performance and efficiency of stream processing.
Performance is not the only criterion for evaluating the superiority or inferiority of a system, but we strive to achieve optimal performance while balancing other important factors such as scalability, fault tolerance, and ease of use.