When creating a stream processing pipeline, two key components are required: a tool for processing and transforming data, and a tool for visualizing the data. It is crucial that these tools work well together for smooth operation. RisingWave and Redash offer an excellent solution for this aspect of the stream processing pipeline.
RisingWave is a distributed SQL streaming database that supports real-time data processing. It offers incrementally updated materialized views, enabling instant joining, transformation, cleaning, and analysis of streaming data using SQL. RisingWave is compatible with PostgreSQL and seamlessly integrates with various upstream and downstream connectors. This makes it effortless to incorporate RisingWave into any existing stream processing technology stack.
Redash is a user-friendly, open-source tool for data visualization and business intelligence (BI). It offers a powerful user interface that excels in exploring, analyzing, sharing, and presenting data. With Redash, you can configure visualizations and dashboards to refresh automatically at regular intervals or on demand, ensuring access to up-to-date data. The tool supports a wide range of data sources, making it a popular choice for data visualization across various platforms.
In this blog, we will demonstrate how to easily configure Redash to read data from a RisingWave database to start setting up a stream processing pipeline.
Set up RisingWave
First, we will deploy a RisingWave instance and create a data source.
Start a RisingWave instance
To learn how to start a RisingWave instance locally, see the Quick start guide. For this blog, we will deploy RisingWave locally. If you would like to use the fully managed version, see RisingWave Cloud and its corresponding Quick start guide.
Create a table
Let’s create a table with some data in RisingWave to query from later in Redash. As RisingWave offers a wide array of connectors, you can ingest data from popular streaming services and databases. For this demo, we use a data generator that sends mock data to a Kafka topic. The data we use has the following format.
{
"year": year,
"month": month,
"day": day,
"temperature": temperature in Celsius,
"humidity": humidity %,
"rainfall": rainfall in millimeters,
}
Use the CREATE SOURCE
command to establish a connection with the Kafka broker. The following query creates the source weather
and defines the schema of the data. In RisingWave, a source does not ingest data from the upstream system. To start ingesting data, we need to create a materialized view. Alternatively, if data needs to be persisted in RisingWave, CREATE TABLE
can be used.
CREATE SOURCE weather (
year INTEGER,
month INTEGER,
day INTEGER,
temperature DOUBLE,
humidity DOUBLE,
rainfall DOUBLE
) WITH (
connector = 'kafka',
topic = 'weather_observations',
properties.bootstrap.server = 'localhost:9092'
) FORMAT PLAIN ENCODE JSON;
Next, we will create a materialized view that finds the average temperature, humidity, and rainfall for each month of each year.
CREATE MATERIALIZED VIEW monthly_avg AS
SELECT year,
month,
avg(temperature) AS avg_temp,
avg(humidity) AS avg_humidity,
avg(rainfall) AS avg_rainfall_mm
FROM weather
GROUP BY year, month;
Now we have a materialized view to query data from.
Connect to RisingWave in Redash
Once RisingWave is set up, let’s start a Redash instance and connect to our RisingWave database.
Set up Redash
To learn how to start a Redash instance, see Setting up a Redash instance for different deployment methods. For demonstration purposes, we use a docker-compose file to deploy Redash and all necessary components.
Log in to or create a Redash account after starting a Redash instance.
Connect to RisingWave
Start by connecting to a new data source. From the list of available options, find and select PostgreSQL. Since RisingWave is compatible with PostgreSQL, we can connect to RisingWave from Redash by setting up a new PostgreSQL data source.
Fill in the configurations according to your RisingWave database. Here, we are using the default port, database, and user for RisingWave. Since we deployed Redash with Docker and RisingWave locally, the corresponding host is host.docker.internal
. The host address will be different based on your setup. Click Create and test the connection to ensure Redash can read data from RisingWave.
Once the data source is set up, we can start writing queries and creating visualizations. Under the New Query window, a list of available RisingWave tables and materialized views will be available. On the side, we can see the monthly_avg
materialized view we created earlier. We can query from the materialized view, make any necessary transformations, and create visualizations and dashboards.
>
With just a few simple steps, we have successfully integrated RisingWave with Redash. Redash offers a wide range of features for creating advanced visualizations and dashboards, while RisingWave provides robust capabilities for data processing, transformation, and analysis. Furthermore, RisingWave's extensive list of integrations allows you to effortlessly connect it to a messaging system, enabling the construction of a comprehensive stream processing pipeline. While the selection and implementation of each system in the pipeline can be challenging, RisingWave's flexibility and user-friendly nature make the entire process manageable. By leveraging the strengths of both RisingWave and Redash, you can build sophisticated data pipelines and unlock powerful insights from your streaming data. If you would like to learn more about the features that RisingWave offers, see the official documentation site. You can also sign up for RisingWave Cloud for free to test it out. If you need additional help with setting up this integration, join our active Slack community. > >
>