The importance of accessing historical data
Accessing historical data alongside real-time information is crucial for many organizations. For trading firms, having snapshots of data at specific points in time is essential for compliance and auditing purposes. Business-critical systems also rely on historical data to recover from failures and restore their systems. Analyzing past events and their impact on current conditions is vital for accurate forecasting and strategic planning. This ability to look back and understand how past events have shaped the present is also key to data governance and integrity, allowing organizations to track changes and maintain a clear audit trail.
Introducing time travel queries in RisingWave
We're excited to introduce time travel queries in RisingWave, a feature designed to meet the needs of customers who require access to historical data. With time travel queries, you can retrieve data from a previous point in time within the designated time travel window. You can also save the data as a separate table or send it to a downstream system for further analysis or processing.
Availability and licensing
Time travel queries are available as part of RisingWave Premium Edition. To use this feature, you'll need to purchase a license. For more information about RisingWave Premium, including pricing, features, and benefits, check out our detailed guide: Everything You Want to Know about RisingWave Premium.
How to use time travel queries in RisingWave
Ensure you have a license key for Premium Edition
Read this article to learn more about Premium Edition and contact our sales team at sales@risingwave-labs.com to purchase a license.
Ensure that your RisingWave instance has a valid license key, by running the following SQL query. If you license key properly set up and valid, it will return t
.
SELECT rw_test_paid_tier();
Set up the environment
In order to run time travel queries in RisingWave, your environment must be set up accordingly:
- Ensure your meta store type is SQL-compatible and at least 50 GB of disk space is reserved. By default, RisingWave uses PostgreSQL as the meta store.
- The system parameter
time_travel_retention_ms
is set to a value greater than0
. This value determines how long historical should be retained. Data older than this value will be deleted. For instance, the following SQL command sets the system parameter so historical data is retained for a day.
ALTER SYSTEM SET time_travel_retention_ms = 86400000;
Syntax for time travel queries
Use the FOR SYSTEM_TIME AS OF
clause in your SELECT
query to access data at the specified point in time. The clause should include a time value.
The historic time can be specified in the following formats. The following queries query historic data from the table historic_table
.
- Unix timestamp in seconds
SELECT * FROM historic_table
FOR SYSTEM_TIME AS OF 1728025608;
- Datetime string
SELECT * FROM historic_table
FOR SYSTEM_TIME AS OF '2024-10-02T12:13:14-08:30';
- An interval relative to the current time
SELECT * FROM historic_table
FOR SYSTEM_TIME AS OF NOW() - '1' HOUR;
If you specify a time that is outside the time travel period, the query will error.
Persisting historical data
If you intend to persist the data from a particular point of time, you can create either a table or a sink. This allows you to access the data even after it falls outside of the time travel retention window. For example, to preserve the state of a table from an hour ago, you can create a new table based on the historical data as below.
CREATE TABLE table_at_202409202300 AS
SELECT * FROM historic_table
FOR SYSTEM_TIME AS OF NOW() - '1' HOUR;
Automatic space reclamation
Stale time travel data in both the meta and object stores is automatically removed in the background to free up storage space. The default configuration is usually sufficient for most situations, but you can customize them if needed.
- The interval to reclaim the meta store is 30 seconds by default. You can change this interval by adjusting the
vacuum_interval_sec
setting. You’ll need to restart the meta node for it to take effect. - The interval to reclaim the object store is 1 day by default. You can customize this schedule by adjusting two settings:
full_gc_interval_sec
andmin_sst_retention_time_sec
. Similarly, you’ll need to restart the meta node for the new setting to take effect.
>
With the introduction of time travel queries in RisingWave, we're excited to see the innovative use cases that our customers will develop. From enhancing fraud detection and audit trails in financial services to optimizing predictive maintenance, the possibilities are vast. We invite you to try out time travel queries and share your feedback with us. > >
>
If you would like to stay up to date on what RisingWave is up to, sign up for our monthly newsletter. Follow us on Twitter and LinkedIn, and join our Slack community to talk to our engineers and hundreds of streaming enthusiasts. > >