Time Travel in the context of a Data Lakehouse refers to the ability to query or access data as it existed at a specific point in the past, or to query data based on snapshot IDs. This feature is primarily enabled by open table formats like Apache Iceberg, Apache Hudi, and Delta Lake, which manage versions of table metadata and data files.
Instead of only seeing the current state of a table, time travel allows users to "go back in time" to examine historical states, reproduce experiments, audit changes, or recover from errors.
Open table formats achieve time travel by:
The table format's metadata tells the query engine which data files to read for that particular snapshot.
Time travel is a powerful feature that significantly enhances the reliability, manageability, and analytical capabilities of data lakehouses, bringing them closer to the functionalities offered by traditional data warehouses. While RisingWave itself focuses on processing current and incoming streaming data to produce fresh results, its output can be sunk into lakehouse tables (e.g., Iceberg tables) that offer time travel capabilities for the data at rest.