Nanosecond Precision Timestamps in RisingWave

RisingWave's standard TIMESTAMP data type (also known as TIMESTAMP WITHOUT TIME ZONE) now offers full, automatic support for high-precision timestamps, up to nanosecond resolution (9 decimal places for seconds). This enhancement ensures high fidelity for time-sensitive data both during internal processing and when interacting with external systems. This guide covers how this works, how to use it, and important considerations regarding precision.

Note that this feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our Slack channel. Your input is valuable in helping us improve this feature.

Key Improvement: Seamless High-Precision Handling

RisingWave excels in handling nanosecond precision in two primary ways:

Data Ingestion & Sinking (Import/Export):
- RisingWave now automatically preserves the full precision (up to nanoseconds) when reading TIMESTAMP data from external sources (like Parquet, Iceberg) or writing to them.
- Benefit: No precision is lost during these I/O operations, which might have occurred previously (truncation to microseconds). This happens transparently without needing changes to your table definitions or connector configurations.
Internal Storage & Same-Type Operations:
- The TIMESTAMP type accurately stores values based on the precision provided in the input – whether it's nanoseconds, microseconds, milliseconds, or less.
- Operations that take TIMESTAMP input and produce TIMESTAMP output (like comparisons, EXTRACT, TO_CHAR, DATE_TRUNC, MAKE_TIMESTAMP, GENERATE_SERIES, assignments) correctly maintain and operate on the stored precision.

How to Use High-Precision Timestamps

1. Defining Tables

Use the standard TIMESTAMP type. No special syntax is required.

CREATE TABLE high_freq_events (
    event_id BIGINT,
    event_time TIMESTAMP, -- Automatically handles precision based on input, up to nanoseconds
    payload JSONB
);

2. Ingesting Data from an Iceberg Source (or Inserting to Tables Directly)

Define a SOURCE in RisingWave to connect to your existing Iceberg table. RisingWave's Iceberg connector will automatically read the timestamp data, preserving nanosecond precision if present in the source Iceberg table (often stored within underlying Parquet files).

-- Prerequisite: An Iceberg table exists, e.g., 'my_catalog.my_db.source_events'
-- containing columns like 'event_id_iceberg', 'event_ts_iceberg' (with potential nanoseconds), 'payload_iceberg'.

CREATE SOURCE iceberg_source (
    event_id BIGINT,
    event_time TIMESTAMP, -- Maps to Iceberg timestamp, preserving source precision
    payload JSONB
)
WITH (
    connector = 'iceberg',
    catalog.type = 'hive', -- Or 'rest', 'glue', 'hadoop' etc.
    warehouse.path = 's3a://my-bucket/my-warehouse/', -- Or local path like 'file:///path/to/warehouse'
    uri = 'thrift://my-hive-metastore:9083', -- Needed for Hive catalog type
    -- database.name = 'my_db', -- Optional: Specify DB if needed
    table.name = 'my_catalog.my_db.source_events' -- Fully qualified Iceberg table name
    -- s3.endpoint = '...', -- Optional: S3 endpoint if needed
    -- s3.access.key = '...', -- Optional: S3 credentials
    -- s3.secret.key = '...'
);

-- Explanation:
-- When RisingWave reads from 'my_catalog.my_db.source_events',
-- if the 'event_ts_iceberg' column (mapped to 'event_time' here)
-- contains data like '2024-03-15 10:30:45.123456789',
-- that full nanosecond precision will be automatically read and stored
-- in the 'event_time' column within RisingWave (in the source's buffer
-- or subsequently in a materialized view like high_freq_events_mv).
-- No special handling is required in the WITH clause for precision.

3. Querying and Processing (Precision Preserving Operations)

Most common operations preserve the nanosecond precision:

-- Comparisons, Ordering respect full precision
SELECT * FROM events
WHERE event_time >= '2024-03-15 10:30:45.123456000' -- Comparing against microseconds
ORDER BY event_time DESC;

-- DISTINCT and GROUP BY differentiate based on exact timestamp values
SELECT DISTINCT event_time FROM events;
SELECT event_time, count(*) FROM events GROUP BY event_time;

-- Aggregates like MAX/MIN operate on full precision
SELECT MAX(event_time) FROM events;

-- EXTRACT allows access to fractional parts at various granularities
SELECT event_id,
       EXTRACT(second FROM event_time), -- e.g., 45.123456789
       EXTRACT(nanosecond FROM event_time) -- e.g., 123456789
FROM events WHERE event_id = 101;

-- TO_CHAR formats output, '.NS' shows nanoseconds (padding if needed)
SELECT event_id, TO_CHAR(event_time, 'YYYY-MM-DD HH24:MI:SS.NS')
FROM events;
-- Example Output might show: ... 10:30:45.123456789 or ... 10:31:00.987654000

Functions like MAKE_TIMESTAMP and GENERATE_SERIES also support high-precision inputs and outputs.

Key Considerations: Precision Loss During Type Conversion

Crucially, precision can be lost (truncated to microseconds) when converting TIMESTAMP values to other time-related types.

General Rule: Any function or operation where the input is TIMESTAMP and the output is TIMESTAMPTZ, TIME, or INTERVAL will likely result in the fractional seconds being truncated to microseconds (6 digits).
Specific Examples:
- Timestamp Subtraction: Subtracting two TIMESTAMP values results in an INTERVAL. The fractional seconds part of this resulting interval will have microsecond precision, even if the input timestamps had nanoseconds.

    SELECT '2022-03-13 03:00:00.123456789'::timestamp - '2022-03-13 01:00:00.000000000'::timestamp;
    -- Result: 02:00:00.123456 (Interval with microsecond precision, nanoseconds truncated)

Casting to TIMESTAMPTZ: Converting a TIMESTAMP to TIMESTAMP WITH TIME ZONE truncates to microseconds.

    SELECT CAST('2024-03-15 10:30:45.123456789'::timestamp AS TIMESTAMPTZ);
    -- Result likely: 2024-03-15 10:30:45.123456+00:00 (or local TZ equivalent)

Casting to TIME: Converting a TIMESTAMP to TIME also truncates to microseconds.

    SELECT CAST('2024-03-15 10:30:45.123456789'::timestamp AS TIME);
    -- Result likely: 10:30:45.123456

Casting to DATE: This inherently removes the time part, so precision loss is expected and not specific to the microsecond truncation rule.

Other Considerations:
- NOW() Precision: The NOW() function's precision depends on the system clock and may not reach nanoseconds.
- Client Display: Use TO_CHAR with .NS for explicit nanosecond formatting, as clients might display fewer digits by default.

Value Proposition & Use Cases

Despite the conversion limitations, the core nanosecond support offers significant value:

High Data Fidelity: Accurate storage and I/O for high-resolution time data.
Precise Event Ordering & Analysis: Critical for finance, logging, IoT, tracing.
Simplified I/O: Transparent precision preservation during ingestion/sinking.

Use Cases Benefiting Most:

Storing & Querying High-Precision Logs: Ingesting logs with nanosecond timestamps from sources like Fluentd/Vector (via Kafka) and querying them accurately.
Financial Data Analysis: Storing tick data and performing comparisons, aggregations, and window functions directly on nanosecond timestamps.
IoT Sensor Streams: Retaining the exact time of rapid sensor readings.
Data Archival/Lakehousing: Sinking processed streams to Iceberg/Parquet while preserving original nanosecond event times.

Conclusion

RisingWave now automatically handles nanosecond precision for its TIMESTAMP type, improving accuracy for time-sensitive data, especially during data import/export. While most operations preserve this precision, remember that converting to TIMESTAMPTZ, TIME, or INTERVAL will truncate to microseconds. Keep this nuance in mind as you leverage RisingWave's enhanced temporal capabilities.