How to Detect DNS Tunneling with Streaming SQL
DNS tunneling hides data inside DNS query names to bypass firewalls and exfiltrate information. RisingWave, a PostgreSQL-compatible streaming database, can analyze a continuous stream of DNS events from Kafka and maintain materialized views that flag statistical anomalies such as high query rates, long subdomain labels, and unusual record types in real time.
What Is DNS Tunneling and How Does It Work?
DNS tunneling is a technique that encodes arbitrary data inside the subdomain portion of a DNS query. Because virtually every corporate network allows outbound DNS traffic, an attacker who has compromised a host can communicate with a remote command-and-control server by sending carefully crafted DNS queries rather than direct TCP connections.
The mechanics are straightforward. A DNS client tool such as Iodine or DNScat2 takes a raw payload, base32- or base64-encodes it, and appends it as a subdomain label before a domain the attacker controls. A query might look like:
v2hpY2hlbmNvZGVkdGV4dA.attacker-c2.com
The authoritative DNS server for attacker-c2.com is controlled by the attacker, who decodes the subdomain and reads the embedded data. Responses also carry encoded payloads in TXT or NULL records, completing a two-way channel entirely inside DNS traffic.
Statistical Signatures of DNS Tunneling
DNS tunneling leaves measurable statistical fingerprints that differ from legitimate DNS traffic:
High query rate to a single domain. Normal clients resolve a given hostname a handful of times per hour. A tunneling client may send hundreds or thousands of queries per minute to the same parent domain, because each query carries only a few hundred bytes of payload.
Unusually long subdomain labels. RFC 1035 permits subdomain labels up to 63 characters. Legitimate subdomains are short and human-readable (e.g., mail, api, cdn). Base32-encoded payloads fill the label to its maximum, so tunneled subdomains are consistently 30 to 63 characters long.
High entropy in subdomain strings. Human-readable names use a small, uneven character set. Base32 and base64 output uses a larger character set with near-uniform distribution, producing high Shannon entropy. A subdomain like xn4r9q2k8mzpvtf1a7y has very different entropy from api or docs.
Preference for TXT, NULL, and CNAME record types. Standard web traffic generates mostly A and AAAA queries. TXT and NULL record responses can carry hundreds of bytes of arbitrary data, making them the preferred return channel for tunneling tools. A host that primarily issues TXT queries is suspicious.
Low TTL values. Tunneling tools frequently set TTL to zero or very low values to prevent caching, ensuring every request travels to the authoritative server where the attacker can read it.
How to Stream DNS Events into RisingWave
Assume DNS query logs are forwarded to a Kafka topic called dns_queries. Each message is a JSON object with the following fields:
| Field | Type | Description |
query_id | VARCHAR | Unique query identifier |
src_ip | VARCHAR | Client IP address |
domain | VARCHAR | Parent domain (e.g., attacker-c2.com) |
subdomain | VARCHAR | Subdomain label (e.g., v2hpY2g) |
record_type | VARCHAR | DNS record type (A, AAAA, TXT, NULL, CNAME) |
query_len | INT | Total query name length in characters |
query_time | TIMESTAMPTZ | Timestamp when the query was observed |
Create a Kafka source in RisingWave:
CREATE SOURCE dns_queries (
query_id VARCHAR,
src_ip VARCHAR,
domain VARCHAR,
subdomain VARCHAR,
record_type VARCHAR,
query_len INT,
query_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'dns_queries',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
RisingWave begins consuming events immediately. From this source you can define materialized views that compute rolling statistics continuously as new events arrive.
How to Detect High Query Rates with a Tumbling Window
The first signal is query volume. A legitimate host should resolve a given domain rarely. A tunneling client sends many queries per minute because it is transferring data one query at a time.
Use a 5-minute tumbling window to count queries per source IP per domain:
CREATE MATERIALIZED VIEW dns_query_rate AS
SELECT
src_ip,
domain,
window_start,
window_end,
COUNT(*) AS query_count
FROM TUMBLE(
dns_queries,
query_time,
INTERVAL '5 minutes'
)
GROUP BY
src_ip,
domain,
window_start,
window_end;
RisingWave maintains this view incrementally. Every new event updates exactly the affected window bucket rather than reprocessing all historical data. A query_count exceeding a threshold such as 200 per 5-minute window for a single (src_ip, domain) pair is a strong indicator of tunneling activity.
How to Detect Long Subdomain Labels
Tunneled payloads are base32- or base64-encoded, which produces consistently long subdomain strings. Computing the average subdomain length per source and domain over the same window exposes this pattern:
CREATE MATERIALIZED VIEW dns_subdomain_length AS
SELECT
src_ip,
domain,
window_start,
window_end,
AVG(LENGTH(subdomain)) AS avg_subdomain_len,
MAX(LENGTH(subdomain)) AS max_subdomain_len,
COUNT(*) AS query_count
FROM TUMBLE(
dns_queries,
query_time,
INTERVAL '5 minutes'
)
GROUP BY
src_ip,
domain,
window_start,
window_end;
Legitimate subdomains average well under 15 characters. Tunneled subdomains average 30 to 55 characters. Any avg_subdomain_len > 30 combined with a non-trivial query count should trigger further investigation.
How to Detect Suspicious Record Type Concentration
Tunneling tools favor TXT and NULL record types because the responses can carry large payloads. Measuring what fraction of a source IP's queries for a given domain use these record types exposes tunneling behavior:
CREATE MATERIALIZED VIEW dns_record_type_concentration AS
SELECT
src_ip,
domain,
window_start,
window_end,
COUNT(*) AS total_queries,
COUNT(*) FILTER (WHERE record_type IN ('TXT','NULL','CNAME')) AS tunnel_type_queries,
ROUND(
100.0 * COUNT(*) FILTER (WHERE record_type IN ('TXT','NULL','CNAME'))
/ NULLIF(COUNT(*), 0),
2
) AS tunnel_type_pct
FROM TUMBLE(
dns_queries,
query_time,
INTERVAL '5 minutes'
)
GROUP BY
src_ip,
domain,
window_start,
window_end;
A tunnel_type_pct above 60 percent for a domain that also shows elevated query rates is highly suspicious. Normal browsing traffic for any domain should be predominantly A and AAAA queries.
How to Combine Signals into a Unified Alert View
Individual signals can produce false positives. A content delivery network might legitimately generate many queries. A developer laptop might resolve a domain frequently during testing. Combining all three signals in a single view reduces noise:
CREATE MATERIALIZED VIEW dns_tunnel_alerts AS
SELECT
r.src_ip,
r.domain,
r.window_start,
r.window_end,
r.query_count,
l.avg_subdomain_len,
l.max_subdomain_len,
t.tunnel_type_pct,
CASE
WHEN r.query_count > 200
AND l.avg_subdomain_len > 30
AND t.tunnel_type_pct > 50
THEN 'HIGH'
WHEN r.query_count > 100
AND (l.avg_subdomain_len > 30 OR t.tunnel_type_pct > 40)
THEN 'MEDIUM'
ELSE 'LOW'
END AS severity
FROM dns_query_rate r
JOIN dns_subdomain_length l
ON r.src_ip = l.src_ip
AND r.domain = l.domain
AND r.window_start = l.window_start
JOIN dns_record_type_concentration t
ON r.src_ip = t.src_ip
AND r.domain = t.domain
AND r.window_start = t.window_start
WHERE
r.query_count > 100
OR l.avg_subdomain_len > 30
OR t.tunnel_type_pct > 40;
You can query this view at any time with a standard SQL client:
SELECT * FROM dns_tunnel_alerts
WHERE severity = 'HIGH'
ORDER BY window_start DESC
LIMIT 50;
Because dns_tunnel_alerts is a materialized view, RisingWave keeps it up to date automatically. There is no polling loop, no batch job schedule, and no separate alerting daemon required. Any downstream system, such as a SIEM or a PagerDuty integration, can subscribe to changes or poll on a short interval.
How Does This Compare to Apache Flink?
The same detection logic can be implemented in Apache Flink, but the two approaches differ substantially in complexity, maintenance burden, and operational overhead.
RisingWave: SQL Materialized Views
With RisingWave you write plain SQL and the engine handles all state management. The entire detection pipeline above is fewer than 80 lines of SQL. RisingWave stores incremental state on S3, so there is no RocksDB tuning, no checkpoint interval configuration, and no separate state backend to operate. The PostgreSQL-compatible interface means any tool that speaks the PostgreSQL wire protocol can query results directly.
Apache Flink: Java DataStream API
Implementing equivalent logic in Flink requires Java or Scala code using the DataStream API. A tumbling window aggregation on query count looks roughly like:
DataStream<DnsEvent> dnsStream = env
.addSource(new FlinkKafkaConsumer<>("dns_queries", schema, properties));
dnsStream
.keyBy(e -> Tuple2.of(e.srcIp, e.domain))
.window(TumblingEventTimeWindows.of(Time.minutes(5)))
.aggregate(new QueryCountAggregator())
.addSink(new AlertSink());
Each aggregation requires a separate aggregator class. Joining the three signals requires additional KeyedCoProcessFunction implementations and careful coordination of watermarks across streams. The state backend must be configured explicitly:
env.setStateBackend(new EmbeddedRocksDBStateBackend());
env.getCheckpointConfig().setCheckpointInterval(60_000);
env.getCheckpointConfig().setCheckpointStorage("s3://checkpoints/flink");
Deploying this application requires packaging a fat JAR, submitting it to a Flink cluster, and managing the cluster separately. Schema changes require code changes, recompilation, and redeployment. Flink SQL exists but it is less mature for complex multi-stream joins and does not expose a PostgreSQL-compatible query interface.
Key Differences
| Dimension | RisingWave | Apache Flink |
| Query language | PostgreSQL SQL | Java/Scala or Flink SQL |
| State backend | S3 (managed) | RocksDB (self-managed) |
| Wire protocol | PostgreSQL | REST / Flink SQL Gateway |
| Schema change | ALTER VIEW | Recompile and redeploy |
| Operational overhead | Low | Medium to high |
| Multi-stream join | SQL JOIN | CoProcessFunction code |
For security engineering teams who already know SQL and want to ship detection logic quickly, RisingWave removes the Java build toolchain and cluster management layer entirely.
Querying Alerts and Integrating with Downstream Systems
Because RisingWave speaks PostgreSQL, you can integrate alerts into existing workflows without custom connectors:
From psql:
psql -h localhost -p 4566 -d dev -U root \
-c "SELECT src_ip, domain, severity, query_count FROM dns_tunnel_alerts WHERE severity = 'HIGH';"
From Python:
import psycopg2
conn = psycopg2.connect(
host="localhost", port=4566, dbname="dev", user="root"
)
cur = conn.cursor()
cur.execute("""
SELECT src_ip, domain, severity, query_count, tunnel_type_pct
FROM dns_tunnel_alerts
WHERE severity IN ('HIGH', 'MEDIUM')
ORDER BY window_start DESC
LIMIT 100
""")
for row in cur.fetchall():
print(row)
You can also sink alert data to a Kafka topic for downstream SIEM ingestion:
CREATE SINK dns_tunnel_alert_sink
FROM dns_tunnel_alerts
WITH (
connector = 'kafka',
topic = 'dns-tunnel-alerts',
properties.bootstrap.server = 'kafka:9092'
)
FORMAT PLAIN ENCODE JSON;
FAQ
What is DNS tunneling and why is it hard to detect?
DNS tunneling abuses the DNS protocol to encode arbitrary data inside query names and responses, allowing attackers to exfiltrate data or maintain covert command-and-control channels through firewalls that permit DNS traffic. It is hard to detect because DNS queries look legitimate at the packet level. Detection requires statistical analysis of query patterns, subdomain length, entropy, and record types across many requests over time.
What statistical signatures indicate DNS tunneling?
Key signatures include: unusually long subdomain labels (typically over 30 characters due to base32 or base64 encoding), high entropy in subdomain strings, an abnormally high query rate from a single source IP to the same parent domain, and heavy use of TXT, NULL, or CNAME record types, which carry larger payloads than standard A or AAAA queries.
How does RisingWave detect DNS tunneling differently from Apache Flink?
RisingWave is a PostgreSQL-compatible streaming database where you write plain SQL materialized views. The views are maintained continuously as new DNS events arrive from Kafka. Apache Flink requires Java or Scala DataStream code with explicit checkpoint configuration and RocksDB state management. RisingWave stores state on S3 and uses PostgreSQL-compatible syntax, making it significantly easier to write, read, and maintain threat detection logic.
Can RisingWave handle the volume of DNS traffic in a production network?
Yes. RisingWave is built in Rust and separates compute from storage, persisting state on S3. It ingests from Kafka at high throughput and maintains materialized views incrementally rather than recomputing from scratch. Enterprise networks generating millions of DNS queries per minute can feed events through Kafka, and RisingWave's incremental computation keeps detection latency in the seconds range without requiring a dedicated stateful stream processor cluster.
DNS tunneling is one of the most persistent and underdetected exfiltration techniques in enterprise environments. By ingesting DNS query streams from Kafka and maintaining lightweight SQL materialized views, RisingWave gives security teams a continuously updated, queryable view of tunneling signals without the operational overhead of a Java-based stream processing platform.
For a broader look at how streaming SQL applies to network threat detection, see RisingWave for Network Anomaly Detection.

