Getting Started With Event-Driven Computation
This article discusses how event-driven computation works and the various tools that can be used to implement different stages of the event-driven computation process.
This article discusses how event-driven computation works and the various tools that can be used to implement different stages of the event-driven computation process.
Everything that happens in the world is an event that reflects the current state of a system. When you recollect memories, look through images in your gallery, or read history books in libraries, you are going through records of past events. In these cases, the records of the events are stored in your brain, galleries, or libraries.
Similarly, every interaction you have on the internet can be regarded as an event. As a developer, besides initiating these events, you are also responsible for designing systems to respond to them in the applications you build. Event-driven computation helps you do this effectively.
Event-driven computation is a programming paradigm that focuses on processing events as they happen by delegating control to a system that watches for events to occur. Many applications rely on it, including graphical user interfaces, fraud detection systems, real-time analytics dashboards, and so on. An event can be defined as a significant change in the state of an application. Examples include a user clicking a button on a website, a sensor detecting a temperature change in the environment, and a fraud detection system flagging a transaction.
Batch processing is an alternative to processing events as they occur. It involves waiting for the data to accumulate before processing it. While this is useful in some situations, many applications and their users require near-real-time responses to events. Event-driven computation can help you achieve this by reducing latency significantly. It also helps applications better use resources by breaking down operations into smaller bits that can be completed more efficiently than batch operations. This also reduces the overall system complexity, which is essential when dealing with systems that involve intensive computations.
This article will discuss event-driven computation and its advantages. You will learn how event-driven computation works and take a look at the various tools that can be used to implement different stages of the event-driven computation process.
Event-driven computation is the key component that enables real-time systems to function, and many of the online applications relied on today are built on this model. The following are just a few examples of event-driven computation’s benefits.
The responsiveness of a system can be described as its ability to complete a specific task in the shortest amount of time. For instance, when you click a button in a mobile application, you want it to perform its intended function quickly and correctly. With event-driven computation, the latency of your system will be significantly reduced since each designated event is processed as it occurs and not in batches.
Complexity arises in data applications when different services must directly communicate with one another to process ingested data. With event-driven computation, you can decouple your system into individual units that don’t need to communicate directly with one another. As a result, the data flow is separated from the application’s core logic. This makes your system easy to design, maintain, and update.
For instance, consider a music processing pipeline that includes services for ingesting tracks and extracting both low-level and high-level information from them. In a traditional system, the ingestion service would need to coordinate directly with the other services to ensure the track is processed. In an event-driven system, however, the ingestion service simply publishes an event when music is ingested. The other services listen to their respective events and perform their tasks. This makes your system easier to manage and less complex.
Cutting your operating costs is a direct way to increase your company’s profitability. In a batch processing system, a reasonable amount of memory and computational power have to be set aside for data storage and processing. However, in an event-driven computation system, your application processes events as they occur in smaller volumes. Much less memory needs to be allocated, and this can be done based on demand. This helps you and your organization save resources.
To successfully implement event-driven computation, you must first understand the key steps involved: generating events, streaming data, creating a materialized view, querying the view, and persisting data. The flow of data from an IoT temperature sensor installed in a supermarket’s cold storage room will be used as an example to help explain these steps. This storage room is assumed to hold perishable food items such as fruits and vegetables that must be kept within 1 to 2 degrees Fahrenheit of their optimal temperature to extend their shelf life.
The first stage in event-driven computation is the generation of an event in an application. This indicates that there has been a state change that requires attention. Users, other applications, or the system itself can trigger an event.
In the temperature sensor example, temperature values are recorded at distinct timestamps, which means each temperature reading is regarded as an event to be streamed. Without the critical step of event generation, the entire system would be dormant because there would be no initial trigger. The temperature sensor would function as a thermometer.
Events can be generated using programming languages like Java and Python, as well as frameworks like Apache Kafka and Amazon Kinesis. A sample temperature reading event in JSON format would look something like this:
{
"temperature": 1,
"timestamp": "2023-03-23 14:11:04"
}
After an event is generated, it is streamed to the appropriate application for processing. Streaming is the process of continuously transmitting data in real time, rather than in batch mode. It is vital for event-driven computation because it enables the application to respond to the action initiated by the event generation step in real time.
After recording a new temperature value, the temperature sensor must send data to an application or control system to monitor and make decisions as needed.
Streaming is carried out using messaging systems or platforms like Apache Kafka, Amazon Kinesis, Google Cloud Pub/Sub, and so on. These systems are designed to provide fault tolerance, reliable delivery, scalability, and security for stream data.
A materialized view is a data snapshot that represents the system’s current state based on all events processed at any given point. This is especially useful in streaming scenarios because it allows performing incremental event computations as they are streamed, reducing the computational burden when the processed data is required later on.
Materialized views can be stored in memory or a database for later access by other services requiring current information. You can use tools like RisingWave, Apache Spark, or Amazon Redshift to create a materialized view.
Let’s consider how you’d use RisingWave to create a materialized view of temperature readings. RisingWave requires that you first create a source stream that feeds data to the materialized view. Setting up a temperature data stream in Kafka allows you to connect to it in RisingWave with the following SQL statement:
CREATE SOURCE temperature_reading (
timestamp TIMESTAMP,
temperature FLOAT
) WITH (
connector = 'kafka',
topic = 'temperature_reading',
properties.bootstrap.server = 'message_queue:29092',
scan.startup.mode = 'earliest'
) ROW FORMAT JSON;
This code block defines a source called temperature_reading
that consumes JSON messages from a Kafka data stream using the CREATE SOURCE
command. Each reading is expected to include a temperature value and a timestamp.
Next, you’d create the materialized view using the command below:
CREATE MATERIALIZED VIEW avg_temp_1min AS
SELECT
AVG(temperature) AS avg_temp,
window_end AS reported_timestamp
FROM
TUMBLE(
temperature_reading,
timestamp,
INTERVAL '1' MINUTE
)
GROUP BY
window_end;
The code above uses the CREATE MATERIALIZED VIEW
command to create a materialized view named avg_temp_1min
to calculate the average temperature reading from the sensor every minute. The temperature events are divided into time windows, and the average temperature per window is aggregated in this view. The TUMBLE
function is used to map each event into a one-minute window, after which the average temperature is calculated within each time window. The end of the time window is used as the reported timestamp.
The next step is to query the view. Queries, such as SQL queries and GraphQL queries, are used to retrieve relevant data from the materialized view for use in other applications.
Depending on the application’s requirements, these queries could be simple or complex. The query results could be used to gain insights into the system’s state, perform actions based on business logic, or trigger additional events. Materialized views can be queried using RisingWave, Kafka Streams, Spark SQL, and other tools.
The avg_temp_1min
materialized view you learned how to create earlier can be queried as follows:
SELECT * FROM avg_temp_1min
ORDER BY reported_timestamp;
The command in this code block selects all time-windowed average temperature readings and sorts them according to the reported timestamp. The average temperature can also be reported as a metric to a tool like Grafana, which can visualize it and send alerts when temperature readings exceed a certain threshold.
After the data has been queried, the final step is to save it, which entails permanently storing it for later retrieval. Persistence can be achieved by writing data to a file, storing it in a database, or storing it in a data warehouse. Persisting data is crucial because it lets you store historical data, which allows you to perform long-term analysis later.
Some tools for persisting data include databases like MySQL, PostgreSQL, and Apache Cassandra, as well as data warehouses like Amazon Redshift and Google BigQuery.
conclusion
This article explained event-driven computation and its benefits, such as responsiveness, reduced complexity, and resource efficiency. Additionally, you learned about the process of event-driven computation and some of the typical tools used to complete each step, such as RisingWave, Amazon Kinesis, and so on.
Using event-driven computation, you can create applications that process data more effectively and quickly respond to events. This is important for applications such as financial trading systems, sensor networks, and social media platforms.
Keep up with the latest blog alerts via LinkedIn, Twitter, and Slack.
Fortune Adekogbe
Community Contributor