What is an aggregated event?

An aggregated event is a combination of multiple events into a single event for the purpose of simplifying event processing and analysis. Aggregated events contain data and metadata from the original raw events and provide a summary view of what happened.

Aggregating events is useful when analyzing large volumes of granular data where looking at individual events does not provide enough context. By grouping related events together, aggregated events reveal patterns, trends and insights that would be difficult to discern from raw events alone.

Some key benefits of aggregated events include:

Reduced data volume – Aggregating millions of raw events into thousands of aggregated events reduces data size and complexity.
Faster processing and analysis – Querying and analyzing aggregated events is much faster than raw events.
Clearer insights – Trends and issues are more visible in aggregated event data.

Simplified data models – Aggregated events fit into simpler data structures compared to massive raw event data.

Aggregated events are commonly used in application performance monitoring, security information and event management (SIEM), business intelligence and analytics. The aggregated events provide a birds-eye view of system or business behavior.

When to Use Aggregated Events

Aggregated events should be used when:

There is a high volume of repetitive, low value events which can be consolidated.
Correlating related events provides more context and insight.
Querying and analyzing raw events is inefficient or infeasible.

Only summarized data is needed for reporting and monitoring.
Data needs to be integrated across multiple systems.

For example, an ecommerce site may log millions of page views, item views and transactions per day. Analyzing each individual event would be meaningless. But if these events were aggregated into visitor sessions with summaries of page views, items viewed and transactions per session, it provides useful insight into visitor behavior.

On the other hand, aggregated events may not be useful when:

Granular, individual event details are needed.
Latency needs to be minimized, aggregating adds processing time.

It’s unclear what events should be correlated or how they should be aggregated.
Data volumes are low enough to analyze raw events directly.

So in cases where low latency, individual event data or unstructured events are needed, aggregated events may not be useful. The tradeoff between raw and aggregated events depends on the use case.

How Aggregated Events are Constructed

Aggregated events are constructed by collecting related raw events, extracting relevant data and combining it into a new summarized event. Here are the main steps:

Identify related events to aggregate – Events with correlated data like user sessions, transactions, etc.
Determine time window – The span of time to collect events for aggregation such as 60 seconds.

Apply filters – Filter raw events to the subset needed for aggregation.
Extract fields – Pull relevant data from raw events into the aggregated event record.
Apply functions – Calculate summaries like count, sum, average on the extracted fields.

Construct new event – Populate the aggregated event record with extracted data.
Send aggregated event – Publish the aggregated event to various destinations like databases, message queues, etc.

For example, to aggregate page view events into visitor sessions:

Collect all page view events for a user’s visit.
Use a 30 minute time window to group events into sessions.
Apply filter to only include web events.

Extract fields like user ID, page URL, timestamp into aggregated record.
Calculate count of page views and total duration of session.
Construct new session event with extracted data and summaries.

Send to database for analysis.

The aggregation logic can be implemented in different places like applications, middleware, streaming platforms or databases. The complexity can range from simple scripts to software frameworks designed specifically for aggregating event streams.

Aggregation Techniques

There are different techniques that can be used to aggregate events:

Time Window Aggregation

Time window aggregation collects events over a fixed time interval like 60 seconds. At the end of each time window, the events are aggregated and a new summary event is constructed. This provides aggregations at regular time intervals like per minute, hour or day. Time windows can be:

Fixed/Tumbling – Non-overlapping windows of fixed length.
Hopping – Overlapping windows with fixed hop interval.

Sliding – Overlapping windows of fixed length sliding by time or number of events.

Session Window Aggregation

Session windows aggregate events belonging to a user session. A session could expire after a period of inactivity (like 30 minutes) or at a logical end point like logout. Session windows are typically used to aggregate events from a single user into a single summarized event covering the session duration.

Key-based Aggregation

Events can be aggregated based on key attributes like user ID, product ID, etc. All events with the same key are combined to create aggregated events per key. This is useful for aggregating all events related to a user or product.

Micro-batch Aggregation

Events are aggregated in small, fixed time batches like 5 seconds. Micro-batch aggregation provides frequent aggregations and low latency compared to longer time windows. Tools like Apache Spark Streaming use micro-batch aggregation.

Pre-Aggregation

Pre-aggregation summarizes raw data into aggregated data structures before analyzing events. Columnar databases like Apache Druid use pre-aggregated data like rollup tables to provide fast aggregation queries. Pre-aggregation tradeoffs increased storage for faster queries.

The choice of aggregation technique depends on the use case – session windows for user behavior, time windows for period comparisons or micro-batches for frequent aggregations. Multiple techniques can be combined like pre-aggregating data into windows.

Aggregated Event Structure

Aggregated events typically contain:

Key attributes – ID or grouping key like user ID, product ID used to aggregate events.
Time range – Start and end time for the event aggregation window.

Summaries – Aggregated metrics like count, sum, avg, max/min computed over window.
Original event data – Select original event attributes included in the aggregate.

For example, a website visit aggregated event could contain:

Visitor ID
Session start and end timestamp
Page views count

Average session duration
Landing page URL

The aggregated event schema depends on the analysis needs. Including original event attributes provides details, while excluding them decreases data size.

Example Aggregated Event Format

Here is an example format for an aggregated event in JSON:

{
  "event_type": "website_session_summary",
  
  "key": "12345", // visitor id
  
  "start_time": "2023-01-01T08:00:00Z",
  "end_time": "2023-01-01T08:30:00Z",  

  "page_views": 10,
  "avg_duration": 200, 
  "exit_page": "/exitpage.html",

  "user_agent": "Chrome", // original event attribute
  "client_ip": "1.2.3.4" // original event attribute
  
}

The event_type, key and time attributes identify the aggregation. The summary attributes provide metrics like page_views computed over the time window. Original event attributes provide context like user agent details.

Challenges with Aggregated Events

Some key challenges with aggregated events include:

Data Loss

Aggregating data leads to loss of granular event details. This can limit troubleshooting and auditing capabilities. Strategies like selective aggregation and including original event attributes help reduce data loss.

Complex Processing

Aggregating event streams in real-time requires distributed stream processing. This adds complexity for partitioning, time synchronization, ordering and fault tolerance.

Replayability

It can be difficult to recreate aggregated events due to loss of raw events or ordering issues. Strategies like deterministic aggregation algorithms help. But full replayability usually requires persisting raw events.

Duplicate Data

Aggregating data from multiple systems can result in duplicate aggregated records. Deduplication mechanisms are needed to avoid double counting.

Data Drift

Aggregated data can start to drift and become inaccurate over time as raw events are continually aggregated. Periodic rebuilding or reprocessing of aggregate tables/windows is required.

Testing

Testing aggregated event pipelines requires generating realistic test data across time windows. Production data clones or synthetic data generation is needed.

Common Aggregation Scenarios

Some common scenarios where aggregating events is useful:

User Session Analysis

Aggregate web/app events like page views, clicks into user sessions to analyze visitor behavior over time.

Sales Funnel Analysis

Aggregate customer engagement events like email opens, landing page views into conversion funnels to understand drop off.

Application Performance Monitoring

Aggregate fine-grained transaction events into hourly/daily metrics for application health monitoring.

Fraud Detection

Aggregate account activities over time and across accounts to identify anomalous patterns indicating fraud.

Network Monitoring

Aggregate network events into connectivity or congestion summaries for identifying issues.

Server Monitoring

Aggregate server metrics like CPU, memory, disk into hourly rolls for infrastructure monitoring.

Tools for Aggregating Events

Some popular tools and platforms for processing and aggregating event data:

Tool	Description
Apache Kafka Streams	Stream processing library for stateful aggregations on Kafka.
Apache Flink	Stream processing engine with built-in windowing and time handling.
Apache Spark	Micro-batch stream processing for aggregating in small intervals.
Apache Druid	Analytical database designed for fast aggregates on time series.
Clickhouse	Column-oriented database good for aggregating large data volumes.
Datadog	Monitoring platform that aggregates metrics and events.
Elastic Stack	Aggregates logs and metrics with components like Logstash and Beats.

The choice depends on the data architecture, infrastructure and aggregating needs – like micro-batch vs streaming or time-series specific storage.

Best Practices

Some best practices for working with aggregated events:

Clearly define aggregation requirements – what events to aggregate, over what time windows, how to handle late events.
Minimize data loss – Be selective in aggregation, include original event attributes where possible.

Handle out-of-order events – Events may arrive out of sequence, use event timestamps for proper ordering.
Uniquely identify aggregates – Have a primary key like user ID to uniquely identify an aggregate record.
Recompute aggregates – Periodically rebuild aggregates to handle data drift over time.

Monitor aggregates – Check for duplicates, gaps in data, unexpected counts, etc.
Document aggregation logic – Document how events are mapped to aggregates for troubleshooting.
Test thoroughly – Have automated tests across aggregation scenarios and time windows.

Proper design is important as rebuilding aggregates on large data volumes can be expensive.

Key Takeaways

Some key takeaways on aggregated events:

Aggregated events provide a summarized view of related events over a time window.

Benefits include faster analysis, clearer insights and reduced data volume.
Common aggregation techniques include time windows, micro-batches and sessionization.
Aggregated event structure includes a key, time window, summary metrics and original attributes.

Challenges include data loss, duplicate records, replayability issues.
Aggregation useful in scenarios like monitoring, fraud detection, analytics.
Tools like Kafka, Flink, Spark and Druid can aggregate event data.

By consolidating large volumes of granular data into meaningful aggregated views, aggregated events enable performing analytics and gaining insights that are hard to achieve on raw data alone. With a carefully designed aggregation strategy, they provide the right balance of summary and detail for monitoring systems and understanding behavior.

Conclusion

Aggregated events are a critical construct for analyzing and monitoring event data at scale. They complement raw events by providing summarized views. The keys are to aggregate along relevant dimensions like sessions or time windows, minimize data loss and process events consistently. With the right aggregation approach and architecture, organizations can tap into the value of massive volumes of event data that would otherwise be buried in low level detail.