Skip to main content
Version: Next

Window

Window is one of the most important tools in stream processing. It splits unbounded stream data into bounded ports. Set operations such as aggregation, sort, and join are only applied to bounded data using windows.

We will explain about windows and related operations in the future but currently you can refer to the following materials to get understand them:

Window in SpringQL

Window types

Currently, SpringQL supports the follwing window types:

  • Event-time-based:
    • fixed window (also called tumbling window)
    • sliding window

Count-based windows, global windows, and session windows are not supported.

Window pane

You can create 0 or 1 window in a pump. WINDOW clause creates a window.

10-sec fixed window (event-time-based)
CREATE PUMP p AS
INSERT INTO s2 ...
SELECT ... FROM s1 ...
FIXED WINDOW DURATION_SECS(10);

Window and panes

A window has panes inside. A pane works as a bound of rows from unbounded rows.

You do not explicitly write about panes in SpringQL but SpringQL users may want to know about panes to understand windows' behavior.

When a pane start at for time-based windows?

Users cannot specify the start time of panes. For example, a pane in a 10-sec window can start from 12:00:00 but cannot from 12:00:05.

For 1-sec, 2-sec, 3-sec, 5-sec, ... and 60-sec (factors of 60) time-based windows, you may intuitively list start times. For 7-sec window, when are the start times? In the current implementation, 1970-01-01 00:00:00 (origin of unixtime) is used as an origin point. So the start times of 7-sec window panes are: 1970-01-01 00:00:00, 1970-01-01 00:00:07, ...

Triggers

SpringQL currently provides very limited functionality for triggers.

Downstreams of a window pump observe only the result of the window operation. They cannot see any intermediate values before panes (splits of a window) get closed.

Handling late data in SpringQL

WINDOW clauses in SpringQL take "allowed latency" argument.

1 sec allowed latency
CREATE PUMP p AS
INSERT INTO s2 ...
SELECT ... FROM s1 ...
FIXED WINDOW DURATION_SECS(10), DURATION_SECS(1);

In the above statement, you create 10-sec fixed window allowing 1-sec late data. Late data are allowed to get in a window pane only if their latencies are less than 1 second.

Late data (1)

^ At first, no pane is created inside the window.

Late data (2)

^ The window gets 12:00:00 row. It updates the watermark and [12:00:00, 12:00:10) pane is created from the new watermark.

Late data (3)

^ 12:00:10 row updates the watermark and creates [12:00:10, 12:00:20) pane. The [12:00:00, 12:00:10) pane is not closed by the new watermark because the window allows 1-sec latency.

Late data (4)

^ 12:00:05 comes after 12:00:10 but it is in-time due to the 1-sec allowed latency. The row does not update the watermark because the timestamp is less than the watermark.

Late data (5)

12:00:11 row updates the watermark and the [12:00:00, 12:00:10) pane is closed by the new watermark because the row exceeds the allowed latency of the pane. The new watermark creates [12:00:11, 12:00:21) pane.