Friday, August 2, 2013

CEP Performance: Processing 100k to Millions of events per second using WSO2 Complex Event Processing (CEP) Server

With WSO2 CEP, you can use SQL style queries to detect interesting patterns across many data streams. We call the standalone version of the CEP as Siddhi, and that is what you need to use if you need to embed CEP engine within a java program. On the other hand, WSO2 CEP provides CEP query support running as a server, and you can send events using Thrift, Web Services, REST calls, JMS, and emails.

WSO2 CEP can handle few 100k events over the network and few million events within the JVM. We had done and publish those numbers before. In this post, I will try to put all together and give context into different numbers.

In the following, event includes multiple properties and queries matches those events against given conditions. 

Same JVM Events performance

 Setup: We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel.  We genereted events from the same JVM.

Case 1: Simple filler (from StockTick[prize >6] return  symbol, prize)

Case 2:  Window (From StickTick[symbol=‘IBM’]#win.time(0.005)  return  symbol, avg(prize))

Case 3: Events patterns A->B (A followed by B). 

From f=FraudWarningEvent ->
return accountNumber;

Performance Over the Network

Setup: We used Intel® Core™ i7-2630QM CPU @ 2.00GHz, 8 cores, 8GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for running CEP and used Intel® Core™ i3-2350M CPU @ 2.30GHz, 4 cores, 4GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for the three client nodes.

Following results are for a simple filter, we sent events over the network using thrift. 


Performance For a Complex Scenario

Finally following is the performance for DEBS grand challenge. Grand challenge detect following scenarios from the event generated from a real football game.

Usecase 1: Running analysis

The first usecase measures each player’s running speeds and calculates how long he spent on different speed ranges. For example, results will tell that the player "Martin" is running fast from the time 27 minutes and 01 second of the game to 27 minute and 35 second of the game.

Usecase 2 & 4: Ball Possession and Shots on Goal

 For the second use case, we need to calculate the time each player controlled the ball (ball possession). A player controls the ball from the time he hit the ball until someone has hit the ball, ball goes out of the ground, or game is stopped. We identify hits when a ball is within one meter of a player and its acceleration increases by more than 55ms-2.

The usecase four is to detect hits and emit events if the ball is going to the goal.

Usecase 3: Heatmap of Activity

Usecase three divides the ground to a grid, and calculate the time a player spends on each cell. However, this usecase needs updates once every second. First part we can solve just like with the first usecase, but to make sure we get an event once a second, we had to couple it with a timer.

You can find more information from , and you can find the queries from this blog post

Setup: VM with 4 cores (@2.8 GHz), 4 GB RAM, SSD HDD, and 1GB Ethernet, and we replayed events from the same JVM. 


No comments: