SlideShare a Scribd company logo
Real-time analytics as a service at King
© King.com Ltd 2016 – Commercially confidential
Real-Time Analytics as a Service at
King
Gyula Fóra
Data Warehouse Engineer
Apache Flink PMC
Page 2
© King.com Ltd 2016 – Commercially confidential
We make awesome mobile games
463 million monthly active users
30 billion events per day
And a lot of data…
Page 3
About King
© King.com Ltd 2016 – Commercially confidential
From streaming perspective…
Page 4
DB
30 billion events / day
Analytics/Processing applications
Terabytes of state
DB
DB
© King.com Ltd 2016 – Commercially confidential
This is awesome, but…
Page 5
End-users are often not Java/Scala developers
Writing streaming applications is pretty hard
Large state and windowing doesn’t help either
Seems to work in my IDE, what next?
We need a “turnkey” solution
© King.com Ltd 2016 – Commercially confidential
The RBea platform
Page 6
Powered by Apache Flink
Scripting on the live streams
Window aggregates
Stateful computations
Scalable + fault tolerant
© King.com Ltd 2016 – Commercially confidential
RBea architecture
Page 7
Events Output
REST API
RBEA web frontend
Libraries
http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg
Data Scientists
© King.com Ltd 2016 – Commercially confidential
RBea backend implementation
Page 8
One stateful Flink job / game
Stream events and scripts
Events are partitioned by user id
Scripts are broadcasted
Output/Aggregation happens downstream
S1 S2
S3
S4
S5
Add/Remove scripts
Event stream Loop over deployed
scripts and process
CoFlatMap
Output based
on API calls
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 9
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 10
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Processing methods by annotation
Event filter conditions
Flexible argument list
Code-generate Java classes
=> void processEvent(Event e, Context ctx);
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 11
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Output calls create Output events
Output(KAFKA, “purchases”, “…” )
These events are filtered downstream and
sent to a Kafka sink
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 12
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Aggregator calls create Aggregate events
Aggr (MYSQL, 60000, “PurchaseCount”, 1)
Flink window operators do the aggregation
© King.com Ltd 2016 – Commercially confidential
Aggregators
Page 13
long size = aggregate.getWindowSize();
long start = timestamp - (timestamp % size);
long end = start + size;
TimeWindow tw = new TimeWindow(start, end);
Event time windows
Window size / aggregator
Script1
Script2
Window 1Window 2 NumGames
Revenue
W1: 8999
W2: 9001
W1: 200
W2: 300
MyAggregator
W1: 10
W2: 5
Dynamic window assignment
© King.com Ltd 2016 – Commercially confidential Page 14
RBea physical plan
© King.com Ltd 2016 – Commercially confidential
How do we run Flink
Page 15
Standalone => YARN
Few heavy streaming jobs => more and more
RocksDB state backend
Custom deployment/monitoring tools
© King.com Ltd 2016 – Commercially confidential
Monitoring our jobs
Page 16
© King.com Ltd 2016 – Commercially confidential
King Streaming SDK (sneak preview)
Page 17
Goal: Bridge the gap between RBea and Flink
Build data pipelines from RBea processors
Strict event format, limited set of operations
Easy ”stream joins” and pattern matching
Thin wrapper around Flink
© King.com Ltd 2016 – Commercially confidential
King Streaming SDK (sneak preview)
Page 18
Last<GameStart> lastGS = Last.semanticClass(GameStart.class);
readFromKafka("event.myevents.log", "gyula")
.keyByCoreUserID()
.join(lastGS)
.process((event, context) -> {
context.getJoined(lastGS).ifPresent(
lastGameStart -> {
context.getAggregators()
.getCounter("Purchases", MINUTES_10)
.setDimensions(lastGameStart.getLevel())
.increment();
});
});
© King.com Ltd 2016 – Commercially confidential
Closing
Page 19
RBea makes streaming accessible to every
data scientist at King
We leverage Flink’s stateful and windowed
processing capabilities
People love it because it’s simple and
powerful
Thank you!

More Related Content

PDF
RBea: Scalable Real-Time Analytics at King
PDF
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
PDF
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
PDF
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
PDF
Scaling Infrastructure at Picnic
PDF
Tech Thursday - Beer & DevOps 24.11.
PPTX
Stream Processing Live Traffic Data with Kafka Streams
PDF
FUTURESTACK13: Software analytics with Project Rubicon from Alex Kroman Engin...
RBea: Scalable Real-Time Analytics at King
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Scaling Infrastructure at Picnic
Tech Thursday - Beer & DevOps 24.11.
Stream Processing Live Traffic Data with Kafka Streams
FUTURESTACK13: Software analytics with Project Rubicon from Alex Kroman Engin...

What's hot (20)

PDF
Kafka Streams - From the Ground Up to the Cloud
PDF
Logging in The World of DevOps
PDF
Presto Summit 2018 - 03 - Starburst CBO
PDF
Stream Processing Live Traffic Data with Kafka Streams
PPT
Application as data flow - LSCC Talks #5
PPTX
re:Invent re:Peat
PPTX
Implementing Real-Time IoT Stream Processing in Azure
PDF
Analyzing and processing FInancial Market Data on AWS with Kinesis - AWS Pop ...
PDF
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
PDF
Streaming sql and druid
PDF
Presto Summit 2018 - 04 - Netflix Containers
PDF
Real time observability with Redis and Grafana
PPTX
Serverless GraphQL. AppSync 101
PDF
Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017
PDF
Big Data on EC2: Mashing Technology in the Cloud
PPTX
Journey to the cloud
PDF
Storing State Forever: Why It Can Be Good For Your Analytics
PPTX
Inneractive - Spark meetup2
PDF
Presto Summit 2018 - 10 - Qubole
PPTX
Serverless
Kafka Streams - From the Ground Up to the Cloud
Logging in The World of DevOps
Presto Summit 2018 - 03 - Starburst CBO
Stream Processing Live Traffic Data with Kafka Streams
Application as data flow - LSCC Talks #5
re:Invent re:Peat
Implementing Real-Time IoT Stream Processing in Azure
Analyzing and processing FInancial Market Data on AWS with Kinesis - AWS Pop ...
Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018
Streaming sql and druid
Presto Summit 2018 - 04 - Netflix Containers
Real time observability with Redis and Grafana
Serverless GraphQL. AppSync 101
Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017
Big Data on EC2: Mashing Technology in the Cloud
Journey to the cloud
Storing State Forever: Why It Can Be Good For Your Analytics
Inneractive - Spark meetup2
Presto Summit 2018 - 10 - Qubole
Serverless
Ad

Viewers also liked (20)

PDF
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
PPTX
Apache Flink Community Updates November 2016 @ Berlin Meetup
PDF
Building Big Data Streaming Architectures
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
PDF
A look at Flink 1.2
PDF
Real-time Stream Processing with Apache Flink @ Hadoop Summit
PPTX
KDD 2016 Streaming Analytics Tutorial
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
PDF
Large-Scale Stream Processing in the Hadoop Ecosystem
PDF
Streaming Analytics
PPTX
Data Streaming (in a Nutshell) ... and Spark's window operations
PPTX
Stream Analytics in the Enterprise
PDF
Reliable Data Intestion in BigData / IoT
PDF
Stream Processing Everywhere - What to use?
PDF
The end of polling : why and how to transform a REST API into a Data Streamin...
PDF
Stateful Distributed Stream Processing
PDF
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
PDF
Oracle Stream Analytics - Simplifying Stream Processing
PDF
Apache Kafka - Scalable Message-Processing and more !
PDF
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Community Updates November 2016 @ Berlin Meetup
Building Big Data Streaming Architectures
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
A look at Flink 1.2
Real-time Stream Processing with Apache Flink @ Hadoop Summit
KDD 2016 Streaming Analytics Tutorial
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Large-Scale Stream Processing in the Hadoop Ecosystem
Streaming Analytics
Data Streaming (in a Nutshell) ... and Spark's window operations
Stream Analytics in the Enterprise
Reliable Data Intestion in BigData / IoT
Stream Processing Everywhere - What to use?
The end of polling : why and how to transform a REST API into a Data Streamin...
Stateful Distributed Stream Processing
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Oracle Stream Analytics - Simplifying Stream Processing
Apache Kafka - Scalable Message-Processing and more !
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Ad

Similar to Real-time analytics as a service at King (20)

PDF
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
PDF
Apache Flink @ Tel Aviv / Herzliya Meetup
PPTX
Data Stream Processing with Apache Flink
PDF
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
PPTX
QCon London - Stream Processing with Apache Flink
PPTX
GOTO Night Amsterdam - Stream processing with Apache Flink
PPTX
Streaming in the Wild with Apache Flink
PPTX
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
PDF
Santander Stream Processing with Apache Flink
PDF
Unified Stream and Batch Processing with Apache Flink
PDF
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
PPTX
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
PPTX
The Stream Processor as a Database Apache Flink
PPTX
Let's Play Flink – Fun with Streaming in a Gaming Company
PPTX
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
PPTX
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
PDF
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
PPTX
Streaming in the Wild with Apache Flink
PDF
Don't Cross The Streams - Data Streaming And Apache Flink
PDF
Complex event processing platform handling millions of users - Krzysztof Zarz...
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
Apache Flink @ Tel Aviv / Herzliya Meetup
Data Stream Processing with Apache Flink
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
QCon London - Stream Processing with Apache Flink
GOTO Night Amsterdam - Stream processing with Apache Flink
Streaming in the Wild with Apache Flink
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Santander Stream Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
The Stream Processor as a Database Apache Flink
Let's Play Flink – Fun with Streaming in a Gaming Company
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
Streaming in the Wild with Apache Flink
Don't Cross The Streams - Data Streaming And Apache Flink
Complex event processing platform handling millions of users - Krzysztof Zarz...

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Data Science Trends & Career Guide---ppt
PDF
Data Analyst Certificate Programs for Beginners | IABAC
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
batch data Retailer Data management Project.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Report The-State-of-AIOps 20232032 3.pdf
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Azure Data management Engineer project.pptx
PPTX
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
PPTX
artificial intelligence deeplearning-200712115616.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Company Presentation pada Perusahaan ADB.pdf
PPTX
Trading Procedures (1).pptxcffcdddxxddsss
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PDF
Linux OS guide to know, operate. Linux Filesystem, command, users and system
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Data Science Trends & Career Guide---ppt
Data Analyst Certificate Programs for Beginners | IABAC
Business Ppt On Nestle.pptx huunnnhhgfvu
Launch Your Data Science Career in Kochi – 2025
batch data Retailer Data management Project.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Miokarditis (Inflamasi pada Otot Jantung)
Report The-State-of-AIOps 20232032 3.pdf
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Azure Data management Engineer project.pptx
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
artificial intelligence deeplearning-200712115616.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Company Presentation pada Perusahaan ADB.pdf
Trading Procedures (1).pptxcffcdddxxddsss
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Taxes Foundatisdcsdcsdon Certificate.pdf

Real-time analytics as a service at King

  • 2. © King.com Ltd 2016 – Commercially confidential Real-Time Analytics as a Service at King Gyula Fóra Data Warehouse Engineer Apache Flink PMC Page 2
  • 3. © King.com Ltd 2016 – Commercially confidential We make awesome mobile games 463 million monthly active users 30 billion events per day And a lot of data… Page 3 About King
  • 4. © King.com Ltd 2016 – Commercially confidential From streaming perspective… Page 4 DB 30 billion events / day Analytics/Processing applications Terabytes of state DB DB
  • 5. © King.com Ltd 2016 – Commercially confidential This is awesome, but… Page 5 End-users are often not Java/Scala developers Writing streaming applications is pretty hard Large state and windowing doesn’t help either Seems to work in my IDE, what next? We need a “turnkey” solution
  • 6. © King.com Ltd 2016 – Commercially confidential The RBea platform Page 6 Powered by Apache Flink Scripting on the live streams Window aggregates Stateful computations Scalable + fault tolerant
  • 7. © King.com Ltd 2016 – Commercially confidential RBea architecture Page 7 Events Output REST API RBEA web frontend Libraries http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg Data Scientists
  • 8. © King.com Ltd 2016 – Commercially confidential RBea backend implementation Page 8 One stateful Flink job / game Stream events and scripts Events are partitioned by user id Scripts are broadcasted Output/Aggregation happens downstream S1 S2 S3 S4 S5 Add/Remove scripts Event stream Loop over deployed scripts and process CoFlatMap Output based on API calls
  • 9. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 9 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() }
  • 10. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 10 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Processing methods by annotation Event filter conditions Flexible argument list Code-generate Java classes => void processEvent(Event e, Context ctx);
  • 11. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 11 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Output calls create Output events Output(KAFKA, “purchases”, “…” ) These events are filtered downstream and sent to a Kafka sink
  • 12. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 12 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Aggregator calls create Aggregate events Aggr (MYSQL, 60000, “PurchaseCount”, 1) Flink window operators do the aggregation
  • 13. © King.com Ltd 2016 – Commercially confidential Aggregators Page 13 long size = aggregate.getWindowSize(); long start = timestamp - (timestamp % size); long end = start + size; TimeWindow tw = new TimeWindow(start, end); Event time windows Window size / aggregator Script1 Script2 Window 1Window 2 NumGames Revenue W1: 8999 W2: 9001 W1: 200 W2: 300 MyAggregator W1: 10 W2: 5 Dynamic window assignment
  • 14. © King.com Ltd 2016 – Commercially confidential Page 14 RBea physical plan
  • 15. © King.com Ltd 2016 – Commercially confidential How do we run Flink Page 15 Standalone => YARN Few heavy streaming jobs => more and more RocksDB state backend Custom deployment/monitoring tools
  • 16. © King.com Ltd 2016 – Commercially confidential Monitoring our jobs Page 16
  • 17. © King.com Ltd 2016 – Commercially confidential King Streaming SDK (sneak preview) Page 17 Goal: Bridge the gap between RBea and Flink Build data pipelines from RBea processors Strict event format, limited set of operations Easy ”stream joins” and pattern matching Thin wrapper around Flink
  • 18. © King.com Ltd 2016 – Commercially confidential King Streaming SDK (sneak preview) Page 18 Last<GameStart> lastGS = Last.semanticClass(GameStart.class); readFromKafka("event.myevents.log", "gyula") .keyByCoreUserID() .join(lastGS) .process((event, context) -> { context.getJoined(lastGS).ifPresent( lastGameStart -> { context.getAggregators() .getCounter("Purchases", MINUTES_10) .setDimensions(lastGameStart.getLevel()) .increment(); }); });
  • 19. © King.com Ltd 2016 – Commercially confidential Closing Page 19 RBea makes streaming accessible to every data scientist at King We leverage Flink’s stateful and windowed processing capabilities People love it because it’s simple and powerful