Want to dive even deeper?

Take the course Advanced Penetration Testing for Highly-Secured Environments by Aaron Johns and become an expert!
Advanced Penetration Testing for Highly-Secured Environments
by Aaron Johns

Check it out!
You're watching a preview of this video, click the button on the left to puchase the full version from Richard Warburton's Channel.

Going Reactive: Scalable, Highly Concurrent & Fault-Tolerant Systems

The skills of building Scalable, Highly Concurrent, Event-driven and Resilient Systems are becoming increasingly important in our new world of Cloud Computing, multi-core processors, Big Data and Real-Time Web. Unfortunately, many people are still doing it wrong; using the wrong tools, techniques, habits and ideas. In this talk we will look at what it means to 'Go Reactive' and discuss some of the most common (and some not so common but superior) practices; what works - what doesn't work - and why.

Published on
  • 12.633
  • 45
  • 2
  • 29
  • 0
  • Going Reactive: Event-Driven, Scalable & Resilient Systems Jonas Bonér CTO Typesafe Twitter: @jboner
  • I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again I will never use distributed objects again Lessons Learned through... Agony and Pain lots of Pain
  • New tools for a new era • The demands and expectations for applications have changed dramatically in recent years
  • New tools for a new era • The demands and expectations for applications have changed dramatically in recent years • We to need to build systems that: • react to events - Event-Driven
  • New tools for a new era • The demands and expectations for applications have changed dramatically in recent years • We to need to build systems that: • • react to events - Event-Driven react to load - Scalable
  • New tools for a new era • The demands and expectations for applications have changed dramatically in recent years • We to need to build systems that: • • • react to events - Event-Driven react to load - Scalable react to failure - Resilient
  • New tools for a new era • The demands and expectations for applications have changed dramatically in recent years • We to need to build systems that: • • • • react to events - Event-Driven react to load - Scalable react to failure - Resilient react through a rich and engaging UI – Interactive Reactive Applications
  • The four traits of Reactive Interactive Scalable Resilient Event-Driven
  • So how can we get there? • It’s All Trade-offs • Go Event-Driven • Go Resilient • Go Scalable • Go Interactive
  • Performance vs Scalability
  • Latency vs Throughput
  • Availability vs Consistency
  • Go Event-Driven
  • Shared mutable state
  • Shared mutable state Together with threads...
  • Shared mutable state Together with threads... ...code that is totally INDETERMINISTIC ...leads to ...and the root of all EVIL
  • Shared mutable state Together with threads... ...code that is totally INDETERMINISTIC ...leads to ...and the root of all EVIL Please, avoid it at all cost
  • Shared mutable state Together with threads... LE AB UT M IM EVIL se U id up st te ta it at all cost savoid Please, ...code that is totally INDETERMINISTIC ...leads to ...and the root of all
  • The problem with locks • Locks do not compose • Locks break encapsulation • Taking too few locks • Taking too many locks • Taking the wrong locks • Taking locks in the wrong order • Error recovery is hard
  • 1. Never block • ...unless you really have to • Blocking kills scalability (& performance) • Never sit on resources you don’t use • Use non-blocking IO • Use lock-free concurrency
  • 2. Go Async Design for reactive event-driven systems • • • Use asynchronous event/message passing Think in workflow, how the events flow in the system Gives you 1. lower latency 2. better throughput 3. a more loosely coupled architecture, easier to extend, evolve & maintain
  • Needs to be Event-Driven all the way down
  • Traditional vs Non-blocking def getTweets = Action { Ok(WS.get("http://twitter.com/")) } Client' blocking' Server' blocking' Service'
  • Traditional vs Non-blocking def getTweets = Action { Ok(WS.get("http://twitter.com/")) } Client' blocking' Server' blocking' Service' non0blocking' Service' def getTweets = Action { Async { Ok(WS.get("http://twitter.com/")) }} Client' non0blocking' Server'
  • You deserve better (and more fun) tools • Actors • Agents • Futures • FRP/RX
  • Actors •Share NOTHING •Isolated lightweight event-based processes •Each actor has a mailbox (message queue) •Communicates through asynchronous & non-blocking message passing •Location transparent (distributable) •Examples: Akka & Erlang
  • Agents • Reactive memory cells • Send a update function to the Agent, which 1. adds it to an (ordered) queue, to be 2. applied to the Agent async & non-blocking • Reads are “free”, just dereferences the Ref • Composes nicely • Examples: Clojure & Akka
  • Futures • Allows you to spawn concurrent computations and work with the not yet computed results • Write-once, Read-many • Freely sharable • Allows non-blocking composition • Monadic (composes in for-comprehensions) • Build in model for managing failure
  • Functional Reactive Programming (FRP) • Extend Futures with the concept of a stream • Functional variation of the observer pattern • A signal attached to a stream of events • The signal is reevaluated for each event • Model events on a linear timeline - deterministic • Composes nicely • Examples: Rx, Reactive.js, RxJava, Scala.Rx, Knockout.js
  • Work with layers in complexity
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • Logic or Functional Programming
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow 2. Add Indeterminism selectively - only where needed
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow 2. Add Indeterminism selectively - only where needed • Actor or Agent-based Programming
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow 2. Add Indeterminism selectively - only where needed • Actor or Agent-based Programming 3. Add Shared Mutability selectively - only where needed
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow 2. Add Indeterminism selectively - only where needed • Actor or Agent-based Programming 3. Add Shared Mutability selectively - only where needed • Protected by Transactions (STM)
  • Work with layers in complexity 1. Start with a Deterministic, Declarative & Immutable core • • Logic or Functional Programming Futures, FRP or Dataflow 2. Add Indeterminism selectively - only where needed • Actor or Agent-based Programming 3. Add Shared Mutability selectively - only where needed • Protected by Transactions (STM) 4. Finally - only if really needed • Add Monitors (Locks) and explicit Threads
  • Go Resilient
  • Failure Recovery in Java/C/C# etc.
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with:
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic • SCATTERED all over the code base
  • Failure Recovery in Java/C/C# etc. • You are given a SINGLE thread of control • If this thread blows up you are screwed • So you need to do all explicit error handling WITHIN this single thread • To make things worse - errors do not propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed • This leads to DEFENSIVE programming with: • Error handling TANGLED with business logic • SCATTERED all over the code base o d n !! a ! c r e te W et b
  • The Right Way
  • The Right Way • Isolate the failure • Compartmentalize • Manage failure locally • Avoid cascading failures Use Bulkheads
  • ...together with supervision
  • ...together with supervision 1. Use Isolated lightweight processes (compartments) 2. Supervise these processes 1. Each process has a supervising parent process 2. Errors are reified and sent as (async) events to the supervisor 3. Supervisor manages the failure - can kill, restart, suspend/resume • Same semantics local as remote • Full decoupling between business logic & error handling • Build into the Actor model
  • Go Scalable
  • Performance vs Scalability
  • How do I know if I have a performance problem? If your system is slow for a single user
  • How do I know if I have a scalability problem?
  • How do I know if I have a scalability problem? If your system is fast for a single user but slow under heavy load
  • Fallacy 1: Transparent Distributed Computing • Distributed Shared Mutable State • N EVIL (where N is number of nodes) • Distributed Objects • “Sucks like an inverted hurricane” - Martin Fowler • Distributed Transactions • Good reading: • A Note On Distributed Computing - Waldo et. al. • Six Misconceptions about Reliable Distributed Computing - Werner Vogels
  • Fallacy 2: RPC • Emulating synchronous method dispatch across the network is a BAD THING • Ignores: • Latency • Partial failures • General scalability and distributed computing concerns • Good reading: • Convenience over Correctness - Steve Vinoski
  • Instead Embrace the Network Use Asynchronous Message Passing e n a d n e b do w th i it
  • Guaranteed Delivery Delivery Semantics • No guarantees • At most once • At least once • Once and only once
  • It’s all lies.
  • The network is inherently unreliable and there is no such thing as 100% guaranteed delivery It’s all lies.
  • Guaranteed Delivery The question is what to guarantee
  • Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network?
  • Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue?
  • Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue? 4. The message is - applied to the receiver? 5. The message is - starting to be processed by the receiver?
  • Guaranteed Delivery The question is what to guarantee 1. The message is - sent out on the network? 2. The message is - received by the receiver host’s NIC? 3. The message is - put on the receiver’s queue? 4. The message is - applied to the receiver? 5. The message is - starting to be processed by the receiver? 6. The message is - has completed processing by the receiver?
  • Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one
  • Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one Different USE-CASES Different GUARANTEES Different COSTS
  • Ok, then what to do? 1. Start with 0 guarantees (0 additional cost) 2. Add the guarantees you need - one by one Different USE-CASES Different GUARANTEES Different COSTS For each additional guarantee you add you will either: • decrease performance, throughput or scalability • increase latency
  • Just Use ACKing and be done with it
  • Use Batching http://www.aosabook.org/en/zeromq.html
  • Use Batching http://www.aosabook.org/en/zeromq.html
  • Use Batching In a JVM-based application it is more like: 1. Application 2. NIO 3. JVM 4. User/Kernel space boundary 5. TCP 6. IP 7. Ethernet layer 8. NIC
  • Latency vs Throughput
  • You should strive for maximal throughput with acceptable latency
  • Go Reactive Scalable Resilient Event-Driven
  • Go Reactive Interactive Scalable Resilient Event-Driven
  • Thank You Contact me Email: jonas@typesafe.com Web: typesafe.com Twitter: @jboner

Comments

Nicholas Sterling 2 years ago
Great presentation! Clear, no-holds-barred advice from a real expert.
Nicholas Sterling 2 years ago
The book Jonas mentioned: aosabook.org/en/zeromq.html

Login to add comments!