Presentation Keynote - Scala with Style

Scala gives you awesome expressive power, but how to make best use of it? In my talk I will discuss the question what makes good Scala style. We will start with syntax and continue with how to name things, how to mix objects and functions, where (and where not) to use mutable state, and when to use which design pattern. As most questions of style, the discussion will be quite subjective, and some of it might be controversial. I am looking forward to discuss these topics with the conference attendees.

Speakers


PDF: slides.pdf

Slides

Scala With Style

Scala With Style Martin Odersky

This Talk

This Talk I’ll argue that we are at a transition period between two programming paradigms: Imperative/OOP ➔ Functional. In the end, we will see a fusion of these. In light of this development, many questions of programming techniques and style have to be revisited.

How OOP got started

How OOP got started First popular OO language: Simula 67 Second popular OO language: Smalltalk

Why Did OOP become popular?

Why Did OOP become popular? Encapsulation? Code Re-use? Dynamic Binding Dependency inversion? Liskov substition principle? Open-closed principle? No Don’t think so Not directly Came much later You gotta be kidding! It was because of the things you could do with OOP!

How OOP got started

How OOP got started Traditional approach: Linked List Two cases: Empty, NonEmpty Unbounded number of operations: ● ● ● ● ● ● map reverse print get elem insert ...

How OOP got started

How OOP got started New challenge: Simulation Fixed number of operations: nextStep toString aggregate Unbounded number of implementations: Car, Road, Molecule, Cell, Person, Building, City, ...

How OOP got started

How OOP got started New challenge: GUI widgets Fixed number of operations: redraw boundingRectangle move Unbounded number of implementations: Window, Menu, Letter, Shape, Curve, Image, Video, ...

What Simulation and GUIs have in common

What Simulation and GUIs have in common Both need a way to execute a fixed API with an unknown implementation. It’s possible to do this in a procedural language such as C. But it’s too cumbersome, so people wanted object-oriented languages instead.

What does this have to do with FP?

What does this have to do with FP? Just like OOP did then, FP has lots of methodological advantages: ● Fewer errors ● Better modularity ● Higher-level abstractions ● Shorter code ● Increased developer productivity But these alone are not enough for mainstream adoption (after all FP has been around for 50 years) Need a catalyzer, something that sparks initial adoption until the other advantages become clear to everyone.

A Catalyzer

A Catalyzer ● Two forces driving software complexity: ● ● ● ● Multicore (= parallel programming) Cloud computing (= distributed programming) Current languages and frameworks have trouble keeping up (locks/threads don’t scale) Need better tools with the right level of abstraction

Triple Challenge

Triple Challenge Parallel – how to make use of multicore, GPUs, clusters? Async – how to deal with asynchronous events Distributed – how to deal with delays and failures? Mutable state is a liability for each one of these! ● ● ● Cache coherence Races Versioning

The Essence of Functional Programming

The Essence of Functional Programming Concentrate on transformations of immutable values instead of stepwise modifications of mutable state.

What about Objects?

What about Objects? So, should we forget OO and all program in functional programming languages? No! What the industry learned about OO decomposition in analysis and design stays valid. Central question: What goes where? In the end, need to put things somewhere, or risk unmanageable global namespace.

New Objects

New Objects Previously: “Objects are characterized by state, identity, and behavior.” (Booch) Now: Eliminate or reduce mutable state. Structural equality instead of reference identity. Concentrate on functionality (behavior)

Can FP and OOP be combined?

Can FP and OOP be combined? FP OOP How many FP people see OOP ↑ And that’s where we are ☺ How many OOP people see FP

Would prefer it to be like this:

Would prefer it to be like this: But need to get rid of some baggage first

Bridging the Gap

Bridging the Gap ● Scala acts as a bridge between the two paradigms. ● To do this, it tries to be: ● ● ● ● orthogonal expressive un-opinionated It naturally adapts to many different programming styles.

A Question of Style

A Question of Style Because Scala admits such a broad range of styles, the question is which one to pick? Certainly, Scala is: ● ● Not a better Java Not a Haskell on the JVM But what is it then? I expect that a new fusion of functional and objectoriented will emerge.

Some Guidelines

Some Guidelines

#1 Keep it Simple

#1 Keep it Simple

#2 Don’t pack too much in one expression

#2 Don’t pack too much in one expression ● I sometimes see stuff like this: jp.getRawClasspath.filter( _.getEntryKind == IClasspathEntry.CPE_SOURCE). iterator.flatMap(entry => flatten(ResourcesPlugin.getWorkspace. getRoot.findMember(entry.getPath))) ● ● It’s amazing what you can get done in a single statement. But that does not mean you have to do it.

Find meaningful names

Find meaningful names ● ● There’s a lot of value in meaningful names. Easy to add them using inline vals and defs. val sources = jp.getRawClasspath.filter( _.getEntryKind == IClasspathEntry.CPE_SOURCE) def workspaceRoot = ResourcesPlugin.getWorkspace.getRoot def filesOfEntry(entry: Set[File]) = flatten(workspaceRoot.findMember(entry.getPath) sources.iterator flatMap filesOfEntry

#3 Prefer Functional

#3 Prefer Functional

Prefer Functional ...

Prefer Functional ... ● By default: ● ● ● ● ● use vals, not vars. use recursions or combinators, not loops. use immutable collections concentrate on transformations, not CRUD. When to deviate from the default: ● ● Sometimes, mutable gives better performance. Sometimes (but not that often!) it adds convenience.

#4 ... But don’t diabolize local state

#4 ... But don’t diabolize local state Why does mutable state lead to complexity? It interacts with different program parts in ways that are hard to track. => Local state is less harmful than global state.

“Var” Shortcuts

“Var” Shortcuts Example: var interfaces = parseClassHeader().... if (isAnnotation) interfaces += ClassFileAnnotation Could have written: val parsedIfaces = parseClassHeader() val interfaces = if (isAnnotation) parsedIfaces + ClassFileAnnotation else parsedIfaces But this is not necessarily clearer!

A Question of Naming

A Question of Naming As so often, it comes down to naming. Do the additional intermediate results of the functional solution help understanding? val parsedIfaces = parseClassHeader() val interfaces = if (isAnnotation) parsedIfaces + ClassFileAnnotation else parsedIfaces

More local state examples

More local state examples ● ● ● Here’s another example where local state is useful. Say you have a sequence of items with price and discount attributes. Compute the sums of all prices and all discounts. Easy: val totalPrice = items.map(_.price).sum val totalDiscount = items.map(_.discount).sum ● Now, do the same with just one sequence traversal (maybe because items is an iterator).

More local state examples

More local state examples The canonical functional solution uses a foldLeft: val (totalPrice, totalDiscount) = items.foldLeft((0.0, 0.0)) { case ((tprice, tdiscount), item) => (tprice + item.price, tdiscount + item.discount) } } Whereas the canonical imperative solution looks like this: var totalPrice, totalDiscount = 0.0 for (item <- items) { totalPrice += item.price totalDiscount += item.discount }

#5 Careful with mutable objects

#5 Careful with mutable objects Mutable objects tend to encapsulate global state. “Encapsulate” sounds good, but it does not make the global state go away! => Still a lot of potential for complex entanglements.

When is an Object Mutable?

When is an Object Mutable? But then, what about this? class Memo[T, U](fn: T => U) { val memo = new mutable.WeakHashMap[T, U] def apply(x: T) = memo.getOrElseUpdate(x, fn(x)) } An object is mutable if its (functional) behavior depends on its history. new Memo { i: Int => i + 1 } // immutable var ctr = 0 new Memo { i: Int => ctr += i; ctr } // mutable

When is an Object Mutable?

When is an Object Mutable? But then, what about this? class Memo[T, U](fn: T => U) { val memo = new mutable.WeakHashMap[T, U] def apply(x: T) = memo.getOrElseUpdate(x, fn(x)) } An object is mutable if its (functional) behavior depends on its history. new Memo { i: Int => i + 1 } // immutable var ctr = 0 new Memo { i: Int => ctr += i; ctr } // mutable

#6 Don’t stop improving too early

#6 Don’t stop improving too early ● You often can shrink code by a factor of 10, and make it more legible at the same time. ● But the cleanest and most elegant solutions do not always come to mind at once. ● That’s OK. It’s great to experience the pleasure to find a better solution multiple times. So, keep going ...

Choices

Choices

Choice #1: Infix vs “.”

Choice #1: Infix vs “.” Scala unifies operators and method calls. Every operator is a method Every method with at least one parameter can be used as an infix operator. How do you choose? items + 10 or items.+(10) ? xs map f or xs.map(f) ? xs flatMap or xs.flatMap(fun) ? fun filterNot .filterNot(isZero) isZero groupBy keyFn .groupBy(keyFn)

Infix vs “.”

Infix vs “.” ● ● If the method name is symbolic, always use infix. For alphanumeric method names, one can use infix if there is only one alphanumeric operator in the expression mapping add filter ● But prefer “.method(...)” for chained operators: mapping.add(filter).map(second).flatMap(third) xs.map(f).mkString

Choice #2: Alphabetic vs Symbolic

Choice #2: Alphabetic vs Symbolic Scala unifies operators and methods. Every operator is a method. Identifiers can be alphanumeric or symbolic. How do you choose? xs map f or xs *|> f ? vector + mx or vector add mx ? (xs.foldLeft(z))(op) or (z /: xs)(op) UNDEFINED or ??? ? ?

Alphabetic vs Symbolic

Alphabetic vs Symbolic Recommentation: Use symbolic only if 1. 2. 3. Meaning is understood by your target audience Operator is standard in application domain, or You would like to draw attention to the operator (symbolic usually sticks out more than alphabetic). (Reason 3 is risky)

Choice #3: Loop, recursion or combinators?

Choice #3: Loop, recursion or combinators? Often, for the same functionality you can use a loop: var i = 0 while (i < limit && !qualifies(i)) i += 1 i Or you could use recursion: def recur(i: Int): Int = if (i >= limit || qualifies(i)) i else recur(i + 1) recur(0) Or you could use predefined combinators: (0 until length).find(qualifies).getOrElse(length) How do you choose? ●

How About This One:

How About This One: xs map f ? var buf = new ListBuffer[String] for (x <- xs) buf += f(x) xs.toList ? def recur(xs: List[String]) = xs match { case x :: xs1 => f(x) :: recur(xs1) case Nil => Nil } ?

And How About This One:

And How About This One: Clearly the worst var buf = new ListBuffer[Int] xs.grouped(2).toList.map { var ys = xs case List(x, y) => x * y while (ys.nonEmpty) { } buf += ys(0) * ys(1) ys = ys.tail.tail Great if you know the combinators... } ... and everybody else does as well. def recur(xs: List[Int]): List[Int] = xs match { case x1 :: x2 :: xs1 => (x1 * x2) :: recur(xs1) case Nil => Nil } recur(xs) Easier to grasp for newcomers, and they tend to be more efficient as well.

Why does Scala have all three?

Why does Scala have all three? ● ● ● ● Combinators: Easiest to use, are done in the library. Recursive functions: bedrock of FP Pattern matching: Much clearer and safer than tests and selections. Loops: Because they are familiar, and sometimes are the simplest solution.

Recommendation

Recommendation 1. 2. 3. Consider using combinators first. If this becomes too tedious, or efficiency is a big concern, fall back on tail-recursive functions. Loops can be used in simple cases, or when the computation inherently modifies state.

Choice #4: Procedures or “=“

Choice #4: Procedures or “=“ Scala has special syntax for Unit-returning procedures. def swap[T](arr: Array[T], i: Int, j: Int) { val t = arr(i); arr(i) = arr(j); arr(j) = t } or: def swap[T](arr: Array[T], i: Int, j: Int): Unit = { val t = arr(i); arr(i) = arr(j); arr(j) = t } Which do you choose?

Why Scala has both

Why Scala has both ● ● ● ● Historical accident. I was concerned to have to explain Unit too early to Java programmers. Problem: This trades simplicity for ease of use/familiarity. Also, opens up a bad trap: def swap(arr: Array[T], i: Int, j: Int): Unit { val t = arr(i); arr(i) = arr(j); arr(j) = t }

Recommendation

Recommendation Don’t use procedure syntax.

Choice #5: Private vs nested

Choice #5: Private vs nested Say you have an outer method that uses a helper method for some of its functionality: def outer(owner: Symbol) = { def isJava(sym: Symbol): Boolean = sym hasFlag JAVA if (symbols exists isJava) ... ... } Do you write isJava inline, or as a separate private method?

Private as an alternative

Private as an alternative private def isJava(sym: Symbol): Boolean = sym hasFlag JAVA def outer() = { if (symbols exists isJava) ... ... }

When to Use Nesting

When to Use Nesting You definitely want to nest if that way you avoid passing parameters by capturing a value in the environment. def outer(owner: Symbol) = { def isJava(sym: Symbol): Boolean = (sym hasFlag JAVA) || owner.isJava if (symbols exists isJava) ... ... } But if you do not capture anything it becomes harder know this if you have to scan lots of nested code.

Recommendation

Recommendation 1. 2. 3. Prefer nesting if you can save on parameters. Prefer nesting for small functions, even if nothing is captured. Don’t nest many levels deep.

Choice #6:

Choice #6: Pattern matching vs dynamic dispatch Say you have a class hierarchy of shapes. class Shape case class Circle(center: Point, radius: Double) extends Shape case class Rectangle(ll: Point, ur: Point) extends Shape case class Point(x: Double, y: Double) extends Shape and you want to write a method to compute the area of a shape.

Alternative 1: Pattern matching

Alternative 1: Pattern matching Could could write a single method area and match over all possible shapes: def area(s: Shape): Double = s match { case Circle(_, r) => math.pi * r * r case Rectangle(ll, ur) => (ur.x – ll.x) * (ur.y – ll.y) case Point(_, _) => 0 }

Alternative 2: Dynamic Dispatch

Alternative 2: Dynamic Dispatch Or, you could add an area method to each Shape class. class Shape { def area: Double } case class Circle(center: Point, radius: Double) extends Shape { def area = pi * radius * radius } case class Rectangle(ll: Point, ur: Point) extends Shape { def area = (ur.x – ll.x) * (ur.y – ll.y) } Which do you choose?

Why Scala has Both

Why Scala has Both Pattern matching: functional, convenient Dynamic dispatch: core mechanism for extensible systems.

Recommendation

Recommendation ● ● ● ● ● It depends whether your system should be extensible or not. If you foresee extensions with new data alternatives, choose dynamic dispatch. If you foresee adding new methods later, choose pattern matching. If the system is complex and closed, also choose pattern matching. What if you want to extend in both dimensions?

Extensibility in both dimensions

Extensibility in both dimensions Here’s one way to do it: class ShapeHandler { def area(s: Shape) = s match { ... } } class ExtraShapeHandler { def area(s: Shape) = s match { case Triangle(...) => ... case _ => super.area(s) } “The expression problem”.

Conclusions

Conclusions Lots of puzzling choices. This is natural; we are breaking new ground here. My main advice: Keep things simple. Think of good names. Have fun!

Thank You

Thank You