Rewriting Java in Scala & Making Code Lovely 4 – Example: Integration

It’s time for a full example that shows off what we know so far.

For the example, we’re going to implement something that does basic integration. This program will approximate the value of an integral using the rectangle method – a good explanation of this can be found here.

Background

Skip this if you already know what integration and the rectangle method are.

Put simply, integration is finding the area under the graph for a particular function. The article here has more details on this.

The rectangle method is a way of approximating this. The graph for the function is divided into sections. Each section then has a rectangle put in it. The height of the rectangle is the same as the value of the graph at the middle of the section:


Thick rectangles under a graph

This isn’t perfect. All the places where the rectangles are too small or too big are errors in the approximation. These errors can be reduced by dividing the graph in to more sections:

Thin rectangles under a graph

For the purposes of this posting, that’s all there is to it.

The Problem

Make a class that when given a function and a range will approximate the area under the graph. The class can be given the number of rectangles to use.

Java Version

We’ll start off with a Java version of this and then rewrite it to Scala.

public class RectangleMethod {
  public double integrate(double start, double end, int steps) {
    final double stepSize = (end - start) / steps;
    double value = 0; // running total for the area
 
    for (int i = 0; i < steps; i++) {
      // Evaluate the function at the mid-point of the current section
      // The code below could be shorter - but it has been kept purposefully simple
      final double startX = start + stepSize * i;
      final double endX = start + stepSize * (i + 1);
      final double midX = (startX + endX) / 2;
      final double y = function(midX);
 
      // Add on the area of the rectangle for the current section: width * height
      value += stepSize * y;
    }
 
    return value;
  }
 
  private double function(double x) {
    return 20 - Math.pow(x - 4, 2);
  }
}

The Java example above contains a hard-coded function: 20 – (x – 4) * (x – 4).

That’s not quite the solution to the problem. We must be able to provide a function to this method so that we can do integrations for any function. We can do this by making the integrate method take an interface and then the caller must implement this interface for any function it wants to integrate:

public class RectangleMethod {
  public double integrate(Function function, double start, double end, int steps) {
    final double stepSize = (end - start) / steps;
    double value = 0;
 
    for (int i = 0; i < steps; i++) {
      final double startX = start + stepSize * i;
      final double endX = start + stepSize * (i + 1);
      final double midX = (startX + endX) / 2;
      final double y = function.y(midX); // Call the function passed in
 
      value += stepSize * y;
    }
 
    return value;
  }
 
  // New interface for arbitrary functions
  public interface Function {
    public double y(double x);
  }
}

I’ve commented the parts that have changed and removed the old comments.

This now lets us pass in any function we want:

double value = new RectangleMethod().integrate(new Function() {
  public double y(double x) {
    return 20 - Math.pow(x - 4, 2);
  }
}, 0, 1, 200);

This will approximate the integral of 20 – (x – 4) * (x – 4) between 0 and 1 with 200 sections. This requires a lot of work to define a function every time you want to integrate it.

Scala – Simple Version

Now lets take our Scala knowledge so far and convert the above code. We’ll start with the simple Java version:

class RectangleMethod {
  def integrate(start : Double, end : Double, steps : Int) = {
    val stepSize = (end - start) / steps // width of each step
    var value = 0.0 // running total for the area
 
    for (i <- 0 until steps) {
      // Evaluate the function at the mid-point of the current section
      // The code below could be shorter - but it has been kept purposefully simple
      val startX = start + stepSize * i
      val endX = start + stepSize * (i + 1)
      val midX = (startX + endX) / 2
      val y = function(midX)
 
      // Add on the area of the rectangle for the current section: width * height
      value = value + stepSize * y
    }
 
    value // Scala doesn't need the return keyword here
  }
 
  def function(x : Double) = {
    20 - Math.pow(x - 4, 2)
  }
}

For loops

I’ve slipped something new in to the example above, the for loop. In Scala there are many ways to write a for loop, this is one of them. The for loop above will iterate i over the values 0 to steps and exclude steps. E.g.

for (i <- 0 until 10) print(i)

Will print 0123456789. To make it include the final value change until to “to”:

for (i <- 0 to 10) print(i)

Will print 012345678910.

There are more ways to define for loops and I’ll come back to them in later posts.

Scala – Better Version

Now we use our knowledge of using functions as arguments to pass a function to the integrate method:

class RectangleMethod {
  def integrate(
      function : Double => Double, // Add the function argument on
      start : Double, end : Double, steps : Int) = {
    val stepSize = (end - start) / steps
    var value = 0.0
 
    for (i <- 0 until steps) {
      val startX = start + stepSize * i
      val endX = start + stepSize * (i + 1)
      val midX = (startX + endX) / 2
      val y = function(midX) // No need to change this!
 
      value = value + stepSize * y
    }
 
    value
  }
}

We’ve added a single argument to the integrate function. The argument is a function that takes a Double as a parameter and returns a Double. To use this new method:

new RectangleMethod().integrate(
  x => 20 - Math.pow(x - 4, 2), 
  0, 1, 200)

I hope that you’ll agree that using this new version is much neater than the Java example.

Scala – Tiny Version

We can make the integrate method really compact and neat.

The first step is to merge the mid-point calculation:

val startX = start + stepSize * i
val endX = start + stepSize * (i + 1)
val midX = (startX + endX) / 2
// Becomes (by substitution and simplification)...
val midX = start + stepSize * i + stepSize * 0.5)

Then simply the calculation of y

val midX = start + stepSize * i + stepSize * 0.5)
val y = function(midX)
// Becomes...
val y = function(start + stepSize * i + stepSize * 0.5))

And finally, move y into the running total calculation:

value = value + stepSize * function(start + stepSize * i + stepSize * 0.5))

This leaves us with:

class RectangleMethod {
  def integrate(
      function : Double => Double, 
      start : Double, end : Double, steps : Int) = {
    val stepSize = (end - start) / steps
    var value = 0.0
 
    for (i <- 0 until steps) {
      value = value + stepSize * function(start + stepSize * i + stepSize * 0.5))
    }
 
    value
  }
}

Whenever you see this pattern in Scala you should be thinking about whether it can be turned in to a call to foldLeft. The things to look for are:

  • A collection to fold over – in this case (0 until steps) is actually a collection (I’ll cover this more in later posts)
  • A running total – in this case value is a running total

Now we use foldLeft to make this more compact:

class RectangleMethod {
  // Add the function argument on
  def integrate(
      function : Double => Double, 
      start : Double, end : Double, steps : Int) = {
    val stepSize = (end - start) / steps
    (0 until steps).foldLeft(0.0)((runningTotal, i) =>
      runningTotal + stepSize * function(start + stepSize * i + stepSize * 0.5))
  }
}

And, that’s it.

There are many other ways to rewrite this, and I would love for you to come up with your own and leave it here as a comment :)

Rewriting Java in Scala & Making Code Lovely 3 – Variable Declarations

I thought I’d take a step back and look at how variables are declared in Scala and how it compares with Java.

Local Variables

In Java you declare a local variable like this:

Map<String, Integer> userIds = new HashMap<String, Integer>();

In Scala, it is a lot shorter:

var userIds = new HashMap[String, Integer]

The type of the variable is missing. This is because Scala can calculate the type. In this case it will calculate that userIds is of type HashMap[String, Integer]. The Scala site has more information on how it calculates the types (this is called type inference).

Another minor detail here is that you use square brackets [ ] instead of angle brackets < > to specify generics.

Constants

In the above example we declared userIds with the keyword ‘var’. This makes it a variable – we can change userIds to point at a different HashMap if we want.

In Java we can say that a variable is constant using the final keyword:

final int x = 4;
x = 5; // Compile error: cannot change x

Scala doesn’t have the final keyword when declaring variables. Instead, you use the ‘val’ keyword:

val x = 4
x = 5 // Compile error: cannot change x

This has a benefit that isn’t obvious.

Every time you declare a variable you specify whether you can change it or not.

There are many good reasons to make something constant (or, immutable as many people like to say). However, in Java it is easy to forget to make something immutable, it is also a burden as you have to write more every time you want to keep something constant. In Scala, you make an active choice every time you declare a variable. It is also easy for constants to be the default that you use.

Specifying Types

Sometimes, you may want to specify the type of a variable. This is easy:

val x : Int = 4
val name : String = "Bob"

Normally, you won’t need to do this. Only when you have a specific type requirement or Scala has failed to infer the types – a rare event!

Summary

You’ve now been introduced to variable declarations. Scala improves this basic area of programming in a number of ways:

  • Reduces repetitiveness by not writing types all the time
  • Makes the choice of constant/not constant an active choice
  • Makes it easier to alter types of variables

The next part of this series will follow on from this and take a look at declaring fields in classes.

Rewriting Java in Scala & Making Code Lovely 2 – Functions as values

Ever wanted to filter a collection in Java and thought “this sucks”? Typically, you end up with code like:

List evens = new LinkedList();
for (Integer i : list) {
  if (i % 2 == 0) evens.add(i);
}

Or, if you’re going for re-use:

Predicate evenNumbers = new Predicate() {
  @Override
  public boolean evaluate(Object o) {
    Integer n = (Integer)o; // Assume list contains only integers
    return n % 2 == 0;
  }
};
CollectionUtils.filter(list, evenNumbers);

Which is actually longer.

In Scala, functions can be values and this makes the above examples become very short:

val evens = list.filter(_ % 2 == 0)

You’re not supposed to understand this yet – it’s the bait to get you to read further ;-)

Defining Functions in Scala

Before we focus on this I’m going to quickly show you how functions are defined in Scala:

def getHelloString(username : String) : String = {
  "Hello " + username
}

This defines a function called getHelloString. It takes a single argument of type String called username and returns a String. Unlike Java, the type of an argument goes after the name. Likewise, the return type of a function goes after the argument list. It’s also worth noting that there isn’t a return statement here, the value of the last expression in the function is the return value.

That’s all I’m going to say on functions for now – I’ll come back to them in more detail in a later post.

Functions as Arguments

Let’s take a look at the signature for the method filter:

def filter(p : (A) => Boolean) : Iterable[A]

This method takes an argument p and returns an Iterable[A] (the [A] is how you specify generics in Scala but we can ignore this for now).

The argument p is the interesting bit. The type of p is a function that takes an A (generics again) and returns a Boolean. To help this make sense here are a few more examples:

def a(p : String => Boolean) : Boolean
// The type of p is a function that takes a String and returns a Boolean
 
def b(p : String => List[String]) : Boolean
// The type of p is a function that takes a String and returns a list of strings
 
def c(p : (String, Boolean) => Int) : Boolean
// The type of p is a function that takes a String and a boolean and returns a list of strings

This is very different to Java. In Java you would define an interface and use that as the argument type (just like in the CollectionUtils example above).

Coming back to filter, we now know that it takes a single argument. That argument must be a function which takes a single argument of type A and returns a boolean. What is A? A is the same type as the type of the list we are filtering over. So:

List("A", "B").filter(...)
// requires a function that takes a String and returns a Boolean
 
List(1, 2).filter(...)
// requires a function that takes an Int and returns a Boolean

With the knowledge we have so far we could comfortably write:

def isEven(n : Int) : Boolean = {
  n % 2 == 0
}
 
List(1, 2, 3, 4, 5, 6).filter(isEven)

And we’ll get a list containing only the even numbers.

Function Literals

This is all very well and good, but it’s still a lot to write – especially if we only ever use the isEven function in one place.

Fortunately, Scala offers a short-hand that allows us to write functions in-line:

List(1, 2, 3, 4, 5, 6).filter(n => n % 2 == 0)

The way this is written above is very similar to the argument types above. The argument passed to filter here is known as a function literal.

Placeholder Syntax

Scala gives us one further refinement to this:

List(1, 2, 3, 4, 5, 6).filter(_ % 2 == 0)

On first reading this can be cryptic. It’s called placeholder syntax. The _ is a placeholder for the 1st argument to the function. When Scala sees this it replaces the _ with the 1st argument to the function. Each subsequent _ is a placeholder for the next argument. If we wrote:

dostuff(_ + _)

This would be the same as:

dostuff((a, b) => a + b)

It’s important to note that when writing function literals with the arguments named parentheses must be used to group the arguments. This is to prevent ambiguity:

dostuff(a, b => a + b)

Could be read as:

dostuff(
  a,
  b => a + b)

Which isn’t what we intended.

I learn best by example, so here are a few more examples of function literals.

Further Examples

Map

Map is one of the most useful functions in Scala’s collections:

Map creates a new collection based on the original. Each element in the new collection is the result of applying the function given to each element in the original collection.

List(1, 2, 3, 4).map(_ * 2)
// Returns: List(2, 4, 6, 8)
 
List(1, 2, 3, 4).map(_.toString)
// Returns: List("1", "2", "3", "4")

findIndexOf

Finding the index of an element in a list is also very easy:

List(1, 2, 3, 4).findIndexOf(_ % 2 == 0)
// Finds the index of the first even number, or -1 if no such number exists

Count

How about counting the number of even numbers in a list?

List(1, 2, 3, 4).count(_ % 2 == 0)
// Returns 2
 
List("Hi", "Bye", "Hey", "Goodbye").count(_.length == 2)
// Returns 1

What’s this given us?

More concise, more readable

Goodbye anonymous inner classes. If you’ve ever written something that fires events you’ll know how horrible these can be.

Once I got used to passing functions around as arguments I soon realised how often I can re-use the same basic operations. Map and filter are two of the most used constructs in Java – I love that I can now write them in one line.

Easier to debug

The great thing about using functions as arguments is that you can write an algorithm once, and never again. I don’t want to think about how many times I’ve written a basic loop in Java and forgotten to deal with the last case, off-by-one errors, etc.

Easier to test

Now that I can extract out algorithms more effectively I can test them to hell and back. After that, I only have to test that I’m calling them correctly – I don’t have to retest all the corner cases all over again.

Summary

You should now understand:

  • Functions can be passed as arguments to other functions
  • Functions can be written in-line as literal functions
  • Literal functions can be extremely short when using placeholder syntax

If you don’t, don’t worry! If you’ve not come across functional programming before it can be quite a brain melting experience. My advice is to grab the Scala REPL and put in some of the examples above… then experiment and have fun!

Rewriting Java in Scala & Making Code Lovely 1 – Case Classes

This is the first in a series of posts for people who know Java and are learning Scala. Apart from teaching Scala I hope these posts show how Scala makes your code lovely…

What does more lovely mean?

Lovely covers several things:

  • More concise
  • More readable
  • Easier to debug
  • Easier to test
  • Easier to maintain

I’m not saying Scala is a silver bullet – I’m saying it gives you more time and support to get the hard parts right.

Anyway, on with the show. First up:

Case Classes

Case classes in Scala look something like:

case class Employee(name : String)

They are a short-hand for defining simple classes. They provide you with default implementations of equals, toString and hashCode. They also provide you with some handy tricks when doing pattern matching (which I’ll talk about in a future post).

The great benefit of case classes is that you can take a Java class such as:

class Point {
  private int x;
  private int y;
  public Point(int x, int y) {
    this.x = x;
    this.y = y;
  }
  public String toString() {
    return "Point(x: " + x + ", y: " + y + ")";
  }
  @Override
  public boolean equals(Object obj) {
    if (!(obj instanceof Point)) {
      return false;
    }
    Point other = (Point)obj;
    return other.x == x && other.y == y;
  }
  @Override
  public int hashCode() {
    return x * 17 + y;
  }
}

and rewrite as

case class Point(x : Int, y : Int)

That’s it.

No, really, that’s it.

You can provide different implementations of toString, equals and hashCode if you really want to. But the majority of the time the default implementations are spot on.

What’s this given us?

More concise, more readable

It’s a one liner. But, it’s not a Perl one liner ;-)

Easier to debug

There isn’t any functionality here to debug – but the standard toString makes debugging code that uses this class far easier. The default toString will result in something like:

Point(3, 4) 
// The order here is the same as the order in the case class definition

Isn’t this nicer than “Point@0×02345678″ as a default?

Easier to test

There’s nothing to test. It’s reasonable to assume that Scala gets the default toString and equality methods correct – so why test them again?

Easier to maintain

Future maintainers don’t have to look at the toString, equals and hashCode methods when they familiarise themselves with the class. As they don’t exist, they can’t introduce bugs there either!

Summary

Case classes are a great way of defining simple and useful classes with very little code. Goodbye huge files for simple POJOs.