Scala Functional Programming Patterns

上QQ阅读APP看书，第一时间看更新

Null Objects – singletons in another garb

One popular form in which singletons are used is the Null Object pattern. Let's see what a Null Object is.

Java gives us the null reference to indicate a missing value. We are not supposed to call a method on it as there is no object. If, erroneously, we do, we are greeted with a Null Pointer Exception (NPE). When we design methods to return nulls, the client code that calls the method needs to check assiduously. This is what a typical null check-based Java code looks like:

       Point p = makeAPoint(); // a method that returns null in some cases
       if (p != null) { // the dreaded null check – onus is on us...
            // We are on sure grounds
       }

The problem is that the onus term is on us to check for a reference being null. Every call to makeAPoint() needs to be checked, as shown earlier. This soon becomes tedious.

Note

The inventor of the null keyword called it his billion-dollar mistake!!! Please see http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare.

What is the way out? How do we return missing values? Instead of returning null, return a Null Object. Why? The point being is that it is a kind of point and not a null reference. So no NPEs are ever possible.

Linked lists and trees are restricted forms of a graph. Trees have parent/child (and at times sibling) relationships and they don't contain cycles. Hence, trees are Directed Acyclic Graphs (DAG). A tree needs to end somewhere, the nodes without any children. These are leaf nodes and they use null pointers to indicate termination.

However, instead of a null pointer, we can use a sentinel node. A sentinel node is like a train terminus, a special node in a linked list or a tree, indicating termination. It is used as an alternative over nulls. Sentinel nodes do not hold any meaningful data. The traversal algorithms are slightly modified, so when a sentinel node is hit, we know we are done.

For example, let's look at a BST, designed with a sentinel node instead of nulls. BSTs are a well-known data structure, the nodes arranged such that the parent value is greater than the left child and lesser than the right child. A BST can be used as a set.

A set is a collection of unique values. If we try inserting duplicate values, the values are discarded. For example:

scala> val s = Set(1, 1, 2, 3, 4, 4, 2, 3)
s: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
scala> println(s)
Set(1, 2, 3, 4)
scala> s + 4
res9: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
scala> s + 5
res10: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4)
We have got a new set. The old set is still as before.
scala> s
res11: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)

We consider how to implement a set of numbers using BSTs. Note the terminating sentinel node and both its left and right arms are pointing at itself. Here it is depicted pictorially:

Figure 2.1: A Binary tree

First, an illustrative unit test; using those Hamcrest beauty matchers—the contains helper method is statically imported—makes sure our INORDER expectation is not violated.

The following test case adds some numbers—the insert call creates a tree. We performed an INORDER traversal on the BST. An INORDER traversal gets us the values in a sorted order:

package com.packt.chapter02;

import static org.hamcrest.Matchers.contains;
import static org.junit.Assert.assertThat;

import java.util.List;

import org.junit.Test;

import com.google.common.collect.Lists;

public class NodeTest {

 @Test
 public void testTraversal() {
  Node tree = Node.createTreeWithRoot(72);
    tree.insert(50);
    tree.insert(25);
    tree.insert(65);
    tree.insert(95);
    tree.insert(88);
    tree.insert(112);

  List<Integer> list = Lists.newArrayList();

  tree.traverseInOrder(list);

  assertThat(list, contains(25, 50, 65, 72, 88, 95, 112));
 }

}

The implementation of the tree algorithms implements the tree data structure. The tree is terminated with a sentinel node. A Node is a binary search tree node. Our tree represents a set of numbers. Each tree node contains a number. The tree is formed by linking up child nodes. All values in the left sub-tree are less than the node value. Values in the right sub-tree are greater. We don't allow for duplicates as essentially our tree represents a set of numbers.

The sentinel node is represented by the nullNode. Note that it is never exposed to the outside world.

The Node class encapsulates the NullNode, which is a Null Object:

import java.util.List;

public class Node {

    private static final Node nullNode = new NullNode();
    private final int val;
    protected Node left;
    protected Node right;

    public Node(final int val) {
        this.val = val;
        this.left = nullNode;
        this.right = nullNode;
    }

    public void insert(final int i) {
        final Node node = new Node(i);
        insert(node);
    }

    ;

    public void insert(final Node n) {
        if (n.val < val) {
            left.insertLeft(n, this);
        } else if (n.val > val) {
            right.insertRight(n, this);
        }
        // else we found the value – do nothing, we
        // already have it
    }

    protected void insertLeft(final Node n, final Node
            parent) {
        insert(n);
    }

    protected void insertRight(final Node n, final Node
            parent) {
        insert(n);
    }

    public void traverseInOrder(final List<Integer> list) {
        left.traverseInOrder(list);
        list.add(val);
        right.traverseInOrder(list);
    }

    private static class NullNode extends Node {

        public NullNode() {
            super(Integer.MAX_VALUE); //
            this.left = this.right = this;
        }

        @Override
        public void traverseInOrder(final List<Integer>
                                                    list) {
            // do nothing
        }

        @Override
        protected void insertLeft(final Node n, final
        Node parent) {
            parent.left = n;  //  Attach new node as
            // left child of parent
        }

        @Override
        protected void insertRight(final Node n, final
        Node parent) {
            parent.right = n; // Attach new node as right
            // child of parent
        }
    }

    // factory method to create the root node – and get
    // the ball rolling
    public static Node createTreeWithRoot(final int val) {
        return new Node(val);
    }
}

The big idea is to homogenize—a missing value is still a value, just a special value accessible only to the Node class, just like a singleton.

The NullNode is a Null Object. Look at the traverseInOrder() method of NullNode, it does nothing and so it terminates the recursion. We also override the insertLeft(...) and insertRight(...) methods—these just link up the child with its parent and thereby help grow the tree. Here it is depicted pictorially:

Figure 2.2: Binary Search Tree terminated with a sentinel node

How many NullNodes do you see? Note that our NullNode has no state of its own, it is a do nothing object—a place-holder! How many would we ever need? Just one would suffice! By design we only ever want one instance—we should prohibit creating more than one instance. Our null node is a singleton.

What value should the null node hold? Anything would do, as we don't use it. However, we must initialize it as it is final in the super class. We use the maximum number as a reminder to serve us that this is really a sentinel node. Its value does not have any business importance though.

Null Objects – the Scala way

Instead of using null to indicate missing values, Scala provides an Option class to represent optional values. Option can hold either Some value or None value. None indicates missing values. Please see http://www.scala-lang.org/api/current/index.html#scala.Option.

So instead of nulls, we return None. To see why, try the following code snippets in REPL:

scala> val m = List(Some("on"), None, Some("us"), None)

m: List[Option[String]] = List(Some(on), None, Some(us), None)
scala> for {

 | Some(p) <- m

 | } println(p) 

on

us

scala> val m = List(Some("on"), None, Some("us"), None)

m: List[Option[String]] = List(Some(on), None, Some(us), None)

scala> val p = m.flatten

p: List[String] = List(on, us)

See all the None getting filtered out? Let's continue further:

scala> val p = Some("one")

p: Some[String] = Some(one)

scala> val q = None

q: None.type = None

scala> val r = Option(null)

r: Option[Null] = None

scala> p.foreach(println(_)) // The _ is bound to the current element 

one

scala> q.foreach(println(_)) 

scala> r.foreach(println(_))

How come the flatten and foreach calls are used with Options? These make sense for a container. For example, a List:

scala> val p = List(List(1,2), List(3,4))

p: List[List[Int]] = List(List(1, 2), List(3, 4))

scala> p.flatten

res0: List[Int] = List(1, 2, 3, 4)

Aha! These work for Options, too—as Options are containers holding zero or one object.

The map and flatMap operations are defined on Options. For Options holding one value that is. For a Some, these invoke the associated function. For a None, they do nothing. Let's check the following command lines:

scala> def m(n : Option[Int]) = n.map(_ + 1) // Options provide the map operation
m: (n: Option[Int])Option[Int]

scala> val c1 = Some(10)
c1: Some[Int] = Some(10)

scala> val c2 = None
c2: None.type = None

scala> m(c1)
res19: Option[Int] = Some(11)

scala> m(c2)
res20: Option[Int] = None

So the map operation, when invoked over a None, simply did nothing:

scala> def m(n : Int) = if (n >= 0 && n < 10) Some(n) else None
m: (n: Int)Option[Int]
scala> m(10) foreach( x => println (x+1 ))
scala> m(5) foreach( x => println (x+1) )

The foreach increments and prints the number, if there is a Some. For a None, it does nothing.

Options are container

Let's put all this together to write the Scala version of the sentinel-based tree node. Let's first look at the picture:

Figure 2.3: Option as Null Object

Here is the Scala code. (Note that our TreeNode is now a case class, with left and right fields as Option[TreeNode]. We also use ListBuffer instead of List):

object Tree extends App {

  case class TreeNode(v: Int, left: Option[TreeNode] = None,
                      right: Option[TreeNode] = None) {
    def add(av: Int) = {
      insert(av).get
    }

    def insert(av: Int): Option[TreeNode] = {
      def insertLeft(av: Int) =
        left.flatMap(_.insert(av)) orElse
          Option(TreeNode(av)) // 1

      def insertRight(av: Int) =
        right.flatMap(_.insert(av)) orElse
          Option(TreeNode(av)) // 2

      av.compare(v) match {
        case 0 => Option(this) // 3
        case -1 => Option(TreeNode(v, insertLeft(av),
          right)) // 4
        case 1 => Option(TreeNode(v, left, insertRight
          (av))) // 5
      }
    }
    def traverseItInOrder(): Option[List[TreeNode]] = {
      val leftList = left.flatMap(x => x
        .traverseItInOrder()).getOrElse(Nil)
      val rightList = right.flatMap(x => x
        .traverseItInOrder()). getOrElse(Nil)
      val result = (leftList :+ this) ++ rightList
      Option(result)
    }

    def traverseInOrder() = {
      val result: Option[List[TreeNode]] = traverseItInOrder()
      result.getOrElse(Nil)
    }
  }

  val p = TreeNode(4).add(3).add(0).add(1).add(99).
    add(1).add(4)

  for {
    q <- p.traverseInOrder() // 5
  } println(q.v)

}

Our TreeNode is a case class. The left and right child references are of type Option[TreeNode]. There is only one constructor that sets the left and right children to a None.

There is an insert method that we can use to add a value to the tree. As earlier, we check the argument and if it is less than this .val we try inserting it into the left arm otherwise in the right arm.

Note

The left can be either an Some[TreeNode] or None.

Let's try to understand the flatMap call:

left.flatMap(_.insert(av)) orElse Option(TreeNode(av))

This form uses infix notation for method invocation. What does it mean?

Scala provides a special invocation syntax when we have a method with one argument. The following method call:

                 o.m(arg)

can be written as:

                 o m arg

scala> val list = List.fill(5)(1) // we create a list of repeated 1
list: List[Int] = List(1, 1, 1, 1, 1)

scala> list.mkString("-") // invocation
res30: String = 1-1-1-1-1

scala> list mkString "-" // infix notation
res31: String = 1-1-1-1-1

Applying the same to our example:

scala> None.orElse(Some(10)) // invoking the orElse method on a None 
res32: Option[Int] = Some(10)

scala> None orElse Some(10) // the infix notation
res33: Option[Int] = Some(10)

scala> Some(10) orElse Some(20)
res34: Option[Int] = Some(10)

Please see http://docs.scala-lang.org/style/method-invocation.html.

Let's try the following example:

scala> val left = None

left: None.type = None 

scala> val right = Option("Hi")

right: Option[String] = Some(Hi)
scala> left orElse right
res7: Option[String] = Some(Hi)

In the left.flatMap(_.insert(av)) expression, if left is Some(TreeNode) it reaches into the Some, which pulls out the TreeNode reference and invokes insert on it.

On the other hand, if left is None, the entire expression results in a None or else kicks into action—a new TreeNode is created and put into an Option and returned.

The salient points for the preceding code are as follows:

Is symmetric—instead of left, it works on the right child reference.
Works on the same principle—now for traversal, so if left is a None, nothing happens else we recursively traverse the left sub-tree.
Same as case 2, if right is a None, nothing happens else we recursively traverse the right sub-tree.