Null Objects – singletons in another garb
One popular form in which singletons are used is the Null Object pattern. Let's see what a Null Object is.
Java gives us the null reference to indicate a missing value. We are not supposed to call a method on it as there is no object. If, erroneously, we do, we are greeted with a Null Pointer Exception (NPE). When we design methods to return nulls, the client code that calls the method needs to check assiduously. This is what a typical null check-based Java code looks like:
Point p = makeAPoint(); // a method that returns null in some cases if (p != null) { // the dreaded null check – onus is on us... // We are on sure grounds }
The problem is that the onus
term is on us
to check for a reference being null. Every call to makeAPoint()
needs to be checked, as shown earlier. This soon becomes tedious.
Note
The inventor of the null keyword called it his billion-dollar mistake!!! Please see http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare.
What is the way out? How do we return missing values? Instead of returning null, return a Null Object. Why? The point being is that it is a kind of point and not a null reference. So no NPEs are ever possible.
Linked lists and trees are restricted forms of a graph. Trees have parent/child (and at times sibling) relationships and they don't contain cycles. Hence, trees are Directed Acyclic Graphs (DAG). A tree needs to end somewhere, the nodes without any children. These are leaf nodes and they use null pointers to indicate termination.
However, instead of a null pointer, we can use a sentinel node. A sentinel node is like a train terminus, a special node in a linked list or a tree, indicating termination. It is used as an alternative over nulls. Sentinel nodes do not hold any meaningful data. The traversal algorithms are slightly modified, so when a sentinel node is hit, we know we are done.
For example, let's look at a BST, designed with a sentinel node instead of nulls. BSTs are a well-known data structure, the nodes arranged such that the parent value is greater than the left child and lesser than the right child. A BST can be used as a set.
A set is a collection of unique values. If we try inserting duplicate values, the values are discarded. For example:
scala> val s = Set(1, 1, 2, 3, 4, 4, 2, 3) s: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4) scala> println(s) Set(1, 2, 3, 4) scala> s + 4 res9: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4) scala> s + 5 res10: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4) We have got a new set. The old set is still as before. scala> s res11: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
We consider how to implement a set of numbers using BSTs. Note the terminating sentinel node and both its left and right arms are pointing at itself. Here it is depicted pictorially:
Figure 2.1: A Binary tree
First, an illustrative unit test; using those Hamcrest beauty matchers—the contains helper method is statically imported—makes sure our INORDER expectation is not violated.
The following test case adds some numbers—the insert call creates a tree. We performed an INORDER traversal on the BST. An INORDER traversal gets us the values in a sorted order:
package com.packt.chapter02; import static org.hamcrest.Matchers.contains; import static org.junit.Assert.assertThat; import java.util.List; import org.junit.Test; import com.google.common.collect.Lists; public class NodeTest { @Test public void testTraversal() { Node tree = Node.createTreeWithRoot(72); tree.insert(50); tree.insert(25); tree.insert(65); tree.insert(95); tree.insert(88); tree.insert(112); List<Integer> list = Lists.newArrayList(); tree.traverseInOrder(list); assertThat(list, contains(25, 50, 65, 72, 88, 95, 112)); } }
The implementation of the tree algorithms implements the tree data structure. The tree is terminated with a sentinel node. A Node is a binary search tree node. Our tree represents a set of numbers. Each tree node contains a number. The tree is formed by linking up child nodes. All values in the left sub-tree are less than the node value. Values in the right sub-tree are greater. We don't allow for duplicates as essentially our tree represents a set of numbers.
The sentinel node is represented by the nullNode
. Note that it is never exposed to the outside world.
The Node
class encapsulates the NullNode
, which is a Null Object:
import java.util.List; public class Node { private static final Node nullNode = new NullNode(); private final int val; protected Node left; protected Node right; public Node(final int val) { this.val = val; this.left = nullNode; this.right = nullNode; } public void insert(final int i) { final Node node = new Node(i); insert(node); } ; public void insert(final Node n) { if (n.val < val) { left.insertLeft(n, this); } else if (n.val > val) { right.insertRight(n, this); } // else we found the value – do nothing, we // already have it } protected void insertLeft(final Node n, final Node parent) { insert(n); } protected void insertRight(final Node n, final Node parent) { insert(n); } public void traverseInOrder(final List<Integer> list) { left.traverseInOrder(list); list.add(val); right.traverseInOrder(list); } private static class NullNode extends Node { public NullNode() { super(Integer.MAX_VALUE); // this.left = this.right = this; } @Override public void traverseInOrder(final List<Integer> list) { // do nothing } @Override protected void insertLeft(final Node n, final Node parent) { parent.left = n; // Attach new node as // left child of parent } @Override protected void insertRight(final Node n, final Node parent) { parent.right = n; // Attach new node as right // child of parent } } // factory method to create the root node – and get // the ball rolling public static Node createTreeWithRoot(final int val) { return new Node(val); } }
The big idea is to homogenize—a missing value is still a value, just a special value accessible only to the Node
class, just like a singleton.
The NullNode
is a Null Object. Look at the traverseInOrder()
method of NullNode
, it does nothing and so it terminates the recursion. We also override the insertLeft(...)
and insertRight(...)
methods—these just link up the child with its parent and thereby help grow the tree. Here it is depicted pictorially:
Figure 2.2: Binary Search Tree terminated with a sentinel node
How many NullNodes
do you see? Note that our NullNode
has no state of its own, it is a do nothing object—a place-holder! How many would we ever need? Just one would suffice! By design we only ever want one instance—we should prohibit creating more than one instance. Our null node is a singleton.
What value should the null node hold? Anything would do, as we don't use it. However, we must initialize it as it is final in the super class. We use the maximum number as a reminder to serve us that this is really a sentinel node. Its value does not have any business importance though.
Null Objects – the Scala way
Instead of using null to indicate missing values, Scala provides an Option
class to represent optional values. Option
can hold either Some
value or None
value. None
indicates missing values. Please see http://www.scala-lang.org/api/current/index.html#scala.Option.
So instead of nulls, we return None
. To see why, try the following code snippets in REPL:
scala> val m = List(Some("on"), None, Some("us"), None) m: List[Option[String]] = List(Some(on), None, Some(us), None) scala> for { | Some(p) <- m | } println(p) on us scala> val m = List(Some("on"), None, Some("us"), None) m: List[Option[String]] = List(Some(on), None, Some(us), None) scala> val p = m.flatten p: List[String] = List(on, us)
See all the None
getting filtered out? Let's continue further:
scala> val p = Some("one") p: Some[String] = Some(one) scala> val q = None q: None.type = None scala> val r = Option(null) r: Option[Null] = None scala> p.foreach(println(_)) // The _ is bound to the current element one scala> q.foreach(println(_)) scala> r.foreach(println(_))
How come the flatten
and foreach
calls are used with Options? These make sense for a container. For example, a List
:
scala> val p = List(List(1,2), List(3,4)) p: List[List[Int]] = List(List(1, 2), List(3, 4)) scala> p.flatten res0: List[Int] = List(1, 2, 3, 4)
Aha! These work for Options, too—as Options are containers holding zero or one object.
The map and flatMap operations are defined on Options. For Options holding one value that is. For a Some
, these invoke the associated function. For a None
, they do nothing. Let's check the following command lines:
scala> def m(n : Option[Int]) = n.map(_ + 1) // Options provide the map operation m: (n: Option[Int])Option[Int] scala> val c1 = Some(10) c1: Some[Int] = Some(10) scala> val c2 = None c2: None.type = None scala> m(c1) res19: Option[Int] = Some(11) scala> m(c2) res20: Option[Int] = None
So the map operation, when invoked over a None
, simply did nothing:
scala> def m(n : Int) = if (n >= 0 && n < 10) Some(n) else None m: (n: Int)Option[Int] scala> m(10) foreach( x => println (x+1 )) scala> m(5) foreach( x => println (x+1) )
The foreach
increments and prints the number, if there is a Some. For a None
, it does nothing.
Options are container
Let's put all this together to write the Scala version of the sentinel-based tree node. Let's first look at the picture:
Figure 2.3: Option as Null Object
Here is the Scala code. (Note that our TreeNode
is now a case class, with left and right fields as Option[TreeNode]
. We also use ListBuffer
instead of List
):
object Tree extends App { case class TreeNode(v: Int, left: Option[TreeNode] = None, right: Option[TreeNode] = None) { def add(av: Int) = { insert(av).get } def insert(av: Int): Option[TreeNode] = { def insertLeft(av: Int) = left.flatMap(_.insert(av)) orElse Option(TreeNode(av)) // 1 def insertRight(av: Int) = right.flatMap(_.insert(av)) orElse Option(TreeNode(av)) // 2 av.compare(v) match { case 0 => Option(this) // 3 case -1 => Option(TreeNode(v, insertLeft(av), right)) // 4 case 1 => Option(TreeNode(v, left, insertRight (av))) // 5 } } def traverseItInOrder(): Option[List[TreeNode]] = { val leftList = left.flatMap(x => x .traverseItInOrder()).getOrElse(Nil) val rightList = right.flatMap(x => x .traverseItInOrder()). getOrElse(Nil) val result = (leftList :+ this) ++ rightList Option(result) } def traverseInOrder() = { val result: Option[List[TreeNode]] = traverseItInOrder() result.getOrElse(Nil) } } val p = TreeNode(4).add(3).add(0).add(1).add(99). add(1).add(4) for { q <- p.traverseInOrder() // 5 } println(q.v) }
Our TreeNode
is a case class. The left and right child references are of type Option[TreeNode]
. There is only one constructor that sets the left and right children to a None
.
There is an insert
method that we can use to add a value to the tree. As earlier, we check the argument and if it is less than this .val
we try inserting it into the left arm otherwise in the right arm.
Note
The left
can be either an Some[TreeNode]
or None
.
Let's try to understand the flatMap
call:
left.flatMap(_.insert(av)) orElse Option(TreeNode(av))
This form uses infix notation for method invocation. What does it mean?
Scala provides a special invocation syntax when we have a method with one argument. The following method call:
o.m(arg)
can be written as:
o m arg
scala> val list = List.fill(5)(1) // we create a list of repeated 1 list: List[Int] = List(1, 1, 1, 1, 1) scala> list.mkString("-") // invocation res30: String = 1-1-1-1-1 scala> list mkString "-" // infix notation res31: String = 1-1-1-1-1
Applying the same to our example:
scala> None.orElse(Some(10)) // invoking the orElse method on a None res32: Option[Int] = Some(10) scala> None orElse Some(10) // the infix notation res33: Option[Int] = Some(10) scala> Some(10) orElse Some(20) res34: Option[Int] = Some(10)
Please see http://docs.scala-lang.org/style/method-invocation.html.
Let's try the following example:
scala> val left = None left: None.type = None scala> val right = Option("Hi") right: Option[String] = Some(Hi) scala> left orElse right res7: Option[String] = Some(Hi)
In the left.flatMap(_.insert(av))
expression, if left
is Some(TreeNode)
it reaches into the Some,
which pulls out the TreeNode
reference and invokes insert on it.
On the other hand, if left
is None
, the entire expression results in a None
or else kicks into action—a new TreeNode
is created and put into an Option
and returned.
The salient points for the preceding code are as follows:
- Is symmetric—instead of left, it works on the right child reference.
- Works on the same principle—now for traversal, so if
left
is aNone
, nothing happens else we recursively traverse the left sub-tree. - Same as case 2, if
right
is aNone
, nothing happens else we recursively traverse the right sub-tree.