Thursday, May 27, 2010

zipWithIndex

A common desire is to have access to the index of an element when using collection methods like foreach, filter, foldLeft/Right, etc... Fortunately there is a simple way.

List('a','b','c','d').zipWithIndex.

But wait!

Does that not trigger an extra iteration through the collection?. Indeed it does and that is where Views help.

List('a','b','c','d').view.zipWithIndex

When using a view the collection is only traversed when required so there is no performance loss.

Here are some examples of zipWithIndex:
  1. scala> val list = List('a','b','c','d')
  2. list: List[Char] = List(a, b, c, d)
  3. /*
  4. I like to use functions constructed with case statements
  5. in order to clearly label the index.  The alternative is 
  6. to use x._2 for the index and x._1 for the value
  7. */
  8. scala> list.view.zipWithIndex foreach {case (value,index) => println(value,index)}
  9. (a,0)
  10. (b,1)
  11. (c,2)
  12. (d,3)
  13. // alternative syntax without case statement
  14. scala> list.view.zipWithIndex foreach {e => println(e._1,e._2)}
  15. (a,0)
  16. (b,1)
  17. (c,2)
  18. (d,3)
  19. /*
  20. Fold left and right functions have 2 parameters (accumulator, nextValue) 
  21. using a case statement allows you to expand that but watch the brackets!
  22. */
  23. scala> (list.view.zipWithIndex foldLeft 0) {case (acc,(value,index)) => acc + value.toInt + index} 
  24. res14: Int = 400
  25. // alternative syntax without case statement
  26. scala> (list.view.zipWithIndex foldLeft 0) {(acc,e) => acc + e._1.toInt + e._2} 
  27. res23: Int = 400
  28. /*
  29. alternative foldLeft operator.  The thing I like about this
  30. syntax is that it has the initial accumulator value on the left 
  31. in the same position as the accumulator parameter in the function.
  32. The other thing I like about it is that visually you can see that it starts with
  33. "" and the folds left
  34. */
  35. scala> ("" /: list.view.zipWithIndex) {                          
  36.      | case (acc, (value, index)) if index % 2 == 0 => acc + value
  37.      | case (acc, _) => acc                                       
  38.      | }
  39. res15: java.lang.String = ac
  40. /*
  41. This example filters based on the index then uses map to remove the index
  42. force simply forces the view to be processed.  (I love these collections!)
  43. */
  44. scala> list.view.zipWithIndex.filter { _._2 % 2 == 0 }.map { _._1}.force
  45. res29: Seq[Char] = List(a, c)

Wednesday, May 26, 2010

Return value of a block

A common misunderstanding is that a code block (without parameters) is a function. That is not the case. A code block is a sequence of statements that are executed and result the last statement is returned. That sounds like a Function0, however, if the block is passed to a method/function only the last statement will be returned to the function/method. If that method/function expects a function as the parameter the last statement maybe returned as a function not a value, this means that the block itself is not a function.
  1. scala> var count = 0                                                                                                                         
  2. count: Int = 0
  3. // the last statement is returned as a function so count
  4. // is incremented only one during the creation of the function
  5. scala> List(1,2,3,4).map{count += 1;_ + 1}
  6. res9: List[Int] = List(2, 3, 4, 5)
  7. scala> count
  8. res10: Int = 1
  9. // now the count increment is within the function
  10. scala> List(1,2,3,4).map{i => count += 1;i + 1}
  11. res11: List[Int] = List(2, 3, 4, 5)
  12. scala> count
  13. res12: Int = 5

The previous example demonstrates a Gotcha if I ever saw one. Map expects a function so the block essentially constructs a function. The last statement being the function. The first line count += 1 executed only once because it is part of creating the function not part of the resulting function. This is equivalent to:
  1. scala> val x = {count += 1 ; i:Int => i +1}
  2. x: (Int) => Int = < function1>
  3. scala> List(1,2,3,4).map(x)
  4. res15: List[Int] = List(2, 3, 4, 5)

Beginning a block with the parameter list signals that the entire block is a function.

Rule of thumb: Functions with placeholder parameters should be a single statement.

Thursday, May 20, 2010

Type Inference with Abstract Types

A second "gotcha" that one might get tripped up when dealing with abstract types is the signature of the concrete class contains type information about the abstract type. So if you are not explicit when assigning a variable or defining a function you can get unexpected compiler errors.
  1. scala> trait S {
  2.      |   type x
  3.      |   def get : x
  4.      | }
  5. defined trait S
  6. scala> var sample = new S{ 
  7.      |   type x = Int
  8.      |   def get = 3
  9.      | }
  10. sample: java.lang.Object with S{type x = Int} = $anon$1@397af435
  11. scala> sample = new S {
  12.      |   type x = Double
  13.      |   def get = 3.0
  14.      | }
  15. < console>:7: error: type mismatch;
  16.  found   : java.lang.Object with S
  17.  required: java.lang.Object with S{type x = Int}
  18.        sample = new S {

In this example sample uses type inference so the actual type is S with underlying type Int. The consequence is that sample can only be assigned with instances of S with type x = Int. The fix is to explicitly declare the variable type:
  1. scala> var sample2 : S = new S{ 
  2.      |   type x = Int
  3.      |   def get = 3
  4.      | }
  5. sample2: S = $anon$1@31602bbc
  6. scala> sample2 = new S {
  7.      |   type x = Double
  8.      |   def get = 3.0
  9.      | }
  10. sample2: S = $anon$1@4de5ed7b

The same thing happens when declaring functions and allows type inference for function definition
  1. scala> class Fac {
  2.      |   def newS = new S {
  3.      |     type x = Int
  4.      |     def get = 3
  5.      |   }
  6.      | }
  7. defined class Fac
  8. scala> class SubFac extends Fac{
  9.      |   override def newS = new S {
  10.      |     type x = Double
  11.      |     def get = 3.0
  12.      |   }
  13.      | }
  14. < console>:8: error: type mismatch;
  15.  found   : java.lang.Object with S
  16.  required: java.lang.Object with S{type x = Int}
  17.          override def newS = new S {

The fix for this example is to be explicit in the definition of the function in the superclass

Monday, May 17, 2010

Instance Type (Abstract type gotcha 1)

In a previous post about abstract types I showed one of the benefits of using abstract types over parameterized types. Abstract Types vs Parameter. The next several posts will feature potential problems you may encounter when using Abstract Types.

I should point out that abstract types are not inherently difficult to understand but they are rather different from anything you will see when you come from the Java world so if you are new to them I would use them with caution at first.

In the abstract types example you will notice that the abstract type 'I' in Foreach is not within the trait Source rather it is outside in the Foreach trait. At first one might consider putting the type in Source rather than Foreach. The naive change can get you in trouble (but there is a couple easy fixes)
  1. trait Foreach[A] {
  2.   trait Source {
  3.     type I <: java.io.Closeable  // moved this line into Source
  4.     def in : I
  5.     def next(in : I) : Option[A]
  6.   }
  7.   def source : Source
  8.   
  9.   def foreach[U](f : A => U) : Unit = {
  10.     val s = source.in
  11.     try {
  12.       def processNext : Unit = source.next(s) match {
  13.         case None => 
  14.           ()
  15.         case Some(value) => 
  16.           f(value)
  17.           processNext
  18.       }
  19.       
  20.       processNext
  21.     } finally {
  22.       // correctly handle exceptions
  23.       s.close
  24.     }
  25.   }
  26. }

Compiling the class results in a compilation error:

jeichar: tmp$ scalac XX.scala
XX.scala:12: error: type mismatch;
found : s.type (with underlying type Foreach.this.Source#I)
required: _2.I where val _2: Foreach.this.Source
def processNext : Unit = source.next(s) match {
^
XX.scala:16: error: type mismatch;
found : value.type (with underlying type Any)
required: A
f(value)
^
two errors found

So what is the problem? The problem is simple but subtle. Notice that source is defined as a def. So calling source 2 times may return 2 different instances of Source. A simple change can fix this. Either change def source : Source to val source : Source. Or change the method foreach to assign the result from source to a val.
  1. trait Foreach {
  2.   trait Source {
  3.     type I <: java.io.Closeable  // moved this line into Source
  4.     def in : I
  5.     def next(in : I) : Option[Int]
  6.   }
  7.   def source : Source
  8.   
  9.   def foreach[U](f : Int => U) : Unit = {
  10.     // this assignment allows this example to compile
  11.     val sameSource = source
  12.     val s = sameSource.in
  13.     try {
  14.       def processNext : Unit = sameSource.next(s) match {
  15.         case None => 
  16.           ()
  17.         case Some(value) => 
  18.           f(value)
  19.           processNext
  20.       }
  21.       
  22.       processNext
  23.     } finally {
  24.       // correctly handle exceptions
  25.       s.close
  26.     }
  27.   }
  28. }

Monday, May 3, 2010

Abstract Types vs Parameter

This topic (and the next) are intended to discuss abstract types. A class/trait with an abstract type is quite similar to a class/trait type parameter. For example:

  1. trait C[A] {
  2.   def get : A
  3.   def doit(a:A):A
  4. }
  5. trait C2 {
  6.   type A
  7.   def get : A
  8.   def doit(a:A):A
  9. }

Both implementations have similar properties. However they are NOT the same. At first I thought that I could used them inter-changeably. However, consider the following examples:

  1. //compiles
  2. def p(c:C[Int]) = c.doit(c.get)
  3. // doesn't compile
  4. def p2(c:C2) = c.doit(c.get)

So why doesn't p2 compile? Because it returns A. From the signature of p2 it is impossible to know what p2 returns. There are several ways to fix this problem. One make the method return Unit:

  1. // compiles because the internals of C2 does not leak out
  2. def p(c:C2):Unit = c.doit(c.get)

Another fix would be to change doit to return Unit or an explicit return value like Int

  1. trait C2 {
  2.   type A
  3.   def get : A
  4.   def doit(a:A):Int
  5. }
  6. // compiles correctly
  7. def p(c:C2) = c.doit(c.get)

A second difference between parameterized types and types with abstract type values is illustrated below:

  1. trait C2 {
  2.    type A
  3.    def get : A
  4. }
  5. scala> var c : C2 = new C2 { 
  6.      | type A = Int
  7.      | def get = 3
  8.      | }
  9. c: C2 = $anon$1@11a40fff
  10. // what is the type of result if at compile time the
  11. // value of c is not known
  12. scala> var result = c.get
  13. result: C2#A = 3
  14. scala> c = new C2 {      
  15.      |    type A = String
  16.      |    def get = "hi"
  17.      | }
  18. c: C2 = $anon$1@5f154718
  19. // crazy eh :) the variable can be anything but does not
  20. // have type Any so you cannot assign arbitrary values
  21. scala> result = c.get
  22. result: C2#A = hi
  23. scala> result.isInstanceOf[String]
  24. res0: Boolean = true
  25. // while the dynamic type of result is a string the
  26. // static type is not so you cannot assign a string to result
  27. scala> result = "4"
  28. < console> :8: error: type mismatch;
  29.  found   : java.lang.String("4")
  30.  required: C2#A
  31.        result = "4"
  32.                 ^

The obvious question is what use are abstract types. I don't claim to know them all but the main point is that they do not expose the internal implementation details to the world. The famous cake pattern is one such example usage of abstract types.

I read the following as well (wish I could remember where):

Abstract types are good when extending and there will be concrete subclasses. Param type good for when a type is useful without extension but can handle several types.

A simpler example is examined here. It is loosely based on a real world usecase.
The example below is contrived so that it is smaller than the actual usecase, so consider the design and not the fact that the example could be easier done with other examples. In the real scenario this design reduced the lines of duplicated code from around 500 to 10.

The example below shows how a Traversable like object can be created from InputStreams and Readers. The important aspect is that the type signature of Foreach does not leak information about the implementation. Users of a Foreach object don't care whether it is backed onto an InputStream or Reader. They just care about the type of object contained.

I am leaving this already long post here. The next post will investigate different ways you can get in trouble trying to implement using abstract types.



  1. import java.io.{InputStream, Reader, ByteArrayInputStream, StringReader}
  2. import java.net.URL
  3. object Foreach {
  4.   def fromStream(s: => InputStream) = new Foreach[Int] {
  5.     type I = InputStream
  6.     def source = new Source {
  7.       def in = s
  8.       def next(_in : InputStream) = _in.read match {
  9.         case -1 => None
  10.         case i => Some(i)
  11.       }
  12.     }
  13.   }
  14.   
  15.   def fromReader(s: => Reader) = new Foreach[Char] {
  16.     type I = Reader
  17.     def source = new Source {
  18.       def in = s
  19.       def next(_in : Reader) = _in.read match {
  20.         case -1 => None
  21.         case i => Some(i.toChar)
  22.       }
  23.     }
  24.   }
  25.   
  26.   
  27.   def fromInputAndFunction[A](s: => InputStream, f: Int => A) = new Foreach[A] {
  28.     type I = InputStream
  29.     def source = new Source {
  30.       def in = s
  31.       def next(_in : InputStream) = _in.read match {
  32.         case -1 => None
  33.         case i => Some(f(i))
  34.       }
  35.     }
  36.   }
  37.   
  38.   
  39. }
  40. trait Foreach[A] {
  41.   type I <: java.io.Closeable
  42.   trait Source {
  43.     def in : I
  44.     def next(in : I) : Option[A]
  45.   }
  46.   def source : Source
  47.   
  48.   def foreach[U](f : A => U) : Unit = {
  49.     val s = source.in
  50.     try {
  51.       def processNext : Unit = source.next(s) match {
  52.         case None => 
  53.           ()
  54.         case Some(value) => 
  55.           f(value)
  56.           processNext
  57.       }
  58.       
  59.       processNext
  60.     } finally {
  61.       // correctly handle exceptions
  62.       s.close
  63.     }
  64.   }
  65. }
  66. object Test {
  67.   def main(args : Array[String]) = {
  68.     val data = "Hello World"
  69.     val bytes = data.toArray.map { _.toByte }
  70.     import Foreach._
  71.     fromStream(new ByteArrayInputStream(bytes)).foreach {a => print(a.toChar)}
  72.     
  73.     println
  74.     fromReader(new StringReader(data)) foreach print
  75.     
  76.     println
  77.     
  78.     fromInputAndFunction(new ByteArrayInputStream(bytes), i => i.toChar) foreach print
  79.     
  80.     println
  81.   }
  82. }