web/collections.textile (299 lines of code) (raw):
---
prev: basics2.textile
next: pattern-matching-and-functional-composition.textile
title: Collections
layout: post
---
This lesson covers:
* Basic Data Structures
** "Arrays":#Arrays
** "Lists":#Lists
** "Sets":#Sets
** "Tuple":#Tuple
** "Maps":#Maps
** "Option":#Option
* Functional Combinators
** "map":#map
** "foreach":#foreach
** "filter":#filter
** "zip":#zip
** "partition":#partition
** "find":#find
** "drop and dropWhile":#drop
** "foldRight and foldLeft":#fold
** "flatten":#flatten
** "flatMap":#flatMap
** "Generalized functional combinators":#generalized
** "Map?":#vsMap
h1. Basic Data Structures
Scala provides some nice collections.
*See Also* Effective Scala has opinions about how to use <a href="https://twitter.github.com/effectivescala/#Collections">collections</a>.
h2(#Arrays). Arrays
Arrays preserve order, can contain duplicates, and are mutable.
<pre>
scala> val numbers = Array(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
numbers: Array[Int] = Array(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
scala> numbers(3) = 10
</pre>
h2(#Lists). Lists
Lists preserve order, can contain duplicates, and are immutable.
<pre>
scala> val numbers = List(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
numbers: List[Int] = List(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
scala> numbers(3) = 10
<console>:9: error: value update is not a member of List[Int]
numbers(3) = 10
</pre>
h2(#Sets). Sets
Sets do not preserve order and have no duplicates
<pre>
scala> val numbers = Set(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
numbers: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4)
</pre>
h2(#Tuple). Tuple
A tuple groups together simple logical collections of items without using a class.
<pre>
scala> val hostPort = ("localhost", 80)
hostPort: (String, Int) = (localhost, 80)
</pre>
Unlike case classes, they don't have named accessors, instead they have accessors that are named by their position and is 1-based rather than 0-based.
<pre>
scala> hostPort._1
res0: String = localhost
scala> hostPort._2
res1: Int = 80
</pre>
Tuples fit with pattern matching nicely.
<pre>
hostPort match {
case ("localhost", port) => ...
case (host, port) => ...
}
</pre>
Tuple has some special sauce for simply making Tuples of 2 values: <code>-></code>
<pre>
scala> 1 -> 2
res0: (Int, Int) = (1,2)
</pre>
*See Also* Effective Scala has opinions about <a href="https://twitter.github.com/effectivescala/#Functional programming-Destructuring bindings">destructuring bindings</a> ("unpacking" a tuple).
h2(#Maps). Maps
It can hold basic datatypes.
<pre>
Map(1 -> 2)
Map("foo" -> "bar")
</pre>
This looks like special syntax but remember back to our discussion of Tuple that <code>-></code> can be used to create Tuples.
Map() also uses that variable argument syntax we learned back in Lesson #1: <code>Map(1 -> "one", 2 -> "two")</code> which expands into <code>Map((1, "one"), (2, "two"))</code> with the first element being the key and the second being the value of the Map.
Maps can themselves contain Maps or even functions as values.
<pre>
Map(1 -> Map("foo" -> "bar"))
</pre>
<pre>
Map("timesTwo" -> { timesTwo(_) })
</pre>
h2(#Option). Option
<code>Option</code> is a container that may or may not hold something.
The basic interface for Option looks like:
<pre>
trait Option[T] {
def isDefined: Boolean
def get: T
def getOrElse(t: T): T
}
</pre>
Option itself is generic and has two subclasses: <code>Some[T]</code> or <code>None</code>
Let's look at an example of how Option is used:
<code>Map.get</code> uses <code>Option</code> for its return type. Option tells you that the method might not return what you're asking for.
<pre>
scala> val numbers = Map("one" -> 1, "two" -> 2)
numbers: scala.collection.immutable.Map[java.lang.String,Int] = Map(one -> 1, two -> 2)
scala> numbers.get("two")
res0: Option[Int] = Some(2)
scala> numbers.get("three")
res1: Option[Int] = None
</pre>
Now our data appears trapped in this <code>Option</code>. How do we work with it?
A first instinct might be to do something conditionally based on the <code>isDefined</code> method.
<pre>
// We want to multiply the number by two, otherwise return 0.
val result = if (res1.isDefined) {
res1.get * 2
} else {
0
}
</pre>
We would suggest that you use either <code>getOrElse</code> or pattern matching to work with this result.
<code>getOrElse</code> lets you easily define a default value.
<pre>
val result = res1.getOrElse(0) * 2
</pre>
Pattern matching fits naturally with <code>Option</code>.
<pre>
val result = res1 match {
case Some(n) => n * 2
case None => 0
}
</pre>
*See Also* Effective Scala has opinions about <a href="https://twitter.github.com/effectivescala/#Functional programming-Options">Options</a>.
h1(#combinators). Functional Combinators
<code>List(1, 2, 3) map squared</code> applies the function <code>squared</code> to the elements of the list, returning a new list, perhaps <code>List(1, 4, 9)</code>. We call operations like <code>map</code> <em>combinators</em>. (If you'd like a better definition, you might like <a href="https://stackoverflow.com/questions/7533837/explanation-of-combinators-for-the-working-man">Explanation of combinators</a> on Stackoverflow.) Their most common use is on the standard data structures.
h2(#map). map
Evaluates a function over each element in the list, returning a list with the same number of elements.
<pre>
scala> val numbers = List(1, 2, 3, 4)
numbers: List[Int] = List(1, 2, 3, 4)
scala> numbers.map((i: Int) => i * 2)
res0: List[Int] = List(2, 4, 6, 8)
</pre>
or pass in a function (the Scala compiler automatically converts our method to a function)
<pre>
scala> def timesTwo(i: Int): Int = i * 2
timesTwo: (i: Int)Int
scala> numbers.map(timesTwo)
res0: List[Int] = List(2, 4, 6, 8)
</pre>
h2(#foreach). foreach
foreach is like map but returns nothing. foreach is intended for side-effects only.
<pre>
scala> numbers.foreach((i: Int) => i * 2)
</pre>
returns nothing.
You can try to store the return in a value but it'll be of type Unit (i.e. void)
<pre>
scala> val doubled = numbers.foreach((i: Int) => i * 2)
doubled: Unit = ()
</pre>
h2(#filter). filter
removes any elements where the function you pass in evaluates to false. Functions that return a Boolean are often called predicate functions.
<pre>
scala> numbers.filter((i: Int) => i % 2 == 0)
res0: List[Int] = List(2, 4)
</pre>
<pre>
scala> def isEven(i: Int): Boolean = i % 2 == 0
isEven: (i: Int)Boolean
scala> numbers.filter(isEven)
res2: List[Int] = List(2, 4)
</pre>
h2(#zip). zip
zip aggregates the contents of two lists into a single list of pairs.
<pre>
scala> List(1, 2, 3).zip(List("a", "b", "c"))
res0: List[(Int, String)] = List((1,a), (2,b), (3,c))
</pre>
h2(#partition). partition
<code>partition</code> splits a list based on where it falls with respect to a predicate function.
<pre>
scala> val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> numbers.partition(_ % 2 == 0)
res0: (List[Int], List[Int]) = (List(2, 4, 6, 8, 10),List(1, 3, 5, 7, 9))
</pre>
h2(#find). find
find returns the first element of a collection that matches a predicate function.
<pre>
scala> numbers.find((i: Int) => i > 5)
res0: Option[Int] = Some(6)
</pre>
h2(#drop). drop & dropWhile
<code>drop</code> drops the first i elements
<pre>
scala> numbers.drop(5)
res0: List[Int] = List(6, 7, 8, 9, 10)
</pre>
<code>dropWhile</code> removes the first element that match a predicate function. For example, if we <code>dropWhile</code> odd numbers from our list of numbers, <code>1</code> gets dropped (but not <code>3</code> which is "shielded" by <code>2</code>).
<pre>
scala> numbers.dropWhile(_ % 2 != 0)
res0: List[Int] = List(2, 3, 4, 5, 6, 7, 8, 9, 10)
</pre>
h2(#fold). foldLeft
<pre>
scala> numbers.foldLeft(0)((m: Int, n: Int) => m + n)
res0: Int = 55
</pre>
0 is the starting value (Remember that numbers is a List[Int]), and m
acts as an accumulator.
Seen visually:
<pre>
scala> numbers.foldLeft(0) { (m: Int, n: Int) => println("m: " + m + " n: " + n); m + n }
m: 0 n: 1
m: 1 n: 2
m: 3 n: 3
m: 6 n: 4
m: 10 n: 5
m: 15 n: 6
m: 21 n: 7
m: 28 n: 8
m: 36 n: 9
m: 45 n: 10
res0: Int = 55
</pre>
h3. foldRight
Is the same as foldLeft except it runs in the opposite direction.
<pre>
scala> numbers.foldRight(0) { (m: Int, n: Int) => println("m: " + m + " n: " + n); m + n }
m: 10 n: 0
m: 9 n: 10
m: 8 n: 19
m: 7 n: 27
m: 6 n: 34
m: 5 n: 40
m: 4 n: 45
m: 3 n: 49
m: 2 n: 52
m: 1 n: 54
res0: Int = 55
</pre>
h2(#flatten). flatten
flatten collapses one level of nested structure.
<pre>
scala> List(List(1, 2), List(3, 4)).flatten
res0: List[Int] = List(1, 2, 3, 4)
</pre>
h2(#flatMap). flatMap
flatMap is a frequently used combinator that combines mapping and flattening. flatMap takes a function that works on the nested lists and then concatenates the results back together.
<pre>
scala> val nestedNumbers = List(List(1, 2), List(3, 4))
nestedNumbers: List[List[Int]] = List(List(1, 2), List(3, 4))
scala> nestedNumbers.flatMap(x => x.map(_ * 2))
res0: List[Int] = List(2, 4, 6, 8)
</pre>
Think of it as short-hand for mapping and then flattening:
<pre>
scala> nestedNumbers.map((x: List[Int]) => x.map(_ * 2)).flatten
res1: List[Int] = List(2, 4, 6, 8)
</pre>
that example calling map and then flatten is an example of the "combinator"-like nature of these functions.
*See Also* Effective Scala has opinions about <a href="https://twitter.github.com/effectivescala/#Functional programming-`flatMap`">flatMap</a>.
h2(#generalized). Generalized functional combinators
Now we've learned a grab-bag of functions for working with collections.
What we'd like is to be able to write our own functional combinators.
Interestingly, every functional combinator shown above can be written on top of fold. Let's see some examples.
<pre>
def ourMap(numbers: List[Int], fn: Int => Int): List[Int] = {
numbers.foldRight(List[Int]()) { (x: Int, xs: List[Int]) =>
fn(x) :: xs
}
}
scala> ourMap(numbers, timesTwo(_))
res0: List[Int] = List(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
</pre>
Why <tt>List[Int]()</tt>? Scala wasn't smart enough to realize that you wanted an empty list of Ints to accumulate into.
h2(#vsMap). Map?
All of the functional combinators shown work on Maps, too. Maps can be thought of as a list of pairs so the functions you write work on a pair of the keys and values in the Map.
<pre>
scala> val extensions = Map("steve" -> 100, "bob" -> 101, "joe" -> 201)
extensions: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101), (joe,201))
</pre>
Now filter out every entry whose phone extension is lower than 200.
<pre>
scala> extensions.filter((namePhone: (String, Int)) => namePhone._2 < 200)
res0: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101))
</pre>
Because it gives you a tuple, you have to pull out the keys and values with their positional accessors. Yuck!
Lucky us, we can actually use a pattern match to extract the key and value nicely.
<pre>
scala> extensions.filter({case (name, extension) => extension < 200})
res0: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101))
</pre>
Why does this work? Why can you pass in a partial pattern match?
Stay tuned for next week!