An Introduction to Map, Filter, Reduce

Higher order functions are a powerful tool for functional programmers. The idea of higher order functions/functions as data has given rise to three especially useful functions. Enter map, filter, and reduce.

Map: Transform Data

Input: Colletion and unary function
Output: A collection, equal in size to the original collection
Uses: One-to-one transformation of a collection

In Clojure, map is used to transform each element in a collection. Map takes a unary (one argument) function and a collection as input.

Given a unary function, f, and a collection C = [v0, v1, ..., vN], (map f C) will return [(f v0), (f v1), ..., (f vN)].

For instance, lets pretend we have to apply a curve to exam grades,

(def grades [72 68 52 70 81 63])

(defn curve
  [grades extra-points]
  (map #(+ extra-points %) grades))

(curve grades 10)
;;=> [82 78 62 80 91 73]

; Awful color scheme

Filter: Filter Data

Input: Collection and predicate function
Output: A subset of the original collection
Uses: To remove unwanted elements of a collection

Filtering a collection is a common operation in any kind of programming. Clojure (any many other languages) provide to you a function, filter, to do just this. As inputs, you provide a predicate function and a collection. The return will always be a subset of the original collection.

Lets say we have a list of names but we only want names starting with “S”,

(def names ["Andy" "Jimmy" "Sarah" "Sam" "Zach"])

(def s-names
  (filter #(clojure.string/starts-with? names "S")))
  
(prn s-names)
;;=> ["Sarah" "Sam"]

Filter is very straight forward in what it’s intended use is and how it’s used. As long as you understand the idea of higher order functions, filter comes easy.

Reduce: Accumulate Data

Input: Collection and combining function
Output: Recombination of original data
Uses: To Aggregate, accumulate, or otherwise transform data

Reduce, also known as fold, is primarily used to recursively recombine a collection of data. Reduce is the most flexible of of the three functions in this article. In fact, map and filter can be defined in terms of reduce.

(def my-map
  [f col]
  (reduce (fn [acc v]
            (conj acc (f v)))
          []
          col))

(defn refilter 
  [f col]
  (reduce (fn [acc v]
            (if (f v)
              (conj acc v)
              acc))
          []
          col))

If you have a keen eye for laziness, you’ll notice that something is off. While map and filter are both known to be lazy, reduce isn’t. Our new my-map and my-filter functions are both eager. This isn’t necessarily a bad thing in all problem domains. However, since map and filter are typically used on large data sets, laziness is good. I’ll explain why in a later post.

But what do we do with this thing?

(defn my-sum
  [nums]
  (reduce + nums))

(def costs [100.10 45.70 255.45])

(my-sum costs)
;;=> 401.25

Fun fact, this is how + is defined in Clojure This is actually how many mathematical, collection related functions are defined. (i.e. +, -, *, max, min, etc…) Not quite this simplistic, but kind of close.

We can also use reduce to

Review

Map, filter, and reduce are staple, higher order functions.
Reduce can be used to define a wide variety of data related functions.
Map excels at surjective (onto) transformations.
Filter is a very useful when you need to refine a set of data based on a predicate like odd?.

Written on January 18, 2017