quick introduction to `map()` and `apply()`

Published

May 14, 2025

quick comparison: apply() vs. purrr::map()

Both apply() and map() let you apply functions to data without writing loops, but they work differently: - apply(): Base R function for matrices and data frames - map(): tidyverse function for vectors and lists

Example 1: Calculate Mean of Each Column

# Create sample data
df <- data.frame(
  a = c(1, 2, 3, 4, 5),
  b = c(10, 20, 30, 40, 50),
  c = c(100, 200, 300, 400, 500)
)

df

Using apply()

apply() is one member of the larger apply() family that is meant for dataframes

# MARGIN = 2 sets this to be column-wise
apply(X = df, MARGIN = 2, FUN = mean)

  a   b   c 
  3  30 300

# compare to MARGIN = 1
apply(X = df, MARGIN = 1, FUN = mean)

[1]  37  74 111 148 185

We get an error using apply() on a single column

apply(X = df$a, MARGIN = 2, FUN = mean)

Error in apply(X = df$a, MARGIN = 2, FUN = mean): dim(X) must have a positive length

Using map()

library(purrr)

We can accomplish similar things with map(), from the purrr family.

The different variations of map() let us specify the singular values we want returned. For example, we can use map_dbl() to force a double to be returned from each element being iterated:

Calculate mean of all the columns using map_dbl()

map_dbl(df, mean)

  a   b   c 
  3  30 300

What happens if we use map() by itself? We see a different structure being returned

map(df, mean)

$a
[1] 3

$b
[1] 30

$c
[1] 300

map() will return a list of values, which means you can actually ask for complex objects to be returned.

test <- map(df,mean)

We can see that it returns a list:

str(test)

List of 3
 $ a: num 3
 $ b: num 30
 $ c: num 300

More complex functions

# create a list of vectors
my_list <- list(
  x = 1:9,
  y = 10:19,
  z = 20:42
)

Create a simple function:

# What does this function do? 
get_stats <- function(x) {
  c(min = min(x), mean = mean(x), max = max(x))
}

get_stats(c(1,2,3))

 min mean  max 
   1    2    3

We can use the “list” version of apply() to apply the function to our list:

lapply(my_list, get_stats)

$x
 min mean  max 
   1    5    9 

$y
 min mean  max 
10.0 14.5 19.0 

$z
 min mean  max 
  20   31   42

Whereas map() is happy to do it for us on its own:

# Using map()
map(my_list, get_stats)

$x
 min mean  max 
   1    5    9 

$y
 min mean  max 
10.0 14.5 19.0 

$z
 min mean  max 
  20   31   42

`map2()`

map2() is the same as map(), but uses two inputs. With this knowledge, we could do a binomial test

# create data for binomial test
set.seed(42)
v1 <- sample(x = seq(0, 25,), 10)
v2 <- rep(50, 10)

map2(v1, v2, ~ binom.test(.x, .y, p = .5))

[[1]]

    Exact binomial test

data:  .x and .y
number of successes = 16, number of trials = 50, p-value = 0.01535
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.1952042 0.4669938
sample estimates:
probability of success 
                  0.32 


[[2]]

    Exact binomial test

data:  .x and .y
number of successes = 4, number of trials = 50, p-value = 4.462e-10
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.02222796 0.19234278
sample estimates:
probability of success 
                  0.08 


[[3]]

    Exact binomial test

data:  .x and .y
number of successes = 0, number of trials = 50, p-value = 1.776e-15
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.00000000 0.07112174
sample estimates:
probability of success 
                     0 


[[4]]

    Exact binomial test

data:  .x and .y
number of successes = 9, number of trials = 50, p-value = 5.614e-06
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.08576208 0.31436941
sample estimates:
probability of success 
                  0.18 


[[5]]

    Exact binomial test

data:  .x and .y
number of successes = 3, number of trials = 50, p-value = 3.708e-11
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.01254859 0.16548195
sample estimates:
probability of success 
                  0.06 


[[6]]

    Exact binomial test

data:  .x and .y
number of successes = 17, number of trials = 50, p-value = 0.03284
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.2120547 0.4876525
sample estimates:
probability of success 
                  0.34 


[[7]]

    Exact binomial test

data:  .x and .y
number of successes = 25, number of trials = 50, p-value = 1
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.355273 0.644727
sample estimates:
probability of success 
                   0.5 


[[8]]

    Exact binomial test

data:  .x and .y
number of successes = 14, number of trials = 50, p-value = 0.002602
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.1623106 0.4249054
sample estimates:
probability of success 
                  0.28 


[[9]]

    Exact binomial test

data:  .x and .y
number of successes = 6, number of trials = 50, p-value = 3.244e-08
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.04533532 0.24310132
sample estimates:
probability of success 
                  0.12 


[[10]]

    Exact binomial test

data:  .x and .y
number of successes = 21, number of trials = 50, p-value = 0.3222
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.2818822 0.5679396
sample estimates:
probability of success 
                  0.42