What are ABMs?

Agent-Based Models (ABMs) are computational simulations, in which individual actors (agents) interact with each other and their environment according to specified rules. The goal of ABMs is to understand how macro-level phenomena emerge from micro-level interactions.

Especially in the field of analytical sociology (mechanism based explaining), ABMs have emerged as a central tool Bruch and Atwell (2015).

ABMs can help precising and formalizing theories, by making assumptions explicit and by allowing to test the implications of these assumptions. ABMs can also be used to explore the consequences of different assumptions, and to identify which assumptions are most important for explaining a given phenomenon.

An ABM typically consists of the following components:

Agents: These are the individual actors in the model, which can represent people, organizations, or other entities. Agents have attributes (e.g., age, income) and behaviors (e.g., decision rules, interactions with other agents).
Environment: This is the context in which the agents operate, which can be a physical space (e.g., a grid) or a social network. The environment can also have its own dynamics (e.g., changes in resources, policies).
Rules: These are the decision rules that govern how agents interact with each other and with the environment. Rules can be simple (e.g., “move to a new location if unhappy”) or complex (e.g., “form a social tie with another agent if they have similar opinions”)

We will have a closer look at how ABMs work, by recreating Schellings Social Segregation Model (Schelling 1971).

Schelling’s Social Segregation Model

Code

library(needs)
needs(tidyverse,
      gridExtra)

The Schelling model of segregation, formulated by Thomas Schelling (1921–2016), who received the Nobel Prize in 2005, is an agent-based framework illustrating how even slight preferences for one’s social group can drive significant societal segregation.

This model delves into neighborhood segregation driven by socio-economic factors. While slight differences might exist within each neighborhood, substantial disparities emerge between neighborhoods, curtailing inter-area interactions. Income inequalities often underlie this segregation, propelling affluent individuals toward luxury renovated neighborhoods (gentrification) or suburban locales (suburbanization) in pursuit of specific community compositions.

The Schelling model is pertinent for studying social dynamics where individual actions hinge on others’ behavior, yielding self-reinforcing trends.

Additionally, the model showcases a thought experiment highlighting that segregation can arise without explicit preferences for homogeneous neighborhoods. The key assumptions are that individuals aim to avoid minority status in their neighborhood and neighborhoods necessitate at least a 50:50 resident ratio, prompting migration if unmet.

In essence, the Schelling model underscores how seemingly minor individual preferences can unintentionally yield substantial social segregation. It provides insightful perspectives on comprehending how intricate social dynamics mold neighborhoods, offering a framework to scrutinize unintended effects of social actions.

Initialization

The original model is depicted on a \(N x N\) grid, where each cell can be in one of the three states: uninhabited, inhabited by a member og group 1, or inhabited by a member of group 2. This can also be represented in a matrix, with each element being either 0,1, or 2.

Code

world_size   = 10          # the grid will be 10x10
world_empty  = 0.2         # 20% of the vertices are empty
pop_shares   = c(0.5, 0.5) # two groups with equal sizes

The code above sets the parameters of our model. We start with just a few components:

world_size determines the dimensions of the grid (10x10 in this case).
world_empty specifies the proportion of unoccupied cells in the grid (20%)
pop_shares defines two things: the amount of types of agents, that we have and their relative shares.

To ensure reproducibility, we set a seed for the random number generator. This allows us to generate the same random numbers each time we run the code, which is crucial for debugging and sharing results.

Code

set.seed(123)

For clean documentation, we can bundle these informations into a single list, which we can easily pass to functions later on. This also allows us to keep track of the parameters we used for each simulation run.

Code

params <- list(
  world_size = 25,
  world_empty = 0.1,
  pop_shares = c(0.5, 0.5),
  seed = 123
)

When we have these parameters, we can initialize the world.

Code

cells <- params$world_size^2
occupied <- floor((1 - params$world_empty) * cells)

group_vector <- c(
  rep(0, cells - occupied),
  rep(1, floor(occupied * params$pop_shares[1])),
  rep(2, floor(occupied * params$pop_shares[2]))
)

set.seed(params$seed)

grid <- matrix(
  sample(group_vector, cells, replace = FALSE),
  nrow = params$world_size,
  ncol = params$world_size
)

Preference treshold

In the next step, we define the ‘alike preference’, the treshold or tolerance level that an agent has for being surrounded by neighbors who are similar to them. This preference determines whether an agent considers its current neighborhood as “acceptable” or “unacceptable” based on the proportion of neighbors that belong to the same group as the agent. By adjusting the “alike preference” parameter, you can control the degree of tolerance an agent has for living in a mixed or segregated neighborhood. A higher “alike preference” means that agents are more particular about living near similar neighbors, which can lead to more segregation. On the other hand, a lower “alike preference” allows for greater diversity in the neighborhood.

Code

params$iterations <- 50
params$alike_preference <- 0.5

Lastly we define a function that takes a set of xy-coordinates as a vector and identifies the neighbors of a given patch using a Moore neighborhood (consisting of the 8 surrounding patches of a cell). The coordinates are adjusted to handle boundaries effectively.

Code

get_neighbors <- function(coords) {

  offsets <- matrix(c(
     1, 0,
     1, 1,
     0, 1,
    -1, 1,
    -1, 0,
    -1,-1,
     0,-1,
     1,-1
  ), ncol = 2, byrow = TRUE)

  n <- sweep(offsets, 2, coords, "+")

  # torus wrapping
  n[n < 1] <- params$world_size
  n[n > params$world_size] <- 1

  n
}

Then we need a dataframe to store the results of our simulation, which we can use for analysis and visualization later on.

Code

results <- data.frame(
  iteration = 1:params$iterations,
  happy = rep(NA_real_, params$iterations),
  unhappy = rep(NA_real_, params$iterations)
)

Simulation

This is where the fun starts. We run the simulation of the simplified Schelling model for a specified number of iterations within a loop to obtain (visual) results. We’ll use two tracker vectors, happy_cells and unhappy_cells, to keep coordinates of happy and unhappy cells. Happiness in the Schelling model of segregation means, that e.g. with an “alike preference” threshold of 0.6, a cell is deemed “happy” if at least 60% of its neighboring cells (within its given range) belong to the same group. This means the cell prefers having similar neighbors and can tolerate a bit of diversity nearby. Moving on, we’ll use nested loops to go through each row and column of the matrix. For each cell (referred to as ‘current’), its value (0, 1, or 2) is checked. If the cell isn’t empty (value not 0), variables are created to count like-minded neighbors and total neighbors (inhabited cells). The get_neighbors function generates a vector ‘neighbors’. A loop then compares values of neighbors to the current cell’s value. If they match, the count of like neighbors increases. If inhabited, the total neighbor count also goes up. Afterward, the ratio of like neighbors to total neighbors is calculated. This ratio is compared to the like-neighbor preference to determine the cell’s happiness (preventing division by 0 with is.nan function). Happy and unhappy patches are stored as coordinate matrices.

Overall happiness is calculated by dividing the count of happy cells by total occupied cells. This value is added to the happiness tracker.

Code

is_happy <- function(x, y, grid, pref) {

  group <- grid[x, y]
  if (group == 0) return(NA)   # empty cell

  neigh <- get_neighbors(c(x, y))
  neigh_vals <- apply(neigh, 1, \(p) grid[p[1], p[2]])

  neigh_vals <- neigh_vals[neigh_vals != 0]  # ignore empty cells
  if (length(neigh_vals) == 0) return(TRUE)

  alike_share <- mean(neigh_vals == group)
  alike_share >= pref
}

To neatly visualize this, we can write a plotting function:

Code

plot_grid <- function(grid, iter) {

  df <- expand.grid(
    x = 1:nrow(grid),
    y = 1:ncol(grid)
  )

  df$group <- as.vector(grid)

  ggplot(df, aes(x = x, y = y, fill = factor(group))) +
    geom_tile(color = "grey80") +
    scale_fill_manual(values = c(
      "0" = "white",
      "1" = "pink",
      "2" = "lightblue"
    )) +
    coord_equal() +
    scale_y_reverse() +
    labs(title = paste("Iteration", iter), fill = "") +
    theme_void() +
    theme(plot.title = element_text(hjust = 0.5))
}

plot_happiness <- function(results) {

  plot_data <- pivot_longer(
    results,
    cols = c(happy, unhappy),
    names_to = "state",
    values_to = "share"
  )

  ggplot(plot_data,
         aes(iteration, share, color = state)) +
    geom_line(linewidth = 1.2, na.rm = TRUE) +
    scale_x_continuous(
      limits = c(1, params$iterations),
      breaks = seq(0, params$iterations, by = 20)
    ) +
    scale_y_continuous(
      limits = c(0,1)
    ) +
    labs(
      x = "Time",
      y = "Share of agents",
      color = ""
    ) +
    theme_minimal(base_size = 13)
}

Lastly, the unhappy patches need to move to unoccupied spots. To achieve this without spatial bias, we’ll randomly select unhappy cells. Each cell selected becomes a “mover” and picks a random unoccupied spot (‘moveto’) on the grid. A while loop ensures an uninhabited ‘moveto’ is chosen. Once found, the mover’s value is transferred to the new spot, and the mover’s original patch is cleared.

Code

dev.new(width = 10, height = 5)

for (t in 1:params$iterations) {

  ## --- find unhappy agents ---
  unhappy <- list()

  for (i in 1:params$world_size) {
    for (j in 1:params$world_size) {

      if (grid[i, j] != 0) {
        if (!is_happy(i, j, grid, params$alike_preference)) {
          unhappy[[length(unhappy) + 1]] <- c(i, j)
        }
      }
    }
  }

  ## --- empty cells ---
  empty_cells <- which(grid == 0, arr.ind = TRUE)

  ## --- move agents ---
  if (length(unhappy) > 0 && nrow(empty_cells) > 0) {

    move_order <- sample(seq_along(unhappy))

    for (k in move_order) {

      if (nrow(empty_cells) == 0) break

      old <- unhappy[[k]]
      new_id <- sample(1:nrow(empty_cells), 1)
      new <- empty_cells[new_id, ]

      grid[new[1], new[2]] <- grid[old[1], old[2]]
      grid[old[1], old[2]] <- 0

      empty_cells[new_id, ] <- old
    }
  }

  ## --- compute happiness shares ---
  agents <- which(grid != 0, arr.ind = TRUE)

  happy_vec <- apply(
    agents,
    1,
    \(pos) is_happy(pos[1], pos[2], grid, params$alike_preference)
  )
  
  happy_share <- mean(happy_vec)
  unhappy_share <- 1 - happy_share
  
  results$happy[t] <- happy_share
  results$unhappy[t] <- unhappy_share
  
  
  #----------- update plot
  
  p1 <- plot_grid(grid, t)
  p2 <- plot_happiness(results)

  grid.arrange(p1, p2, ncol = 2)

  
}

TODO: ADD GIF!

Congratulations! This is your first ABM!

Now think about our model:

On which parameters does the behavior of this Schelling model depend?
How do the different parameters affect the strength of the observed segregation in the simulated population?
Play with the different parameters of the model. Do the results correspond to your expectations from question 2?

To better grasp the effects of homophily preference on social segregation, we might be interested in understanding how stable our results are. Let’s add this to our simulation.

Code

params <- list(
  world_size = 25,
  world_empty = 0.1,
  pop_shares = c(0.5, 0.5),
  seed = 123,
  iterations = 50,
  alike_preference = 0.5,
  runs = 10       # <-- number of simulation repetitions
)

get_neighbors <- function(coords, world_size) {
  offsets <- matrix(c(
     1, 0,
     1, 1,
     0, 1,
    -1, 1,
    -1, 0,
    -1,-1,
     0,-1,
     1,-1
  ), ncol = 2, byrow = TRUE)

  n <- sweep(offsets, 2, coords, "+")

  # torus wrapping
  n[n < 1] <- world_size
  n[n > world_size] <- 1
  n
}

is_happy <- function(x, y, grid, pref) {
  group <- grid[x, y]
  if (group == 0) return(NA)

  neigh <- get_neighbors(c(x, y), nrow(grid))
  neigh_vals <- apply(neigh, 1, \(p) grid[p[1], p[2]])
  neigh_vals <- neigh_vals[neigh_vals != 0]
  if (length(neigh_vals) == 0) return(TRUE)

  alike_share <- mean(neigh_vals == group)
  alike_share >= pref
}

# store happiness for all runs
all_results <- array(NA, dim = c(params$iterations, 2, params$runs),
                     dimnames = list(NULL, c("happy", "unhappy"), NULL))

set.seed(params$seed)

for (r in 1:params$runs) {
  
  # --- initialize grid ---
  cells <- params$world_size^2
  occupied <- floor((1 - params$world_empty) * cells)
  group_vector <- c(
    rep(0, cells - occupied),
    rep(1, floor(occupied * params$pop_shares[1])),
    rep(2, floor(occupied * params$pop_shares[2]))
  )
  
  grid <- matrix(
    sample(group_vector, cells, replace = FALSE),
    nrow = params$world_size,
    ncol = params$world_size
  )
  
  results <- data.frame(
    iteration = 1:params$iterations,
    happy = rep(NA_real_, params$iterations),
    unhappy = rep(NA_real_, params$iterations)
  )
  
  for (t in 1:params$iterations) {
    # find unhappy agents
    unhappy <- list()
    for (i in 1:params$world_size) {
      for (j in 1:params$world_size) {
        if (grid[i, j] != 0 && !is_happy(i, j, grid, params$alike_preference)) {
          unhappy[[length(unhappy) + 1]] <- c(i, j)
        }
      }
    }
    
    # empty cells
    empty_cells <- which(grid == 0, arr.ind = TRUE)
    
    # move agents
    if (length(unhappy) > 0 && nrow(empty_cells) > 0) {
      move_order <- sample(seq_along(unhappy))
      for (k in move_order) {
        if (nrow(empty_cells) == 0) break
        old <- unhappy[[k]]
        new_id <- sample(1:nrow(empty_cells), 1)
        new <- empty_cells[new_id, ]
        grid[new[1], new[2]] <- grid[old[1], old[2]]
        grid[old[1], old[2]] <- 0
        empty_cells[new_id, ] <- old
      }
    }
    
    # compute happiness
    agents <- which(grid != 0, arr.ind = TRUE)
    happy_vec <- apply(agents, 1, \(pos) is_happy(pos[1], pos[2], grid, params$alike_preference))
    results$happy[t] <- mean(happy_vec)
    results$unhappy[t] <- 1 - mean(happy_vec)
  }
  
  # --- define array to store results ---
  all_results <- array(NA, dim = c(params$iterations, 2, params$runs),
                       dimnames = list(NULL, c("happy", "unhappy"), NULL))
  
  # inside the runs loop:
  all_results[ , 1, r] <- results$happy
  all_results[ , 2, r] <- results$unhappy
}

# compute average across runs
avg_results <- data.frame(
  iteration = 1:params$iterations,
  happy = rowMeans(all_results[, "happy", ]),
  unhappy = rowMeans(all_results[, "unhappy", ])
)

# plot average dynamics
plot_data <- pivot_longer(avg_results, cols = c(happy, unhappy), names_to = "state", values_to = "share")
ggplot(plot_data, aes(iteration, share, color = state)) +
  geom_line(linewidth = 1.2) +
  scale_x_continuous(limits = c(1, params$iterations), breaks = seq(0, params$iterations, by = 10)) +
  scale_y_continuous(limits = c(0,1)) +
  labs(x = "Time", y = "Share of agents", color = "") +
  theme_minimal(base_size = 13)

In order to systematically analyze the effects of homophily on socila segregation, we would need to run the model multiple times with different parameter values and analyze the results.

For the purpose of replication, we can wrap the code in a function and run it multiple times with different parameter values. This allows us to systematically analyze the effects of homophily on social segregation.

Variations

Limitations of ABMs

Can yield generative sufficiency but cannot rule out other mechanisms. ABMs are not a silver bullet for social science research, and they have limitations that researchers should be aware of. Some of the main limitations include:

References

Bianchi, Federico, and Flaminio Squazzoni. 2015. “Agent-Based Models in Sociology.” WIREs Computational Statistics 7 (4): 284–306. https://doi.org/10.1002/wics.1356.

Bruch, Elizabeth, and Jon Atwell. 2015. “Agent-Based Models in Empirical Social Research.” Sociological Methods & Research 44 (2): 186–221. https://doi.org/10.1177/0049124113506405.

Jarvis, Benjamin F., Marc Keuschnigg, and Peter Hedström. 2021. “Analytical Sociology Amidst a Computational Social Science Revolution.” In Handbook of Computational Social Science, Volume 1, 1st ed., 33–52. London: Routledge. https://doi.org/10.4324/9781003024583-4.

Schelling, Thomas C. 1971. “Dynamic Models of Segregation.” The Journal of Mathematical Sociology 1 (2): 143–86. https://doi.org/10.1080/0022250X.1971.9989794.

Week 09 - Agent-based modeling

What are ABMs?

Limitations of ABMs

References

Copyright