Code
library(needs)
needs(tidyverse,
gridExtra)Simulating macro behavior from micro rules
Agent-Based Models (ABMs) are computational simulations, in which individual actors (agents) interact with each other and their environment according to specified rules. The goal of ABMs is to understand how macro-level phenomena emerge from micro-level interactions.
Especially in the field of analytical sociology (mechanism based explaining), ABMs have emerged as a central tool Bruch and Atwell (2015).
ABMs can help precising and formalizing theories, by making assumptions explicit and by allowing to test the implications of these assumptions. ABMs can also be used to explore the consequences of different assumptions, and to identify which assumptions are most important for explaining a given phenomenon.
An ABM typically consists of the following components:
We will have a closer look at how ABMs work, by recreating Schellings Social Segregation Model (Schelling 1971).
The Schelling model of segregation, formulated by Thomas Schelling (1921–2016), who received the Nobel Prize in 2005, is an agent-based framework illustrating how even slight preferences for one’s social group can drive significant societal segregation.
This model delves into neighborhood segregation driven by socio-economic factors. While slight differences might exist within each neighborhood, substantial disparities emerge between neighborhoods, curtailing inter-area interactions. Income inequalities often underlie this segregation, propelling affluent individuals toward luxury renovated neighborhoods (gentrification) or suburban locales (suburbanization) in pursuit of specific community compositions.
The Schelling model is pertinent for studying social dynamics where individual actions hinge on others’ behavior, yielding self-reinforcing trends.
Additionally, the model showcases a thought experiment highlighting that segregation can arise without explicit preferences for homogeneous neighborhoods. The key assumptions are that individuals aim to avoid minority status in their neighborhood and neighborhoods necessitate at least a 50:50 resident ratio, prompting migration if unmet.
In essence, the Schelling model underscores how seemingly minor individual preferences can unintentionally yield substantial social segregation. It provides insightful perspectives on comprehending how intricate social dynamics mold neighborhoods, offering a framework to scrutinize unintended effects of social actions.
The original model is depicted on a \(N x N\) grid, where each cell can be in one of the three states: uninhabited, inhabited by a member og group 1, or inhabited by a member of group 2. This can also be represented in a matrix, with each element being either 0,1, or 2.
The code above sets the parameters of our model. We start with just a few components:
world_size determines the dimensions of the grid (10x10 in this case).world_empty specifies the proportion of unoccupied cells in the grid (20%)pop_shares defines two things: the amount of types of agents, that we have and their relative shares.To ensure reproducibility, we set a seed for the random number generator. This allows us to generate the same random numbers each time we run the code, which is crucial for debugging and sharing results.
For clean documentation, we can bundle these informations into a single list, which we can easily pass to functions later on. This also allows us to keep track of the parameters we used for each simulation run.
When we have these parameters, we can initialize the world.
cells <- params$world_size^2
occupied <- floor((1 - params$world_empty) * cells)
group_vector <- c(
rep(0, cells - occupied),
rep(1, floor(occupied * params$pop_shares[1])),
rep(2, floor(occupied * params$pop_shares[2]))
)
set.seed(params$seed)
grid <- matrix(
sample(group_vector, cells, replace = FALSE),
nrow = params$world_size,
ncol = params$world_size
)In the next step, we define the ‘alike preference’, the treshold or tolerance level that an agent has for being surrounded by neighbors who are similar to them. This preference determines whether an agent considers its current neighborhood as “acceptable” or “unacceptable” based on the proportion of neighbors that belong to the same group as the agent. By adjusting the “alike preference” parameter, you can control the degree of tolerance an agent has for living in a mixed or segregated neighborhood. A higher “alike preference” means that agents are more particular about living near similar neighbors, which can lead to more segregation. On the other hand, a lower “alike preference” allows for greater diversity in the neighborhood.
Lastly we define a function that takes a set of xy-coordinates as a vector and identifies the neighbors of a given patch using a Moore neighborhood (consisting of the 8 surrounding patches of a cell). The coordinates are adjusted to handle boundaries effectively.
Then we need a dataframe to store the results of our simulation, which we can use for analysis and visualization later on.
This is where the fun starts. We run the simulation of the simplified Schelling model for a specified number of iterations within a loop to obtain (visual) results. We’ll use two tracker vectors, happy_cells and unhappy_cells, to keep coordinates of happy and unhappy cells. Happiness in the Schelling model of segregation means, that e.g. with an “alike preference” threshold of 0.6, a cell is deemed “happy” if at least 60% of its neighboring cells (within its given range) belong to the same group. This means the cell prefers having similar neighbors and can tolerate a bit of diversity nearby. Moving on, we’ll use nested loops to go through each row and column of the matrix. For each cell (referred to as ‘current’), its value (0, 1, or 2) is checked. If the cell isn’t empty (value not 0), variables are created to count like-minded neighbors and total neighbors (inhabited cells). The get_neighbors function generates a vector ‘neighbors’. A loop then compares values of neighbors to the current cell’s value. If they match, the count of like neighbors increases. If inhabited, the total neighbor count also goes up. Afterward, the ratio of like neighbors to total neighbors is calculated. This ratio is compared to the like-neighbor preference to determine the cell’s happiness (preventing division by 0 with is.nan function). Happy and unhappy patches are stored as coordinate matrices.
Overall happiness is calculated by dividing the count of happy cells by total occupied cells. This value is added to the happiness tracker.
is_happy <- function(x, y, grid, pref) {
group <- grid[x, y]
if (group == 0) return(NA) # empty cell
neigh <- get_neighbors(c(x, y))
neigh_vals <- apply(neigh, 1, \(p) grid[p[1], p[2]])
neigh_vals <- neigh_vals[neigh_vals != 0] # ignore empty cells
if (length(neigh_vals) == 0) return(TRUE)
alike_share <- mean(neigh_vals == group)
alike_share >= pref
}To neatly visualize this, we can write a plotting function:
plot_grid <- function(grid, iter) {
df <- expand.grid(
x = 1:nrow(grid),
y = 1:ncol(grid)
)
df$group <- as.vector(grid)
ggplot(df, aes(x = x, y = y, fill = factor(group))) +
geom_tile(color = "grey80") +
scale_fill_manual(values = c(
"0" = "white",
"1" = "pink",
"2" = "lightblue"
)) +
coord_equal() +
scale_y_reverse() +
labs(title = paste("Iteration", iter), fill = "") +
theme_void() +
theme(plot.title = element_text(hjust = 0.5))
}
plot_happiness <- function(results) {
plot_data <- pivot_longer(
results,
cols = c(happy, unhappy),
names_to = "state",
values_to = "share"
)
ggplot(plot_data,
aes(iteration, share, color = state)) +
geom_line(linewidth = 1.2, na.rm = TRUE) +
scale_x_continuous(
limits = c(1, params$iterations),
breaks = seq(0, params$iterations, by = 20)
) +
scale_y_continuous(
limits = c(0,1)
) +
labs(
x = "Time",
y = "Share of agents",
color = ""
) +
theme_minimal(base_size = 13)
}Lastly, the unhappy patches need to move to unoccupied spots. To achieve this without spatial bias, we’ll randomly select unhappy cells. Each cell selected becomes a “mover” and picks a random unoccupied spot (‘moveto’) on the grid. A while loop ensures an uninhabited ‘moveto’ is chosen. Once found, the mover’s value is transferred to the new spot, and the mover’s original patch is cleared.
dev.new(width = 10, height = 5)
for (t in 1:params$iterations) {
## --- find unhappy agents ---
unhappy <- list()
for (i in 1:params$world_size) {
for (j in 1:params$world_size) {
if (grid[i, j] != 0) {
if (!is_happy(i, j, grid, params$alike_preference)) {
unhappy[[length(unhappy) + 1]] <- c(i, j)
}
}
}
}
## --- empty cells ---
empty_cells <- which(grid == 0, arr.ind = TRUE)
## --- move agents ---
if (length(unhappy) > 0 && nrow(empty_cells) > 0) {
move_order <- sample(seq_along(unhappy))
for (k in move_order) {
if (nrow(empty_cells) == 0) break
old <- unhappy[[k]]
new_id <- sample(1:nrow(empty_cells), 1)
new <- empty_cells[new_id, ]
grid[new[1], new[2]] <- grid[old[1], old[2]]
grid[old[1], old[2]] <- 0
empty_cells[new_id, ] <- old
}
}
## --- compute happiness shares ---
agents <- which(grid != 0, arr.ind = TRUE)
happy_vec <- apply(
agents,
1,
\(pos) is_happy(pos[1], pos[2], grid, params$alike_preference)
)
happy_share <- mean(happy_vec)
unhappy_share <- 1 - happy_share
results$happy[t] <- happy_share
results$unhappy[t] <- unhappy_share
#----------- update plot
p1 <- plot_grid(grid, t)
p2 <- plot_happiness(results)
grid.arrange(p1, p2, ncol = 2)
}TODO: ADD GIF!
Congratulations! This is your first ABM!
Now think about our model:
To better grasp the effects of homophily preference on social segregation, we might be interested in understanding how stable our results are. Let’s add this to our simulation.
params <- list(
world_size = 25,
world_empty = 0.1,
pop_shares = c(0.5, 0.5),
seed = 123,
iterations = 50,
alike_preference = 0.5,
runs = 10 # <-- number of simulation repetitions
)
get_neighbors <- function(coords, world_size) {
offsets <- matrix(c(
1, 0,
1, 1,
0, 1,
-1, 1,
-1, 0,
-1,-1,
0,-1,
1,-1
), ncol = 2, byrow = TRUE)
n <- sweep(offsets, 2, coords, "+")
# torus wrapping
n[n < 1] <- world_size
n[n > world_size] <- 1
n
}
is_happy <- function(x, y, grid, pref) {
group <- grid[x, y]
if (group == 0) return(NA)
neigh <- get_neighbors(c(x, y), nrow(grid))
neigh_vals <- apply(neigh, 1, \(p) grid[p[1], p[2]])
neigh_vals <- neigh_vals[neigh_vals != 0]
if (length(neigh_vals) == 0) return(TRUE)
alike_share <- mean(neigh_vals == group)
alike_share >= pref
}
# store happiness for all runs
all_results <- array(NA, dim = c(params$iterations, 2, params$runs),
dimnames = list(NULL, c("happy", "unhappy"), NULL))
set.seed(params$seed)
for (r in 1:params$runs) {
# --- initialize grid ---
cells <- params$world_size^2
occupied <- floor((1 - params$world_empty) * cells)
group_vector <- c(
rep(0, cells - occupied),
rep(1, floor(occupied * params$pop_shares[1])),
rep(2, floor(occupied * params$pop_shares[2]))
)
grid <- matrix(
sample(group_vector, cells, replace = FALSE),
nrow = params$world_size,
ncol = params$world_size
)
results <- data.frame(
iteration = 1:params$iterations,
happy = rep(NA_real_, params$iterations),
unhappy = rep(NA_real_, params$iterations)
)
for (t in 1:params$iterations) {
# find unhappy agents
unhappy <- list()
for (i in 1:params$world_size) {
for (j in 1:params$world_size) {
if (grid[i, j] != 0 && !is_happy(i, j, grid, params$alike_preference)) {
unhappy[[length(unhappy) + 1]] <- c(i, j)
}
}
}
# empty cells
empty_cells <- which(grid == 0, arr.ind = TRUE)
# move agents
if (length(unhappy) > 0 && nrow(empty_cells) > 0) {
move_order <- sample(seq_along(unhappy))
for (k in move_order) {
if (nrow(empty_cells) == 0) break
old <- unhappy[[k]]
new_id <- sample(1:nrow(empty_cells), 1)
new <- empty_cells[new_id, ]
grid[new[1], new[2]] <- grid[old[1], old[2]]
grid[old[1], old[2]] <- 0
empty_cells[new_id, ] <- old
}
}
# compute happiness
agents <- which(grid != 0, arr.ind = TRUE)
happy_vec <- apply(agents, 1, \(pos) is_happy(pos[1], pos[2], grid, params$alike_preference))
results$happy[t] <- mean(happy_vec)
results$unhappy[t] <- 1 - mean(happy_vec)
}
# --- define array to store results ---
all_results <- array(NA, dim = c(params$iterations, 2, params$runs),
dimnames = list(NULL, c("happy", "unhappy"), NULL))
# inside the runs loop:
all_results[ , 1, r] <- results$happy
all_results[ , 2, r] <- results$unhappy
}
# compute average across runs
avg_results <- data.frame(
iteration = 1:params$iterations,
happy = rowMeans(all_results[, "happy", ]),
unhappy = rowMeans(all_results[, "unhappy", ])
)
# plot average dynamics
plot_data <- pivot_longer(avg_results, cols = c(happy, unhappy), names_to = "state", values_to = "share")
ggplot(plot_data, aes(iteration, share, color = state)) +
geom_line(linewidth = 1.2) +
scale_x_continuous(limits = c(1, params$iterations), breaks = seq(0, params$iterations, by = 10)) +
scale_y_continuous(limits = c(0,1)) +
labs(x = "Time", y = "Share of agents", color = "") +
theme_minimal(base_size = 13)In order to systematically analyze the effects of homophily on socila segregation, we would need to run the model multiple times with different parameter values and analyze the results.
For the purpose of replication, we can wrap the code in a function and run it multiple times with different parameter values. This allows us to systematically analyze the effects of homophily on social segregation.
Can yield generative sufficiency but cannot rule out other mechanisms. ABMs are not a silver bullet for social science research, and they have limitations that researchers should be aware of. Some of the main limitations include: