Usage
How it works
stochatreat assigns treatments within each stratum independently. For a given set of treatment probabilities, it:
- Divides each stratum into the largest possible block that can be split in exact proportion
- Assigns treatments to the remainder (misfits) using one of three strategies
Misfit strategies
| Strategy | Behavior |
|---|---|
"stratum" (default) |
Misfits in each stratum are assigned randomly and independently using the given probabilities |
"global" |
All misfits across strata are pooled into one group and assigned together |
"none" |
Misfits are left unassigned (treat = NA) and marked with stratum_id = NA for manual handling |
Examples
Single stratum
from stochatreat import stochatreat
import numpy as np
import pandas as pd
np.random.seed(42)
df = pd.DataFrame(
data={"id": range(1000), "nhood": np.random.randint(1, 6, size=1000)}
)
treats = stochatreat(
data=df,
stratum_cols="nhood",
treats=2,
idx_col="id",
random_state=42,
misfit_strategy="stratum",
)
df = df.merge(treats, how="left", on="id")
df.groupby("nhood")["treat"].value_counts().unstack()
Multiple strata and unequal probabilities
np.random.seed(42)
df = pd.DataFrame(
data={
"id": range(1000),
"nhood": np.random.randint(1, 6, size=1000),
"dummy": np.random.randint(0, 2, size=1000),
}
)
treats = stochatreat(
data=df,
stratum_cols=["nhood", "dummy"],
treats=2,
probs=[1 / 3, 2 / 3],
idx_col="id",
random_state=42,
misfit_strategy="global",
)
df = df.merge(treats, how="left", on="id")
df.groupby(["nhood", "dummy"])["treat"].value_counts().unstack()
treat 0 1
nhood dummy
1 0 37 75
1 33 65
2 0 35 69
1 29 57
3 0 30 58
1 34 68
4 0 36 72
1 32 66
5 0 33 68
1 35 68
Sampling from a larger population
Use size to draw a stratified sample before assigning treatments:
treats = stochatreat(
data=df,
stratum_cols="nhood",
treats=2,
idx_col="id",
size=500,
random_state=42,
)
Manual misfit handling
The "none" strategy identifies misfits but leaves their treatment unassigned:
# Identify misfits without assigning treatments to them
treats = stochatreat(
data=df,
stratum_cols="nhood",
treats=2,
idx_col="id",
random_state=42,
misfit_strategy="none",
)
# Misfits are marked with stratum_id = NA and treat = NA
misfits = treats[treats["stratum_id"].isna()]
print(f"Found {len(misfits)} misfits")
# Option 1: Assign all misfits to control
treats.loc[treats["stratum_id"].isna(), "treat"] = 0
# Option 2: Exclude misfits from the study
df = df.merge(treats, how="left", on="id")
df_assigned = df[df["treat"].notna()]
References
stochatreatis inspired by Alvaro Carril's Stata packagerandtreat, published in The Stata Journal.- Tools of the trade: Doing Stratified Randomization with Uneven Numbers in some Strata on stratified randomization for the World Bank.
- In Pursuit of Balance: Randomization in Practice in Development Field Experiments. Bruhn, McKenzie, 2009