API Reference
stochatreat.stochatreat
Stratified random assignment of treatments to units.
This module provides a function to assign treatments to units in a stratified manner. The function is designed to work with pandas dataframes and is able to handle multiple strata. There are also different strategies to deal with misfits (units that are left over after the stratified assignment procedure).
stochatreat(data: pd.DataFrame, stratum_cols: list[str] | str, treats: int, probs: list[float] | None = None, random_state: int | None = 42, idx_col: str | None = None, size: int | None = None, misfit_strategy: MisfitStrategy = 'stratum') -> pd.DataFrame
Assign treatments to units in a stratified manner.
Takes a dataframe and an arbitrary number of treatments over an arbitrary number of strata.
Attempts to return equally sized treatment groups, while randomly assigning misfits (left overs from strata not divisible by the number of treatments).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
The data that contains unique ids and the stratification columns. |
required |
stratum_cols
|
list[str] | str
|
The columns in 'data' that you want to stratify over. |
required |
treats
|
int
|
The number of treatments you would like to implement, including control. |
required |
probs
|
list[float] | None
|
The assignment probabilities for each of the treatments. |
None
|
random_state
|
int | None
|
The seed for the rng instance. |
42
|
idx_col
|
str | None
|
The column name that indicates the ids for your data. |
None
|
size
|
int | None
|
The size of the sample if you would like to sample from your data. |
None
|
misfit_strategy
|
MisfitStrategy
|
The strategy used to assign misfits. One of 'stratum' (default) — assign misfits randomly within each stratum using probs; 'global' — pool all misfits across strata and assign together; 'none' — leave misfits unassigned (treat = NA, stratum_id = NA) for manual handling. |
'stratum'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pandas.DataFrame with idx_col, treat (treatment assignments) and |
DataFrame
|
stratum_id (the id of the stratum within which the assignment |
DataFrame
|
procedure was carried out) columns. Both treat and stratum_id use |
DataFrame
|
pandas nullable integer types (Int64) to support NA values. |
Examples:
Single stratum:
>>> treats = stochatreat(data=data, # your dataframe
stratum_cols='stratum1', # stratum variable
treats=2, # including control
idx_col='myid', # unique id column
random_state=42) # seed for rng
>>> data = data.merge(treats, how="left", on="myid")
Multiple strata:
>>> treats = stochatreat(data=data,
stratum_cols=['stratum1', 'stratum2'],
treats=2,
probs=[1/3, 2/3],
idx_col='myid',
random_state=42)
>>> data = data.merge(treats, how="left", on="myid")