fairness
Custom fairness metrics for evaluating synthetic data.
This module provides custom implementations of fairness metrics (demographic parity, equalized odds) for comparing real and synthetic datasets.
Metrics implemented
- Demographic Parity: Measures whether positive prediction rates are equal across groups. Lower values indicate better fairness (less disparity between groups).
- Equalized Odds: Measures whether TPR and FPR are equal across groups. Lower values indicate better fairness.
Usage
These metrics are computed automatically when running evaluation with the --fairness flag and specifying protected attributes via --protected-attributes.
Example
nhssynth evaluate --fairness --protected-attributes age_group gender --downstream-tasks
compute_demographic_parity(data, predictions, protected_attribute)
Compute demographic parity metrics for a protected attribute.
Demographic parity is satisfied when P(Ŷ=1|A=a) = P(Ŷ=1|A=b) for all groups a, b. We report the maximum difference in positive prediction rates between any two groups.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
DataFrame containing the protected attribute. |
required |
predictions
|
Series
|
Binary predictions. |
required |
protected_attribute
|
str
|
Name of the protected attribute column. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with demographic parity metrics. |
Source code in src/nhssynth/modules/evaluation/fairness.py
compute_equalized_odds(data, predictions, labels, protected_attribute)
Compute equalized odds metrics for a protected attribute.
Equalized odds is satisfied when: - P(Ŷ=1|Y=1,A=a) = P(Ŷ=1|Y=1,A=b) (equal TPR across groups) - P(Ŷ=1|Y=0,A=a) = P(Ŷ=1|Y=0,A=b) (equal FPR across groups)
We report the maximum difference in TPR and FPR between any two groups.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
DataFrame containing the protected attribute. |
required |
predictions
|
Series
|
Binary predictions. |
required |
labels
|
Series
|
True binary labels. |
required |
protected_attribute
|
str
|
Name of the protected attribute column. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with equalized odds metrics. |
Source code in src/nhssynth/modules/evaluation/fairness.py
run_fairness_metrics(data, predictions, protected_attributes, target_column, threshold=0.5)
Compute fairness metrics for a dataset given predictions and protected attributes.
This function computes demographic parity and equalized odds for each protected attribute specified. Lower values indicate better fairness (less disparity between groups).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
The full dataset containing protected attributes and target column. |
required |
predictions
|
DataFrame
|
DataFrame containing prediction probabilities from the downstream task. |
required |
protected_attributes
|
list[str]
|
List of column names to use as protected attributes. |
required |
target_column
|
str
|
Name of the target column (actual labels). |
required |
threshold
|
float
|
Classification threshold for binarizing predictions. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary mapping metric names to values. Lower values indicate better fairness. |
dict[str, float]
|
Metrics returned: - dp_{attr}max_diff: Demographic parity - max difference in positive rates - eo{attr}tpr_diff: Equalized odds - max difference in TPR - eo{attr}_fpr_diff: Equalized odds - max difference in FPR |