Getting Started
Installation
Requires Python ≥ 3.11 and numpy ≥ 2.0.
Concepts
Segmentation
Bias analysis requires dividing data into advantaged and disadvantaged groups.
fair_perf_ml handles this via segmentation criteria — either a label (equality)
or a threshold (ordering).
See Segmentation for the full API.
Analysis reports
Batch analysis functions (data_bias_perform_analysis, model_bias_perform_analysis,
linear_regression_analysis, etc.) return a plain dict[str, float] mapping metric
names to values. These dicts can be stored and passed back to runtime comparison
functions later.
Runtime comparison / drift reports
Runtime check functions compare a latest analysis report against a stored baseline.
They return a DriftReport indicating whether any metric has drifted
beyond the configured threshold (default 10%).
{
"passed": True, # False if any metric exceeded the threshold
"failed_report": None, # list of {"metric": ..., "drift": ...} when passed=False
}
Streaming monitors
Streaming classes maintain a baseline and accumulate runtime examples incrementally.
They expose drift_report() and performance_snapshot() for point-in-time queries
without needing to collect and replay the full dataset.
Data bias — walkthrough
from fair_perf_ml.data_bias import (
data_bias_perform_analysis,
data_bias_runtime_comparison,
data_bias_partial_runtime_comparison,
DataBiasStreaming,
)
from fair_perf_ml.bias import LabeledBiasSegmentation
from fair_perf_ml.models import DataBiasMetric
# Batch analysis
baseline = data_bias_perform_analysis(
feature=[0, 1, 1, 0, 1, 0],
ground_truth=[1, 1, 0, 0, 1, 1],
feature_label_or_threshold=1,
ground_truth_label_or_threshold=1,
)
latest = data_bias_perform_analysis(
feature=[1, 1, 1, 0, 0, 0],
ground_truth=[1, 0, 1, 0, 0, 1],
feature_label_or_threshold=1,
ground_truth_label_or_threshold=1,
)
# Full runtime check
report = data_bias_runtime_comparison(baseline, latest, threshold=0.10)
# Partial (selected metrics only)
partial = data_bias_partial_runtime_comparison(
baseline, latest,
metrics=[DataBiasMetric.ClassImbalance, DataBiasMetric.KlDivergence],
)
# Streaming
feat_seg = LabeledBiasSegmentation(label=1)
gt_seg = LabeledBiasSegmentation(label=1)
monitor = DataBiasStreaming(feat_seg, gt_seg, feature_data=[0, 1, 1], ground_truth_data=[1, 1, 0])
monitor.push(feature_value=1, ground_truth_value=0)
snapshot = monitor.drift_report(drift_threshold=0.10)
Model performance — walkthrough
import numpy as np
from fair_perf_ml.model_perf import (
binary_classification_analysis,
logistic_regression_analysis,
linear_regression_analysis,
runtime_check_full,
BinaryClassificationStreaming,
)
y_true = np.array([1, 0, 1, 1, 0, 0, 1, 0])
y_pred = np.array([1, 0, 1, 0, 0, 0, 1, 1])
baseline = binary_classification_analysis(y_true, y_pred)
# Later, compare against a new batch
y_true_new = np.array([1, 1, 0, 0, 1, 0, 0, 1])
y_pred_new = np.array([0, 1, 1, 0, 0, 0, 1, 1])
latest = binary_classification_analysis(y_true_new, y_pred_new)
report = runtime_check_full(latest, baseline, threshold=0.10)
# Streaming
monitor = BinaryClassificationStreaming(label=1, y_true=y_true, y_pred=y_pred)
monitor.update_stream_batch(y_true_new, y_pred_new)
drift = monitor.drift_report(drift_threshold=0.10)
Data drift — walkthrough
import numpy as np
from fair_perf_ml.drift import (
ContinuousDataDrift,
CategoricalDataDrift,
DataDriftType,
StreamingContinuousDataDriftFlush,
StreamingCategoricalDataDriftDecay,
)
# Batch continuous
baseline = np.random.normal(0, 1, 2000)
monitor = ContinuousDataDrift(baseline, quantile_type="FreedmanDiaconis")
runtime = np.random.normal(0.3, 1, 500)
scores = monitor.compute_drift_multiple_criteria(
runtime,
[DataDriftType.JensenShannon, DataDriftType.PopulationStabilityIndex],
)
# Streaming continuous with flush
stream = StreamingContinuousDataDriftFlush(baseline, quantile_type="Scott", flush_rate=1000)
stream.update_stream_batch(runtime)
js = stream.compute_drift(DataDriftType.JensenShannon)
# Categorical
cat_baseline = ["cat", "dog", "cat", "bird", "dog", "cat"]
cat_monitor = CategoricalDataDrift(cat_baseline)
cat_runtime = ["cat", "cat", "cat", "dog", "bird"]
psi = cat_monitor.compute_drift(cat_runtime, DataDriftType.PopulationStabilityIndex)