Skip to content

Model Bias

Post-training model bias analysis. Import from fair_perf_ml.model_bias.

Supported metrics: Difference in Positive Predicted Labels (DPPL), Disparate Impact (DI), Accuracy Difference, Recall Difference, Difference in Conditional Acceptance (DCA), Difference in Acceptance Rate (DAR), Speciality Difference, Difference in Conditional Rejection (DCR), Difference in Rejection Rate (DRR), Treatment Equity (TE), Conditional Demographic Disparity in Predicted Labels (CDDPL), Generalized Entropy (GE).


Batch analysis

fair_perf_ml.model_bias.core.model_bias_perform_analysis

model_bias_perform_analysis(feature: list[F] | NDArray, ground_truth: list[G] | NDArray, predictions: list[P] | NDArray, feature_label_or_threshold: F, ground_truth_label_or_threshold: G, prediction_label_or_threshold: P) -> AnalysisReport

Performs model bias analysis on the data passed. The arrays passed with the feature, prediction and ground truth data must all be of the same length. The collection type that is passed must be coercable to a numpy array.

Type in the data container passed must the label or threshold value passed for each criteria type. Args: feature: list[F] | NDArray -> the feature data most efficient to pass as numpy array ground_truth: list[G] | NDArray -> the ground truth data most efficient to pass as numpy array predictions: list[P] | NDArray -> the prediction data most efficient to pass as numpy array feature_label_or_threshold: F -> segmentation parameter for the feature ground_truth_label_or_threshold: G -> segmenation parameter for ground truth prediction_label_or_threshold: P -> segmenation parameter for predictions

fair_perf_ml.model_bias.core.model_bias_perform_analysis_explicit_segmentation

model_bias_perform_analysis_explicit_segmentation(feature: BiasDataPayload[F], ground_truth: BiasDataPayload[G], prediction: BiasDataPayload[P]) -> AnalysisReport

Method to provide explicit segmentation criteria for ad hoc data bias analysis as opposed to using the default derivation logic in Rust core to determine segmentation logic from heurisitcs. Segmentation and data are passed as a single unit in BiasDataPayload.

Parameters:

Name Type Description Default
feature BiasDataPayload[F]

DataBiasPayload[F]

required
ground_truth BiasDataPayload[G]

DataBiasPayload[G]

required

returns: AnalysisReport


Runtime comparison

fair_perf_ml.model_bias.core.model_bias_runtime_comparison

model_bias_runtime_comparison(baseline: AnalysisReport, comparison: AnalysisReport, threshold: float | None = 0.1) -> DriftReport

Evaluates a runtime analysis report for drift relative to the baseline, on all metrics define in the library suite. The criteria for a drift failure is the drift define by the user and from the baseline set. Args: baseline: dict -> the result from calling perform_analysis on the baseline data comparison: dict -> the current data for comparison from calling perform_analysis threshold: Optionl[float]=None -> the comparison threshold, defaults to 0.10 in rust mod Returns: dict

fair_perf_ml.model_bias.core.model_bias_partial_runtime_comparison

model_bias_partial_runtime_comparison(baseline: AnalysisReport, comparison: AnalysisReport, metrics: list[ModelBiasDriftMetric], threshold: float | None = 0.1) -> DriftReport

Performs the same logic as the function above, but on a limited set of metrics defined by the user. This users to narrow the scope to what they are concerned about. Args: baseline: dict -> the result from calling perform_analysis on the baseline data latest: dict -> the current data for comparison from calling perform_analysis metrics: List[str] -> the list of metrics we want to evaluate on threshold: Optionl[float]=None -> the comparison threshold, defaults to 0.10 in rust mod Returns: dict


Streaming

fair_perf_ml.model_bias.streaming.ModelBiasStreaming

Container to maintain the state of a model bias monitoring session, and compute data bias observability metric drift on demand. Allows for arbitrary types that implement eq and ge, to establish ordering for segmentation.

Wraps the core rust logic.

push

push(feature: F, prediction: P, ground_truth: G) -> None

Push a single feature, prediction, and ground truth example into the stream. Type checkers will enforce that the type passed for each value matches the type defined in the segmentation criteria. args: feature: F prediction: P ground_truth: G returns: None

push_batch

push_batch(feature_data: Sequence[F], ground_truth_data: Sequence[G], prediction_data: Sequence[P]) -> None

Push a single feature, prediction, and ground truth example into the stream. Type checkers will enforce that the type passed for each value matches the type defined in the segmentation criteria. The iterables/arrays passed should all be of the same length. If either invariant is violated, then an excpetion will be thrown.

Parameters:

Name Type Description Default
feature

Iterable[F]

required
prediction

Iterable[P]

required
ground_truth

Iterable[G]

required

returns: None

reset_baseline

reset_baseline(feature_data: Sequence[F], ground_truth_data: Sequence[G], prediction_data: Sequence[P]) -> None

Resets the baseline state with using a new baseline dataset. This will leverage the same segmentation criteria defined at type construction.

Parameters:

Name Type Description Default
feature

Iterable[F]

required
prediction

Iterable[P]

required
ground_truth

Iterable[G]

required

returns: None

reset_baseline_and_segmentation_criteria

reset_baseline_and_segmentation_criteria(feature_segment_criteria: BiasSegmentationProtocol[F], ground_truth_segment_criteria: BiasSegmentationProtocol[G], prediction_segment_criteria: BiasSegmentationProtocol[P], feature_data: Sequence[F], ground_truth_data: Sequence[G], prediction_data: Sequence[P]) -> None

Resets the baseline state with using a new baseline dataset. This is the only that allows a change in the segmentation criteria used for class segmentation. A reset in the baseline set is required, to avoid inconsistent state in runtime class bucketing. Type checkers will enforce the same type to be used in the segmentation criteria.

Parameters:

Name Type Description Default
feature

Iterable[F]

required
prediction

Iterable[P]

required
ground_truth

Iterable[G]

required

returns: None

flush

flush() -> None

Clear all runtime state.

performance_snapshot

performance_snapshot() -> PerformanceSnapshot

Generate a performance snapshot of runtime state, irrespective of the baseline state. An exception will be thrown if no runtime data has been pushed into the stream.

drift_snapshot

drift_snapshot() -> DriftSnapshot

Generate a drift report, detailing the drift from the baseline state observed in the runtime data stream. An exception will be thrown if no runtime data has been pushed into the stream.