odtlearn.utils.validation¶
Functions¶
|
Check and validate inverse probability weights (IPW). |
|
Check and validate counterfactual predictions (y_hat). |
|
Check and validate target values (y). |
|
Check if the columns in new_data match the original_columns. |
|
Check if all values in the DataFrame are binary (0 or 1). |
|
Check if all values in the DataFrame are integers. |
|
Check if a DataFrame G has the same structure as X. |
Module Contents¶
- odtlearn.utils.validation.check_ipw(X: numpy.ndarray, ipw: numpy.ndarray | pandas.core.series.Series) numpy.ndarray [source]¶
Check and validate inverse probability weights (IPW).
- Parameters:
- Xarray-like of shape (n_samples, n_features)
The input samples.
- ipwarray-like of shape (n_samples,)
The inverse probability weights to be checked.
- Returns:
- ipwndarray of shape (n_samples,)
The validated and potentially converted inverse probability weights.
- Raises:
- ValueError
If ipw has inconsistent number of samples with X.
- AssertionError
If any value in ipw is not in the range (0, 1].
Examples
>>> import numpy as np >>> from odtlearn.utils.validation import check_ipw >>> X = np.array([[1, 2], [3, 4], [5, 6]]) >>> ipw = np.array([0.5, 0.7, 0.3]) >>> validated_ipw = check_ipw(X, ipw) >>> print(validated_ipw) [0.5 0.7 0.3]
- odtlearn.utils.validation.check_y_hat(X: numpy.ndarray, treatments: numpy.ndarray, y_hat: pandas.core.frame.DataFrame | numpy.ndarray) numpy.ndarray [source]¶
Check and validate counterfactual predictions (y_hat).
- Parameters:
- Xarray-like of shape (n_samples, n_features)
The input samples.
- treatmentsarray-like
The unique treatment values.
- y_hatarray-like of shape (n_samples, n_treatments)
The counterfactual predictions to be checked.
- Returns:
- y_hatndarray of shape (n_samples, n_treatments)
The validated and potentially converted counterfactual predictions.
- Raises:
- ValueError
If y_hat has inconsistent dimensions with X or treatments.
- AssertionError
If y_hat is None.
Examples
>>> import numpy as np >>> from odtlearn.utils.validation import check_y_hat >>> X = np.array([[1, 2], [3, 4], [5, 6]]) >>> treatments = [0, 1] >>> y_hat = np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]) >>> validated_y_hat = check_y_hat(X, treatments, y_hat) >>> print(validated_y_hat) [[0.1 0.2] [0.3 0.4] [0.5 0.6]]
- odtlearn.utils.validation.check_y(X: numpy.ndarray, y: numpy.ndarray | pandas.core.series.Series) numpy.ndarray [source]¶
Check and validate target values (y).
- Parameters:
- Xarray-like of shape (n_samples, n_features)
The input samples.
- yarray-like of shape (n_samples,)
The target values to be checked.
- Returns:
- yndarray of shape (n_samples,)
The validated and potentially converted target values.
- Raises:
- ValueError
If y has inconsistent number of samples with X.
Examples
>>> import numpy as np >>> from odtlearn.utils.validation import check_y >>> X = np.array([[1, 2], [3, 4], [5, 6]]) >>> y = np.array([0, 1, 0]) >>> validated_y = check_y(X, y) >>> print(validated_y) [0. 1. 0.]
- odtlearn.utils.validation.check_columns_match(original_columns: numpy.ndarray | pandas.core.indexes.base.Index, new_data: numpy.ndarray | pandas.core.frame.DataFrame) None [source]¶
Check if the columns in new_data match the original_columns.
- Parameters:
- original_columnslist
The list of column names from the original data.
- new_dataarray-like or pandas.DataFrame
The new data to be checked.
- Returns:
- bool
True if the columns match, False otherwise.
- Raises:
- ValueError
If new_data is a DataFrame and contains columns not present in original_columns.
- AssertionError
If new_data is not a DataFrame and has a different number of columns than original_columns.
Notes
This function performs different checks based on whether new_data is a pandas DataFrame or not: - For DataFrames: It checks if all columns in new_data are present in original_columns. - For non-DataFrames: It checks if the number of columns matches the length of original_columns.
Examples
>>> import pandas as pd >>> from odtlearn.utils.validation import check_columns_match >>> original_cols = ['A', 'B', 'C'] >>> new_data = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]}) >>> result = check_columns_match(original_cols, new_data) >>> print(result) True
- odtlearn.utils.validation.check_binary(df: pandas.core.frame.DataFrame | numpy.ndarray) None [source]¶
Check if all values in the DataFrame are binary (0 or 1).
- Parameters:
- dfpandas.DataFrame or array-like
The data to be checked.
- Raises:
- ValueError
If df is a DataFrame and contains columns with non-binary values.
- AssertionError
If df is not a DataFrame and contains non-binary values.
Notes
This function performs different checks based on whether df is a pandas DataFrame or not: - For DataFrames: It identifies columns containing non-binary values. - For non-DataFrames: It checks if all values are either 0 or 1.
Examples
>>> import pandas as pd >>> from odtlearn.utils.validation import check_binary >>> df = pd.DataFrame({'A': [0, 1, 0], 'B': [1, 1, 0]}) >>> check_binary(df) # This will not raise an error >>> df['C'] = [0, 1, 2] >>> check_binary(df) # This will raise a ValueError ValueError: Found columns (['C']) that contain values other than 0 or 1.
- odtlearn.utils.validation.check_integer(df)[source]¶
Check if all values in the DataFrame are integers.
- Parameters:
- dfpandas.DataFrame or array-like
The data to be checked.
- Raises:
- ValueError
If df contains non-integer values.
Examples
>>> import pandas as pd >>> from odtlearn.utils.validation import check_integer >>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) >>> check_integer(df) # This will not raise an error >>> df['C'] = [1.5, 2.0, 3.0] >>> check_integer(df) # This will raise a ValueError ValueError: Found non-integer values.
- odtlearn.utils.validation.check_same_as_X(X, X_col_labels, G, G_label)[source]¶
Check if a DataFrame G has the same structure as X.
- Parameters:
- Xpandas.DataFrame
The reference DataFrame.
- X_col_labelsarray-like
The column labels of X.
- Gpandas.DataFrame or array-like
The DataFrame or array to be checked against X.
- G_labelstr
A label for G to be used in error messages.
- Returns:
- pandas.DataFrame
G converted to a DataFrame if it wasn’t already.
- Raises:
- ValueError
If G has a different number of columns than X.
- KeyError
If G is a DataFrame and its columns don’t match X_col_labels.
- TypeError
If G is not a DataFrame and X has non-default column labels.
Examples
>>> import pandas as pd >>> from odtlearn.utils.validation import check_same_as_X >>> X = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) >>> G = pd.DataFrame({'A': [5, 6], 'B': [7, 8]}) >>> result = check_same_as_X(X, X.columns, G, 'Test DataFrame') >>> print(result) A B 0 5 7 1 6 8