
Module Contents#



Return tuple of the train and test dataframes from the prescriptive tree example notebook


Return a dataframe containing the balance-scale data set from the UCI ML repository.


Returns tuple with two numpy arrays containing the data used in the first example in the Flow OCTexample notebook in the ODTlearn documentation.


Returns tuple with three numpy arrays containing the data used in example 1


An example data set used to demonstrate usage of Flow OCT.


A simulated data set used in the FairOCT example notebook.


Return a dataframe containing the data set for MONK's second problem from the UCI ML repository.


Return tuple of the train and test dataframes from the prescriptive tree example notebook


Return a dataframe containing the balance-scale data set from the UCI ML repository. See the following URL for attribute information https://archive.ics.uci.edu/ml/datasets/Balance+Scale


Returns tuple with two numpy arrays containing the data used in the first example in the Flow OCTexample notebook in the ODTlearn documentation. The diagram within the code block shows the training dataset. Our dataset has two binary features (X1 and X2) and two class labels (+1 and -1).

|               |
|               |
1    + +        |    -
|               |
|               |
0    - - - -    |    + + +
|    - - -      |
X: numpy array of covariates from training set
y: numpy array of responses from training set

Returns tuple with three numpy arrays containing the data used in example 1 of the RobustTree example notebook in the ODTlearn documentation. The diagram within the code block shows the training dataset. Our dataset has two binary features (X1 and X2) and two class labels (+1 and -1).

|               |
|               |
1    + +        |    -
|               |
|               |
0    - - - -    |    + + +
|    - - -      |

The third array returned contains a cost vector with the following form: - Uncertainty in 5 points at [0,0] on X1 can cause it to flip to [1,0] if needed to misclassify - Uncertainty in 1 point at [1,1] on X2 can cause it to flip to [1,0] if needed to misclassify - All other points certain

X: numpy array of covariates from training set
y: numpy array of responses from training set
costs: numpy array of costs for each observation in the training set

An example data set used to demonstrate usage of Flow OCT. The diagram within the code block shows the training dataset. Our dataset has two binary features (X1 and X2) and two class labels (+1 and -1). Here the data is imbalanced with the positive class being the minority class.

|               |
|               |
1    + - -      |    -
|               |
|               |
0    - - - +    |    - - -
|    - - - -    |
X: numpy array of covariates from training set
y: numpy array of responses from training set

A simulated data set used in the FairOCT example notebook. The diagram within the code block visualizes the training data. We have two binary features (X1, X2) and two class labels (+1 and -1). The protected feature is race and it has two levels (B and W). In the visualization of the training data, we see that, for example, there are 7 instances with (X1,X2) = (0,1) and among these 7 instances, 5 of them are from race W and 2 of them from race B. We also show the breakdown of the instances based on their class label.

X2                    |
|                     |
1    5W: 4(-) 1(+)    |     2W: 1(-) 1(+)
|    2B: 2(-)         |     5B: 3(-) 2(+)
|                     |
|                     |
|                     |
0    4W: 3(-) 1(+)    |         3W: 1(-) 2(+)
|    1B:      1(+)    |         6B: 1(-) 5(+)
|                     |
X: numpy array of covariates from training set
y: numpy array of responses from training set
protect_feat: numpy array of the protected feature
legit_factor: numpy array of the legitimate factor feature

Return a dataframe containing the data set for MONK’s second problem from the UCI ML repository. See the following URL for attribute information https://archive.ics.uci.edu/ml/datasets/MONK%27s+Problems