[6]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import OneHotEncoder

from odtlearn.flow_oct import FlowOCT, BendersOCT
from odtlearn.utils.binarize import binarize

FlowOCT Examples#

Example 0: Binarization#

The following example shows how to binarize a dataset with categorical and integer features using the built-in function binarize.

[7]:
number_of_child_list = [1, 2, 4, 3, 1, 2, 4, 3, 2, 1]
age_list = [10, 20, 40, 30, 10, 20, 40, 30, 20, 10]
race_list = [
    "Black",
    "White",
    "Hispanic",
    "Black",
    "White",
    "Black",
    "White",
    "Hispanic",
    "Black",
    "White",
]
sex_list = ["M", "F", "M", "M", "F", "M", "F", "M", "M", "F"]
df = pd.DataFrame(
    list(zip(sex_list, race_list, number_of_child_list, age_list)),
    columns=["sex", "race", "num_child", "age"],
)
[8]:
df_enc = binarize(
    df, categorical_cols=["sex", "race"], integer_cols=["num_child", "age"]
)
df_enc
[8]:
sex_M race_Black race_Hispanic race_White num_child_1 num_child_2 num_child_3 num_child_4 age_10 age_20 age_30 age_40
0 1 1 0 0 1 1 1 1 1 1 1 1
1 0 0 0 1 0 1 1 1 0 1 1 1
2 1 0 1 0 0 0 0 1 0 0 0 1
3 1 1 0 0 0 0 1 1 0 0 1 1
4 0 0 0 1 1 1 1 1 1 1 1 1
5 1 1 0 0 0 1 1 1 0 1 1 1
6 0 0 0 1 0 0 0 1 0 0 0 1
7 1 0 1 0 0 0 1 1 0 0 1 1
8 1 1 0 0 0 1 1 1 0 1 1 1
9 0 0 0 1 1 1 1 1 1 1 1 1

Example 1: Varying depth and _lambda#

In this part, we study a simple example and investigate different parameter combinations to provide intuition on how they affect the structure of the tree.

First we generate the data for our example. The diagram within the code block shows the training dataset. Our dataset has two binary features (X1 and X2) and two class labels (+1 and -1).

[9]:
from odtlearn.datasets import flow_oct_example

"""
    X2
    |               |
    |               |
    1    + +        |    -
    |               |
    |---------------|-------------
    |               |
    0    - - - -    |    + + +
    |    - - -      |
    |______0________|_______1_______X1
"""


X, y = flow_oct_example()

Tree with depth = 1#

In the following, we fit a classification tree of depth 1, i.e., a tree with a single branching node and two leaf nodes.

[10]:
stcl = FlowOCT(depth=1, solver="gurobi", time_limit=100)
stcl.fit(X, y)
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 100
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 110 rows, 89 columns and 250 nonzeros
Model fingerprint: 0x5e9209d1
Variable types: 84 continuous, 5 integer (5 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve removed 102 rows and 82 columns
Presolve time: 0.00s
Presolved: 8 rows, 7 columns, 16 nonzeros
Variable types: 6 continuous, 1 integer (1 binary)
Found heuristic solution: objective 9.0000000

Root relaxation: objective 1.000000e+01, 2 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0      10.0000000   10.00000  0.00%     -    0s

Explored 1 nodes (2 simplex iterations) in 0.01 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 2: 10 9

Optimal solution found (tolerance 1.00e-04)
Best objective 1.000000000000e+01, best bound 1.000000000000e+01, gap 0.0000%
[10]:
FlowOCT(solver=gurobi,depth=1,time_limit=100,num_threads=None,verbose=False)
[11]:
predictions = stcl.predict(X)
print(f'Optimality gap is {stcl._solver.optim_gap}')
print(f"In-sample accuracy is {np.sum(predictions==y)/y.shape[0]}")
Optimality gap is 0.0
In-sample accuracy is 0.7692307692307693

As we can see above, we find the optimal tree and the in-sample accuracy is 76%.

ODTlearn provides two different ways of visualizing the structure of the tree. The first method prints the structure of the tree in the console:

[12]:
stcl.print_tree()
#########node  1
branch on X_0
#########node  2
leaf 0
#########node  3
leaf 1

The second method plots the structure of the tree using matplotlib:

[13]:
fig, ax = plt.subplots(figsize=(5, 5))
stcl.plot_tree(ax=ax)
plt.show()
../_images/notebooks_FlowOCT_13_0.png

Tree with depth = 2#

Now we increase the depth of the tree to achieve higher accuracy.

[14]:
stcl = FlowOCT(depth=2, solver="gurobi")
stcl.fit(X, y)
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 60
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 274 rows, 209 columns and 642 nonzeros
Model fingerprint: 0x51470803
Variable types: 196 continuous, 13 integer (13 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective -0.0000000
Presolve removed 250 rows and 187 columns
Presolve time: 0.00s
Presolved: 24 rows, 22 columns, 76 nonzeros
Found heuristic solution: objective 8.0000000
Variable types: 16 continuous, 6 integer (6 binary)

Root relaxation: objective 1.300000e+01, 14 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0      13.0000000   13.00000  0.00%     -    0s

Explored 1 nodes (14 simplex iterations) in 0.00 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 3: 13 8 -0

Optimal solution found (tolerance 1.00e-04)
Best objective 1.300000000000e+01, best bound 1.300000000000e+01, gap 0.0000%
[14]:
FlowOCT(solver=gurobi,depth=2,time_limit=60,num_threads=None,verbose=False)
[15]:
predictions = stcl.predict(X)
print(f"In-sample accuracy is {np.sum(predictions==y)/y.shape[0]}")
In-sample accuracy is 1.0

As we can see, with depth 2, we can achieve 100% in-sample accuracy.

[16]:
fig, ax = plt.subplots(figsize=(10, 5))
stcl.plot_tree(ax=ax, fontsize=20)
plt.show()
../_images/notebooks_FlowOCT_18_0.png

Tree with depth=2 and Positive _lambda#

As we saw in the above example, with depth 2, we can fully classify the training data. However if we add a regularization term with a high enough value of _lambda, we can justify pruning one of the branching nodes to get a sparser tree. In the following, we observe that as we increase _lambda from 0 to 0.51, one of the branching nodes gets pruned and as a result, the in-sample accuracy drops to 92%.

[17]:
stcl = FlowOCT(solver="gurobi", depth=2, _lambda=0.51)
stcl.fit(X, y)
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 60
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 274 rows, 209 columns and 642 nonzeros
Model fingerprint: 0x3e3f455b
Variable types: 196 continuous, 13 integer (13 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [5e-01, 5e-01]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective -0.0000000
Presolve removed 250 rows and 187 columns
Presolve time: 0.00s
Presolved: 24 rows, 22 columns, 76 nonzeros
Found heuristic solution: objective 3.9200000
Variable types: 16 continuous, 6 integer (6 binary)

Root relaxation: objective 5.105000e+00, 14 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0    5.10500    0    2    3.92000    5.10500  30.2%     -    0s
H    0     0                       4.8600000    5.10500  5.04%     -    0s

Explored 1 nodes (14 simplex iterations) in 0.01 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 3: 4.86 3.92 -0

Optimal solution found (tolerance 1.00e-04)
Best objective 4.860000000000e+00, best bound 4.860000000000e+00, gap 0.0000%
[17]:
FlowOCT(solver=gurobi,depth=2,time_limit=60,num_threads=None,verbose=False)
[18]:
predictions = stcl.predict(X)
print(f"In-sample accuracy is {np.sum(predictions==y)/y.shape[0]}")
In-sample accuracy is 0.9230769230769231
[19]:
fig, ax = plt.subplots(figsize=(10, 5))
stcl.plot_tree(ax=ax, fontsize=20)
plt.show()
../_images/notebooks_FlowOCT_22_0.png

Example 2: Different Objective Functions#

In the following, we have a toy example with an imbalanced data, with the positive class being the minority class.

[20]:
'''
    X2
    |               |
    |               |
    1    + - -      |    -
    |               |
    |---------------|--------------
    |               |
    0    - - - +    |    - - -
    |    - - - -    |
    |______0________|_______1_______X1
'''
X = np.array([[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],
              [1,0],[1,0],[1,0],
              [1,1],
              [0,1],[0,1],[0,1]])
y = np.array([0,0,0,0,0,0,0,1,
              0,0,0,
              0,
              1,0,0])

Tree with classification accuracy objective#

[21]:
stcl_acc = BendersOCT(solver="gurobi", depth=2, obj_mode="acc")
stcl_acc.fit(X, y)

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 60
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter LazyConstraints to value 1
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 14 rows, 42 columns and 44 nonzeros
Model fingerprint: 0x012b1ab9
Variable types: 29 continuous, 13 integer (13 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve removed 4 rows and 4 columns
Presolve time: 0.00s
Presolved: 10 rows, 38 columns, 36 nonzeros
Variable types: 29 continuous, 9 integer (9 binary)
Root relaxation presolved: 10 rows, 36 columns, 46 nonzeros


Root relaxation: objective 1.500000e+01, 2 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0   15.00000    0    -          -   15.00000      -     -    0s
     0     0   14.00000    0    -          -   14.00000      -     -    0s
     0     0   14.00000    0    -          -   14.00000      -     -    0s
     0     0   14.00000    0    -          -   14.00000      -     -    0s
     0     0   14.00000    0    -          -   14.00000      -     -    0s
     0     0   14.00000    0    -          -   14.00000      -     -    0s
     0     0   14.00000    0    6          -   14.00000      -     -    0s
H    0     0                      13.0000000   14.00000  7.69%     -    0s
     0     0   14.00000    0    -   13.00000   14.00000  7.69%     -    0s
     0     0   14.00000    0    4   13.00000   14.00000  7.69%     -    0s
     0     0   14.00000    0    6   13.00000   14.00000  7.69%     -    0s
     0     0   13.80000    0    6   13.00000   13.80000  6.15%     -    0s
     0     0   13.66667    0    6   13.00000   13.66667  5.13%     -    0s
     0     0   13.66667    0    6   13.00000   13.66667  5.13%     -    0s
     0     2   13.66667    0    6   13.00000   13.66667  5.13%     -    0s

Cutting planes:
  MIR: 3
  Flow cover: 4
  Lazy constraints: 24

Explored 6 nodes (43 simplex iterations) in 0.02 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 1: 13

Optimal solution found (tolerance 1.00e-04)
Best objective 1.300000000000e+01, best bound 1.300000000000e+01, gap 0.0000%

User-callback calls 221, time in user-callback 0.01 sec
[21]:
BendersOCT(solver=gurobi,depth=2,time_limit=60,num_threads=None,verbose=False)
[22]:
predictions = stcl_acc.predict(X)
print(f"In-sample accuracy is {np.sum(predictions==y)/y.shape[0]}")
In-sample accuracy is 0.8666666666666667
[23]:
stcl_acc.print_tree()
#########node  1
branch on X_0
#########node  2
branch on X_1
#########node  3
branch on X_1
#########node  4
leaf 0
#########node  5
leaf 0
#########node  6
leaf 0
#########node  7
leaf 0
[24]:
fig, ax = plt.subplots(figsize=(10, 5))
stcl_acc.plot_tree(ax=ax, fontsize=20)
plt.show()
../_images/notebooks_FlowOCT_29_0.png

Tree with Balanced Classification Accuracy Objective#

[25]:
stcl_balance = FlowOCT(
    solver="gurobi",
    depth=2,
    obj_mode="balance",
    _lambda=0,
    verbose=False,
)
stcl_balance.fit(X, y)
predictions = stcl_balance.predict(X)
print(f"In-sample accuracy is {np.sum(predictions==y)/y.shape[0]}")
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 60
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 314 rows, 237 columns and 734 nonzeros
Model fingerprint: 0x3bc6df12
Variable types: 224 continuous, 13 integer (13 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [4e-02, 2e-01]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective -0.0000000
Presolve removed 269 rows and 203 columns
Presolve time: 0.00s
Presolved: 45 rows, 34 columns, 134 nonzeros
Found heuristic solution: objective 0.5000000
Variable types: 27 continuous, 7 integer (7 binary)

Root relaxation: objective 7.500000e-01, 31 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0    0.75000    0    1    0.50000    0.75000  50.0%     -    0s
H    0     0                       0.5288462    0.75000  41.8%     -    0s
H    0     0                       0.6730769    0.75000  11.4%     -    0s
     0     0    0.67308    0    4    0.67308    0.67308  0.00%     -    0s

Cutting planes:
  MIR: 1
  Flow cover: 1

Explored 1 nodes (36 simplex iterations) in 0.00 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 3: 0.673077 0.5 -0
No other solutions better than 0.673077

Optimal solution found (tolerance 1.00e-04)
Best objective 6.730769230769e-01, best bound 6.730769230769e-01, gap 0.0000%
In-sample accuracy is 0.8
[26]:
fig, ax = plt.subplots(figsize=(10, 5))
stcl_balance.plot_tree(ax=ax, fontsize=20)
plt.show()
../_images/notebooks_FlowOCT_32_0.png

As we can see, when we maximize accuracy, i.e., when obj_mode = 'acc', the optimal tree is just a single node without branching, predicting the majority class for the whole dataset. But when we change the objective mode to balanced accuracy, we account for the minority class by sacrificing the overal accuracy.

Example 3: UCI Data Example#

In this section, we fit a tree of depth 3 on a real world dataset called the `balance dataset <https://archive.ics.uci.edu/ml/datasets/Balance+Scale>`__ from the UCI Machine Learning repository.

[27]:
import pandas as pd
from sklearn.model_selection import train_test_split
from odtlearn.datasets import balance_scale_data
[28]:
# read data
data = balance_scale_data()
print(f"shape{data.shape}")
data.columns
shape(625, 21)
[28]:
Index(['V2.1', 'V2.2', 'V2.3', 'V2.4', 'V2.5', 'V3.1', 'V3.2', 'V3.3', 'V3.4',
       'V3.5', 'V4.1', 'V4.2', 'V4.3', 'V4.4', 'V4.5', 'V5.1', 'V5.2', 'V5.3',
       'V5.4', 'V5.5', 'target'],
      dtype='object')
[29]:
y = data.pop("target")

X_train, X_test, y_train, y_test = train_test_split(
    data, y, test_size=0.33, random_state=42
)
[30]:
stcl = BendersOCT(solver="gurobi", depth=3, time_limit=200, obj_mode="acc", verbose=True)

stcl.fit(X_train, y_train)
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-27
Set parameter TimeLimit to value 200
Set parameter NodeLimit to value 1073741824
Set parameter SolutionLimit to value 1073741824
Set parameter LazyConstraints to value 1
Set parameter IntFeasTol to value 1e-06
Set parameter Method to value 3
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (mac64[arm])

CPU model: Apple M1 Pro
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 30 rows, 618 columns and 249 nonzeros
Model fingerprint: 0xaeff5467
Variable types: 463 continuous, 155 integer (155 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve removed 8 rows and 8 columns
Presolve time: 0.00s
Presolved: 22 rows, 610 columns, 233 nonzeros
Variable types: 463 continuous, 147 integer (147 binary)
Root relaxation presolved: 412 rows, 610 columns, 7100 nonzeros

Concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...

Root barrier log...

Ordering time: 0.00s

Barrier performed 0 iterations in 0.18 seconds (0.00 work units)
Barrier solve interrupted - model solved by another algorithm


Solved with dual simplex

Root relaxation: objective 4.180000e+02, 77 iterations, 0.01 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0  418.00000    0    4          -  418.00000      -     -    0s
H    0     0                     269.0000000  418.00000  55.4%     -    0s
     0     0  418.00000    0    4  269.00000  418.00000  55.4%     -    0s
     0     2  413.60000    0    9  269.00000  413.60000  53.8%     -    0s
H  308   294                     272.0000000  413.60000  52.1%  75.5    1s
H  483   460                     274.0000000  413.60000  50.9%  75.6    1s
*  533   475              54     290.0000000  413.60000  42.6%  73.6    1s
H 1310   970                     291.0000000  412.40000  41.7%  75.5    4s
  1528  1113  344.00000   11    4  291.00000  412.40000  41.7%  71.8    5s
  1568  1140  353.62500   27   27  291.00000  408.55187  40.4%  70.0   10s
  1641  1190  354.00000   43   21  291.00000  403.00436  38.5%  82.0   15s
H 2163  1398                     293.0000000  400.54472  36.7%  92.9   19s
  2327  1443  399.64465   48   27  293.00000  399.64465  36.4%  92.2   20s
* 2375  1361             111     296.0000000  399.64465  35.0%  91.6   20s
H 3419  1561                     299.0000000  398.14472  33.2%  86.5   22s
H 4454  1024                     312.0000000  398.14472  27.6%  85.7   23s
  6323  1758  351.06250   59   15  312.00000  367.56601  17.8%  84.0   25s
 12327  3931  314.50000   61    4  312.00000  345.59655  10.8%  80.1   30s
H17827  5595                     314.0000000  338.16497  7.70%  67.8   31s
 25506  8599  329.85714   80   15  314.00000  331.00000  5.41%  59.9   35s
H31789  9084                     315.0000000  329.54167  4.62%  57.5   38s
 32844  9456  324.87500   56    9  315.00000  329.05882  4.46%  57.1   40s
H35037  8771                     316.0000000  328.02174  3.80%  56.5   40s
H36500  6760                     318.0000000  327.50000  2.99%  56.0   41s
 44613  6611     cutoff   67       318.00000  324.00000  1.89%  53.7   45s
 57158  4133     cutoff   93       318.00000  321.00000  0.94%  50.0   50s

Cutting planes:
  MIR: 120
  Flow cover: 193
  Lazy constraints: 236

Explored 70775 nodes (3292092 simplex iterations) in 54.97 seconds (141.79 work units)
Thread count was 8 (of 8 available processors)

Solution count 10: 318 316 315 ... 290

Optimal solution found (tolerance 1.00e-04)
Best objective 3.180000000000e+02, best bound 3.180000000000e+02, gap 0.0000%

User-callback calls 150649, time in user-callback 1.65 sec
[30]:
BendersOCT(solver=gurobi,depth=3,time_limit=200,num_threads=None,verbose=True)
[31]:
stcl.print_tree()
#########node  1
branch on V3.2
#########node  2
branch on V3.1
#########node  3
branch on V5.1
#########node  4
branch on V2.1
#########node  5
branch on V5.1
#########node  6
branch on V4.1
#########node  7
branch on V2.1
#########node  8
leaf 2
#########node  9
leaf 3
#########node  10
leaf 3
#########node  11
leaf 2
#########node  12
leaf 3
#########node  13
leaf 2
#########node  14
leaf 2
#########node  15
leaf 3
[32]:
fig, ax = plt.subplots(figsize=(20, 10))
stcl.plot_tree(ax=ax, fontsize=20, color_dict={"node": None, "leaves": []})
plt.show()
../_images/notebooks_FlowOCT_40_0.png
[33]:
test_pred = stcl.predict(X_test)
print('The out-of-sample accuracy is {}'.format(np.sum(test_pred==y_test)/y_test.shape[0]))
The out-of-sample accuracy is 0.6859903381642513

References#

  • Dua, D. and Graff, C. (2019). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science.

  • Aghaei, S., Gómez, A., & Vayanos, P. (2021). Strong optimal classification trees. arXiv preprint arXiv:2103.15965. https://arxiv.org/abs/2103.15965.