CART continued -- incorporating forced choice trials

We relax our previous assumption where we threw out forced choice trials. Instead, we now incorporated forced choice trials – successful and violation ones – in our CART model to examine feature importance of forced choice trial and violation lookback.

Here are feature importances from random sessions from a random rat:


M018
chosen sessions: [660 919 745 741 722 970 858 831 629 867]
************************************************************
lookback number: 2
************************************************************

Feature space holds 7089 observations and 8 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.01)
Average feature importances:
1	choice_1	0.370855
2	choice_2	0.289385
3	reward_1	0.188118
4	reward_2	0.097207
5	forced_2	0.024999
6	forced_1	0.022097
7	violated_1	0.004253
8	violated_2	0.003087
************************************************************
lookback number: 3
************************************************************

Feature space holds 7079 observations and 12 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.02)
Average feature importances:
1	choice_1	0.257356
2	choice_2	0.194156
3	reward_1	0.183929
4	choice_3	0.123622
5	reward_2	0.089906
6	reward_3	0.052798
7	forced_2	0.035486
8	forced_1	0.029653
9	forced_3	0.026102
10	violated_1	0.002853
************************************************************
lookback number: 4
************************************************************

Feature space holds 7069 observations and 16 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.02)
ROC AUC: 0.83 (+/- 0.02)
Average feature importances:
1	choice_1	0.191105
2	reward_1	0.163980
3	choice_2	0.134632
4	choice_3	0.095324
5	choice_4	0.090378
6	reward_2	0.082217
7	reward_3	0.053217
8	reward_4	0.047355
9	forced_2	0.037090
10	forced_1	0.034463
************************************************************
lookback number: 5
************************************************************

Feature space holds 7059 observations and 20 features
Unique target labels: [0 1]
Accuracy: 0.75 (+/- 0.02)
ROC AUC: 0.82 (+/- 0.03)
Average feature importances:
1	choice_1	0.149854
2	reward_1	0.132662
3	choice_2	0.110324
4	choice_3	0.077954
5	choice_4	0.077137
6	reward_2	0.073266
7	choice_5	0.061055
8	reward_3	0.052285
9	reward_4	0.048660
10	reward_5	0.046412

And, as before, for many rats and many sessions, here are the average ROC AUC scores for each n_lookback setting:


n = 2 : average ROC across folds = 0.8391 (+/- 0.0596)
n = 3 : average ROC across folds = 0.8344 (+/- 0.0695)
n = 4 : average ROC across folds = 0.8232 (+/- 0.0792)
n = 5 : average ROC across folds = 0.8175 (+/- 0.0848)

Some subsequent questions to look at:

interpretability: probabilistic in that have classified proportion at each leaf. is roc auc better than normalized likelihood?
tree GP
logistic regression at nodes. (see michael jordan paper and Logistic Model Tree paper)
cross validation strategies

Written on October 12, 2015