CART continued -- incorporating forced choice trials

We relax our previous assumption where we threw out forced choice trials. Instead, we now incorporated forced choice trials – successful and violation ones – in our CART model to examine feature importance of forced choice trial and violation lookback.

Here are feature importances from random sessions from a random rat:


M018
chosen sessions: [660 919 745 741 722 970 858 831 629 867]
************************************************************
lookback number: 2
************************************************************

Feature space holds 7089 observations and 8 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.01)
Average feature importances:
1	choice_1	0.370855
2	choice_2	0.289385
3	reward_1	0.188118
4	reward_2	0.097207
5	forced_2	0.024999
6	forced_1	0.022097
7	violated_1	0.004253
8	violated_2	0.003087
************************************************************
lookback number: 3
************************************************************

Feature space holds 7079 observations and 12 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.02)
Average feature importances:
1	choice_1	0.257356
2	choice_2	0.194156
3	reward_1	0.183929
4	choice_3	0.123622
5	reward_2	0.089906
6	reward_3	0.052798
7	forced_2	0.035486
8	forced_1	0.029653
9	forced_3	0.026102
10	violated_1	0.002853
************************************************************
lookback number: 4
************************************************************

Feature space holds 7069 observations and 16 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.02)
ROC AUC: 0.83 (+/- 0.02)
Average feature importances:
1	choice_1	0.191105
2	reward_1	0.163980
3	choice_2	0.134632
4	choice_3	0.095324
5	choice_4	0.090378
6	reward_2	0.082217
7	reward_3	0.053217
8	reward_4	0.047355
9	forced_2	0.037090
10	forced_1	0.034463
************************************************************
lookback number: 5
************************************************************

Feature space holds 7059 observations and 20 features
Unique target labels: [0 1]
Accuracy: 0.75 (+/- 0.02)
ROC AUC: 0.82 (+/- 0.03)
Average feature importances:
1	choice_1	0.149854
2	reward_1	0.132662
3	choice_2	0.110324
4	choice_3	0.077954
5	choice_4	0.077137
6	reward_2	0.073266
7	choice_5	0.061055
8	reward_3	0.052285
9	reward_4	0.048660
10	reward_5	0.046412

And, as before, for many rats and many sessions, here are the average ROC AUC scores for each n_lookback setting:


n = 2 : average ROC across folds = 0.8391 (+/- 0.0596)
n = 3 : average ROC across folds = 0.8344 (+/- 0.0695)
n = 4 : average ROC across folds = 0.8232 (+/- 0.0792)
n = 5 : average ROC across folds = 0.8175 (+/- 0.0848)

Some subsequent questions to look at:

  • interpretability: probabilistic in that have classified proportion at each leaf. is roc auc better than normalized likelihood?
  • tree GP
  • logistic regression at nodes. (see michael jordan paper and Logistic Model Tree paper)
  • cross validation strategies
Written on October 12, 2015