CART continued -- incorporating forced choice trials
We relax our previous assumption where we threw out forced choice trials. Instead, we now incorporated forced choice trials – successful and violation ones – in our CART model to examine feature importance of forced choice trial and violation lookback.
Here are feature importances from random sessions from a random rat:
M018
chosen sessions: [660 919 745 741 722 970 858 831 629 867]
************************************************************
lookback number: 2
************************************************************
Feature space holds 7089 observations and 8 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.01)
Average feature importances:
1 choice_1 0.370855
2 choice_2 0.289385
3 reward_1 0.188118
4 reward_2 0.097207
5 forced_2 0.024999
6 forced_1 0.022097
7 violated_1 0.004253
8 violated_2 0.003087
************************************************************
lookback number: 3
************************************************************
Feature space holds 7079 observations and 12 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.04)
ROC AUC: 0.84 (+/- 0.02)
Average feature importances:
1 choice_1 0.257356
2 choice_2 0.194156
3 reward_1 0.183929
4 choice_3 0.123622
5 reward_2 0.089906
6 reward_3 0.052798
7 forced_2 0.035486
8 forced_1 0.029653
9 forced_3 0.026102
10 violated_1 0.002853
************************************************************
lookback number: 4
************************************************************
Feature space holds 7069 observations and 16 features
Unique target labels: [0 1]
Accuracy: 0.77 (+/- 0.02)
ROC AUC: 0.83 (+/- 0.02)
Average feature importances:
1 choice_1 0.191105
2 reward_1 0.163980
3 choice_2 0.134632
4 choice_3 0.095324
5 choice_4 0.090378
6 reward_2 0.082217
7 reward_3 0.053217
8 reward_4 0.047355
9 forced_2 0.037090
10 forced_1 0.034463
************************************************************
lookback number: 5
************************************************************
Feature space holds 7059 observations and 20 features
Unique target labels: [0 1]
Accuracy: 0.75 (+/- 0.02)
ROC AUC: 0.82 (+/- 0.03)
Average feature importances:
1 choice_1 0.149854
2 reward_1 0.132662
3 choice_2 0.110324
4 choice_3 0.077954
5 choice_4 0.077137
6 reward_2 0.073266
7 choice_5 0.061055
8 reward_3 0.052285
9 reward_4 0.048660
10 reward_5 0.046412
And, as before, for many rats and many sessions, here are the average ROC AUC scores for each n_lookback
setting:
n = 2 : average ROC across folds = 0.8391 (+/- 0.0596)
n = 3 : average ROC across folds = 0.8344 (+/- 0.0695)
n = 4 : average ROC across folds = 0.8232 (+/- 0.0792)
n = 5 : average ROC across folds = 0.8175 (+/- 0.0848)
Some subsequent questions to look at:
- interpretability: probabilistic in that have classified proportion at each leaf. is roc auc better than normalized likelihood?
- tree GP
- logistic regression at nodes. (see michael jordan paper and Logistic Model Tree paper)
- cross validation strategies
Written on October 12, 2015