Project Proposal

Description of Dataset and Project Goals

Description of Dataset

Overview

We will focus on a dataset of decisions made by rats playing a dynamic two-armed bandit task in which they selected between two actions with constantly changing reward probabilities. Rats had to continually learn about the reward probability of each action, and make decisions in order to maximize their chance of reward.

Task details

Rats performed the task in a box with three nose-ports arranged horizontally. To begin each trial, the rat had to enter the central port, after which one (“forced choice trial”) or both (“free choice trial”) of the ports on the sides would light up. The lit port(s) were then available for the rat to enter, and dispensed a water reward with the appropriate probability. Entering an un-lit port on a forced choice trial resulted in an aversive sound and a long time-out (“violation trial”). After each trial, the reward probablity on each port evolved according to a gaussian random walk with drift parameter 0.15, bounded at 0 and at 1.

Dataset Details

Data are available for 20 rats, totalling ~1.2 million trials.

Data for each trial consist of:

Trial type: Free choice, forced-left, or forced right
Choice: Left or right
Reward: Rewarded or unrewarded
Response Time: Time it took for the rat to enter the side-port
Inter-trial Interval: How much time elapsed since the previous trial
Session ID
Rat ID

Project Goals

The major goal of this project is to gain insight into the algorithms employed by the rat to learn about the reward probabilities and to decide which action to take. One approach to this involves building computational models of the rats - that is, models which attempt to predict a rat’s upcoming choice using only information available at the time he made it. A model which does a good job of capturing the patterns in the rats’ choices is a candidate hypothesis for the mechanism by which those choices were made.

Particular questions of interest include:

What features of the available information does the rat use to guide his choice?
Can the rat be modeled as maintaining and updating summary statistics, rather than remembering each trial independently?
Do rats learn in the same way from forced-choice trials as they do when they are allowed to choose freely? That is, are actions chosen for them stored or processed differently?
Do rats change their strategy as they gain experience with the task over many sessions?
Do they change their strategy early vs. late in each session?
How do the strategies of individual rats vary?
How would optimal performance look, and how close are the rats to such an optimum?

Written on September 27, 2015