Week 1 exploratory data analysis

Description of the data

There are 1087140 rows, and each row represents one trial. We have the following columns:

  • choice (right 53%, left 46%, violation of forced choice 1%)
  • reward (0 or 1, mean 0.572214, std 0.494758)
  • session number (1946 sessions total)
  • trial type (free choice 90%, left only 5%, right only 5%)
  • reaction time (mean 0.515855, std 7.778415)
  • trial number (1087140 trials total)
  • rat number (there are 20 total)

The number of trials per rat ranges from 12324 to 237974.

How the data is organized

Data for each rat is all together in the data – i.e. there is no gap in trial numbers for a rat:

Trial numbers are continuous for each rat


Trial numbers are continuous for each rat:
Rat 42 : min  0  max  20677
Rat 44 : min  20678  max  102494
Rat 25 : min  102495  max  136573
Rat 14 : min  136574  max  190471
Rat 15 : min  190472  max  236763
Rat 16 : min  236764  max  387498
Rat 17 : min  387499  max  410524
Rat 18 : min  410525  max  648498
Rat 22 : min  648499  max  684993
Rat 23 : min  684994  max  769229
Rat 27 : min  769230  max  822278
Rat 29 : min  822279  max  834602
Rat 31 : min  834603  max  849296
Rat 37 : min  849297  max  873762
Rat 39 : min  873763  max  907607
Rat 43 : min  907608  max  931664
Rat 47 : min  931665  max  988275
Rat 46 : min  988276  max  1008326
Rat 40 : min  1008327  max  1047021
Rat 41 : min  1047022  max  1087139

How long are sessions?

The mean session size is 558.653649, with std 192.371081. Session size ranges from 56 to 1389.

Here is a histogram of session size:

Session size histogram

Here is how session number increments over trials:

How session number increments over trials

When are rewards given within a session?

Here is the reward distribution in relation to how many trials we are into the session, for one session:

When rewards are given within a session

Clearly, rewards are given very often and all throughout the session.

Is the overall reward probability similar between rats?

By reward probability, we don’t mean the latent reward probability that the experimental system uses. Rather, we mean the observed reward incidence, i.e. the proportion of trials when each rat actually receives a reward.

The histogram below suggests that rats receive similar amounts of rewards:

Rat reward histogram

Reverse engineering the experimental system (1): how are trials fed?

How is the next trial type determined? Per the transition probabilities computed below, it seems to be chosen at random at each step, with 90% probability of a free-choice trial, 5% probability of a forced-left, and 5% probability of a forced-right:


Average transition probabliities over all sessions:
[[ 0.88704614  0.05639817  0.0565557 ]
 [ 0.88556099  0.05649934  0.05793967]
 [ 0.88833638  0.05723813  0.05442549]]
+/-
[[ 0.08631192  0.04622885  0.046599  ]
 [ 0.14779596  0.10531393  0.09640627]
 [ 0.14263365  0.09678701  0.0930963 ]]

In those matrices, the row/column order is:

  1. Free choice
  2. Left only
  3. Right only

Reverse engineering the experimental system (2): how does reward probability modulate?

First, let’s look at the overall probability of getting a reward given the choice made:

  • probability of reward when go left, over all time: 57.4%
  • probability of reward when go right, over all time: 57.7%
  • probability of reward when violate the forced constraint, over all time: 0.0%

Now, let us try to examine how the observed (not underlying) reward probability modulates with time. We do so by:

  1. binning by N=30 consecutive trials
  2. counting the proportion of trials when the rat decided to go left in each bin that resulted in a reward – which is shown in blue
  3. counting the proportion of go-right trials that gave a reward – shown in red
  4. counting the proportion of all trials (except violations) where a reward was earned – shown in black.

Below are plots of those proportions in bins of 30 trials from 5 random sessions:

  1. Session 888 with length: 665 trials
  2. Session 424 with length: 854 trials
  3. Session 629 with length: 689 trials
  4. Session 1233 with length: 351 trials
  5. Session 1854 with length: 216 trials

sessions reward probability sessions reward probability sessions reward probability sessions reward probability sessions reward probability

Notice that the observed reward probability modulates fairly randomly. It seems to fit what we have heard about experimental design: namely, that the reward probabilities are Brownian motion bounded in the interval [0, 1].

Also, notice how varied the sessions can be. For example, the fourth session shown (1233) has 0 reward probability from going left throughout! This deserves further digging: was the rat blind on the left side, perhaps?

Modeling as a Markov chain

How well can we model the rat’s actions as a Markov chain? Given the past 2 actions (L/R) and whether or not they were rewarded (0/1), what is the probability that the rat will go left on the next trial? Shown below are the graphs for 3 rats (42, 44, and 25). For simplicity, I threw out all the data that included forced trials.

sessions reward probability sessions reward probability sessions reward probability

Here are tables showing the probability of going left given the previous 2 actions and rewards, for a number of rats.

sessions reward probability

This confirms the intuition that if the rat chose a side and was rewarded/not rewarded on it, then it will more/less likely choose that side again. Overall the probabilities are similar between rats, but there is some variation that we can further investigate.

Written on September 27, 2015