Theta Oscillations in the Orbitofrontal Cortex and Reward Learning

Post by Stephanie Williams

What's the science?

Theta oscillations (brain activity waves at a frequency of 4-8 Hz) have been implicated in a wide range of brain functions, such as memory, exploration, and navigation. Recently, a research study noted increased theta-band power in a brain region called the orbitofrontal cortex during a task that involved reward. However, it remains unclear whether increased power in the theta-band is causally related to reward-guided behaviour. Further, there is a relationship between theta oscillations in the orbitofrontal cortex and theta oscillations in the hippocampus, and this relationship could also play a role in reward-guided behaviour. This week in Neuron, Knudsen, and Wallis used a closed-loop paradigm to disrupt theta activity to demonstrate a causal relationship between hippocampal and orbitofrontal theta activity in rewarded-guided behavior.

How did they do it?                             

The authors recorded local field potentials from two macaque monkeys while the monkeys performed a task. During each task session, the authors presented the monkeys with 3 new pictures that were associated with a probability for a reward. A reward probability was assigned to the pictures such that there was one each of high, medium, and low reward probability. In some trials, the monkeys made choices between two pictures by fixating on their choice, and in other trials, the monkeys saw only one picture. When asked to make a choice, the monkeys tended to choose optimally, choosing the more valuable picture a majority of the time. During a single session, the authors manipulated the reward contingencies for each picture and tracked how well the monkeys were able to update their choices. To investigate the causal relationship between the 4-8 Hz oscillations and task behavior, the authors applied a targeted stimulation paradigm to disrupt the 4-8 Hz oscillations without changing single neuron firing rates. They delivered the stimulation during different parts of the task, and also delivered “de-coupled” stimulation as a control test. They implemented an identical paradigm using information extracted from a different frequency range (13-30 Hz) as another control test. The authors tracked the phase at which each stimulation pulse was delivered, and calculated the corresponding change in power that the pulse evoked. They then looked to see whether the pulse changed behavior during the task. The authors also analyzed the activity of single neurons during stimulation periods to understand why disruption during that epoch might change behavior.  

To understand the relationship between the 4-8 Hz activity in the hippocampus and the 4-8 Hz activity in the orbitofrontal cortex, the authors recorded from both areas and measured the degree to which the phases of the signals in the two regions were aligned. To better understand the directionality of the information flow between the hippocampus and orbitofrontal cortex, the authors computed how well one of the signals could predict the future values of the other signal, a value called generalized partial directed coherence. Finally, the authors stimulated the hippocampus using their closed-loop stimulation paradigm, while recording from both the hippocampus and the orbitofrontal cortex.

What did they find?

The authors confirmed that there were significant increases in 4-8 Hz power in the orbitofrontal cortex relative to other frequency ranges during major events of the reward task. During the set of stimulation experiments, the authors found that delivering theta stimulation during the fixation epoch of the reward task severely disrupted the monkey’s ability to update their choices. Strikingly, the authors found that a single pulse of microstimulation on a single electrode in the orbitofrontal cortex could disrupt behavior. Stimulation during the choice epoch did not affect the monkey’s learning. When the authors delivered either the decoupled stimulation or the stimulation in the 13-30 Hz range, they found no disruption of the task. These control experiments allowed them to conclude that the behavioral disruption they observed was not a non-specific result of electrical stimulation, but rather is specifically caused by the 4-8 Hz stimulation.

The authors observed a negative relationship between power and phase alignment during theta stimulation, compared to sham stimulation. They found that when they delivered stimulation on the rising cycle of the oscillation, there was an increase in 4-8 Hz power. When they compared the behavioral effects of the pulses delivered at different points in the oscillation cycle, they found that pulses delivered at the positive phase were more disruptive to behavior than those delivered at negative phases. These findings suggest that the phase alimented of the 4-8 Hz oscillations is related to adapting behavior to changing reward contingencies. The authors, therefore, suggest that firing of the orbitofrontal neurons that encode rewards may preferentially occur at a specific phase in the 4-8 Hz oscillation. Analysis of single-unit activity during the stimulation confirmed that the 4-8 Hz stimulation did not change the firing rate of individual neurons. The authors found that half of the single neurons they recorded from fired spikes that were phase-locked to the 4-8 Hz oscillation during the fixation period.

Neuron_pic.jpg

When the authors compared the 4-8 Hz oscillations in the hippocampus and orbitofrontal cortex, they found that there was a strong phase alignment during the fixation period. There were changes in the synchrony of the two regions that matched changes in behavior: when the subjects had to adapt to new reward contingencies and their performance initially dropped, the synchrony between the hippocampus and orbitofrontal cortex decreased. Once the new rules were learned, and the monkeys showed improved performance on the task, the authors observed a corresponding increase in synchrony between the hippocampus and orbitofrontal cortex. The authors found that information primarily flows from the hippocampus to the orbitofrontal cortex and that there is more influence between the two areas during the drift period than during the stable periods in the learning cycle. These results suggest that the hippocampus provides theta input to the orbitofrontal cortex to enable value learning.

What's the impact?

This study used closed-loop microstimulation to show the causal importance of 4-8 Hz oscillations in the orbitofrontal cortex in reward-guided behavior, which in turn depends on hippocampal input. Their findings advance our understanding of how single neurons may encode value during reward tasks, by phase locking to underlying theta rhythms. Future work could build on these findings to develop treatments that apply microstimulation to disrupt maladaptive patterns of activity.

Neuron_quote_March17.jpg

Knudsen and Wallis. Closed-loop Theta Stimulation in the Orbitofrontal Cortex Prevents Reward-Based Learning. Neuron. (2020). Access the original scientific publication here.