Are methamphetamine users compulsive? Faulty reinforcement learning, not inflexibility, underlies decision making in people with methamphetamine use disorder

Alex H. Robinson, José C. Perales, Isabelle Volpe, Trevor T.J. Chong, Antonio Verdejo-Garcia

Research output: Contribution to journalArticleResearchpeer-review

11 Citations (Scopus)


Methamphetamine use disorder involves continued use of the drug despite negative consequences. Such ‘compulsivity’ can be measured by reversal learning tasks, which involve participants learning action-outcome task contingencies (acquisition-contingency) and then updating their behaviour when the contingencies change (reversal). Using these paradigms, animal models suggest that people with methamphetamine use disorder (PwMUD) may struggle to avoid repeating actions that were previously rewarded but are now punished (inflexibility). However, difficulties in learning task contingencies (reinforcement learning) may offer an alternative explanation, with meaningful treatment implications. We aimed to disentangle inflexibility and reinforcement learning deficits in 35 PwMUD and 32 controls with similar sociodemographic characteristics, using novel trial-by-trial analyses on a probabilistic reversal learning task. Inflexibility was defined as (a) weaker reversal phase performance, compared with the acquisition-contingency phases, and (b) persistence with the same choice despite repeated punishments. Conversely, reinforcement learning deficits were defined as (a) poor performance across both acquisition-contingency and reversal phases and (b) inconsistent postfeedback behaviour (i.e., switching after reward). Compared with controls, PwMUD exhibited weaker learning (odds ratio [OR] = 0.69, 95% confidence interval [CI] [0.63–0.77], p <.001), though no greater accuracy reduction during reversal. Furthermore, PwMUD were more likely to switch responses after one reward/punishment (OR = 0.83, 95% CI [0.77–0.89], p <.001; OR = 0.82, 95% CI [0.72–0.93], p =.002) but just as likely to switch after repeated punishments (OR = 1.03, 95% CI [0.73–1.45], p =.853). These results indicate that PwMUD's reversal learning deficits are driven by weaker reinforcement learning, not inflexibility.

Original languageEnglish
Article numbere12999
Number of pages11
JournalAddiction Biology
Issue number4
Publication statusPublished - Jul 2021


  • cognitive inflexibility
  • compulsivity
  • methamphetamine use disorder
  • reinforcement learning
  • reversal learning

Cite this