vidence suggests that catchment state variables such as groundwater can exhibit multiyear trends. This means that their state may reflect not only recent climatic conditions but also climatic conditions in past years or even decades. Here we demonstrate that five commonly used conceptual “bucket” rainfall‐runoff models are unable to replicate multiyear trends exhibited by natural systems during the “Millennium Drought” in south‐east Australia. This causes an inability to extrapolate to different climatic conditions, leading to poor performance in split sample tests. Simulations are examined from five models applied in 38 catchments, then compared with groundwater data from 19 bores and Gravity Recovery and Climate Experiment data for two geographic regions. Whereas the groundwater and Gravity Recovery and Climate Experiment data decrease from high to low values gradually over the duration of the 13‐year drought, the model storages go from high to low values in a typical seasonal cycle. This is particularly the case in the drier, flatter catchments. Once the drought begins, there is little room for decline in the simulated storage, because the model “buckets” are already “emptying” on a seasonal basis. Since the effects of sustained dry conditions cannot accumulate within these models, we argue that they should not be used for runoff projections in a drying climate. Further research is required to (a) improve conceptual rainfall‐runoff models, (b) better understand circumstances in which multiyear trends in state variables occur, and (c) investigate links between these multiyear trends and changes in rainfall‐runoff relationships in the context of a changing climate.