Background: Rapid change in the commercial market can threaten consistency of activity data comparisons as devices are superseded. Purpose: To determine the level of agreement between two generations of Fitbit™ device for step count and activity level in a free-living environment. Methods: Thirty-seven healthy participants (17 women, 20 men; M ± SD: age 34 ± 8 y; body mass index 25.4 ± 3.9 kg/m2) wore a Fitbit Flex™ and Flex 2™ on their non-dominant wrist over two weeks in a free-living environment. A waist-mounted ActiGraph GT3X+ was also worn to provide a comparison of step count data obtained against a commercial device. Results: Comparison of step count between two generations of Fitbit™ device (Mean Absolute Percentage Error, 12%; Standard Error of Mean, 102.58 steps/d (p = .039); ICC = 0.955) showed closer inter-device agreement than comparison of step count data between commercial (Fitbit™) and research (ActiGraph GT3X+) grades of device (Mean Absolute Percentage Error, 31%; Standard Error of Mean, 124.6 steps/d (p < .001); ICC = 0.915). Statistically significant differences were identified for the Standard Error of Mean between generations of Fitbit™ device (p = .039) and grades of device (p < .001). A comparison of ‘fairly’ and ‘very’ active minutes showed no statistically significant difference between generations of Fitbit™ (p = .980); Mean Absolute Percentage Error, 38%; ICC = 0.908. The number of days of data captured for step count was comparable between to the two grades of device. Conclusion: Users should be aware of potential variations in data estimates from different generations of Fitbit™ device, with step count data providing a more consistent comparison metric.