I Tested My VO2Max in a Lab to Find Out Which of These Nine Fitness Devices Was the Most Accurate

I have at least nine different devices at my disposal that can assess my cardio fitness. They all express it as a number that scientists call VO2max. But the only way to find out your actual VO2max is to get tested in a lab, so I knew what I had to do. Based on my results, I’ll tell you which devices gave me the best and worst readings, and what that means for my (and your) training going forward.

My test included devices from Apple, Coros, Fitbit, Garmin, Oura, Suunto, Ultrahuman, Withings, and Whoop. I wasn’t surprised that Garmin did well, but I expected more from some of the other brands, like Apple. There were also a few notable exceptions—you’ll have to read on to find out which devices performed worse.

What is VO2max and why is it important?

VO2max is a measure of cardiovascular fitness, so athletes and their coaches have long been interested in their VO2max numbers. But recently, VO2max has become a buzzword in the health and fitness space for some reasons that make sense, and some that are probably a little overblown.

I say “hyped” because VO2max is just one measure of fitness, not a comprehensive metric, even for athletes. And like virtually every metric you get from fitness tech, it’s on your watch because it’s easy to assess with a device, not because it’s necessarily the best thing to focus on. (Never mind that the estimate might not even be accurate.)

In any case, the main reason for the hype around VO2max is that it’s associated with longevity . Fitter people tend to be healthier and live longer, and VO2max gives a simple numerical value to the otherwise nebulous concept of fitness. A 2016 American Heart Association statement suggested that cardiorespiratory fitness may be a better predictor of mortality than traditional risk factors like cholesterol levels.

VO2max is also useful to track if you’re interested in your fitness for the sake of fitness. If you enjoy running or playing sports, your VO2max tells you how well your body can handle aerobic exercise, which directly relates to your improvement as an athlete.

So if your VO2max increases over time, that’s a good sign, whether you’re interested in winning races or just living a healthy lifestyle. Smartwatches often estimate your VO2max based on your training data, so almost every wearable these days will give you a VO2max estimate, sometimes called a “cardio fitness” score.

How to actually test your VO2max?

Credit: Dr. Michelle Stehman

The easiest way to get an estimate of VO2max is to look at your watch and assume it will do the math for you. The next easiest is to take a field test, like the Cooper Test, which asks how far you can run in 12 minutes. But these are all estimates that may or may not be close to the truth.

To actually test your VO2max, you need to go to a lab. And that’s why I headed to the Human Performance Lab at St. Francis University one sunny Tuesday, where Dr. Christopher Wisniewski and Dr. Michel Stehman gave me a treadmill test.

I’ll describe how the testing went for me, but if you do your own VO2max test, things may be a little different. You may find yourself on a bike instead of a treadmill, for example, or you may do just a walking session, or you may combine a VO2max session with other health or fitness tests.

I was told not to drink alcohol for 48 hours before the test. No intense exercise for the last 12 hours. No caffeine or food for the last three hours. That last one scared me a little until I realized I had plenty of time to eat a normal breakfast before the midday appointment. I showed up in workout clothes and brought a water bottle, even though I couldn’t drink from it during the test. In hindsight, I should have also brought a snack to eat afterward while I waited for the results.

At the lab, I confirmed my answers on the health form I filled out when I made my appointment, and before we began, I took two puffs from my inhaler (I have mild asthma, which can sometimes be triggered by intense exercise ). Scientists measured my weight and height, and then began hooking me up to the equipment that would monitor me during the test.

There was a chest strap to measure my heart rate , which they wanted to make “uncomfortably tight” and tuck under the elastic of my sports bra. Then there was a mask over my mouth and nose, measured to size and secured in place with straps that fit snugly behind my head. You can see it in the photo above, and it, too, was, by design, uncomfortably tight.

The tube attached to the mask doesn’t actually pump oxygen into my mouth, as I mistakenly assumed. Instead, I breathe normal room air, and the air I exhale is sampled to determine how much oxygen and carbon dioxide it contains. The tube was rigid and supported by a stand, so every now and then I had to ask the scientists to move it slightly to the left or right so I could stay centered on the treadmill.

Before the treadmill was launched, there were a lot of little things to be aware of. For example, I couldn’t see the treadmill while I was running—which was more disconcerting than all the physical inconveniences. The sign on the wall in front of me was perfectly centered on the treadmill, so I could use it as a visual anchor. If I drifted off-center, Dr. Stehman would tell me to move slightly to the right or left. If I wanted to steady myself on the handrail, I had to do so with my hand palm up, since palm down could affect my blood pressure readings. Dr. Stehman, yes, took my blood pressure with a cuff and stethoscope at several points during the test. And every few minutes, Dr. Wisniewski would ask how I was feeling and how hard I was working on a scale of 1 to 10.

We started with a brisk walk, 3.5 mph. Every three minutes it got progressively harder: a slow jog at 4.5 mph, then a more comfortable jog at 5.5, then my usual easy-running pace at 5.7 mph. After that, instead of speed, the incline increased. First 5 percent at 5.7, then 10 percent at 5.7. About a minute into this last stage, I gave up, grabbed the handrail, and signaled it was time to stop. The rest is a blur—I remember a walking cool-down and at least one other blood pressure reading. Dr. Wisniewski analyzed my results while I recovered and drank water. Not counting the cool-down, I was on the treadmill for just over 16 minutes.

Why is VO2max measured this way?

I’m going to get a little technical in my explanation to understand why I had to hook up to all this on a treadmill. VO2max literally means the volume (V) of oxygen ( O2 ) your body can use per minute, at its maximum (max), during exercise. It’s measured in milliliters of oxygen per minute per kilogram of your body weight. (Big people breathe more air than small people, even if they’re not necessarily in better shape, so the equation takes that into account.)

In common parlance, we often write this as “VO2max,” but I’ll format it scientifically just once so you can see it: “V̇O 2 max.” The dot on the V stands for volume per unit of time, not total volume. If you hear runners talk about their VDOT numbers , that’s also referring to their VO2max estimate.

Why do we care about the amount of oxygen you breathe? Because it corresponds to how much work your body does. If you remember the breathing equation from high school biology—glucose plus oxygen enters a system that gives you energy in the form of ATP—knowing your oxygen intake tells us how much energy your body produces and uses aerobically.

So if you put an elite athlete on a treadmill and increase the speed and incline, their body will be able to do a huge amount of work while consuming a lot of oxygen, and the test will show that they have a high VO2max.

On the other hand, someone who is out of shape and sedentary will not be able to do what an elite athlete does. They will be able to walk fast, maybe jog a little, but they will not be able to work as hard as the athlete, and therefore they will not consume as much oxygen. They will measure a lower VO2max.

Your VO2max can change over time. If that sedentary person starts exercising consistently and takes the treadmill test again a few months later, he’ll likely find that he can walk or run faster, maybe even handle a higher incline. The test will show that his VO2max has improved. Heck, he might even become an elite athlete someday.

On average, younger people tend to have higher (better) VO2max than older people, and men tend to have higher VO2max than women. Elite athletes have been recorded with VO2max scores in the 70s and 80s, but among recreational athletes, many of us will have scores in the 30s and 40s, maybe 50s. (For context, Garmin has a chart that breaks down what’s considered “good” by age and gender .)

How Smartwatches and Fitness Trackers Measure VO2max

Apple Health on phone, Garmin watch on the left, Suunto watch on the right By Beth Squareski

Your smartwatch (or tracker or bracelet) doesn’t know how much oxygen you’re breathing. Most of these devices use an algorithm that compares the intensity of your work (like how fast you’re running) to your heart rate.

For example, Garmin devices use GPS-tracked activities that last at least 10 minutes . Garmin can trim parts of your activity that aren’t useful, like when you stop to tie your shoes or chat with a neighbor.

From GPS, the device knows your speed. And from your heart rate, it knows how hard your body is working to maintain that speed. This approach is sometimes called a “submaximal” algorithm, because you don’t have to run at top speed to get useful data. Even a light jog can tell your Garmin or Apple Watch a lot about your fitness. If you can move at a good pace while your heart is beating at a steady, easy rhythm, you’re likely much fitter than someone whose heart is pounding out of their chest to maintain the same pace.

Each device has its own algorithm for converting the data it collects into a VO2max estimate, and it starts with recognizing when an activity can provide the algorithm with enough data. This varies by device; Garmin wants a minimum of 10 minutes of activity, while Coros wants 25 minutes. Often, you need to have a certain minimum heart rate for the algorithm to work. Here’s an example from Apple’s developer documentation that describes when and how it calculates VO2max from an activity:

“The system can generate VO2max samples after an outdoor walk, outdoor run, or hiking workout. During outdoor activity, the user must walk on a relatively flat surface (less than 5% incline or decline) with adequate GPS, heart rate signal quality, and sufficient exertion. The user must maintain a heart rate approximately greater than or equal to 130% of resting heart rate. The system can estimate VO2max ranges from 14 to 60 ml/kg/min.”

This data varies by device. Some Garmin watches may use power meter data from your bike instead of GPS. These algorithms typically require the device to know your maximum heart rate, which they’re notoriously bad at estimating , but which they can measure directly if the device is programmed to do so. For a deep dive into what one of these algorithms looks like, here’s an article published by Firstbeat Analytics , which developed Garmin’s VO2max algorithm. (It’s unclear whether the data described here is exactly the same as what’s currently used by Garmin watches.)

But some devices don’t give you details about how they estimate your VO2max, and some seem to say they can come up with a number without collecting any exercise data at all. For example, Whoop says that “to calculate your score, the algorithm takes into account your continuous physiological data (including resting heart rate and heart rate variability), your exercise patterns, and performance metrics tracked by GPS (if enabled). It also takes into account how VO2 Max naturally changes with age and includes physical factors that affect oxygen use, such as height, weight, and biological sex.” My Whoop app tells me to do more GPS-tracked activities to improve my VO2max estimate, but the company says the app can provide a number even though it has no GPS data to work with.

Oura is a little different from the other devices I’ve tested. Instead of calculating a VO2max estimate based on your usual workouts, it asks you to take a six-minute walk test. This type of test is well-known in the medical field and has been used to estimate VO2max , albeit imperfectly.

But there’s one depressing thing to remember about all of this. When it comes to how accurate fitness watches actually are , we don’t have enough information to make a scientific judgment. I’ve discussed this issue here : Device makers aren’t required to back up their numbers or publish their methodology. They just stick whatever algorithm they want into whatever device they want, and leave it up to the rest of us to study if we want. By the time scientists can design a study, conduct it, and report the results, often enough time has passed that the model they tested is outdated.

Studies of smartwatch VO2max estimates generally show that they correlate with VO2max test results — the higher the smartwatch estimate, the higher the tested VO2max for the same person — but the exact number can vary quite a bit. For example, this study of the Apple Watch Series 9 and Ultra 2 concluded that “for people with good or excellent fitness, Apple Watch tended to underestimate VO2 max, whereas for people with poor fitness, it tended to overestimate.”

My results and winners

Credit: Beth Squareschi

I got my official lab results shortly after finishing my treadmill test, and then at home I looked at the various fitness trackers I’ve been wearing lately. Some I tested for review, like the Garmin Forerunner 570 , some I wear because they’re my personal devices and I use them out of habit (like the Oura ring ), and some I had left over from a previous review. You’ll also see a few devices I haven’t finished reviewing yet — consider this a preview.

What do you think at the moment?

For any devices that didn’t have the latest data, I made sure to run them once or twice to allow them to recalibrate. When I had multiple devices of the same brand, they all fed data to the same app or algorithm, so I organized the results by brand rather than device. A full list of the devices I used is at the end of this article.

My VO2max, tested in the lab, was 42.8 ml/kg/min . That’s higher than most of the estimates I’ve gotten from my wearables, so I’m probably in better shape than most of them give me credit for. Still, some overestimated me—Garmin by just one point, Whoop by about three, Ultrahuman by a staggering amount. Here’s the full list, sorted by closest:

  • Tested VO2max: 42.8

  • Garmin : 44 (1.2 points higher)

  • Fitbit : 41 (low score – 1.8)

  • Suunto : 40 (minimum 2.8 points)

  • Whoop : 46 (3.2 points)

  • Apple Watch : 37.9 (low score is 4.9)

  • Koros : 37 (5.8 points below)

  • Oura : 37 (5.8 points below)

  • Withings : 36 (low score 6.8)

  • Superhuman: 61 (18.2 points max)

Garmin came out on top with a VO2max estimate of 44, just 1.2 points higher than the actual value. I expected Garmin to be pretty good, since it knows my exact max heart rate, and I’ve already seen that its 5K run time estimate was pretty close to my actual time.

I didn’t expect Fitbit to be next in line, but hey, good job Fitbit. I’ve seen other reviews that mention Suunto as having fairly accurate VO2max estimates, so it was nice to see Suunto do well here, even if it was still a bit behind.

After that, Whoop stands out with its 46, which is three points. Whoop doesn’t disclose how exactly it estimates VO2max , but since it’s not supposed to require any exercise data at all, I don’t put too much faith in it. (I did provide it with some GPS data during testing, and it said that improved the accuracy of my estimate.) If it’s a guess, it’s at least a flattering guess.

Ultrahuman’s estimate is so far off the mark that I almost didn’t include it. I only started testing the Ultrahuman ring a few days ago and have only done two workouts with it so far, but every other device on my list was able to give a reasonable estimate once a number appeared. I checked my settings and found that I can’t edit the max heart rate that Ultrahuman calculates for me, which likely impacts the accuracy of my VO2max estimate. But if the Ultrahuman app is working with bad data as a design decision, I’m not being unfair by using the number it gives me. So it’s on the list, and I voiced my concerns.

The rest are all five or more points below the norm. If I trusted my Apple Watch, I’d think I was a lot less fit than I actually am. Together with Coros, Oura, and Withings, they scored a 30. I really can’t be too impressed with them.

Restrictions

The biggest caveat to my results is this: I’m just one person. If you did the same experiment with 100 different people, we probably wouldn’t get the same results. Some devices may be more accurate with young athletes, some with regular people, some with people who naturally have higher or lower heart rates, and so on. Devices change. Software updates. Please think of my results as a snapshot of one person on one day with this particular set of devices.

Each device’s VO2max estimates have their own parameters that I’m not necessarily aware of. I’ve done my best to ensure that each app has the correct (or close enough) weight, age, and, where possible, maximum heart rate. But since companies don’t all disclose what variables they use in their calculations, I don’t have a full list of numbers to go back and double-check.

There is also no perfect test, even if it is done as well as possible. If I had done the VO2max test on a different day or in a different lab, my result might have been slightly different, and the ranking order would not be quite the same.

How useful is the VO2max metric on your device?

I’ll be honest: After all this science, I can boil down the practical advice to four words: “Increase the number.” Whether your VO2max comes from a lab test or a smartwatch, it will increase as you exercise more and more consistently.

If the number increases or stays steady at a relatively high level, you’re probably doing something right. If it decreases over time, you can take it as a nudge to do a little more cardio .

(If your watch’s estimate isn’t getting better as you feel like you’re getting fitter, I’d check by testing your fitness in another way, like timing yourself on a run of a certain distance, or even assessing how you felt during a workout you’ve done in the past and seeing if it improves over time. But generally, we’d expect changes in these VO2max estimates to correspond with improving fitness.)

In addition to estimating your VO2max, most of these devices also tell you how your VO2max compares to other people in your gender and age group. Garmin rates me as “excellent,” and once, for a moment, I had a brief rating of “excellent.” Apple says my cardio fitness is “high,” Oura says my cardio fitness is “peak,” and Suunto says I’m “excellent.”

Without being too picky about where the edges of these ranges might be, I think they’re fair judgments, given that the lab said I’m in the 96th percentile of my cohort of middle-aged women. That sounds impressive on paper, but in real life, I’m a pretty average runner. That “for your age and gender” asterisk does a lot of the work.

But let’s step back for a minute. VO2max is just a number. My real goals in life are to be healthy and happy, and maybe improve my 5K time for fun. If I were a true masochist like some people here , I might add that I want to run faster and faster marathons.

Your VO2max is related to all of these, but they’re not literally the same thing. You can have a high VO2max and still have health problems. Athletes often find that their actual race times are faster or slower than their VO2max test results suggest. Coaches don’t just say, “Let’s raise your VO2max.” They make runners work on their lactate threshold, running economy, mental toughness, leg strength, and dozens of other things.

Health and fitness are multifaceted and cannot be reduced to a single number. So while you can use VO2max (or its estimate on a smartwatch) as a shorthand for cardio fitness, it is certainly not a direct measurement, and reaching a certain VO2max number does not unlock a certain level of health or longevity.

Specific models of devices I used

In some cases, multiple devices were reporting data to the same app or algorithm. For example, even if you have three Garmin watches linked to the same account, you will only get one VO2max reading that will show up in the Garmin Connect app and on any watch. The watches will not disagree with each other in their readings.

I’ve tested other devices from these brands in the past and have never seen a significant difference between devices within the same brand. For example, I recall seeing similar cardio fitness scores from the Fitbit app whether I was wearing the Charge 6 or the Pixel Watch 3, so I feel pretty confident reporting these scores by app rather than by device.

With that in mind, the list below includes the devices I used during my VO2max lab testing as the primary sources for each brand’s assessment.

  • Apple Watch : Series 10 (GPS + Cellular, 42mm)

  • Coros : Pace 3 (less commonly used: Pace Pro)

  • Suunto : Suunto Running

  • Withings : Scanwatch 2

  • Oops : Oops 4.0

  • Ultraman: Ring AIR

I made sure to get an updated VO2max estimate from each device within a week of my VO2max lab test (either before or after the test, whichever was more convenient). The only exception was Whoop, which requires 14 days of recent sleep data to provide you with an up-to-date VO2max estimate. My last VO2max estimate from Whoop was three weeks before my VO2max lab test.

More…

Leave a Reply