(Also note: A few weeks after the test was published Garmin released a new firmware update that fixes the FR610 issues, and I redid some of the tests here).
Last fall to the request of many I put together a bit of a Sport Device GPS Accuracy Showcase Showdown. That test had just four devices – the Garmin FR310XT, Edge 500, Edge 800, and the Timex Global Trainer. The goal of it at the time was to look into how accurate sports GPS devices really were against a measured course, and that goal remains unchanged this year.
This time though, I doubled-down and went to nine devices:
I wanted to take out the most popular new devices (and a few vintage/popular devices) and see how they faired against each other at the same time on differing courses. In doing so I scouted out five different locales that covered a variety of scenarios, and then did each course three different ways – walking, running and cycling.
Finally, with so many of the devices now using the exact same underlying GPS chip – how important is the software that runs the device when it comes to distance accuracy? And would devices made by the same manufacturer with the same chips yield the same results?
Let’s get into how it went down.
Over the course of a number of weekends I did five separate and distinct tests. The tests involved five different courses with varying levels of ‘difficulty’, these included:
1) The Straight and Narrow: A perfectly straight out and back course with low tree coverage – exactly 1.00 miles.
2) The Rambling Loop: A half a mile loop with plenty of turns, a bit of trees and some ups and downs – basically your average run.
3) Circles around the track: Two laps around a 400m track, for a total of 800m, all on the inside lane.
4) The underpass and bridge test: Many of us run/ride under bridges and overpasses – how would a short out and back session under the four lane bridge fair? Would the GPS handle drops in satellite availability and then regain the signal?
5) The Deep Tree Adventure: Into the woods I went, on a fairly twistyroute on dirt trails.
Of course, you may be wondering how I was measuring these. Well, you might remember back to some of my previous course measuring posts, where I picked up a rolling measuring tape – similar to those used to mark out distances for cross country meets.
I took that little roller and then strapped all of the GPS devices onto it. By doing so I could guarantee they were all going the same distance at the same time. Plus, it looked way less sketchy than strapping them on my wrist. For more on my methodology see the end of the post.
Test 1: The Straight and Narrow:
This is my defacto first test, which is about as simple as it gets for a GPS device. All it’s gotta do is just go half a mile out, and half a mile back on a almost perfectly straight course with little tree cover, no telephone poles or anything else blatantly blocking the view.
Here’s what the trail looks like from the ground:
Last year when I did this, all of the devices were within 1.2% of the baseline distance, with the majority even closer than that.
For the methodology I staked out a start line with some pink tape (easier to see from a distance than chalk), and then walked outwards half a mile until the measuring wheel read 2,640 feet (half a mile). At which point I planted another pink flagged into the ground, did an about face, and came back. I gave the units about 8-10 seconds to stop and register the turn – the same process as last year. I did this once walking (with the measuring wheel), and then running, and then finally cycling. I worked to stay within one foot from the path’s edge in both directions.
This year I saw a tiny bit more variance in the numbers, but still all contenders did very well on this result (as they should) – all but one of which performing within the usual 2.5% that GPS is quoted as being accurate to, here’s the results:
I thought it was interesting to see that the units seemed to do best at speed, which is a trend I’ve noticed a bit over the past two years of this. It’s almost as if the walking allows the GPS to wander a bit and decrease accuracy.
The only outlier was the walk test with the Android phone. It’s a bit unclear why it went over so much, but it seemed to get back on the mark with the run and the bike portions.
With the straight and simple tests complete, it was time to kick it up a notch.
Test 2: Circles around the track:
I added this test this year for two reasons. Firstly, because you asked me to…and secondly…it’s really easy to do. Since I don’t have to measure the track (known 400m per loop), and I’m not allowed to ride my bike on it, it makes it a rather quick test to knock out and at the same time answers many questions about how well the units perform on a track.
Now, I should point out that generally speaking when I’m running on track during a workout I’m using the track as the distance ruler – not my GPS (I use that instead as a general reminder of roughly where I am in a workout). But I do understand that some prefer the GPS measurements from a tracking standpoint.
This test was two part. First, I ran 800m (two loops) on the inside lane. Following which I walked 800m, again on the same inside lane. I started and stopped at the same line both times.
Here’s the results of the first test (running this time). Keep in mind that since we’re only talking half a mile – that just .01mi gets a unit knocked down 2% to 98%. Which means it’s plausible that the watch was all of a few more feet from being over/under. In other words, in shorter distance tests don’t overthink too much only .01mi off.
As you can see, most of the devices did quite well. Track corners do tend to throw GPS for a bit of a loop in general (no pun intended), but these are pretty close to spot on. And as I kinda said before, if you’re on a track, use the known distances there if getting down-to-the-second precision is necessary. The RCX5 did perform a bit lower than the rest, a data point that seemed to continue on the track in the walk test. The Edge 800 also had some issues on the run on the track, but did better in the walk test.
Now let’s look add in that walk. For most of the devices we see pretty much the same, except there’s one big outlier:
That’d be the one at 43% marked in red, by the way. So what happened? How does it go from what should be .50 miles to 1.15 miles? Well, it didn’t actually get lost…it just got distracted. You can see that below in the map, where it starts off good for the first 30 or so meters, and then poof – it goes and visits some houses, before visiting another set of houses:
The catch is, that it’s not terribly uncommon for GPS data for iPhone users, or some cell phone users in general. You’ll find plenty of complaints about incorrect data points, often caused when phones attempt to pickup location data using WiFi hotspots instead. In fact, if you look carefully you’ll notice it’s interesting that it seems to go straight to some houses on the right side, and then again back to houses on the left side – exactly as if it picked up WiFi as I got close to each set of houses (with nothing bit thin air in between the track and the homes). In my tests I didn’t disable WiFi, and while I could have – to me that’s just an annoyance that many users wouldn’t bother with. Additionally, this is the only time I saw this occur throughout the tests. And it only occurred with the iPhone.
All in all though, most units did just fine with the track test, so let’s move onto a more common scenario.
Test 3: The Rambling Loop
The rambling loop is simply what most people likely run on a day to day basis. It’s half on a running path/sidewalk next to the street before cutting across a parking lot, ducking under a small grove of trees onto gravel road near a small pond before looping back around. All in the loop is .52 miles long. Measured this day as 2,727 feet (last year it was 2,733 feet, so only five feet difference – impressive year or year measurement). Here’s an overview:
And here’s few pictures from the ground. First, looking across the parking lot, I run towards the camera, and then the trail behind the waterpark, where I run away from the camera:
For this test, I did three variations: Walk, Run, Bike. Each variation got progressively faster with the change from the walk to the bike – which is the idea, to see how it tracks at varying speeds.
With that, here’s the results for the rambling test:
As you can see – almost all the units were within 2% on all the tests. The only outlier this time was the Garmin FR610, which was steadily set in thinking it was .50 miles each time. It should be noted that the FR610, FR305 and RCX5 all were set to 1-second recording rates, while the other units were using either Smart Recording or proprietary recording algorithms (could not change). And finally, again you notice a very subtle trend where higher speed tests actually do better than lower speed tests.
So with Part 1 complete you’re seeing that the majority of the time in ‘normal’ conditions the units do fairly well. Sure, there are some outliers, but we’re not seeing huge 5-10% issues like in the past. It’s also really important to stress that the colors are there merely to allow the brain to quickly see the differences, but in some cases – we’re talking only .01-miles differential between a ‘green’ and a ‘yellow’.
Tomorrow though, we’ll dive into the trees and tunnels and start to really separate the boys from the men…or the girls from the women, and then we’ll wrap up with all of the details consolidated so you don’t have to scroll up and down a bunch.
For the rest of the tests, as well as the summary results, head on over to Part II…
Q&A Post Show For Part I:
When you start doing testing of any sort there are invariably just as many questions as answers. So, I wanted to outline some of the reasons I did what I did, in an effort to answer as much as possible. Do keep in mind that while my goal is to be as accurate as possible, it’s also to be as realistic as possible to a normal runner. Additionally, I wanted to cover as many devices as possible in as many scenarios as possible…while still trying to make it possible to spend some time with my fiancée. Sometimes you have to eventually draw the line on data collection methodology (especially since this isn’t my day job).
Q: Why didn’t you include XYZ watch?
A: It depends, but in most cases it’s because I didn’t have it or already tested it, or it’s duplicate with one I tested. For example, I didn’t do the FR410 because I never re-purchased it after the review since it’s too similar to the FR405/FR405CX (same chipset) and I’m not personally a huge fan of the bezel. I didn’t add the FR110 because it’s just the exact same watch as the FR210, except without footpod support and instant-pace data field. I didn’t add the Timex Global Trainer because I did that last fall, and saw no reason to beat that dead horse again. And finally, I didn’t add any Suunto stuff…simply because I don’t have any Suunto stuff.
Q: You should have used XYZ phone app, it’s better.
A: True, there are no doubt better apps. And any time you choose apps there are difficult decisions – you know, like Angry Bird Classic or Angry Birds Rio. But I specifically used RunKeeper for a few reasons. First, it’s offered on three platforms (iPhone, Android, Windows 7 Phone), so it makes it super easy for me to consolidate the data afterwards since it ends up in one account. Second, they have the largest install base as far as running apps go – thus, the most people are probably using it. Third, they’ve done incredible work to try and fix phone GPS data coming into it. Of course, others have done this work as well, but then see point #1 again. :) Fourth, since they offer it on multiple platforms you can attempt to compare hardware a bit more than comparing developer coding skills (though, not 100% of course). Fear not though, I’ll be doing at some point an app comparison, though I realize I’ve been saying that for some time. Oh, and as a side note, the iPhone was running version 4.3.2 and the Android phone kernel 2.6.29.
Q: Why don’t you change the position/orientation/sunlight levels/etc of the watches/phones?
A: I tried to remain as close as possible to how a person would normally wear their watch or phone. For the phones, they went in pouches designed for them and onto my arm – exactly as intended. For the watches, they went on the pipe portion of the measuring wheel than I ran in front of my chest with – positioned about the same distance to my body as my wrist would be. While it’s plausible that one watch got a slightly better ‘deal’ placement wise than another – that’s life. If a watch can’t get good results because I moved it three inches to the left, then quite frankly…the watch sucks.
Q: What about XYZ watch that comes out in XYZ timeframe?
A: Eventually I had to draw a line on waiting. If/when Garmin and/or Timex and/or someone else comes out with a new watch, I’ll circle back and do accuracy tests with those. At this point I’d suspect I’ll wait until all of the summer related releases occur (including things leading up in EuroBike/Interbike in late August).
Q: Why didn’t you do any really long courses – like 30 miles?
A: I have no desire to either run or walk 30 miles, it breaks a key running rule I have: Thou shalt not utilize thy feet to go more than 26.2 miles without wheels. (Exceptions to the rule are made if people are running after me with weaponry, or a giant Smurf is chasing me.) Even if I had ignored the walking/running piece and simply done bike tests – they would only be comparing against themselves. While you could use a speed sensor, once you look into that a bit more – they’re not quite as accurate as a Jones Counter bicycle, and that involves a lot more complexity around PSI and everything else.
Q: Why did you reduce from exact meters down to one-hundredths of a mile?
A: I did this for two reasons. None of these units display meters to the end user, but rather all display in .01 increments, thus, it seemed a bit odd to measure more precisely if at the end of the day rounding would show simply one only show the non-rounded value to the user. After all – it’s what you see that matters, not what the unit internally calculates. Secondly, I did this to make it more manageable for me. With nine devices, five umbrella tests, and upwards of three subtests per major test, I was looking at upwards of 135 files to consolidate, analyze and record – across many different platforms, file types and services.
Q: Why don’t you pickup a Jones counter bicycle to measure the routes?
A: Because my fiancée would kill me if I added yet another bicycle to the stable. But more practically, some of the routes I’ve chosen only permit running/walking – such as the one in the park in Part II and the track, where bicycles are not allowed on the trails/tracks. Further, my little roller does just fine for my purposes. :)
For the rest of the tests, head on over to Part II…