Is Google Analytics Accurate?


By Jay Adamsson

Google Analytics is not perfect. There are limitations to the accuracy of the information provided by Google Analytics which most casual users do not realize. But does this mean that Google Analytics cannot be trusted? Of course not. It provides a wealth of information that is invaluable to the strategic website. It simply means that the numbers have to be interpreted, knowing what the limitations may be.

First, there are three main areas in which visitors may not be counted:

Google Analytics depends on two technologies – javascript and cookies. These are two common technologies that are available through every modern browser, including tablets and smartphones. So, if everyone has these technologies, why does everyone not get counted? Simply because they are easy to turn off. Particularly with the increased awareness of privacy issues recently, the browsers have made it easier to turn off anything that can be used to track information about the user – including javascript and cookies. So, to protect their privacy, some people turn these off routinely. These people do not get tracked in Google Analytics. Is this a big concern? No. The vast majority of people leave these on, allowing Google Analytics to track their activity. But you should be aware that there is a segment of the population that does not show up in your Analytics data.

As your website gets larger and more popular, another factor can come into play. Google limits the amount of data that is captured, at least in the free version of Analytics. Once your number of hits gets up into large numbers (millions of hits monthly), Google no longer captures all of the information, but only a sample. In addition, some reports start to show sampled data much sooner. So, although all the data may be in Google’s servers, they will not show all the detail, but will only provide an estimate based on a smaller sample. Again, this does not mean that the data cannot be trusted, as their sampling process is quite sophisticated and produces accurate results. But again, it is not an exact count.

For webmasters that have implemented events on their website, again there are data limitations. Google wants to prevent people from going overboard with their event generation. So, after the first few events are generated on a page, Google then limits the number of events that are captured from that point on. This really only affects people that are generating large numbers of events, such as trying to capture all mouse movements, or generating time-based events with a very short interval. But it can have a significant impact in these instances.

Besides these technical areas, the way Google interprets information also causes limitations. Essentially, what Google captures is when a browser goes to your website. It can then read a few pieces of information – the IP address for the visitor’s computer, what pages they visit on your site, and some web history of that browser.

Notice one important factor here – I say the history of the browser, not the user.

To Google Analytics, a visitor is a browser. So, if one person visits your site with Internet Explorer, then immediately visits your site again with Firefox, that gets counted as two visits from two visitors in Google Analytics. Someone does not even have to change computers – they simply have to change browsers. Likewise, if one person visits your site today, then your site gets visited by their son tomorrow using the same computer and the same browser, Google Analytics interprets this as a single visitor with two visits. There is no way for Analytics to tell the difference. This again produces inaccuracies in these reports.

Finally, Google has recently added two new items to Analytics – Demographics and Interest Reports. The data reported in these reports is inferred by other means, but again it is based on browser usage. But it is even more inaccurate. To get this information, Google will look at the browser history. Then essentially make their best guess at the age, gender, and interests of the user based on their browsing history. There is no way that Google knows exactly who is using the browser, and their exact demographics and interest. This adds yet another level of abstraction, making the data even more inaccurate.

Now, does all this mean that the data is useless? No, of course not. Google is made up of some very smart people, and they have built some very advanced ways to interpret data, and they do a very good job of it. However, it is not exact. Google Analytics is best used to spot trends, not as an exact measure.