What Is the Best Way to Reliably Track Outbound Clicks?

Although we try to show our users all the information they need to know about a course, sometimes they still want to take a look at the course provider’s own website. That’s why we include a link to the provider’s webpage on our own training pages. We do of course track these clicks, as we consider them conversions and we get paid per click next to getting paid per lead. But how do you reliably track these clicks?

We used to track these clicks with a slew of Javascript onClick handlers, like Google Analytics, Clicky and our own internal click handler, but we soon found out that the numbers for each of these methods differed greatly. This can have many reasons, one of them most likely being that not all of our click handlers would fire due to the speed of the redirect to the external site, or that their assets are simply not loaded in time, as explained by Andre Scholten).

More information about this course

So which statistics do you trust? Worse, we had a gut feeling we were missing external clicks because people didn’t have Javascript enabled or were blocking tracking cookies due to personal settings or a company firewall.

Solution: track clicks on the server

We could have solved the issue of the click handler not firing by adding a timeout to the redirect. Thus letting the user wait for something to happen after he has clicked, just because we want to track his click. Honestly, we hate waiting for sites to load, so why bother our users with a waiting time because we want to track their behavior? It would hurt the user experience. So we went with the old-fashioned approach: registering clicks on the server side, and redirecting the user after we’ve processed his request with a 301 redirect to the external site.

# http://springest.nl/provider/training/registerClick?ref=http://www.example.com
function registerClick($ref) {
    saveClick();
    header("Location: $ref", true, 301);
}

Because we do a lot of processing after a click, the whole operation can take quite some time. When someone clicks a link, we’re actually creating a job in Beanstalk, to maintain a fast redirect. When this was deployed we suddenly noticed a huge increase in the number of external clicks that we registered! Our gut feeling was right, we were missing clicks from people who can’t use Javascript! Yeah!

A new problem: crawlers

Our joy didn’t last long when we found out that a lot of these clicks came from crawlers and other automated processes that visit our site. Most crawlers are quite polite and identify themselves, so these were easily blocked by filtering their respective user agent strings. However, we still found a large number of “illegal” clicks coming from (mostly) Chinese IP addresses. When we were using Javascript to track clicks, these bots weren’t counted because they don’t parse Javascript, but now that we are tracking them on the server side the suddenly show up in large numbers.

It’s not difficult to filter these IP’s, but the problem is that they change quicker than you can block them. Of course we do block them and fix our statistics, so that our clients don’t pay for fake clicks. But this is obviously not a very elegant and scalable solution.

Is there a silver bullet?

It seems like there are three main approaches to this problem:

  1. use Javascript and have some false negatives
  2. use Javascript with a delay and maybe still have some false negatives, plus annoy people a bit with the waiting time.
  3. use a server side approach and have lots of false positives which you will be battling like there’s no tomorrow.

Oh, and we’re not even considering putting one of those ugly “Thanks for clicking, wait till we redirect you to the other website while we let you look at some more advertising” redirect pages…

We’d like to hear other people’s experiences: how do you track external clicks and how do you cope with these problems? Is there a silver bullet for external click tracking?

Comments