Please note, this is a STATIC archive of website neilpatel.com from April 2020, cach3.com does not collect or store any user information, there is no "phishing" involved.
Neil Patel

‘Dark Traffic’ is Stealing Your Data (Here’s How to Rescue It)

Everyone says they’re data-driven.

But are they, really?

Because here’s the thing:

You’re only truly data-driven if you’re making decisions based on accurate data.

And that’s exactly the problem.

Because in most cases I see, a company’s data is inaccurate.

It seems OK on the surface. At first, it doesn’t raise any red flags.

However, when you dig a little deeper, you start uncovering all of these different issues.

For example, social traffic is almost always underreported inside Google Analytics.

A company launches a new social campaign to bring in leads. But three weeks later, results don’t look encouraging.

But many times, we can cross-reference other data-points to see that something is missing.

We can look at a Facebook ad account and see the results.

But visits and conversions aren’t being displayed properly inside Google Analytics.

What’s happening, exactly?

In many cases, ‘dark traffic’ is stealing your data.

I’m going to show you how to recover all that lost data in this article.

But first, you need to understand what ‘dark traffic’ is and how it works.

So let’s start with a brief history lesson.

Why “Direct Traffic” is a direct lie

‘Dark traffic’ isn’t new. In fact, it’s been around for decades.

The problem is that it’s only getting worse.

For example, way back in 2001, Gary Price, Chris Sherman, and Danny Sullivan published a book called The Invisible Web.

In this book, they talked about how it was difficult or impossible to track certain web activities at the time.

A lot of it had to do with how much data was being passed by each referral source. But the other side addressed how analytics programs typically work.

Without getting into a long tangent, let’s just say they’re not foolproof.

It’s like email open rates, for instance.

Yes, you should always try to improve them.

But many times that number is wildly inaccurate.

Email marketing tools look for a little image pixel to display. If it doesn’t, they won’t track the open.

Now, think of all the times you view emails without images turned on by default. Many email service providers do this by default for performance or security.

Analytics programs work similarly. They often are looking for a tiny clue.

If they don’t see it, they don’t track it.

And guess what it gets called?

That’s right: ‘Direct traffic.’

Keep in mind that this book talked about these issues nearly two decades ago now.

Not only have we not fixed them by now. But it keeps getting worse!

Back in 2013, The Atlantic made headlines for the wrong reasons when their president and COO claimed that they “have no idea where 25% of their readers come from.”

The Guardian has been struggling with this same issue.

They often can’t track large groups of Google or Facebook referrals. Traffic data from a viral story illustrates this perfectly.

Check out the massive brown section in the image below. The Guardian literally has no idea where this traffic is coming from.

This is a disaster for publishers. And that’s putting it mildly.

Think about it for a second.

How do these companies make money?

By serving online ads to their website visitors.

So, how can they make more money?

Get more visitors to their sites. The problem is that’s impossible when you have no idea where your current visitors are coming from or why they’re visiting your site in the first place.

Incorrect traffic data isn’t just a minor inconvenience for their marketing team. It’s literally holding them back from printing more revenue.

Another curious trend has been happening over the past few years.

As some sources of traffic grow, others erode. Check out how Google Search traffic has dropped while Facebook’s has grown in this example from a few years back:

Producing more traffic from your Facebook Groups is awesome.

But graphs like this can also lead you to confirmation bias.

Looking at this graph makes you think that social media is simply stealing traffic away from Google Search.

In other words, SEO is dying, and social is the new king of the Internet.

Of course, it’s not that simple. It’s not even that accurate.

BuzzFeed did a deep dive into publisher traffic in 2013 and kept seeing the same trend lines:

Let’s keep fast forwarding to see how the problem only gets worse from there.

Remember all that talk about Mobilegeddon back in 2015?

It was all we heard about. And then… nothing. Right?

We didn’t see huge drops like there was with earlier algorithm updates like Panda or Penguin.

Except, there was if you knew where to look.

Turns out, mobilegeddon has been happening all along since as far back as 2013.

Chances are, most people just couldn’t see it because it was getting trapped as ‘dark’ traffic.

Around the same time, Groupon performed an experiment to see how much of their traffic was being misreported.

And then found that 60% of their ‘direct’ traffic should actually have been from organic search.

Think about the ramifications of that for a second.

You do SEO for Groupon. You get paid for increasing traffic and conversions from search engines.

You’re doing your job, but it’s actually being underreported by as much as 60%!

So to your bosses and clients, it looks like you’re just wasting their time and money.

The problem is especially bad on mobile devices for a few reasons.

For starters, more people are surfing the web on mobile than desktop devices.

Then, 84% of someone’s mobile use is often spent inside an app.

And apps, specifically, are notoriously difficult to track properly.

Both Google and Facebook apps, for example, won’t always pass referrer information. So data gets cut out from ever reaching your website if someone starts with them.

Again, it’s similar to how email open tracking works.

Website analytics packages rely completely on referral data to tell them where someone is coming from.

But if there’s any interference, or if someone passes from one thing to another before reaching your site, the chances of getting that data is slim to none.

Apps on mobile or desktop are one major problem area.

So Apple Mail or Outlook, for example, might not pass everything.

Links passed through Slack would also be an issue.

Transitions from secure to unsecured search (HTTPS to HTTP) can cause problems.

In a second, we’ll look at a few ways to try and solve each scenario. However, let’s unravel what this means for marketers everywhere, first.

Your job is on the line if you’re not getting the credit you deserve.

Marketing is a results-oriented business.

People run AdWords campaigns or hire SEO specialists to do one thing and one thing only:

Improve the bottom line.

Clients and bosses don’t care about followers, rankings, or CPCs.

They care about the number of new deals being closed or purchases being made.

Chances are, you meet with them on a weekly or monthly basis and present these results in the same exact way:

You pull up traffic and conversion data, then compare it to the channels that produced those results.

And almost every time, you feel the need to justify results by pointing to leading indicators like rankings or keywords.

All because your primary metrics aren’t properly reflecting the value you provide.

Trust me: I’ve sat in countless meetings exactly like this.

In the past few years, new dashboards like Cyfe or Supermetrics help make the data look more convincing.

But guess what?

They’re pulling data from the same sources!

All you’re doing is authenticating your AdWords or Analytics account. And then they’re pulling that data directly.

In other words, this ‘dark traffic’ problem is corrupting the data from the very beginning.

It doesn’t matter how you slice or dice or present the information if it’s wrong to begin with.

Specifically, what’s happening is that “Direct” traffic is literally stealing your results.

Technically, Direct traffic is just supposed to apply to the people who type in your web address directly.

Instead, your social, advertising, or SEO data is being taken and lumped under it.

So it’s overreported while those channels are all underreported:

This example shows that 64% of the traffic’s referral data is incorrect.

Now, imagine if you had 64% more traffic to report at your next meeting. Or 64% more conversions and revenue!

Not only would you be actually looking forward to those meetings. But you’d also be due for a proper raise, too.

So let’s solve this problem.

The first step is to figure out just how bad the problem is on your site.

We’re going to start by segmenting out this ‘dark traffic’ to see exactly how bad it is.

First, segment your ‘dark traffic’

This problem is already lurking in Google Analytics.

Here’s how to find it.

Once logged in, head over to the Audience section. We’re going to create a “New Segment” by clicking on the upper right-hand side:

If “Direct” traffic is people typing in your URL, it’s safe to assume that they’re hitting your homepage.

Right? I mean, how many people are really going to type in, “https://neilpatel.com/blog/rank-where-you-belong/,” off the top of their head?

Not many.

So that’s the clue. We want to see how many “Direct” visitors are going to these complex URLs, as opposed to the homepage.

When creating your new segment, set the “(direct)” traffic source, first.

Then, we want to pull out all of the people going to pages other than the homepage.

The homepage is denoted with only a trailing backslash (“/”) in Analytics. So set your second Landing Page condition to “is not” the backslash, like this:

Pretty easy, right?

This is the simplistic way to get a quick read on all potential ‘dark traffic.’

But there’s a more advanced method to nail this down even further.

Sayif Sharif from Seer Interactive calls these “filters.”

Basically, you’re adding a little more nuance to the equation. You’re going to draw a line between what’s definitely ‘dark’ versus what’s just ‘kinda dark.’

Here’s the tweak you’d make to the Landing Page section:

Add this additional context, then head over to your Multi-Channel Funnels report path to see what the breakdown looks like now:

This isn’t perfect, but you are a little closer to properly identifying which traffic sessions are being stolen from right under your nose.

Last but not least, let’s take individual campaigns, events, or promotions into account.

For example, you can look at data correlations when you have content go viral to narrow down your time windows.

Random spikes aren’t actually that random when you know what caused it.

So despite what your analytics might say, you know what drove that sudden increase in visits.

Don’t just compare data against the previous period, though. You should also compare this same period to years prior to make sure to account for any other seasonal, external factors outside your control.

Again, it’s not ideal. But having even a rough idea of the problem can help you take the next step to prevent it from occurring again.

Next, make sure you tag each and every campaign

Now you know what you’re up against.

Next, let’s set out to fix it.

The first trick is knowing all of the various points that could send you traffic.

These include all of your own campaigns and ads, obviously. But this chart from Marshall Simmonds is a helpful starting point:

‘Dark traffic’ is usually the end result of dark search, social, and mobile. This chart can help you figure out which campaigns affect each one of those three categories.

For example, you see a few familiar faces we’ve touched on, like apps, email, or secure search.

Next, we’re going to add Urchin Tracking Module (UTM) parameters to each link we can control.

Visiting one of the earlier blog posts about The Guardian we discussed gives you a URL that looks like this:

Those UTM parameters include everything that happens after the question mark.

The format for these is the same, no matter which tool you use.

So you can do add them in the same language from the top of your head.

Specifically, you want to add at least the Campaign Source, Medium, and Name. Term and Content are helpful to narrow results even further, but not required.

You can reference this chart and type out the code now, giving each UTM a value, like “utm_medium=email.”

Ampersand signs (“&”) are used in between each UTM element.

If I’ve already lost you, head over to the Campaign URL Builder from Google. They help break out each element, so all you have to do is drop in the values for each.

Here’s a quick breakdown of what we’re talking about:

See? Not that hard once you break it all down.

When finished, this tool will generate a brand new URL to copy and paste into your campaign.

Next, comes the boring part: Actually tagging your campaigns.

I know, I know. It sucks. It really does. But there’s no way around it.

Fire up a spreadsheet and start thinking through your naming conventions.

For example, you can create a few UTM strings and then automatically add them to different campaigns depending on which channel they fall under.

If you were building out tweets to promote a new offer, you can simply rely on the same tracking token for most of the work.

Just a little customization can now go a long way. You can plan everything, here, before uploading them in bulk to Buffer or MeetEdgar.

Holini was kind enough to give you the spreadsheet template they use on client sites.

Then, repeat this step for email, blog posts, press releases, outreach campaigns, etc. etc.

Fortunately, there are a few apps that can help speed this up.

CampaignTrackly takes the same standardized approach. You just have to drop in a few basic details and they’ll give you outputs for each channel.

Terminus is another option that helps you, from adding links initially to later tracking the results at the end of the process.

You can even drop in multiple URLs and get UTM parameters added in bulk:

It works off the same naming convention idea. So you can customize these Campaign-wide settings before and update when it’s time to run off a list at once.

My other favorite feature is that you can copy and paste content into the tool. It will automatically scan your text and replace all standard links with tagged ones.

This feature is perfect for email newsletters or outreach campaigns.

There are also a few WordPress plugins that will do this for you.

Easy UTM builder is one that will help you set UTM codes based on your site’s page or post categories.

Now, if people share content from your site that ends up bringing visits back with it, you’ll be able to see the original action that started it.

If you’re running AdWords campaigns, you can set-up auto-tagging that will add parameters to each landing page URL you drop in.

A final caveat, though.

Although it seems counterintuitive, you do not want to tag internal links.

These are links from one page on your site to another, for instance.

The problem is that you will inadvertently overwrite referral data.

So if someone clicks on your tagged email campaign link, but then visits another tagged link on your site, it will create a new session.

Instead of increasing your data’s accuracy, it will only make it much, much worse.

Keep your link tagging reserved for external links that are pointing back to your site.

Conclusion

You can’t be data-driven if your data is incorrect.

And unfortunately, that’s typically the case.

Not because you did anything wrong.

Many times, it’s already happening without you even realizing it.

The problem is that you, personally, pay the price.

It looks as if you’re not doing your job. Or that you’re only producing average results.

When, in reality, ‘dark traffic’ is stealing them all right out from under you.

Start by segmenting your traffic details to make it easy to identify just how dark your traffic might be.

Apply those additional filters by the landing page people are visiting to estimate the traffic your SEO, ad, email, or social campaigns are losing.

Then, you can add that back to your reporting for clients and bosses.

The next step is to start tagging as many campaigns as possible. This step isn’t exactly fun or enjoyable.

Sticking with consistent naming conventions can make it easier. Set-up your campaign links in a spreadsheet, first.

Or, look for UTM tools and plugins that will automatically apply ‘rules’ to each set of URLs you provide.

Ultimately, performing this boring but critical task is the only way to reduce the amount of dark traffic you have.

You’ll be able to prevent the problem from cropping up in the first place. Which means you, personally, will be able to finally prove your worth.

How much of your traffic right now ‘dark’?

Grow your traffic