GA4

How to Filter Bot Traffic from GA4

Prevent bot traffic from being collected by GA4

If you’ve seen suspicious activity in your Google Analytics property, you know how frustrating it is. Your historical data is polluted with non-human traffic, making it hard to understand what’s really happening on your site. Your first step is to identify the signature attributes of the bot. Next, you’ll want to use those characteristics to filter out the traffic in reporting. And if the traffic is continuing to hit your website, setting up a filter so it’s not being logged in your GA4 data is a smart move. This article describes how to do that, leveraging GA4’s “Internal traffic filters” in a way you may not have realized is possible.

In Universal Analytics, we had the ability to filter traffic from a property view based on a variety of dimensions. In GA4 our options for filtering at the property level are a lot more limited. GA4 lets us create ‘Internal traffic filters’ and that’s about it. This feature is poorly named – we can actually use it to filter any traffic based on IP address, internal or otherwise.

What is less obvious and less well-documented is that the actual filtering takes place based on an optional ‘traffic_type’ parameter. This method of filtering traffic requires two steps:

  1. Add an Internal traffic rule – when you do this you specify an IP address or range of addresses. The rule sets the value of the traffic_type parameter for all incoming events that match the IP address(es).
  2. Add a Traffic filter – if you set the Type of the filter type to ‘Internal traffic’, you can label or exclude traffic based on the value of the traffic_type parameter.

But you don’t actually need to do step 1! If you set a value for the traffic_type parameter in your Google Tag, you can create a Traffic filter without creating an Internal traffic rule. This gives you A LOT more power to exclude traffic using Google Tag Manager (GTM). At a high level, this process looks like this:

  1. Create a traffic_type variable in Tag Manager using the full capabilities of javascript in GTM.
  2. Add a traffic_type parameter to your GA4 Google Tag that takes the value of your variable.
  3. Add a Traffic filter in GA4 based on the value you set.

I walk through a real-world example of this approach in my article Hunting for Bots. In that case, I used JavaScript in GTM to identify a specific browser version and screen resolution that were associated with a bot.

I also use this technique a lot to filter out dev traffic. It is often the case that developers work on a version of a site that has a different domain name or URL structure. To exclude dev activity from GA4, I create a regex lookup variable in GTM that outputs “dev” or “staging” based on this URL pattern. Then I follow steps 2 and 3 above to exclude the traffic. Note that when the traffic_type variable has a value of null, nothing happens – no harm, no foul.

Nico Brooks

Nico loves marketing analytics, running, and analytics about running. He's Two Octobers' Head of Analytics, and loves teaching. Learn more about Nico or read more blogs he has written.

Recent Posts

Tracking AI Traffic in GA4: A Step-by-Step Guide

Report on traffic from people clicking through from AI services like ChatGPT. Build an exploration…

1 week ago

Digital Marketing Updates: November 2024

The latest developments in digital marketing include Meta feature updates, Google downgrading ranking on vastly…

3 weeks ago

Analytics Roundup – Updates from October 2024

ChatGPT traffic in the GA house! Plus new features in GA4 and understanding GTM first-party…

3 weeks ago