Project: AdBlock Mirage

Looking into a generic method for preventing the detection of element hiding, a method of ad blocking

9 minutes read about 7 years ago

Motivation

You may want to skip this section if you are already familiar with ad blocking techniques.

Ad blocking is the process of disabling or removing ad content from a page, often through the use of a browser plugin. With the extensive use of ad blocking and its measurable impact, many publishers have decided to employ counter measures which restrict access to site content when this type of activity is detected. If your intent is to browse the internet without third party messaging, then ad blocking can pose some challenges these days. Disable AdBlock Messaging

Current ad blocking solutions employ element removal, resource blocking, and script rewriting. The problem with these techniques is that they can be detected rather well.

Element removal is often done by modifying the element’s style (ex: “display: none”). This can be detected by looking at an ad element’s computed style at random intervals and looking for attributes which are indicative of element hiding (ex: “offsetParent”, “offsetHeight”, “offsetLeft”, “clientHeight”). Script blocking can be detected by including a unique indicator in the content and, after all content has loaded, checking for that indicator (ex: content hash check). Now one can imagine how these detection techniques could be usurped by either blocking the anti ad block messaging (ex: Adblock Warning Removal List) or even modifying the detection scripts themselves (ex: anti-adblock-killer). However this becomes a game of obfuscation based cat and mouse where ad block detection prevention requires increasingly more complexity and processing time.

Is there a better way? Maybe.

Overview

“Mirage” is a first attempt at looking into preventing ad block detection while still staying performant.

If we can prevent the detection of element hiding by lying to anti ad block element probing scripts in an undetectable way, then we stand a chance.

Since JavaScript is a prototypical language and the __proto__ chain of key DOM objects can be modified, we can intercept all element style inspection functions, disable the ad block styles before the function call, and re-enable them after. This happens fast enough that the style changes are not visually seen or repainted. However there is a non-trivial overhead to this forced relayout. Caching of ad element styles allows this approach to be more performant. We can also chose to hide elements by using visibility: hidden to not take the element out of the document flow. Visually this does not look as nice but it should not be necessary now to compute a relayout.

Approach

Replacing Functions

The first step to this facade is to replace a few key functions. Specifically we will be looking at wrapping calls to:

  • HTMLElement
    • .offsetTop
    • .offsetLeft
    • .offsetWidth
    • .offsetHeight
    • .offsetParent
  • Element
    • .clientWidth
    • .clientHeight
  • CSSStyleDeclaration
    • .getPropertyValue()

Now to intercept the function call we could add new functions to the ad element’s prototype. As so…

let adElementStyle = getAdStyle();
adElementStyle.prototype.getPropertyValue = function( ){ ... };

FN Replaced 1

However this sort of interception can be detected by walking the prototype chain and looking for targeted function name conflicts or observing that the targeted function is defined on an unexpected object.

To get around this we will want to replace the function on the object in which it is defined. We can do this by walking the prototype chain ourselves until we find the function definition. As so…

// find implementing object
let obj = getObj();
while( obj !== null &&
obj.prototype !== undefined &&
obj.prototype['getPropertyValue'] === undefined )
{
obj = obj.__proto__;
}
// replace
let originalFn = obj.prototype.getPropertyValue;
obj.prototype.getPropertyValue = function( ){ ... };

FN Replaced 2

This also prevents the replaced function from ever being called again in an unexpected way if we guard access to its original reference.

Now you might be wondering if the new function itself is detectable simply by virtue of being different than the original function. It is. See Function.prototype.toString() as an example. However we can override the toString() method and all other methods which leak this information if that becomes necessary.

Finally many of the object attributes we are interested in have defined getters and setters, and these too can be replaced in a similar manner with the help of getOwnPropertyDescriptor() and defineProperty().

Being First

Now all this work of setting up a facade for the ad detection scripts will be for naught if we cannot be guaranteed to run before them. This is actually pretty simple to accomplish. When writing a chrome extension we can set the run_at attribute in the manifest for our injected script to document_start. “In the case of “document_start”, the files are injected after any files from css, but before any other DOM is constructed or any other script is run.”. As so…

{
"content_scripts": [
{
"all_frames": true,
"js": ["path/to/our/script.js"],
"match_about_blank": false,
"matches": [
"http://*/*",
"https://*/*"
],
"run_at": "document_start"
}
]
}

Creating a “Mirage”

Now that all the prep work is out of the way, we are ready to start looking at the intercepting function. This function is actually quite simple if we ignore performance for now. We will simply disable the ad blocking styles rules, call the original function, enable the ad blocking style rules, and return the result. As such

let adBlockStyle = getABStyle();
let originalGetPropertyValue = getOriginalFn();
function newGetPropertyValue( )
{
let result;
adBlockStyle.disabled = true;
result = originalGetPropertyValue.apply(this, arguments);
adBlockStyle.disabled = false;
return result;
}

Performance

You should now have a pretty good conceptual idea of how we are deceiving ad block detection. However a lot more work is required to make this performant. This is necessary since ad detection scripts will run checks with many iterations shortly after the page loads, when new ads in frames are loaded, and most importantly during onscroll events.

To keep things snappy the element styles are cached and recomputed in batch. The cache is invalidated when a DOM element is added, removed, or a style property is modified since these events can alter what an element’s computed style might be.

As another means of improving performance, the ad block style has been modified to trigger an animation event (more details). This is much faster than calling querySelectorAll on DOM mutations. However this approach has its own drawbacks due to it being asynchronous.

Using this animation event technique also means display: none cannot be used to hide the ad elements since an animation event will not be triggered. Instead visibility: hidden and position: absolute are used for element hiding.

Limitations

There are a few. Since ad element detection is being done through an asynchronous browser event, animationstart, an ad element is not always detected before an ad detection function is run and, due to the asynchronous nature of events, this cannot be prevented. An element can also have no events fired for it if the element is added and removed from the DOM too quickly. This no-event-fired condition becomes much more likely when an ad element is added, inspected, and removed in rapid succession.

Alternative synchronous approaches such as determining if an element is an ad element during inspection causes a noticeable performance impact on complex pages, even with caching. Worse still memory usage drastically increases due to the inclusion of the ad blocking selectors on each page or frame needed for calls to Element.matches(). It does however work reliably.

Where Do We Go From Here?

“Mirage” has served as a good proof of concept and has given some insight into alternative solutions.

Parallel DOM

Perhaps the most promising is the construction of a parallel DOM. We can clone the DOM on load into a Shadow DOM, pass all style related function calls to the shadow DOM, and apply our ad blocking styles to the main DOM. This will require us to keep the main DOM and shadow DOM in sync but there will no longer be a need to recompute styles upon ad element inspection.

There are some concerns about memory usage but my hunch is that the DOM, ignoring external resources, is pretty small, and resource references will be shared between the page and shadow DOM since they are apart of the same page.

Computing Styles Ourselves

We could compute the style changes ourselves, instead of relying on the browser. If we only need to handle a subset of styles then this becomes some what reasonable. For example if an element blocking method such as visibility: none is used then element layout will not need to be recomputed. This means that only the styles which affect the visibility state of the ad elements need to be parsed and stored. And when an ad element is inspected only the stored visibility state needs to be returned.

But this approach does seem complicated and does not allow collapsing of the ad space.

Performant and Reliable Ad Element Detection

Finally one of the issues with using asynchronous DOM modification events was that the creation of ad elements could not always be handled in time. We can overcome this by either marking the ad element with a new ad block style (ex: content: 'AD-ELEMENT') or having the equivalent of a whitelist and black list, and any unclassified elements would be treated as being blacklisted.

However if we use the ‘Parallel DOM’ approach this will no longer be an issue since we can pass all element inspection calls directly to the shadow DOM with hopefully a negligible performance overhead.

Closing

We have looked at a way to thwart ad block detection if some compromises, such as content loading, can be made. Then we touched on how it could be made more performant. Finally there are some limitations in this approach which effect performance and reliability but we do have some good options to look into for addressing these.

GitHub Repo: AdBlock Plus + “Mirage”