Why two-sided testing is reducing your A/B testing program’s impact by 25%

Jones’ and Burke’s vigorous debate. (Image taken from SNL).

Identifying significant negative results isn’t often that valuable.

In A/B testing, we are always trying to create a positive effect. Consider two hypotheses by a Growth experimenter:

  • The classic hypothesis: By changing variable X, we will increase CVR / Retention or decrease Churn / Pausing.
  • The mad hypothesis: By changing variable X, we will either increase or decrease CVR / Retention / Churn / Pausing.

1. Significant negative results rarely give learnings that you can build upon in future testing

How often can significant negative results give us concrete learnings we can build on in the future? For nearly all of the tests we run, they cannot. In each of the following hypothetical examples, there are no clear follow-up tests when we observe a significant negative result.

  1. We add new education about our product to our very simple signup flow
  2. We introduce a new Abandoned Cart email journey
  3. We completely redesign our homepage and test it against the old one
Let’s say we need to know the breed of dog to make our product properly, so this step in the signup flow is necessary. Our recent test was to add the paragraph at the top to explain why customers are required to put in this information. If the test ended up reducing Signup CVR, there’s no more information we can take away. Taken from Tails.com signup flow
If the Airbnb’s old page performed better than the new page, what exactly could we infer? Do we reason that people love nostalgic 2005-era web-design? Taken from their excellent blog here.
  1. The costs or barriers to testing in the negative direction are high. Let’s say you think the latest Brand refresh might hurt, not improve, conversions on your Homepage. You might need to produce stringent (i.e. statistically significant) proof that their newest design doesn’t work if you want to test removing their new designs elsewhere on your website, as you’ll be upsetting a lot of people by doing so. (Example purely illustrative — our Brand team is amazing).
  2. You are experiencing severe pressures from HiPPOs to validate a particular feature. Demonstrating that what you’re testing not only not improving things, but is significantly harming a key success metric might help convince stakeholders to leave you alone.

2. In business, nobody needs you to “significantly prove” your negative result.

In academia, the community might find your very negatively significant result fascinating and a source of ground-breaking new research. They often value any significant effect, negative or otherwise. Tens or hundreds of academic papers usually validate broader theories by demonstrating different types of outcomes under different test conditions.

With the dummy data above, we cannot reject the null, and therefore cannot say with statistical validity that the test performed worse than the control. But we can infer that whatever we tested likely reduced CVR.

The efficiency benefits of one-sided tests are worth it: they are 25% faster to run than two-sided ones.

One of the final reasons advocates of two-sided tests prefer them to one-sided is due to claims that they are more ‘stringent’. However, it is this over-stringency of testing that may be slowing down your program unnecessarily.

Two-sided and one-sided tests. Taken from UCLA.edu.

Moving to a brave new one-sided world

One-sided tests are significantly more efficient to run than two-sided tests. And since we are in the business of producing positive changes to customer experience, negative results are either not that useful or not required to be significant to inform future testing. Both factors render the multi-directional power of two-sided tests useless, given the goals of online experimentation programs.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store