A/B testing plays a key role in validating optimization strategies across search and navigation experiences. A/B test results can be straightforward if one variant significantly improves all KPIs. However, when one variant only improves certain KPIs or both variants perform similarly, additional analysis can help uncover meaningful shopper behavior patterns.
The Insights dashboard helps you analyze A/B test performance. By reviewing trends, validating assumptions, and investigating supporting data, you can better understand test outcomes and make informed decisions regarding future merchandizing strategies.
This article outlines a workflow for reviewing A/B test results and investigating unexpected outcomes using the Insights A/B Testing report.
Review how success is measured
Before setting up your A/B test, you should have defined how success will be measured for the test in question. Each test should have a clearly defined primary KPI based on your hypothesis, such as:
- View conversion
- Revenue per unique impression
- Basket conversion
- Purchase conversion
This KPI determines whether a test is successful.
Secondary KPIs as well as other supporting metrics help explain why a change occurred.
Important: A winning variant does not need to improve every metric. In many cases, trade-offs between metrics are expected.
Review overall test performance
Once the A/B test has concluded, begin by reviewing the overall test performance. These steps should be followed for all A/B tests regardless of the result.
(1) Evaluate KPI performance
Review the chosen primary KPI as well as secondary KPI to ensure you have a full understanding of the shopper journey and how well your results match your hypothesis.
If one variant outperforms the other across all KPIs, this indicates a clear winner. However, further validation by following steps (2) and (3) below is still highly recommended.
If the results are mixed, i.e. the winning variant doesn’t improve all KPIs or both variants perform equally well, continue with steps (2) and (3) below. You should also review sections on interpreting mixed KPI results and identifying inconsistent or unclear results.
Learn more: For a more detailed description of the A/B test report interface see Understanding A/B Testing Reports in Insights.
(2) Analyze performance over time
Review the time series for relevant KPIs as well as other supporting metrics.
Warning: The time series graph only reflects the data included in the currently selected Global Filters. Clear all filters to show all data collected for the test.
Review total impressions to validate that traffic was distributed according to the chosen ratio between both variants throughout the test period.
Warning: Unexpected traffic allocation or data inconsistencies can invalidate test results and point to tracking issues.
Review view conversions. A clear winner should consistently outperform the other variant. If the winning variant only outperforms the losing variant for a certain stretch of time, this could indicate a trend and requires further investigation (see step (3) below). Similarly, sudden spikes or drops should be further investigated.
(3) Identify changing trends
In some cases, performance shifts during the test period, i.e. one variant performs better initially, while the other improves later. This indicates a change in shopper behavior.
Performance shifts can be caused by:
- Pre- and post-payday spending patterns
- Seasonal changes
- Product launches or new releases
- Items going into or out of sale/promotions
- Differences between weekday and weekend activity
- Marketing email campaigns or outreach timing
- Changes in weather conditions
Ensure to review the results carefully to identify whether the shift in performance can be explained by any of the reasons mentioned above or any other condition you can identify.
In such cases:
- Interpret results separately for different periods.
- Consider re-running the test for each context.
- Use separate strategies for the different periods to address the different behavior patterns accordingly.
Interpret mixed KPI results
When one variant does not win across all KPIs, it is imperative to interpret results in the context of your hypothesis. A positive outcome does not always mean all metrics increase but trade-offs between metrics are expected.
Example 1: Showing more available products
If you adjust a ranking cocktail to show more available products by avoiding products with fragmented stock, this can often:
- Decrease or increase views depending on whether your catalog lends itself to exploratory shopper journeys
- Increase purchases and revenue as available products are shown
This indicates improved efficiency in product discovery.
Example 2: Improving search relevance
If you adjust the search configuration, expecting to increase search relevance, this will most likely:
- Increase add-to-basket, purchases, and revenue
- Increase or decrease views
The expected behavior for views depends on your catalog:
- Broad catalogs (e.g. fashion): Views may increase as shoppers see more relevant products and browse more
- Specific catalogs (e.g. technical products): Views may decrease as shoppers find what they are looking for more quickly
Example 3: Promoting higher-priced products
If you adjust rankings to prioritize higher-priced items, depending on the price sensitivity of your shoppers, this can:
- Not affect or decrease views
- Decrease basket and purchase conversions
- Increase revenue
This can still be a successful outcome depending on your goal.
Troubleshoot unclear results
Some tests produce mixed signals that are difficult to explain, even when considered in the context of your original hypothesis. These cases allow you to explore shopper behavior in detail.
Consider metric reliability
When trying to understand inconsistent results, keep in mind that not all metrics are equally stable:
- High-volume metrics (e.g. views) are less affected by randomness
- Low-volume metrics (e.g. revenue) are more variable
For example, revenue changes may be less reliable without sufficient data.
Note: In Insights, significance is only calculated for primary and secondary KPIs at the FAS environment level (e.g. live_1). Significance levels are not available for individual report environments or locales.
Diagnose unexpected results
If results are unclear or do not match expectations, follow the structured investigation process below.
(1) Re-check your setup
Validate that the A/B test was set up correctly:
- Does the configuration match your intended strategy?
- Are the correct products shown in each variant?
- Are there overrides (e.g. result modifications) affecting results?
Review your setup in MS Preview to confirm that the business rules are behaving as expected.
(2) Break down the data
Identify other patterns:
Use global filters to identify regional or country differences.
Consider breaking the A/B test down per category.
This helps determine whether specific segments drive the result.
(3) Use behavioral signals to explain changes
Analyze supporting metrics to understand shopper behavior, e.g.:
- If view conversions decrease:
- Increased pagination or filtering may indicate low relevance of products shown to the shopper.
- If purchase conversions decrease:
- Decreased purchase to add rate would indicate that that shoppers are adding products to cart but abandoning before checkout; this could signal issue with, for example, pricing competitiveness, shipping costs, or low purchase confidence.
(4) Refine your hypothesis
If, after following the steps above, you still struggle to make sense of the results you have in front of you, don't immediately repeat the same test. Instead:
- Combine the learnings from both variants to develop a refined hypothesis.
- Use the findings you have gathered thus far to create a testing plan.
- Design a more targeted follow-up test. Depending on your findings, consider:
- Increasing the footprint of the test.
- Making more radical changes that align with your hypothesis.
- Targeting categories/countries/user groups that are likely to share behavioral patterns.
(6) Document insights
Record findings for future use:
- Add notes explaining results and interpretations.
- Use labels to organize tests.
Important: Documenting your findings with the current context fresh in mind helps to build long-term understanding of shopper behavior. This, in turn, will help you to optimize your site over time and avoid repeated analysis.
Apply learnings and validate changes
Use insights from your analysis to guide further optimizations.
- Adjust your strategy based on observed behavior.
- Test refined hypotheses using A/B testing.
- Validate changes through consistent improvements in KPIs.
A/B testing should be treated as an iterative optimization process. Even inconclusive or mixed results can provide valuable insights into shopper behavior and help refine future strategies.
Comments
0 comments
Please sign in to leave a comment.