Waldo sessions now support scripting! – Learn more

How Scripts Test Your App

May 9, 2023

min read

If you are looking to automate your mobile tests, you have likely been looking at tools and frameworks (Appium, Selenium, etc.) that help you write code to automate your suite.

Scripting is far and away the most common automation solution, but is it the best way to automate testing, for mobile specifically?

In this blog we are going to compare the fundamental makeup of tests created through scripting versus tests created with Waldo. Below, we will break down the differences with a focus on how a test is built, executed, and the results each provides.

How a script will test your app

Let’s start with the basics: a scripted test consists of a series of programmatic commands that check certain behaviors within your test scenario. Ideally, the behaviors you check correspond to where bugs could be present in the given scenario.

When you create a scripted test, you base it on the existing build that you have. One of the ways a script can navigate future builds is using the element IDs or coordinates of certain elements as they existed in this initial build, which can make scripts more fragile and prone to break. You can also layer specific assertions into a script: asking it to check for things like text matching before completing an action.

Starting with something simple as an example: you want to include a step in your test scenario where you click on an element labeled “Battery”. What this looks like in practice is:

Example of a basic script

The script will navigate to the item based on the text associated with the item (in this case, the word “Battery”), and attempt to tap the element. Your scripted test will then return the boolean result of “yes” or “no” to indicate it was successful or not.

And that, in a nutshell, is how a scripted test works. I will pause here for applause.

The benefits of scripting

One of the core benefits of this approach (and what has led to it becoming the de facto solution for automated mobile tests), is that your developers can customize it to suit your app’s needs.

Their familiarity with both your app’s codebase and the coding language they select can simplify the process of designing the right commands for your scripted tests: giving more meaning to those boolean “yes”/”no” results.

Let’s look at how Waldo goes about executing a test.

The differences start with how a test is built.

In Waldo, you will upload a build to the platform, and then navigate through the build on a simulated device to create an initial “baseline” recording of an ideal user behavior. This demonstrates what a “good path” looks like to move through your app, so it can attempt to replicate it using future builds.

Starting with a simple use case: you need to click on the Log In button in your test scenario.

Waldo attempts to advance from step-to-step in your test scenario following the path you provided in that initial baseline. It applies a proprietary source tree: analyzing every element present in every layer of your app, and identifying available elements (i.e. buttons or text fields) that can be interacted with on a given screen.

This allows Waldo to not rely on element IDs or coordinates, and take a multi-layered approach when breaking down each screen it is presented with, evaluating:

What elements are present?
What elements are accessible for interaction on this particular screen?
Where are the elements we identified in the baseline located on this new screen?

Much like a script: you can also add assertions in Waldo: asking it to check for specific text, elements, etc. Once Waldo completes the analysis of the screen, it will attempt to take the designated path by clicking on a login button.

‍

The image above shows how Waldo identifies and matches elements on each screen.

The benefits of Waldo

Waldo goes beyond pixel perfect comparisons: attempting to match elements present on one screen to the other. Slight movements or UI changes will not necessarily cause Waldo to fail your test. It will, however, note those discrepancies or any others it finds on the screen.

This means that instead of having to predict what bugs could appear and programmatically check for them, Waldo’s similarity algorithm catches all the discrepancies it finds without requiring specific assertions. This allows Waldo to focus on finding out if it can navigate through your build following the same, happy path provided in your baseline: prioritizing if a user would be able to complete the flow.

When a test fails in Waldo, you know what caused that failure. Unlike scripts that return a boolean result and no additional detail, Waldo’s results will show you how it broke down your screen, what element caused the failure (where applicable), and provide you with the additional details necessary to begin debugging faster (logs, network requests, full video playback of your test, etc.).

‍

The image above shows the Replay artifact generated with every test run in Waldo.

‍

Drawbacks to Waldo

Now, Waldo is not a magic, quick solve for all your automation needs. One of the limitations of the platform is the out-of-the-box customization. Since Waldo focuses on confirming that it can replicate a “happy path” through your test scenario, it is not as flexible as a script when it comes to highly custom or complex tests.

Scripts are code, which means in theory: they can do anything you program them to! Waldo does not have the same inherent flexibility. While this allows Waldo customers to limit their suite maintenance, it may slightly limit the types of tests you can automate in the platform.

Drawbacks to Scripting

Like Waldo, scripting is not the silver bullet for mobile test automation either. One of the biggest drawbacks to scripting is that it requires updating with every change to your app…and those updates have to be made by an engineering resource who can write them.

When a scripted test returns a failed result (or even a passing one), it won’t provide you with any additional information outside of “yes I could complete this action” or “no I could not.” Investigating the root cause starts with determining the validity of that simple result, and then ruling out other possibilities one by one.

For Example: A script could be able to navigate a screen where the UI is completely broken, still find the button you are asking it to tap, and pass your test while providing no warning about the UI issues. Since the results give you no real insight into what happened during the test, they provide limited value.

Lastly: scripting solutions do not come with their own infrastructure: so your team will be responsible for building and maintaining that themselves. This means device setup, cleaning, user account management, server updates, etc.

There you have it: the breakdown of how scripting compares to Waldo, and vice versa. If you think Waldo may be a good fit for your team, reach out today to book a demo!