Waldo sessions now support scripting! – Learn more
Testing

Appium Architecture, Explained

Nabendu Biswas
Nabendu Biswas
Appium Architecture, Explained
November 1, 2022
8
min read

In this post, you'll learn about Appium, its server, and its architecture. Appium is an open-source mobile app testing framework that's completely free to use. You can use it to test all types of mobile applications, whether they're written using Kotlin or Java for Android or Swift or Objective-C for iOS. You can even test mobile apps written with languages like React Native or Flutter, which use web view to create cross-mobile apps.

Test cases in Appium can be written using a variety of languages, including Java, Python, JavaScript, and PHP. After writing test cases in Appium and executing them, the app will run automatically on the connected device and show the user interactions written in a test case.

Test scripts can automate actions such as the clicking of a sign-in button as if a real user were interacting with the app.

Appium is a web server that uses the WebDriver protocol.

What Kind of Framework Is Appium?

Appium is a web server that uses the WebDriver protocol. This protocol was invented by Selenium, which is an automated software that's also used for mobile automation.

In fact, the WebDriver protocol became so popular that it was included in every web browser by the World Wide Web Consortium (W3C). The W3C is the governing committee for HTML, CSS, and web browsers. So, now this protocol is included in all major browsers by default.

How Does the Appium Server Work?

The client program for Appium is written in a variety of languages. In a previous post titled "How to Test React Native Apps with Appium," we wrote client-testing code using Chai framework in JavaScript.

This code is converted into HTTP requests by the client program. Then, they're sent to the Appium server, which either runs locally or remotely.

The Appium server interprets the requests and forwards them to the WebDriver. Then, the WebDriver runs these commands on the attached physical device or emulator. As mentioned earlier, these interactions seem like real user interactions.

After this, the test results are sent back to the Appium server, which produces test status logs.

WebDriver Protocol

The WebDriver protocol is central in Appium. It was originally designed to automate the browser. And it's used in Selenium, on which Appium is based. It gives us access to some commands, which can emulate the user behavior. The behaviors like clicking or swiping can be done with WebDriver.

This protocol understands only the GET, DELETE, and POST methods of RESTful APIs. It's used to execute user interaction and other things on the attached physical device or emulator.

The WebDriver protocol receives information generally in a POST request. Here, in the body of the POST request, it receives the JSON object containing information.

It also receives important information about desired capabilities in the JSON object.

Appium is mainly the server, which is written in the JavaScript server-side framework of NodeJS.

Client-Server Communication

Appium is mainly the server, which is written in the JavaScript server-side framework of NodeJS. Client-server communication forms a major part of Appium.

As mentioned earlier, client-side programming can be done in a variety of languages, and the client sends a request to the server via JSON objects. The client-server follows the typical client-server architecture. And all the communications are done via a request-and-response mechanism.

JSON (JavaScript object notation) is also the way data is passed between web browser and server when we use the internet.

The Appium server is a web server that communicates via JSON objects. It uses a special protocol created by Selenium called JSON wire protocol, which we will learn later. Whenever a client request is made, it's processed by the Appium server and then sent to the WebDriver for execution on a connected mobile. It also sends back the results of the test to the client program.

Appium Sessions

Whenever the first client request is sent from the client to the server, a session is created. All of the subsequent requests are given a session ID, and this separates each test.

Further, the request that the client program sends is known as a session request, which is a POST request. These requests in this session are carried through JSON objects, and the underlying protocol is called the JSON wire protocol.

JSON Wire Protocol

Appium uses a modified version of the JSON wire protocol called the mobile JSON wire protocol. It's created on top of the Selenium JSON wire protocol. This is used to communicate between the client and the server. Additionally, it's actually a set of API endpoints in the server. These API endpoints are RESTful. They're a handful and predefined.

When a client wants to send some data to the server, the client program converts it into a JSON object. Then, it's sent to the server, which does its tasks. After that, the response is sent to the client.

What Is Appium's Architecture?

Appium's main part is the server, and this server is written in the server-side framework of Node.js. The client programming is written in different languages, and it communicates with the Appium server using mobile JSON wire protocol.

When the first request is sent from the client, it includes desired capabilities. It also contains all information about the connection, like the APK file, platform, and mobile device being used.

The first call is a POST request to the endpoint /wd/hub/session. Here, the Appium server creates a unique session ID and uses it throughout the session until the testing is done.

However, Appium works in a different way in Android and iOS. Let's look into that in detail now.

Appium on Android

In Android devices, we have a UI Automator framework. This is used by Appium to perform the interactions on the connected Android device or emulator. The steps, as shown in the figure below, are performed in this order:

  1. The test commands written by the user are converted by the underlying client libraries. Then, they're sent to the Appium server as a JSON object through the mobile JSON wire protocol.
  2. Now, the Appium server, using WebDriver, sends these requests to the connected physical Android device or emulator.
  3. In the device, the Bootstrap.jar gets these commands from the APK file and converts them using the UI Automator framework.
  4. After the conversion, these tests—which are generally interactions—are run on the device.
  5. The test results are also sent back to the Appium server by the Bootstrap.jar.
  6. Again, these responses are sent back to the client, who can display them or use them in any way.
appium android

Appium on iOS

In iOS devices, we have the XCUITest framework. This framework is used by Appium to perform the interactions on the connected iOS device or simulator. The steps, as shown in the figure below, are performed in this order:

  1. The test commands written by the user are converted by the underlying client libraries. Then, they're sent to the Appium server as a JSON object through the mobile JSON wire protocol.
  2. Now, the Appium server, using WebDriver, sends these requests to the connected physical iOS device or simulator.
  3. In the device, the WebDriverAgent.app gets these commands from the IPA file and converts them using the XCUITest framework.
  4. After the conversion, these tests—which are generally interactions—are run on the device.
  5. The test results are also sent back to the Appium server by the WebDriverAgent.app.
  6. Again, these responses are sent back to the client, who can display them or use them in any way.
appium ios

Why Should or Shouldn't We Use Appium?

There are several benefits and drawbacks to using Appium, and we should know them before we choose Appium as our automation test framework.

Top Benefits

The top benefits of Appium are that it's completely open source and free to use. Further, the whole code is available. Therefore, a lot of paid tools like Appium Studio are made on top of it.

Additionally, Appium can be used with all types of mobile apps, whether it's a native iOS app made with Swift or Objective-C or it's a native Android app created with Kotlin or Java.

Even hybrid apps, in which a common code is used for both Android and iOS, are supported. These apps are created with frameworks like React Native, Flutter, and Ionic.

Top Drawbacks

The biggest drawback for Appium is that the tester needs to write test cases. Generally, these test cases are written in Java, JavaScript, or Python.

So, here the tester has to learn a programming language that's generally difficult for them. There are a lot of paid fully automated tools like Waldo, in which only interactions need to be given by the testers.

Another big drawback of Appium is that the process of setup is quite complicated and prone to lots of errors.

Conclusion

In this post, we've learned about Appium, including how the server works and the famous WebDriver protocol.

We've also learned about terms like Appium sessions and the JSON wire protocol. Finally, we learned about Appium architecture and the direct ways it's used in Android and iOS.

If this all seems complicated, we can also get the APK or IPA file and try Waldo. With this testing tool, we can avoid going through the pain of writing test cases.

Automated E2E tests for your mobile app

Waldo provides the best-in-class runtime for all your mobile testing needs.
Get true E2E testing in minutes, not months.

Reproduce, capture, and share bugs fast!

Waldo Sessions helps mobile teams reproduce bugs, while compiling detailed bug reports in real time.