Wednesday, October 8, 2014

Android UI testing with Appium

Final product of Android application development are not Activities, Services, Fragments and Views but simultaneous work of all these pieces to produce system with certain functionalities. Customer and user are not interested in internal architecture of the mobile app but they want to ensure that app returns the correct UI output in response to user’s actions on device. Therefore, functional UI testing does not require testers to know details of implementation.
Manual testing has a lot of disadvantages: it can be time-consuming, tedious and error prone. Automated testing is more efficient and reliable. All you need to do is to write test cases to cover specific usage scenarios, and then run the test cases automatically and repeatedly by testing framework.

Inspiration

The most notable limitation in Android Instrumentation frameworks, including Robotium, is that it lets click throughout only on the application that is under testing. For example, if application opens the camera and tries to take a photo, the test ends with a fail.
This is because of a permission to perform a click from one application to another. It is related to Android’s security model. For example, the uiautomator does not have this limitation, it allows taking pictures in one application and enable access to change settings in second application.

Why Appium?

  • Provides cross-platform solution for native and hybrid mobile automation i.e. Android and iOS.
  • Allows you to communicate with other Android apps not only app under the test. For example, you can start another app from app under the test (for example, Camera app).
  • You don’t have to re-compile your app or modify it in any way, due to use of standard automation APIs on all platforms.
  • It is “black box”. You can test not only app developed by yourself but any *.apk installed on your phone or emulator. Internal implementation of the app is not limitation for testing (except some rules related to UI definition like defining content-description text).
  • You can write tests with your favorite dev tools using any WebDriver compatible language such as Java, Objective-C, JavaScript with node.js, PHP, Ruby, Python, C#… All you need are Selenium WebDriver and language specific libraries.

How it works?

It supports a subset of the Selenium WebDriver JSON Wire Protocol, and extends it so that user can specify mobile targeted desired capabilities to run tests through Appium. Android support for Appium uses the UiAutomator framework for newer platforms and Selendroid for older Android patforms.
diagram

Example

My simple example is doing this:
  1. Runs MainActivity which has a button with label “button1”.
  2. Clicks on button1 which starts second Activity
  3. Checks if second screen contains TextView with text “Activity2”
  4. Clicks on “back” button
  5. Checks if we are again on MainActivity



public class AppiumExampleTest {
	private WebDriver driver = null;
 
	@Before
	public void setup() {
		File appDir = new File("..//TestedAndroidApp//bin//");
		File app = new File(appDir, "TestedAndroidApp.apk");
 
		DesiredCapabilities capabilities = new DesiredCapabilities();
		capabilities.setCapability(CapabilityType.BROWSER_NAME, "");
		capabilities.setCapability(CapabilityType.VERSION, "4.2");
		capabilities.setCapability(CapabilityType.PLATFORM, "WINDOWS");
		capabilities.setCapability(CapabilityConstants.DEVICE, "android");
		capabilities.setCapability(CapabilityConstants.APP_PACKAGE, "com.example.android");
		capabilities.setCapability(CapabilityConstants.APP_ACTIVITY, "MainActivity");
		capabilities.setCapability(CapabilityConstants.APP, app.getAbsolutePath());
 
		try {
			driver = new RemoteWebDriver(new URL("http://127.0.0.1:4723/wd/hub"), capabilities);
		} catch (MalformedURLException e) {
			e.printStackTrace();
		}
 
		driver.manage().timeouts().implicitlyWait(80, TimeUnit.SECONDS);
 
	}
 
	@Test
	public void appiumExampleTest() throws Exception {
		// find button with label or content-description "Button 1"
		WebElement button=driver.findElement(By.name("Button 1"));
		// click on button and start second Activity
		button.click();
 
		// we are on second screen now
		// check if second screen contains TextView with text “Activity2”
		driver.findElement("Activity2");
 
		// click back button
		HashMap<String, Integer> keycode = new HashMap<String, Integer>();
		keycode.put("keycode", 4);
		((JavascriptExecutor) driver).executeScript("mobile: keyevent", keycode);
 
		// we are again in main activity
		driver.findElement(By.name("Button1"));
	}
 
	@After
	public void tearDown() {
		if (driver != null) {
			driver.quit();
		}
	}
 
}
As you can see in code example, we use WebDriver to find elements on UI. It is created in setup() method where we define a set of desired capabilities. When we find certain UI element we can perform some action on it like clicking or type some text in input field.

WebView testing

One feature that is lacking in uiautomator is not existing way to directly access Android objects (Views) and there is a limitation to handle WebView. Because there is not way to access WebView, testers can not inject JavaScript, which is clearly the easiest and the best way to handle those tests. Currently there is nothing testers could do inside WebView with uiautomator.
But Appium developers found solution for this limitation. As Appium has support for both, uiautomator and Selendroid, you can use Selendroid to test WebView. Here is simple example how to do that:

public class LoginTest {
	private WebDriver driver = null;
 
	@Before
	public void setup() {
		File appDir = new File("..//TestedAndroidApp//bin//");
		File app = new File(appDir, "TestedAndroidApp.apk");
 
		DesiredCapabilities capabilities = new DesiredCapabilities();
		capabilities.setCapability(CapabilityType.BROWSER_NAME, "");
		capabilities.setCapability(CapabilityType.PLATFORM, "WINDOWS");
		capabilities.setCapability("device", "selendroid");
		capabilities.setCapability(CapabilityConstants.APP_PACKAGE, "com.example.android");
		capabilities.setCapability(CapabilityConstants.APP_ACTIVITY, "LoginActivity");
		capabilities.setCapability(CapabilityConstants.APP, app.getAbsolutePath());
 
		try {
			driver = new RemoteWebDriver(new URL("http://127.0.0.1:4723/wd/hub"), capabilities);
		} catch (MalformedURLException e) {
			e.printStackTrace();
		}
		driver.manage().timeouts().implicitlyWait(80, TimeUnit.SECONDS);
 
	}
 
	@Test
	public void loginTest() throws Exception {
		WebDriverWait wait = new WebDriverWait(driver, 10);
 
		// this is important part.
		driver.switchTo().window("WEBVIEW");
 
		// find user-name input field
		WebElement userNameInput = driver.findElement(By.id("input_user_name"));
		wait.until(ExpectedConditions.visibilityOf(userNameInput));
 
		// type user-name in input field
		userNameInput.clear();
		userNameInput.sendKeys("android1@example.com");
		driver.findElement(By.name("password")).sendKeys("password");
 
		// submit login form
		driver.findElement(By.name("login")).click();
 
		WebElement confirmButton = driver.findElement(By.name("grant"));
		wait.until(ExpectedConditions.visibilityOf(confirmButton));
		confirmButton.click();
 
		// we are now logged in app and we proceed with native app
		driver.switchTo().window("NATIVE_APP");
 
		// find button with label "button1".
		driver.findElement(By.name("button1"));
	}
 
	@After
	public void tearDown() {
		driver.quit();
	}
 
}

Backward compatibility

Appium supports all Android API levels but there is one limitation. As it uses uiatomator for tests running on API>=17, for older APIs you need to run tests using Selendroid.

Selendroid vs Appium

Selendroid and Appium are very similar:
  • both use Selenium WebDriver
  • both could be used for native, hybrid and mobile web apps
  • both could run tests on emulator or real devices
  • both are suitable for Cloud Based Testing
Selendroid, or “Selenium for Android”, is a test automation framework which drives off the UI of Android native and hybrid applications (apps) and the mobile web. As you can see from its name, it could be used only for Android which is not case with Appium (it supports iOS and FirefoxOS, too).
Selendroid has multiple Android target API support (10 to 19) and it has not limitation with WebView testing like Appium which uses uiautomator for API>=17.
UI elements locating is easier in Selendroid. In Selendroid you can find UI element by its id, class, name, xpath, link text, partial link text. Appium, for example, does not support elements locating by id (in layout *.xml file defined as “android:id=@+id/some_id”). It is because uiautomator does not support it for API less than 18. Elements locating by link text and partial link text is also not supported by Appium.
Selendroid has very useful tool called Selendroid Inspector which simplify UI elements locating. Perhaps Android SDK has uiautomatorviewer, Selendroid Inspector is more user-friendly.

Limitations

For recognizing UI elements, the Robotium is much more accurate because it lets tests to click on elements by their resource ID that provides a more accurate element identification. In addition to ID, the elements can be recognized by the content. Uiautomator has a general accessibility on labels, e.g. text, description… etc. But if there are more elements with the same text, there is need to add index for instance. And, if the UI changes dynamically, it might be a big problem. As uiautomator actually lets a test to click through device and text descriptions, such as “Settings”, can cause issues as there are “Settings” and “Option settings”. For this reason it is much harder to write an universal test in uiautomator.
Basically, you can easily find every View which has defined “contentDescription” attribute or which extends TextView class. If you have custom View, which does not extend TextView, it will be very hard to find it by test. Of course, there is an option to find view by xpath, but it is not trivial.
At a time when I was researching Appium I was not able to test screen orientation change or connectivity change. Also I did not find a way how to confirm AlertDialog in my tests. There were some proposals to use javascript methods for this but it did not work for me. Last thing which I was not able to test are auto-complete text suggestions. I did not find how to select one of suggestions.
Limited support for gestures: If your app uses only simple gestures, like tap, you could be fine. Appium allows you to write javascript wrappers to support different gestures. But you will probably spend a lot of time to write support for them.

Cloud Based Testing

Cloud testing is a form of software testing in which web applications use cloud computing environments (a “cloud”) to simulate real-world user traffic. It is interesting to me because Appium is suitable to run tests in cloud. For example, SauceLabs or testdroid provides services to run Appium tests on real devices or simulators. Of course, you need to pay for this but it has a lot advantages compared to tests run on local machine or jenkins. Simulators in Cloud are much faster than emulators running locally.

Conclusion

Appium is still young and I think that it need to grow more to cover all testing requirements and I hope it will. I like idea, especially that I can communicate with other apps on my phone while running the test for certain app which is limitation of Robotium, for example. Cloud Based Testing has a lot of advantages. For example, our tests often fail on Jenkins because it runs tests on emulators which are slow and unpredictable especially when you have wait-for-view conditions in your tests.

Tuesday, October 7, 2014

Understanding the JSON wire protocol

All this while, in many places, we have mentioned that WebDriver uses the JSON wire protocol to communicate between client libraries and different drivers (that is, Firefox Driver, IE Driver, Chrome Driver, and so on) implementations. In this section, we will see exactly what it is and which different JSON APIs a client library should implement to talk to the drivers.
JavaScript Object Notation (JSON) is used to represent objects with complex data structures. It is used primarily to transfer data between a server and a client on the web. It has very much become an industry standard for various REST web services, playing a strong alternative to XML.
A sample JSON file, saved as a .json file, will look as follows:
{
    "firstname": "John",
    "lastname": "Doe",
    "address": {
         "streetnumber":"678",
         "street":"Victoria Street",
          "city":"Richmond",
          "state":"Victoria",
          "country":"Australia"
     }
    "phone":"+61470315430" 
}
A client can send a person's details to a server in the preceding JSON format, which the server can parse and create an instance of the Person object for use in its execution. Later, the response can be sent back by the server to the client in the JSON format, the data of which the client can use to create an object of a class. This process of converting an object's data to the JSON format and JSON-formatted data to an object is named serialization and de-serialization, respectively, which is quite common in REST web services these days.
Our WebDriver uses the same approach to communicate between client libraries (language bindings) and drivers, such as Firefox Driver, IE Driver, Chrome Driver, and so on. Similarly, theRemoteWebDriver client and the RemoteWebDriver server use the JSON wire protocol to communicate among themselves. But, all of these drivers use it under the hood, hiding all of the implementation details from us and making our lives simpler. For any existing or new client library, they should provide implementations for building all of the WebDriver JSON APIs, and any existing or new WebDriver should handle these requests and provide implementations for them. The list of APIs for various actions that we can take on a webpage is as follows:
/status
/session
/sessions
/session/:sessionId
/session/:sessionId/timeouts
/session/:sessionId/timeouts/async_script
/session/:sessionId/timeouts/implicit_wait
/session/:sessionId/window_handle
/session/:sessionId/window_handles
/session/:sessionId/url
/session/:sessionId/forward
/session/:sessionId/back
/session/:sessionId/refresh
/session/:sessionId/execute
/session/:sessionId/execute_async
/session/:sessionId/screenshot
/session/:sessionId/ime/available_engines
/session/:sessionId/ime/active_engine
. . .
. . .
/session/:sessionId/touch/flick
/session/:sessionId/touch/flick
/session/:sessionId/location
/session/:sessionId/local_storage
/session/:sessionId/local_storage/key/:key
/session/:sessionId/local_storage/size
/session/:sessionId/session_storage
/session/:sessionId/session_storage/key/:key
/session/:sessionId/session_storage/size
/session/:sessionId/log
/session/:sessionId/log/types
/session/:sessionId/application_cache/status
The complete documentation is available athttps://code.google.com/p/selenium/wiki/JsonWireProtocol.
The client libraries will translate your test script commands to the JSON format and send the requests to the appropriate WebDriver API. The WebDriver will parse these requests and take necessary actions on the web page.
Let us see that with an example. Suppose your test script has a the following code:
driver.get("http://www.google.com");
The client library will translate that to JSON by building a JSON payload and post the request to the appropriate API. In this case, the API that handles the driver.get(URL) method is as follows:
/session/:sessionId/url
The following code shows what happens in the client library layer before the request is sent to the driver; the request is sent to the RemoteWebDriver server running on 10.172.10.1:4444:
HttpClient httpClient = new DefaultHttpClient();
HttpPost postMethod  = new HttpPost("http://10.172.10.1:4444/wd/hub/session/"+sessionId+"/url");
JSONObject jo=new JSONObject();
jo.put("url","http://www.google.com");
StringEntity input = new StringEntity(jo.toString());
input.setContentEncoding("UTF-8");
input.setContentEncoding(new BasicHeader(HTTP.CONTENT_TYPE, "application/json"));
postMethod.setEntity(input);
HttpResponse response = httpClient.execute(postMethod);
The RemoteWebDriver server will forward that request to the driver; the driver will execute the test script commands that arrive in the preceding format on the web application under the test that is loaded in the browser.
The following diagram shows what data flows at each stage:
Understanding the JSON wire protocol
The following table shows which command is executed at each stage:
Stage in the preceding diagram
Command executed at that stage
a
driver.get("http://www.google.com");
b
"http://10.172.10.1:4444/wd/hub/session/"+sessionId+"/url"
{
"url": "http://www.google.com"
}
c
"http://localhost:7705/
{
"url": "http://www.google.com"
}
Native Code
Talks natively to the browser
d
"http://www.google.com"
In the previous diagram, the first stage is communication between your test script and client library. The data or command that flows between them is represented as a in the image; a is nothing but the following code:
driver.get("http://www.google.com");
The client library, as soon as it receives the preceding command, will convert it to the JSON format and communicate with the RemoteWebDriver server, which is represented as b.
Next, the RemoteWebDriver server forwards the JSON payload request to the Firefox Driver (in this case), and the data that flows through is represented as c.
Firefox Driver will speak to the Firefox browser natively, and then the browser will send a request for the asked URL to load, which is represented as d.

My Profile

My photo
can be reached at 09916017317