Automation and Accessibility Integration (AAI) (Total Control 8+)
AAI identifies UI elements on screen as objects, traditional way of (x, y) coordinates take screen as one giant object, so seek image/color and OCR are required to identify an object on screen.
Accessibility is a feature that represent UI elements on screen to underlying nodes, a node includes many properties such as text/description, dimension, boolean properties such as clickable, editable or scrollable, underlying class name, etc. The text/description can be accessed easily (OCR is not needed), the dimension (and clickable) ensures the button can be clicked on certain location even the node is moved to another location.
A node can represent an UI element (e.g. button) or a group of UI elements or layout of certain elements. A node (element, group or layout) can be identified by a node ID (represent in string of hex). We integrate Accessibility, TC scripting framework and UI Automator library to achieve the following goals:
- Coordinate independent makes the script more portable with different resolutions, multiple sizes and brands.
- Synchronous API will wait until the screen is repaint, make the script simpler, do not need to guess the time to sleep.
- Can retrieve the string from the app with ease, instead of using error-prone OCR.
The simplest case of AAI:
- Click "OK" on the screen if found, far better than click(100, 100) for specific resolution: devices.clickSync("OK")
- Enter text into text entry, AAI can find all text entry lines in current screen.
devices.inputTextSync([position], "text") // Enter text, the position is used for multiple inputs
- Run or restart application, without query, it will return on screen refresh; with query, it will match the query after screen is refreshed.
devices.runAppSync(<package name>, [query])
devices.restartAppSync(<package name>, [query])
The entire screen is composed by many nodes, a node can be a smallest UI element or a container of many nodes, some nodes are invisible. The entire screen is a tree structure starting from the single root node. Depending on the complexity of App, a screen can contain 50-300 nodes.
Since the users are interested in small subset of nodes, not all the nodes on screen, the challenge is finding the correct nodes and extract information or perform actions on them.
The challenge is how to find nodes? We invented a query language to describe the nodes, the FindNode program is installed in every device, the query will be performed to obtain the nodes that meet the criteria, the intent is to reduce the large number of nodes to one or few intended nodes, users can obtain information or apply actions to the nodes.
For example: UI Automator in Java provides "UiSelector" and "BySelector" in UiDevice.findObject() or findObjects() to locate nodes, it can be complex for multiple conditions:
new UiSelector().className("android.widget.TextView").text("OK")
We created a simple query language, that is shorter and portable since the query will be send to many devices, the above code can be rewritten in our query language as:
"C: android.widget.TextView||T:OK"
AAI project includes the following:
- Query language, simple one line syntax language to search for intended nodes. Core of the AAI.
- FindNode carry out the query or actions on each device. All the query and certain actions are done in FindNode, it contains 40+ commands. See FindNode documentation for more information.
- Object mode in one-to-many synchronization, send the node (or UI object) to all devices instead of coordination, click "OK" can run on all devices with different resolutions than click(100,100).
- UI Explorer to obtain node information, can visually test the query language, learning and exploration tool.
- AAIS, a simple language to perform automation on multiple devices. Capture and replay generate this language, see AAIS documentation for more information.
- REST and JS API includes accessibility to FindNode.
- UiElement class on top of FindNode to access node with ease.
Query
The following are the basic queries that match information of the node itself:
C:<class name> Class name (S)
R:<resource ID> Resource ID (S)
D:<text> Description (S)
T:<text> Text (S)
IT:<number> Text input type (I)
CC:<number> Child count (I)
ID:<ID> Node ID in hex (S)
BI:[x, y] The nodes contain (x,y)
BI:[x1, y1, x2, y2] The nodes enclosed by the rectangle, if x or y is -1, it will be ignored
BP:<prop name> Boolean properties (S).
The following are the expanded queries:
IX:<number> Obtain the one node from list of matching nodes based on position
OX:<number> Offset to neighbor nodes horizontally (positive – right, negative – left)
OY:<number> Offset to neighbor nodes vertically (positive – down, negative – up)
TP:<template name> Short-form of complex search
ON:<type> Different ways to pick one node out of list of matching nodes
LT:<line number>Return top nodes on the top of the scrollable nodes
LB:<line number>Return bottom nodes of the bottom of scrollable nodes
ST:<sort type> Return sorted nodes based on the position of the nodes on screen
TX Return nodes that intersect with reference node horizontally (see "intersectX").
TY Return nodes that Intersect with reference node vertically (see "intersectY").
VG:[level number] Return nodes that Intersect with reference node vertically (see "intersectY").
RN Return the optimized nodes from a list of matching nodes (see "reduceNodes")
PQ:<query> Apply after everything is done.
The query syntax can contain "!" for not, ">", "<" for greater than or less than, "*" for wild card match and "/<regexp>/" for regular expression. It can match package name, class name, resource ID, text, description, child count and input type.
FindNode is installed into every device (part of Total Control App), it is the only program that recognize the query syntax, it parses query, locate nodes and perform actions to the nodes found. FindNode offload the complexity of JavaScript and CPU utilization of Total Control, all the search is conducted in the devices.
device.sendAAi() and devices.sendAai() are direct way to communicate with FindNode with one or list of devices. The JS object will be translated to JSON before sending to device, the returned value is in JS object format. If error is encountered, the return is null, the lastError() contains error message.
A simple query example, to obtain the text of Model name, use X offset of 1 (next to the right):
>> device.sendAai({query:"T:Model name||OX:1", postAction:"getText"})
{retval: 'Galaxy S10+'}
FindNode can even detects the fixed icons on the top/bottom of the screen:
>> device.sendAai({query:"LB:-1", action:"getText"})
{retval: ['Chats','Calls','Contacts','Notifications']}
The following 3 commands, doing the same thing, click on the "Calls" text:
>> device.sendAai({query:"LB:-1&&T:Calls", action:"click"})
{retval: true}
>> device.sendAai({query:"LB:-1&&IX:1", action:"click"})
{retval: true}
>> device.sendAai({query:"LB:-1&&T:Chats&&OX:1", action:"click"})
{retval: true}
Click "Contacts" icon:
>> device.sendAai({query:"LB:-1&&T:Contacts&&OY:-1", action:"click"})
{retval: true}
>> device.sendAai({query:"LB:-2&&IX:2", action:"click"})
{retval: true}
>> device.sendAai({query:"T:Contacts&&IX:-1&&OY:-1", action:"click"})
{retval: true}
Please read FindNode User Guide for more information.