AO Jobs


What goes into a solid bot testing strategy – By QA Engineer, Chris.

Back to articles

Hi I’m Chris and I’ve spent the last 9 months testing AO’s native Chatbot.
After completing a graduate employment scheme with a local Chorley based logistics company, I moved over to AO taking on the role of QA Engineer.

The initial strategy was an exciting opportunity. Within the company knowledge of chatbots and chatbot testing was limited.
The experience was truly as exploratory as it gets when we initially started. The current approach taken is still a work in progress and through that trial and error I’ll share with you the five most valuable tips I can offer when approaching the task of testing a Chatbot.


1) Have a clear understanding of what is going on within the Chatbot

As development progresses and functionality within a Chatbot increases it is vital a mechanism to validate behaviour of the backend system exists.

Should the user input a phrase or word while using the Chatbot, we expect the Chatbot to respond in a specific way. We can validate this by implementing an integration test solution which communicates directly with the backend API – This enables us to make assertions against actual responses from the backend engine, providing us with confidence all is performing as expected from a back-end perspective.

In addition to a strong integration test suite, here on the AO Chatbot team we have a unique bespoke dashboard which further provides insight into what interactions are being executed within the bot. This dashboard allows for us to gauge what users are typing into the bot and peer into live conversations. Data harvested from live usage then drives our channel of direction as to what customers are asking and gauge whether we have the correct content there to satisfy the customer enquiry – We classify this as a form of live continuous testing.

Without an understanding of how customers are using your Chatbot you will never have that insight into whether the content you are serving is useful and satisfying the customer’s query.


2) Validate the user experiences the software as you intend

So, you have faith in the backend ? – What about what the front end does with the information received?

Tools such as Selenium are great for functional testing, but they leave much to be desired in relation to validating how a front end is rendered. Selenium can distinguish if an element is displayed and active on screen but will not fail a test case when an element is not rendered correctly (e.g. aligned incorrectly.)

Through trial and error, I have found image comparison solutions to be a very effective answer to the above issue. These tools take a snapshot of the Chatbot and compare it with a prepared expected image of exactly how we expect the chatbot to look when it is given a specific piece of content to respond to.

Where things such as timestamps exist against messages, these will naturally vary depending on;
1) When the expected response image was captured
2) When the image comparisons tests will be run

Through the use of image comparison solutions often it is possible to set a percentile tolerance which can allow for variation in time stamps between expected image and actual image, alternatively these can be configured to be a like for like match should there be no variables to cater for – This will depend on how your chatbot is designed and functions.


3) Proactively seek a TDD (Test Driven Development) approach

Referring to Andrea Kutaisi’s article which states ‘All code is guilty until proven innocent’ is a perfect place to start when discussing this tip.

In order to be prepared to execute test driven development you must first have requirements and test cases defined. These items are a crucial part of this testing method as they form the foundation of the failing test cases we must create before any production code is written.

An example of where I have found this methodology extremely useful is;

There is a requirement for the user to receive an audible alert when a new message is received within the Chatbot. In this instance we can use a tool such a Cypress to simulate a user’s activity triggering a response from the chatbot. At this point, we have an assertion to validate that the sound event is triggered. Prior to production code being written this test case will fail.

Once the test case has been implemented and is in a failing state, writing production code can commence. This ensures that development is geared towards getting the test case to pass. By the test case being in a passing state indicates that the requirements of the software/ code have been satisfied.


4) Ensure your product can scale

You have a great product, but can a mass of people use it ?
The answer to the above question can be found by having a load test strategy. A load test strategy is the process of defining a specified load/demand on a system and measuring its response.

You could have the best product in the world but if it is found to be vulnerable under load you may find that the Chatbot could fail over or perform sub standard to what you expect.

In order determine what load you should put on your system would depend on known traffic rates. Generally, it is a good idea to work with a percentile increase from your current known max capacity to ensure that the product can scale should you experience an influx of traffic/ usage unexpectedly.



When carrying out this method of testing it is important to simulate the network traffic as realistically as possible. Generally, you would not experience a 5000% influx increase of network traffic all utilizing the Chatbot as one time. Realistically there may be a slight initial influx and then from that point network traffic will incrementally increase.


5) Use a combination of black and white box testing

By pairing with developers this allows for an extra pair of eyes that can potentially see loopholes in logic and offer suggestions based on observations. This is a benefit to the quality of the product as a bug prevented is better than a bug caught.

Relating back to the TDD top tip above, this can be incorporated into development collaboration. When development is taking place having the specified test cases to hand is useful here. The test cases can be referred to during the development process ensuring that features are meeting all requirements and the output of development work matches the expected result of the test cases shared.


I personally find this the better approach in comparison to leaving developers to implement features and test from a black box perspective later down the line. Carrying out white box testing allows for an understanding of the system and code. Logic loopholes can be quickly identified and patched at this stage opposed to testing the feature, finding the issue and then putting the ticket back into a development state. This costs money and is less efficient in the overall product quality assurance process.

I hope this helps anyone out there looking to test chatbots! If you want to access any other technical engineering blogs we have available just visit our Tech blog –

20 IT Jobs Currently Available