If you just had 1 Testing question to hire/reject a QA candidate

It is unfair to judge a candidate through just one challenge or exercise, but image that you are in a (non-violent & harmonious) Squid Game situation and as the hiring manager you were only allowed one Testing challenge to pose to the candidate ,

what would that be and why?

Something that is related to the Testing craft, can be applied agnostic of the experience level of the candidate and can be used a vehicle to elicit their core testing mindset

For me, it is goes something as below…

  1. I will draw a whiteboard diagram of the product or system under test
  2. I will explain a typical end to end use-case of the product/system
  3. I will explain the integrations and touch points that the system has with other sub-systems/products

and then I would commence the challenge by an open ended question

“What do you think could go wrong with this Product/System ?”

Good testers, that I have had the fortune to hire & work with, engage with this exercise usually on the below lines

  • They will probe more on the context under which this question is being asked, they will try and understand what “wrong” means here i.e. are we talking about functionality going wrong ? Scalability of this system ? end user experience ? data integrity ? security of the components ? deployment & availability ?

  • They will try & understand how and what stages does a human interact with the system and in which roles ( UI end user , admins , deployment, tech support ) ?

  • They will ask counter questions on how does data flow through the system ? Architecturally how do the integrations work , to which spec , is there a shared understanding on API specs ? Which operations can be performed on the data ? where is it stored ? how is it retrieved & displayed ?

  • They will inquire about testability & monitoring of the system or the sub-components ? How do I know data has reached from A to B in the system ? What does A hear back from B when the transaction finishes ? How are errors logged, retrieved, cleared ?

  • They will frame questions around understanding change to the system ? What is our last working version in this context ? Which patterns of failures in the past might be relevant in this context ? How do we track changes to the code , config , test environments of the product/system ?
  • They will try & establish modes of failure of the components of the system , how to simulate them ? how to deploy and redeploy the system ?
  • They will delve into finding what happens when parts of the system are loaded or soaked e.g. exposed to user interaction or due to voluminous transactions of bulk data or susceptible to infrastructure availability/scalability

These are just some of the rudimentary but important aspects of critical thinking that I would expect from promising or established Testers

Of course , a holistically capable Testers’ skills go way beyond the above points but this challenge has served me as a handy guide that acts as a screener during interviews and usually sets up the trajectory for the remainder of the interview

Lessons learnt from a POC to automate Salesforce Lightning UI

My recent client work has been on testing a migration(data & business processes) to Salesforce CRM platform.

As part of Test execution, I took the initiative to build a POC to exercise automation of Salesforce both by interacting with the Lightning UI and the APEX Salesforce API interface.

This post is to share the hurdles I faced and lessons I learnt in building the POC for UI automation.

1. Choice of tools – Cypress.io & Selenium WebDriver

I exercised two tools sets that I am experienced with for UI automation – Cypress.io and Selenium Webdriver API (using Python) .

I could not go far wth Cypress, as it has limited support for iframes ( by design) , covered in this open issue thread.

Basically, as soon as I automated the login process, hit an error where Cypress terminated with an error “Whoops there is no test to run”

https://github.com/cypress-io/cypress/issues/2367

I tried some of the workarounds mentioned in the thread that worked for some folks, but not success.

So, once I exhausted my time box for Cypress I moved onto Selenium Webdriver.

2. Bypassing Email verification after login

The first hurdle I hit was the email 2FA that was set on the Salesforce sandbox environment that I was testing.

If I would have been automating at the API layer, there are various secure ways (e.g. API tokens) to authenticate but with email verification from the UI,Salesforce bases it on IP addresses. So, to work around that I to create another test user & profile that had either my IP whitelisted or basically no-IP filtered out.

Instructions from here were helpful –> https://github.com/neozenith/testcafe-sfdc-poc

3. Explicit waits & dealing with dynamic UI ids

Goes without saying that we have to explicitly wait for elements to load, to create robust tests, I had to write heaps of explicit waits to the interact with the Lightning UI (as it is a very “busy” UI)

Another interesting quirk I found , was ,even though some elements that I wanted to interact with has unique ids that I could as a selector, but as I found later through flakiness in my tests, that those ids especially for modal dialogs were being generated dynamically, most likely per login.

e.g.

this piece of code although using id, was flaky because webdriver could not find the same id across sessions. The value “11:1355;a” would change to “11:3435;a” on subsequent test runs for the same modal dialog box.

So, instead I went with a different approach to not use id for the dynamic modal dialogs, instead search by XPath in this case and awaiting for element to be clickable

That worked and finally I was able to complete a basic UI automation flow of logging in , interacting with dynamic elements, adding user input and asserting some test conditions šŸ™‚

Heuristics for debugging integration problems

Outstanding Testers (that I have had the chance to work with/coach) did not just “report that there was a fire” , they were skilled at investigating and communicating –

  • How long the fire has been burning ?
  • What is the scale of impact ?
  • Which areas are affected vs not ?
  • What is the nature of the impact ?
  • When did it start ?
  • When did we last check ?
  • What could have caused it ?
  • What could we do better next time to help answering the above questions (when the next fire hits) ?

For Exploratory Testing, one of key challenges in testing an unfamiliar (and complex) system is ascertaining where to look for the source of the error for debugging and root cause analysis purposes.

From my experience in testing multi-technology integrated systems, I have put together a bunch of generic heuristics that I use to investigate and look for information that helps in debugging and contributes towards articulating the root cause of end-user errors.

1. “Top- down” heuristics

By top-down (in this context) I mean debugging the application stack of the system component where the symptom has cropped up.

The intention here is to ask questions to ascertain as to whether the root cause lies in the vertical slice of the architecture or not ? Because , if not , then we can start looking at the second set of heuristics (i.e. integration of the current system component with other components of the solution architecture)

  • Symptom Repeatability – Are you able to repeat the error consistently from the UX ? Which browsers + platform combination is the best bet to reproduce the symptom ?
  • API traffic for the stack– Which underlying API end points are called by the stack’s UX (when the error happens) ? Are those end points responding ? What do the browser developer tools ( or alternate methods) tell you about the request payload and response when the error happens ? Invoke the API end point directly ( with exactly the same request payload) and compare the response with the response received from the UX ? Are there any errors logged in the developer tools console ? Are those error related , how do you know ?
  • DB transactions within the stack – Which tables is the API supposed to write to ? Which fields ? Are those tables /fields being correctly populated ? Are your DB schema definitions upto date and correct ? If a stored procedure is called , is it being called , how do you know ? Do you log API/Database errors in the database ? If yes , have any errors been logged when the UX error happens ? If not, you should advocate persistent logging of errors for debuggability with your Product Owner
  • Last working version of the stack – What was the last working version of the stack i.e. did not have this error? Revert the stack to that version , can you still reproduce the error ? If not, hold a peer review of the changes since then ? Have you got automated checks to tell the status of all the versions between the working and non-working ? By reviewing those checks or manually changing (one variable at a time ) can you pin point which version of the stack this error started ?

2. “End to end” architecture heuristics

Ok, running our top-down ( through the stack) debugging checks did not yield success, now we need to inspect the integration points and other system components that your application interacts with.

  • Data flow and events across integration points – Do you have(/can you draw) a solution architecture diagram to confirm which other system components does your application stack deal with ? When the error happens, can you confirm what data, events is your application expecting from the system components(that it is integrated with) ? Is your application the receiving the data it expects ? Is the data in the right format ? When the error happens , can you confirm which data is being written to which other system components ? Is it actually happening , how do you know ? Is there logging evidence to confirm the answers ? If not, you should advocate persistent logging of errors for debuggability with your Product Owner
  • Last working version of the architecture – Do you know the last working versions (i.e. not displaying this error) of all the integration points and system components ? Can the whole architecture be rolled back to a working version ? Have you got automated checks to tell the status of all the system components between the working and non-working copies of the architecture ? By reviewing those checks or manually can you pin point in which version/by which change of a system component/integration point this error started ?
  • Completeness of the architecture – Is the architecture complete i.e. are all the system components and integration points responding ? Is there logging to confirm (or negate) that there is no missing system component or disabled integration point ? If not , have a discussion with your solution architect as to how could this be improved to aid debuggability ?
  • Non-functional/timing activities across the architecture – When the error happens , are there any resource intensive ( CPU,Memory, Dis I/O) processes that are running and/or kicked off , in other part of the architecture ? How can you monitor resources across the components and integration points ? How do you know that those resource intensive processes have completed or are stuck ? If not , where do you refer to for evidence of failure of those processes/tasks ? Are there any time outs involved i.e. is any system component awaiting on another for a response and is not getting it ? If their logging to this affect ? If not, you know what to do šŸ˜‰

Eye the fish’s eye

Arjuna (a prince from the Indian epicĀ Mahabharata) was a highly skilled warrior ,his primary skill being archery.

In the epic there is a chapter on how he won the princess’s hand by competing in a royal competition .

Task was to shoot of a fish in the eye ( which was sitting on the top of a pole and rotating ) by looking at it’s reflection in the water.

Hard eh ?

Arjuna did it and won his prize. He accomplished it by applying his archery skills through his legendary concentration levels. Legend says that only Arjuna could have done that because he could concentrate perfectly on the fish’s eye and nothing could distract him (no direct vision of the target ,target being in motion,high pressure siituation etc)

We testersĀ encounterĀ similar challenging situations often .

We have no direct vision of the target == only a symptom of the problem to start with.

target is in motion == bugs “move” as software changes, bugs “hide”,bugs “hide” and “come back”

pressure is usually high

But one thing is different ,we can not just look at only the fish’s eye or a fish eye.

While investigating a bug,a tester needs to constantly focus,defocus or have simultaneousĀ multipleĀ focus all over the SUT.Pardon the cliche but we need to see the bigger picture as well.

What the UI is doing,what do the logs say,what does the task manager say,what do the input/output routes have,what other system elements is the problem impacting,what is the impact on the project,what would be the impact on the customer,why has the problem come back,is it really a problem.

are we looking at a fish with mutiple eyes ? or even a shoal (we are looking at a fish’s eye when we see another fish)?

I think we do !