Software Quality Assurance SQA

Thursday, May 22, 2014

What's a Unit Test?

Well, what's a, "Unit"?

A unit is a sub-component of a modular architecture. Wikipedia defines a software unit as: "...the smallest testable part of an application.". (http://en.wikipedia.org/wiki/Unit_testing)

If a car engine is just a single object and all of its components are fabricated (built) together, it's not modular. That is, if you can't build the components separately there are no sub-units. You're left with the ineffective and costly testing method of system testing the entire engine. But you can't effectively test an alternator by driving the car.

Car manufacturers get this. They design automobiles to be modular. This means they can design, build, and test individual components and they can do this in parallel. It also means individual components can be repaired or replaced in the field. Car manufacturers often outsource unit production. They can do this because they have a modular design. They design the required interfaces and behaviors of the sub-components and provide those specifications as requirements to the sub-component's manufacturers. Those specifications inform the sub-component manufacturers' unit test planning, including test jig design. (A test jig is an instrumented tool that connects to and exercises a sub-component's interfaces.) Unit testing is done before the sub-component is sent to the car manufacturer to be bolted into the car. It's the sub-component manufacturers' responsibility to assure their product's quality.

Take the example of a car's alternator. Chevrolet provides specifications to ACDelco. Those specifications include the overall dimensions, the size and positioning of the mount points, the diameter of the belt sheeve, the electrical output at different speeds, heat output, the electrical cable dimensions and connections, and so on... ACDelco creates unit tests to assure their alternators meet Chevy's specifications. They build a test jig based on the specifications. They use the jig to execute unit tests before shipping their alternators to Chevy. Notice that although Chevy may do further testing, it is ACDelco's responsibility to test their product. It's. Their. Product.

So modularity is a requirement for software unit testing. If you can only build the monolithic application, you can't unit test. A software architecture must allow modules to be individually built for them to be individually tested. The developer is responsible for writing unit tests alongside product content. This investment pays us back every time the unit tests find a bug.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, May 14, 2014

New PEST Features

I just checked-in two new powerful PEST features:

wait

This causes the harness to pause to wait on a list of tests before starting the one containing the wait attribute.

haltfail

This causes the entire run to be aborted if the process containing the haltfail attribute fails, forcibly killing all remaining processes and exiting the harness. Thus, the testbed drops into forensic state for debugging.

Perl System Test (PEST) is a test harness, written in Perl, for running processes on a distributed testbed via SSH. It's available for free on Sourceforge:

Project Page

https://sourceforge.net/projects/perlsystemtest/

Wiki

https://sourceforge.net/p/perlsystemtest/wiki/Home/

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Thursday, May 1, 2014

Test Data Types

There are two kinds of test data generated by testing. As I've said elsewhere in this blog, it's the Release Candidate that separates them. If we want to run an efficient testing organization we need to understand the ways in which we use data to decide how we store and summarize it in managing our software projects.

Pre-Release Candidate Testing

Testing we do prior to having a completed product doesn't count toward qualifying the product for shipping to customers. We may do a lot of testing during development. We may test every integration build or every check-in. That produces a lot of test results data. Testing during this phase focuses on maintaining product quality during development, so we're trying to find bugs. We're managing by exception. We want to know about regressions as soon as possible. We're very interested in the failures and not at all in tests that pass. We file bugs for the failures. The passing tests we ignore.

The Data - We need to keep test failure forensic data (cores, logs, config, test logs, etc...) for debugging and associate it with the bugs we file. But we have no use for the test results data from the tests beyond that. After filing bugs the data not associated with the bugs can be discarded.

Release Candidate Testing

By contrast, testing we do on a Release Candidate is official. We very much care about all the test results and we want to see them all pass. We want to manage execution of a Test Plan through to completion. After all, the goal of this testing is to ship the product. We should have already found the bugs. We're hoping to ship the product and we need all the test results to prove that it's ready.

The Data - We need to keep the test results data forever. We're going to use it for comparison in future regression testing. These are the results that customers and partners will want to see. Of course, we keep the forensic data with bugs as usual but unlike the test results generated by Pre-Release Candidate testing, we need to keep all of the results data too.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Monday, January 20, 2014

Advice to Recruiters

LinkedIn is awesome. I'm a big fan of professional networking. I love the idea of keeping track of people I've worked with and to keep my finger on the pulse of my industry. I don't mind unsolicited emails at all. For the most part, I enjoy hearing about job openings from recruiters even when I'm not actively looking for another opportunity. I often learn about exciting new startups and technical trends that way. I like that. But over the years I've accumulated some advise for recruiters.

Here are a few pointers for recruiters:

Sell the company and the work. Tell the prospective employee why the company and the position will enhance their career and professional well-being. Is the company doing interesting, cutting edge work? Are there interesting people working there? How will the position benefit the recruit? Don't make the mistake of the just listing the company's technical demands. If you've done your homework, you already know the prospect is an approximate fit.
Unless there's a private jet involved, don't sell the benefits. They're all the same. Who cares whether they have a foosball table?
Stop asking for the prospect to send you an updated Word resume (pre-Internet technology) if the user has a LinkedIn account. This is not 1950. Just ask whether the LinkedIn information is current. Web links work. Mine looks like this: http://www.linkedin.com/in/toddshoenfelt/.
Tell the prospective employee where the job is right up front. By this, I mean the city. "The San Francisco Bay Area" is not a city. You're not going to trick anybody into a 2 hour commute, so weed those people out early by being forthcoming with the location. Somebody who lives in Silicon Valley is not going to commute to San Francisco every day no matter how great the fit, certainly not for long.
If the job requires the prospect to move, have a plan, and tell the prospect what it is initially. If there is no relocation assistance (and there never is), say so. You're not going to fool anybody into moving to another state on their own dime. You're not that clever. You should probably entirely stop recruiting people in remote geographic regions anyway. It's a long shot at best.
Respect the prospect's preferred method of contact. For instance, I prefer LinkedIn email. I don't answer the phone unless the number is in my contact list, nor do I answer emails to my personal address. My LinkedIn profile says so.
Don't ask for referrals if we've never successfully worked together. I don't give personal recommendations for people unless I'm familiar with their work. I wouldn't recommend a house painter if I've never seen a house they painted.
Do your homework. Know who you're talking to. If you're recruiting for a technical position, learn about the technologies. Generally, companies and employes want a good technical fit.
If you are an internal recruiter for a big company, promote the candidate aggressively. And I mean you should have interviews lined-up within 24-hours. Your external competition will have delivered 3 offers in the time it takes you to circulate a resume among the "hiring managers" and wait for their email replies. (This advise is targeted at the web search company whose name rhymes with, "Wahoo!", who ironically really don't get how fast things move in the Internet age)
Recognize that an employment relationship is like any other business relationship. Businesses can choose whom to employ. The employee can choose for whom to work. Experienced people have options and they are interviewing you and your company too.
Respect the prospect's preffered job type. For instance, I'm not currently interested in contract work. My LinkedIn profile says so. Why send me an email for a contract gig?
Realize that being accepted into somebody's professional network is a privelege, one that can be leveraged to your advantage or revoked. A reputation for slipshod work will get you shut out. And once you're out, you're probably not going to get back in.
Be humble. Even if you're recruiting for Apple or Google, don't fool yourself into believing that big names will sell themselves. There are plenty of reasons not to work for big companies. Good employees have to be earned, just like good jobs.
Don't be overly aggressive with the prospect. Very few people are tolerant of pushy salespeople. Being too aggressive may get you shunned permanently.
Always, always, always asked to be added to the prospective employee's LinkedIn network, whether you close the deal or not.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, December 4, 2013

Perl System Test (PEST) Harness

First release on Sourceforge!

Perl System Test (PEST) is a test harness, written in Perl, for running processes on a distributed testbed via SSH. Test processes are defined in a simple config file for execution on remote hosts, enabling a tester to run tests serially or in parallel. Processes are as non-blocking and un-buffered as possible. The harness redirects process output to an individual log for each process and logs process metadata. It captures the return code of each remote process and interprets 0 as pass and everything else as fail, enabling it to run tests written in any language. The harness depends on a standard testbed descriptor configuration file to facilitate portability across testbeds.

PEST is the culmination of over a decade of test automation experience, designed by drawing on background with companies like NetApp, Dell, and HP. Its requirements are the synthesis of all that is good about the automation technologies with which I've worked and none of the bad.

PEST is simple, yet powerful. Anybody can control complex distributed systems with zero programming knowledge.

Project Page

https://sourceforge.net/projects/perlsystemtest/

Wiki

https://sourceforge.net/p/perlsystemtest/wiki/Home/

More information to come...

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, June 21, 2011

Black and White

Software comes in layers. Higher layers operate and are dependent on the layers below. Each layer presents one or more interfaces. We test these interfaces to their specifications.

Black Box Testing

Black box testing means testing the interfaces presented by the outermost layer, the layer operated by a user, for instance. It typically includes testing things like GUIs and CLIs. Accordingly, test cases are defined by the actions you can take in manipulating the interfaces (GUI or the CLI).

Black box testing of an automobile, for instance, would include test cases describing using the steering wheel, gas, and brakes. It would not include test cases directly manipulating the alternator.

White Box Testing

White box testing means testing any of the layers below the outermost layer. These layers are not directly manipulated by a user. Instead, their interfaces are accessed by the other layers. Our test cases are therefore defined by the actions taken in manipulating the interfaces they present to the other layers. This kind of testing requires a test jig.

To continue the car metaphor, white box testing would mean testing engine parts that are under the hood. The alternator, for example, is bench tested outside of the car. The test jig is designed to imitate the car's connections to the alternator. It contains structural mount points to which the alternator can be bolted, an electric motor with a rheostat (continuous speed control) and pulley to which the alternator's belt can be attached, and electric leads to an amp/voltage meter to monitor electrical output. Test cases would include manipulating the rheostat and evaluating amp meter readings.

Understanding our product's software layers enables us to weigh the costs and benefits of black and white box testing helps us allocate our test resources to best effect. I'll discuss the costs and benefits in a future post.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Thursday, March 31, 2011

Quality - What is it?

"God is in the details" - Ludwig Mies van der Rohe

Mercedes, Apple, Google...why is it we think of their products as exceptionally high quality?

Let's take the example of Mercedes. Is it enough for them to try to deliver a car that never breaks down on the freeway? Do you think that's the end to which they aspire? Is that enough to separate them from, say, Volkswagon? No, that can't be it. VWs don't break down of the freeway either. In fact, isn't that just meeting the bare minimum customer expectation? Isn't it (rightly) assumed that your car won't break down?

If we should be able to expect any car to never break down, then robustness can't be what characterizes high quality.

Can you guess why I love this commercial?

http://www.youtube.com/watch?v=6sr3Rh7Yjnc

Look at the focused attention to detail. There is a product specification that details the width of that gap. It specifies how wide it can be, how narrow it can be, and the permissible variance in width. There are test cases about that gap. Mr. Labcoat is going to send that car back for adjustment and possible redesign if the tests fail and the product doesn't meet those specifications.

"God is in the details."

So is quality.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, March 15, 2011

Development and Testing

There are two distinct phases to QA:

1) Testing we do before Code Freeze
2) Testing we do after Code Freeze

The testing we do after Code Freeze is the only testing that "counts" toward qualifying the product. Before that, our testing is designed to keep product quality high during development, but doesn't "count" because we don't have a completed product. Here's a graph showing how I see the development and testing milestones in the production process.

Branch Milestone

The line at the left of the graph is the point at which we branch our new release. At that point, its content is exactly equal to the previous release. There have been no check-ins, neither new content nor bug fixes. Consequently, all the tests in the previous release still apply, assuming no features were deleted. We test a build of the product that may include special test features and privileges that won't ship with the product. During this development phase, pre-checkin unit tests assure new features work. Post-checkin nightly regression tests assure existing features don't break. Weekly disruptive system testing keeps robustness high. The goal of the testing is to assure bugs are identified with the check-ins that caused them, before reaching Code Freeze. Defects found later are harder to diagnose and assign, expensive to fix, and can seriously delay the release.

Release Candidate Milestone

The next line is where the first Release Candidate (RC) is declared (for the purposes of simplicity and clarity I assume only one RC). At that point all principle development of new features and bug fixes have been checked in, unit tested, and fully regression tested, including upgrade and revert testing. All the previous release's tests still apply along with the new features' tests. All the known bugs that would stop shipment have been fixed. This is the starting point of our Test Plan execution and official product qualification. Here, we're starting to test in order to decide whether we can ship the RC. If show stopper bugs occur, we fix the bug(s), then produce a new RC and restart Test Plan execution.

Release Milestone

The last line is the product release. We have a completed product and all the tests in the Test Plan have been executed on the RC. Bugs found during qualification testing are not severe enough to warrant to stop shipment.

I'll be referring to the graph in future posts in describing how testing relates to the Development and Testing phases and how we can make our test automation count.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, March 9, 2011

Utilities

Test utilities are scripts or compiled programs that can be run from the command line that automate routine tasks associated with development and testing. They have several use cases and hence pay us back continuously for the initial investment of creating them. They have the dual advantage of benefiting both manual testers and automation. They can be strapped in to CGI GUIs as well to allow the utilities and tests to be executed from a browser. They can be used in configuration management to maintain the readiness of lab equipment. Here are some of the utilities I've found to be of general use:

Install - Newly install the software on a testbed.

Upgrade - The software is already installed on the system. Upgrade the software to a newer version while maintaining data, state, configuration, and (possibly) service continuity.

Revert - Like with upgrade, the software is already installed on the system. Revert the software to an older version. The data, state, configuration, and service continuity as possible should remain intact, net of new features that didn't exist in the version to which we're reverting.

Configure - Set the standard software configuration for our lab after the install script is run. This can include extra configuration specific to testing our software.

Monitor - aka 'Alert Daemon'. Monitor the testbed and tell us when the product is unhealthy. Alert criteria can include things like cores, missing services/processes, or unplanned reboots. The monitor can be designed to run snarf when it sees problems, assuring we don't lose important forensic data or bugs.

Snarf - Copy forensic data off the testbed to somewhere safe, typically a NAS share. Copied data can include logs, config files, binaries, and cores for each node in the testbed.

Reset - Return the nodes in the testbed to its pre-install state, leaving it ready for the install script to be run.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, February 22, 2011

The Testbed Monitor

With the increasing prevalence of distributed, clustered, or cloud products, monitoring the status of your testbed for problems is more important than ever. Distributed systems are computers, networking, and software working together as a single system. It's very difficult, if not impossible, manually to monitor the condition of a distributed system competently. Transitory events can be missed. Even obvious defects can go unnoticed.

Testbed Monitor to the rescue.

The Testbed Monitor is a custom flexible tool that detects problems with the testbed by continually tracking essential processes or services in parallel to detect failures across all the systems in the testbed. It runs on a management host and not on the testbed (we can't assume the testbed will stay up). The Monitor notifies the user via script output to the console and email when it detects a problem and tracks known problems to assure it doesn't spam the user repeatedly. It has flags to control the conditions the user wants to check. It has the option to 'snarf' forensic data from the testbed to a network share when it finds problems. The Monitor returns 0 to the shell if no problems were found and 1 if one or more problems were found.

If you design your test infrastructure right, the Monitor will be useful for manual testers as well as in automation. A manual user launches the Monitor before they start their testing. The execution harness launches the Monitor at the beginning of the test batch. The reservation system uses the Monitor to verify system readiness.

The Monitor has 3 general use cases:

1) Initial - Check the testbed once and exit. This is intended to check for existing testbed issues.
2) Initial Continuous - Check the testbed continuously until killed, including initial conditions.
3) Continuous - Check the testbed continuously until killed, ignoring initial conditions.

The issues to monitor vary by product. Common things to monitor are:

1) Cores
2) Required services
3) Daemon processes
4) Log entries
5) Bounces (reboots)
6) Server protocols (ICMP, NFS, etc...)

The only effective way to deal with the increasing complexity of distributed systems is increasing automation. The Testbed Monitor is one of tools you need to tame your distributed beast.

Todd Shoenfelt

http://www.linkedin.com/in/toddshoenfelt

Tuesday, February 15, 2011

Hardware Utilization

Where's your car?

It's in the garage? So you're not using it? Sounds like you could improve your hardware utilization. You should drive it somewhere.

Goofy right?

If you don't have anywhere to go, it's not under-utilized. There's no reason to operate it more than you need to. A resource is only under-utilized if you have a legitimate use for it, but aren't.

There are several stages to developing software. Only when we have a release candidate can we test the product (before that, we don't have a product). So there are significant lengths of time during which equipment will be idle and periods in which it will be in great demand. During idle time, our engineers should focus on reducing their automation backlogs or plugging testing gaps by writing new test specifications, investments that pay us back repeatedly.

We shouldn't contrive uses for our testbeds any more than we should contrive uses for our car. Write a test plan and execute to plan. Creating make-work is wasteful. The worst sin is to waste human resources in trying to make hardware look busy. People are more expensive. Let's make their time count.

Automation anyone?

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfe

Monday, August 23, 2010

How To Name a Test Script

As I've said elsewhere in this blog (http://qa4software.blogspot.com/2010/07/test-specifications-dont-have-variables.html), test cases don't have variables, they have specifications. It follows then, that automated test cases also don't have variables. In practice, that means scripts don't take variable arguments and they don't have config files. Config files are just a way to populate variables so when I talk about variables below, I'm also talking about config files, and vice versa.

If we want to assure the quality of our software doesn't degrade over time, we need to do regression testing. Regression tests are deterministic and therefore repeatable. They don't vary. Variable arguments, including those populated via config files seriously undermine the ability to do regression testing.

Test planning generally works like this:

1) Review the Product Requirements Document and Development Response for new features.
2) Create new test cases in the test case planning database for the new features.
3) Automate the test cases.
4) Execute the automated test cases.

All automation approaches are not created equal. Take these two execution synopses:

$> testcase123.pl

$> testutility.pl -A 123

In the first case, the script name equals the test case because it doesn't take variables. We can track our test planning and execution by script name from test case design through execution. If we find a bug, we can tell the developer he can reproduce the bug by running 'testcase123'.

In the second case, we can't just refer to the script name when we talk about the test case, the full name of the testcase has to be, 'testutilityA123', because the script name no longer uniquely identifies the test case. Config files are rarely used to specify just one variable, so we should identify test cases with something like, 'testutilityA123B432C484D543'. Of course, that doesn't actually happen. People shorten the name to just, 'testutility' as if the variables passed to the utility all result in the same test case. Then they incorrectly record all the test results to the same test case in the testcase database.

The usual objection to the former approach is that you have to write more scripts. While that's true, in the second case you have to write and include config files in the test case specifications along with the utility script. There are no config files to keep track of in the first approach. Essentially, the automation of the test has been split into a utility and config files in the second approach.

In the first approach, the script is the test case, so you end up with these files:

testcase123.pl
testcase456.pl

In the second approach, the-script-plus-the-config is the test case, so you have these files:

testutility.pl
config123
config456

You're not really avoiding creating files in the second approach. But you are guaranteeing that QA engineers will fail to track and manage their config files. In the first approach, you can create testcase456 from testcase123 by simply copying the file and make the necessary changes internally.

The next objection to the first approach is that there is much duplicate code between the copied test scripts. The answer to that is utility modules. Factor out the common code to library functions for reuse.

Obviously, I'm a proponent of the first approach. The test script is the fully automated test case, so test cases can be managed as a single file and regression testing is greatly simplified.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Thursday, August 19, 2010

Should I "Break Stuff"?

"We want people in QA who are good at breaking stuff." - Generic Manager

Sorry, but that's just patently incorrect. I've known a lot of QA engineers who think that's their job. As a consequence, they feel unproductive when they're not finding (creating?) bugs. That feeling is often validated by their manager's focus on measuring their performance by their bug rate. This misunderstanding results in searching for and finding a lot of non-bugs and the consequent wasted QA and development cycles. They drive the car into the lake and say, "Look! It doesn't float! That's a bug!" But it's only a bug if the product specifications say the car should float.

Contrast the above attitude with the correct understanding of QA's mission: Our job is to demonstrate the product conforms to its specifications. Of course, among other things, the specifications should include robustness goals. And demonstrating we meet those robustness goals is a legitimate QA function. Demonstrating the product behaves correctly in negative test cases is also legitimate. The reason so many QA engineers and managers focus on "breaking the product", is that they often don't have specifications from which to plan their tests. They don't know whether the car is supposed to float or not. Trouble is, it's their job to know and if they don't, to find out.

QA's value, then, in the correct paradigm is in providing the information necessary to decide whether the product is fit to ship. This results in a different assessment of QA's value that doesn't depend on the bug find rate. It means the tests that pass are every bit as valuable as the tests that fail, a dramatically different mindset from the "breaking stuff" mentality. The QA engineer who believes he must, "break the product" will see no value in tests that pass. He will also see no value in issues he considers trivial, like whether the product is easy to use or whether the documentation is correct. As the product's quality improves with time, he will view his value as diminishing as it becomes increasingly difficult to find robustness bugs.

Can you imagine the QA engineer at the end of the car assembly line banging on the car with a sledge hammer? Why is that absurd? He could easily beat holes into it and claim to have found a defect. If you agree with me that that's not his job, what exactly is his value in the production process?

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, July 27, 2010

Test Portability

"Is your test portable?" - Generic Manager

"Huh?" - Todd

Test cases are dependent on interfaces. A test case is a test of something and that something has a specification. If the interfaces required by a test case exist across multiple testbeds, the test case will be portable. A well documented test case is specific and includes the testbed specification. A test case will be runnable when its testbed requirements are met. If not, not.

If a testbed doesn't meet the required specifications for a test, why would we want to run it there?

If we want to test the bash 'pwd' command, our test cases should specify that the test must be run where the command exists, namely on the Unix/Linux command line (the dependency exists whether we understand it or not). The 'pwd' command doesn't exist on the Windows command line. What does it mean to make the value judgment that the test should be portable to Windows?

Like I said: "Huh?"

I could force our test cases to be portable by creating a virtual shell layer that runs on both Linux and Windows. Yeah, I could do that. It'd be fun. The virtual shell could offer its own 'pwd' command. It would just pass inputs and outputs to and from the Unix/Linux command line. On Windows, it would simulate the Linux command's behavior. The resulting test cases would be portable. Silly, but portable. Windows DOS still doesn't have a 'pwd' command! So why are we testing it there?

Tests cannot be run where their testbed requirements aren't met. Nor should they be. Trying to force test cases to be portable is founded on a misguided value judgment. The portability goal is driven by the laudable desire for efficient reuse, but is based on an imprecise understanding of test case dependencies. There is a mapping between test cases and their required testbeds. It's a better use of our time to improve our understanding of that mapping to assure we're running tests where they should be run than it is to try to contrive tests to run where they shouldn't.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Friday, July 16, 2010

Test Specifications Don't Have Variables

Specifications are, um, specific.

In the network storage industry, I've seen hundreds test cases that read something like, "STEP 1: Create a volume between 100 and 500 gigabytes in size".

Well, which?

Assuming the system doesn't take a size range for an argument, eventually we have to decide how big to make it. We're not getting out of the decision by being indecisive in the specification.

Decisions like this should be made during test planning, because:

1) The test planners are the most qualified to make them.
2) By documenting decisions, you can get input during test plan review. You can get by-in from the stakeholders on test coverage to avoid surprises and second guessing.
3) You can evaluate your testbeds to assure you have the required equipment.

By listing a range, the test planner is saying, "I want somebody else to make the decision". The implication is that the person executing the test is more qualified to fill in the blanks in the specification. That shouldn't be the case. The test designers should be more knowledgeable than the test executors. They therefore shouldn't leave these decisions to the executors. Test designers should nail down the test case specification to assure it becomes a reliable, deterministic, and repeatable test of quality and to do that they must remove all ambiguity.

An open specification undermines regression testing. For instance, during release 1.0 testing, the tester uses a 100 gig volume and it passes. During release 1.1 testing, the tester uses a 500 gig volume for the same test case and it fails. Is it a regression, or not? This sort of mismatch is very common even if the same person executes the test in both releases. This variability is even more likely when the test is executed on different testbeds by different people. Further, when we eventually automate the test, we will certainly have to specify the volume size anyway. The automation requires that the specifications details be known.

Regression test specifications don't have variables. They are deterministic. They have to be. You'll have to make up your mind eventually, so why not do it in the specification?

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, May 26, 2010

Baking Bread

If you want to make bread, there are steps you have to follow.

You must:

1) Have a recipe
2) Mix the ingredients
3) Knead the dough
4) Let the dough rise
5) Put the pan in the oven and bake it

You don't get to skip steps. If you want to end up with bread, you have to have a recipe, and follow the steps. The output of each step is the input to the following step. It makes the next step possible. If you don't mix the ingredients before trying to knead the dough, you're not going to end up with bread.

What would you think of a baker who put flour still in the bag, a stick of butter still in the wrapper, and 2 whole eggs still in their shells into a pan, then directly into the oven? Would you consider him a process genius who came up with a brilliant shortcut that the rest of us dummies never thought of? Or would you think he's a little, um, "special"? Would you expect to be able to dine on a delicious loaf of warm bread for dinner? My guess is that if this describes your chef, you can expect to order takeout a lot.

Making software is like baking bread. There are sequential steps you have to follow to produce high quality software. There are no shortcuts. Notice that it doesn't matter whether you're in a hurry. It doesn't matter whether you have a desired delivery date. It doesn't matter whether you're a startup. It doesn't matter that you're short-handed. It doesn't matter whether the customers are nagging you or that the board of directors is breathing down your neck. You have to follow the steps if you want to make software.

The steps are well known. It's not necessary to invent them. You just have to know what they are and follow them if you want to end up with high quality bread...

...or software.

http://en.wikipedia.org/wiki/Systems_Development_Life_Cycle

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Monday, May 24, 2010

We Test the Release Candidate

I rather like Wikipedia's definition of a Release Candidate:

"The term Release Candidate (RC) refers to a version with potential to be a final product, ready to release unless fatal bugs emerge. In this stage of product stabilization (read QA cycle), all product features have been designed, coded and tested through one or more Beta cycles with no known showstopper-class bug." - Wikipedia.

It's important to understand this definition because the RC is the primary criterion for entering official QA release testing on behalf of the customer. QA may participate in testing prior to having a Release Candidate, but that testing doesn't count, because we don't have a product yet.

Let's break up the Wikipedia definition, for clarity:

"a version" means a build of the software product.

"with the potential to be a final product" means you could sell that exact version of the product if it passes QA qualification testing. It's a software version candidate that you could release to the customer.

"ready to release" means all the content is there. There is no coding or known bugs that have to be fixed still outstanding.

"unless final bugs emerge" means that the release candidate will not be released if such bugs are found in the official code freeze testing. This also means if such bugs are found, they must be fixed and a new RC declared with their fixes included. The new RC results in a new code freeze and a restart of qualification testing from QA.

The RC is the official hand off from development to QA, in which development says, "Here, I think this is ready." It's the start of QA's official testing on behalf of the customer.

The RC is important because the testing that occurred prior to it doesn't count. That testing is important and it may help the developers find bugs, but the product isn't complete, so you're not really testing the product. You're testing the prior version of the product. The next release is 2.0. You're still testing 1.0 until all the code and bug fixes for 2.0 are checked-in.

That is, until you have a Release Candidate.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, May 18, 2010

Combinations and Permutations

Denial makes for poor quality software.

"If you choose not to decide, you still have made a choice." - Neil Peart

Software and hardware combinations quickly get out of hand. The number can easily be in the millions. Software companies respond to this in two ways, both of them bad:

1) Build billion dollar interoperability labs
2) Allow enormous testing gaps and keep customers in the dark

Here's a simple combinatorics computation:

Supported Hardware Options

* Hardware platforms: 3 (3)
* Memory sizes: 4 (3 x 4 = 12)
* Disk types and sizes: 6 (12 x 6 = 72)
* Network cards: 4 = (72 x 4 = 288)

Supported Software Options

* Operating Systems software versions: 6 (288 x 6 = 1,728)

To qualify our product software to run on hardware with that many options, we need 288 testbeds, assuming we have the ability and sophistication to install/uninstall the 6 operating systems on those testbeds on demand. We must execute 1,728 instances of each test.

If we want to dedicate operating systems to testbeds, we need 1,728 testbeds. And we still need to execute each test 1,728 times.

Every time another hardware platform or OS version is proffered by marketing, management must make the necessary calculations to estimate the impact on schedules, resources, and budget. They must understand what they're signing up for. In the above example, each additional O/S version means each test has to be executed another 288 times in order to be qualified across all the hardware.

Companies have to get a handle on this problem. Denial is all too common, but it isn't an answer. Your customer has a right to believe that you've tested what you say works. Short of building a billion dollar lab, which isn't an option for most companies, the only real solution is carefully to limit the combinatoric variables.

Sadly, these calculations are never made. Most companies unknowingly choose option 2 and choose to live in denial. So my advice to software customers is to think about the enormity of this, and always buy the support contract. ;^)

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, January 19, 2010

Leveraging the Automation Harness

In previous posts (http://qa4software.blogspot.com/2009/10/automation-framework.html and http://qa4software.blogspot.com/2009/07/startup-automation-recipe.html), I detailed the requirements and benefits of a test execution harness. After being initially developed, the test harness can be repeatedly leveraged to improve development quality and massively increase test execution frequency. The test harness logs every detail of its activities and testbed configuration, makes the results browseable and linkable via a web server, and emails the user the test results. Automated execution can easily be scheduled to run from cron for each codeline and platform to increase coverage at little cost.

Here are the ways in which the test harness can be leveraged to pay you back for the initial investment.

Nightly Functional Regression Tests

A fixed batch of regression tests can be scheduled to run nightly from cron with the latest build for each codeline and platform. The product can be installed on virtual machines or simulators.

Nightly Upgrade/Revert Tests

A batch of upgrade/revert pairs can be scheduled to run nightly from cron for each codeline and platform. Upgrade from each fixed release version to the latest build of each platform. Revert from the latest build of each platform to a fixed release version. Run sanity tests after each upgrade or revert. The released versions are the major versions of the software that are already being run by customers and supported by the company. The product can be installed on virutal machines or simulators.

Nightly Performance Tests

A batch of performance regression tests can be scheduled to run nightly from cron for each codeline and platform. The performance tests will establish performance baselines for each platform compare results to that baseline. The product must installed on hardware.

Test Automation Verifier Web Tool

A simple Perl CGI can be written to allow an automation developer to execute new tests in the harness. This is intended to assure that newly written automated tests can be added to the nightly tests safely. Obviously, the execution testbed configuration should match the regression testbeds. The developer supplies his automated test. The CGI launches the harness to install standard product builds and runs the new test repeatedly. The product can be installed on virtual machines or simulators.

Pre-Checkin Test Web Tool

A simple Perl CGI can be written to allow a product developer to execute tests of his private builds before checking in his code. The developer supplies his privately built executable and chooses the tests he wants to run. The CGI launches the harness to install the developer's build and executes the tests. The product can be installed on virtual machines or simulators.

Software Install/Uninstall Web Tool

This is just a scaled down version of the pre-checkin test tool above. A simple Perl CGI can be written to allow anyone to reboot hosts, install, configure, uninstall, and/or unconfigure the product software. The user supplies the build executable, either privately or automatically built, and chooses the operations he wants to run. The CGI launches the harness to execute the chosen operations. The product can be installed on virtual machines or simulators.

Per-Build Tests

A daemon or cron can be written to monitor each codeline and run a series of quick smoke and integration tests via the harness for each product source code checkin. The product can be installed on virtual machines or simulators.

A well designed test execution harness has many uses and can do the work of literally dozens of test engineers. It shifts your attention from manual testing to developing automated tests and monitoring test results - a higher order, more sophisticated mode of operation that saves you money.

In short, the test execution harness is just something you simply can't live without.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, January 12, 2010

Customer Zero

QA is the customer before the customer is the customer. We're Customer Zero. When we feel pain, the customer feels pain. We just feel it first.

The customer may feel the pain to a different degree than QA, but they still feel it, so pay attention to QA's feedback. Listen to QA's experience as if you're listening to the customer. Don't wait until the customer support call complaint to hear about your software's quality.

If the product is difficult for QA to install or configure, it will be difficult for the customer to install and configure. Even if there isn't anything wrong with the software, that is it meets specifications, installation difficulties are still part of the customer's quality experience. QA can give you an idea of what that will be.

If QA have automated regression tests that suddenly start failing after a check-in, it's tempting to code around the problem, as if the automated test is the source of the failure. Developers are often quick to suggest that QA "fix" their test script to run successfully with the latest change. But that sweeps the problem under the rug. If QA's automated testing fails, the customer's automation will fail as well.

These deficiencies can be magnified by QA's experience, but shouldn't be dismissed as only a QA problem. They should be viewed as an opportunity to improve quality for the customer, because QA is the customer.

We're Customer Zero.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, October 6, 2009

Automation Framework

What's a test automation harness? It's the means by which a batch of test scripts are serially executed. It's fairly easy to build one in Perl and it can be leveraged in several ways once it's built to pay you back for the investment.

You must first define the Test Contract because the test harness will rely on it. This is absolutely essential. The Test Contract is the agreement between test input/output and the test harness. Test Automators must follow the rules. Perl is preferred for tests, but as long as the Test Contract's rules are followed, the programming language you use for tests is unimportant (although you should have a very good reason for your choice).

The harness will be designed to run from the command line and will eventually run from cron nightly and per checkin. All the forensic data produced by the run will be plain text, so it can be viewed and parsed at the command line with grep and viewed in a browser.

Here are the requirements for the test harness script:

Arguments

Take a software configuration file as an argument to specify the software versions for the test.
Take a testbed configuration file as an argument specifying the testbed on which to execute the tests.
Take a batch file listing the test scripts to be run as an argument, with timeouts for each.

Setup

Create a unique run ID
Create a unique directory named after the run ID to contain run logs.
Reboot hosts, install the software, and configure the testbed before running batch of scripts.
Create a detailed log file containing all test output.
Copy the batch file and testbed configuration file to the run directory.
Create a meta data file listing the software version, hosts, and log locations.
Start memory measurements at the beginning of the run.

Execution

Capture standard out and standard error from each test execution to the detailed log.
Timestamp the start and end time of each test script in the detailed log.
Timestamp the start and end time of the entire batch run.
Cleanup the testbed between each test script.
Check for configuration issues between each test script.

Cleanup

Create a summary log detailing the start/end time, duration, exact command executed, and results for each test.
Run a triage script at the end of the run to look for problems with the testbed (eg cores, configuration disruption, reboots, memory leaks, etc...).
Email results to concerned parties.

For extra credit, build a simple Perl CGI to allow the plain text test results to be viewed in the browser. After you've built the framework and are running tests nightly from cron, your job becomes monitoring test results daily and writing automated tests to add to the nightly batch. In my experience, 12 hours of nightly tests executed from a such a framework can deliver the work of twelve full time testers with a single testbed. You'll find a tremendous number of bugs far in advance of formal qualification testing.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, September 30, 2009

The Test Contract

As I've said elsewhere in the blog, regression tests are by definition deterministic. Automated regression tests are just regression tests whose algorithms have been automated. Their most fundamental attribute is that they must be easy to execute. Accordingly, they should follow a standard execution synopsis, a 'Test Contract'. We establish these rules because we know we will want to execute our tests automatically without custom plumbing.

If we can't do this, we haven't accomplished that goal:

for test in *
do
    $test
done

I recommend these rules for CLI based automated testing with Perl:

I. Inputs

You must resist the temptation to give-in to the automation developer's desire to add arguments to the standard synopsis. Fully automated regression tests don't have variables. They have specifications. Variables are an indication that you don't have a complete test specification. Here's the synopsis I recommend:

test.pl -cfg
test.pl -help

The testbed configuration file specifies the testbed on which the test is to be executed. It tells the test the names of the hosts it should talk to. Again, refrain from polluting the configuration file with unnecessary attributes. Here's an example of what this should look like, in Perl. Do yourself a favor, write this in eval-able Perl like this:

$Testbed = [
    node => {
        role => 'master',
        type => 'physical',
        ostype => 'linux',
        osversion => 'Centos 5.2',
        interface => {
            eth0 => '10.10.23.223',
            eth1 => '10.10.3.8'
        }
    }
];

II. Outputs

Print anything you want to be logged to standard out.

III. Script Exit Code

0 = PASS
1 = FAIL

Perl's "die" produces exit code 255. This should be used when a test wants to abort. That is, when it makes no sense for the test to continue running. Don't make the mistake of calling that FAIL. A test that doesn't run isn't a product test failure.

IV. Setup Assumptions

These are the general assumptions that each automated test script can make. Each of these is a separate piece of automation used to setup the testbed in preparation for test execution. There is room for debate about where to draw the line between these, but what is important is that a test writer knows what to assume has already been done.

1) Hosts have been rebooted as necessary and have the required operating system and platform packages installed
2) The product software has been installed
3) The product software has been initially configured

V. Cleanup Assumption

Tests will leave behind some persistent configuration. We need the means to cleanup between tests without a full reset of the testbed. This is done before each test execution, rather than after. This is because we can't rely on automated tests to exit normally, so cleanup may not be conducted and because we may need to leave the systems their failed state for bug triage.

1) The testbed is returned to the state equivalent to initial setup (IV, above)

VI. Automatic Execution

Regression tests should be written with automated execution in mind. Assume the following algorithm for serially running a batch of test scripts. Reference the above sections.

1) Initial Setup (Do this once, at the beginning of the batch)
     a. Reboot hosts and install OS and required packages (IV.1)
     b. Install product software (IV.2)
     c. Configure product software (IV.3)

2) Test Execution (Do this for each test script execution)
     a. Cleanup testbed (V.1)
     b. Execute test

This recipe has worked well for me in building a robust automation framework. I hope you find it helpful. The Test Contract is an important part of the foundation of reliable automated testing.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Saturday, September 26, 2009

International Outsourcing

This is a touchy subject, but I don't regard confronting reality as optional, so here goes...

There are business costs associated with international outsourcing that are often overlooked. Decision makers like to compare pay rates, because those benefits are easy to see. The costs, however, are subtle and harder to quantify. That doesn't make them less real. If you want to do the right thing for your company, you should dig a little deeper. Here are some of the barriers to international outsourcing that a responsible decision maker should confront:

Language Barrier

It's not enough to be good with technology. You have to be able to participate effectively in teams as well. Technical ideas are difficult enough to formulate and express without additional language problems. Technology professionals are not exactly chatty to begin with. Poor language skills further hamper teamwork and impact schedules.

Interaction Barrier

When they're working, you're sleeping. When you're working, they're sleeping. Scheduling meetings means some people work late and others work early. In any case, you only have a tiny overlapping window during which to communicate. If your CEO wants to hold an all hands lunch meeting, remote workers are excluded. In fact, they'll be excluded from most meetings. Effective teams communicate all day long. Without constant accessibility, work slows drastically. Problems that should be solved in minutes take days. Many don't get solved at all.

Leadership Barrier

A remote team must have strong, competent, organized, and responsible leadership. The leader must show initiative and own all the issues associated with the success of his team. International micromanagement cannot work. Everyone who tries fails. There must be a single point of contact leading the remote team, a single throat to choke. That person must over-communicate. They must document everything they're doing because local management has to rely heavily on written documentation due to narrowness of the communication pipeline. That documentation is a additional continuous drag on productivity.

Workdays Barrier

When you have a holiday, they have a work day. When they have a holiday, you have a work day. They work on your Sunday. You work on their Saturday. You only get four days of overlap instead of five. This means the already serious communications lag is extended often by as much as an entire weekend. Things you could have accomplished on Friday don't get done until Tuesday. Again, communication, teamwork, and productivity all take a hit.

Modularity Barrier

Not all work can be delegated to the remote team. This means there is a division of labor that may be unnatural and contrived. You can only give highly modular, documentable, and measurable tasks to the remote team because micromanagement isn't an option. The local team will end up doing whatever is left over, even when they're not the right people for the job. A more senior local team may end up doing junior work that should have been delegated. All work tasks must be documented in great detail. Much money is wasted if you get this wrong.

Infrastructure Barrier

This is expensive. There are many technical lab problems to address that you wouldn't otherwise have. International VPN connections will be slow. This means you have to limit the amount of data transferred between sites and possible limit the window during which such transfers can take place. Your source code repository will be located locally. You will need the means to replicate the source code across sites if you plan remote development. You may need to copy compiled binaries across sites as well. You will need to copy cores and logs across sites for troubleshooting. You will need to have a means to image operating systems and distribute the appropriate software tools for the remote team. If you want the remote team to have their own lab, you will do everything twice. If you need to deploy an Active Directory server in your lab to support a new feature, you will deploy two instead of one. You can multiply the number of service outages by two as well.

Support Barrier

Any time there are technical difficulties that require remote assistance, and there will be, the issues will be resolved at a snail's pace. Usually problems that would ordinarily take only minutes to resolve will take days.

Training Barrier

Technical instructions have to be documented in painstaking detail and made accessible on the intranet. In-person training has to be video taped. The video files must be transferred across the WAN. Question and answer may not happen at all and will be done via email at best. The remote team will have to wait to learn about new product features until the local team is up to speed and can document them and their testing requirements. This is another unnatural division of labor. Work items you'd like to delegate, may have to be done locally until you understand them well enough to delegate and explain them.

Meeting Barrier

You'll need have a remote meeting infrastructure like Webex or NetMeeting. You'll need a plan for handling frequent international phone calls. Regardless, the technologies often don't work well for remote teams due to inadequate telephony infrastructure. Get used to asking people to repeat what they said. "What? Huh? Can you sit any further from the microphone?" A picture is worth a thousand words, but you can't whiteboard effectively with distributed teams. Remote management robs you of much of the benefits of visual communication.

Travel Barrier

It probably makes sense periodically to bring the remote team to the local office for face time and personal interaction given the communications barriers. The travel cost is obvious. But this is lost time in terms of the remote team's productivity. The local team will have to entertain the visitors and carve out time for redundant meetings, so you pay twice.

Productivity Barrier

Most of the communication and teamwork costs have a local counterpart. That is, it's not just that the remote team is seriously less productive than the would be if they were local, but the local team's productivity suffers as they're distracted by all the preceding problems and get blocked on slow communications. You can't just grab the person in the next cube and ask them to reboot the Interocitor. You have to email a precise, detailed, and time bound set of instructions, then wait a day until they say, "Okay". If they didn't understand the instructions, you wait another 2 days to clarify their understanding.

Market Barrier

International outsourcing has the same effect as arbitrage (http://en.wikipedia.org/wiki/Arbitrage#Price_convergence). That is, you increase the quantity of labor demanded in one market and decrease it in another. Prices (labor costs) increase in the remote location and decrease locally. This has the effect of equalizing the prices across markets. Once the price gap closes, you're left with all the foregoing costs without the benefits you thought you'd have.

All barriers can be overcome, but reality isn't optional. International outsourcing will cost you. Anyone who tells you differently is selling something. Make the decision in full awareness of these barriers.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, September 22, 2009

Clever Frameworks Aren't

Clever, that is.

In QA, we test interfaces. We always test interfaces, whether we're testing a CLI, a GUI, an API, a web service, or an individual subroutine. Our job is to understand the interface's specifications, then create tests cases to assure the interface performs to those specifications. That is most effectively done by operating the interface directly.

The last thing we want to do is to interject a translation layer between our test and the interface we're testing. Such a layer will inevitably include limitations.

Here's an example of what not to do. A large software company made a product that had a CLI. The QA engineers wrote an object oriented framework that automatically builds a class hierarchy of the available CLI commands from the appropriate version of the product source code. So you could do stuff like this (where setIpAddress() is translated to a CLI operation). Very clever stuff.

CLI cli = new CLI;
cli.setProductVersion( "1.0" );
bool r = cli.setIpAddress( "10.10.22.123" );

Clever, yes, but unwise. Only correct commands and arguments were permitted by the framework, because only those commands were generated from the source code. Think about it. The cutesy framework made it illegal to do negative testing. The layer's "help" prevented QA from doing its job.

I think you'll find that wherever such framework layers exist, testing limitations exist. It's therefore a mistake create frameworks that insulate you from the object under test. It isn't that clever after all.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Friday, August 14, 2009

1/10/100 Rule

Quality Assurance wasn't born with advent of computer software. Manufacturing had QA for many years before computers came along, so it's wise to learn from their experience. It has been long understood that the earlier defects are found, the less they cost to fix. That's the essence of the 1/10/100 rule. The rule states:

If a defect is found in the Design Stage, it costs $1 to remedy. If it is found in the Production Stage, it costs $10 to remedy. If it is found by Quality Assurance Testing, it costs $100 to remedy.

Let's see how that looks in the context of the SDLC:

1) Initiation
2) System Concept Development
3) Planning
4) Requirements Analysis
5) Design $1 (opportunity)
6) Development $10 (opportunity)
7) Integration and Test $100 (too late)
8) Implementation
9) Operations and Maintenance
10) Disposition

From this mapping, it's really obvious how important good design is. It's not difficult to come up with examples of products that flopped fabulously because of poor designs. Design reviews are extremely important to quality because defective designs are often too expensive to ever be fixed. Total Quality Management (http://en.wikipedia.org/wiki/TQM) teaches us that we should, "Do the Right Things Right". Implementing the wrong thing with high quality is an obvious waste of effort. QA must therefore be active and strong partners during design. Design defects that escape the design phase should be considered failures of the preceding quality processes.

It's also obvious that the next opportunity to drive quality is during development. The common view that software quality is something that only QA engineers worry about after development very clearly multiplies the costs of finding and repairing bugs by 10. Remember Mom's advise, "If you don't have time to do it right, how will you have time to do it twice?" Defects that are found by QA should therefore also be considered failures of the preceding quality processes.

The QA engineer standing at the end of the car assembly line doesn't expect to find defects. His job is to determine whether the car is of sufficiently high quality to sell. At that point, all defects should have already been found and rectified. Every defect he finds indicates: 1) a flaw in manufacturing, and 2) a flaw in the preceding quality processes. This is the philosophy we must cultivate in building our SQA processes.

If we want to produce high quality software repeatably, SQA processes must be designed with the 1/10/100 Rule in mind. Finding bugs during qualification testing is just too late and way too expensive.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, July 29, 2009

Use Perl

In my blog "Why Automate?" (http://qa4software.blogspot.com/2009/07/why-automate.html), I made the case for automation. Here, I consider the answser to that question a given and ask what's the best language to use? Of course, this question only applies when there is an option in the first place. White box or unit testing, for instance, may dictate the automation language. In that case, do what's dictated. However, when there is a choice, use Perl.

I'm not a Perl evangelist. A lot of people I've worked with would be surprised by that. It's my strength. But I gather data, then make decisions objectively, so I find myself in the uncomfortable position of technology advocacy. This is not what I consider a flattering label, so here I'll list my excuses for the apparent prejudice.

We should always work from requirements to make decisions, so here they are:

Requirement #1: Automation Code Should Be Portable

Automated tests should be able to run everywhere when they can. This essentially means test automation should run on both Windows and Unix systems if possible. Why write code that reduces your options? The chosen automation language shouldn't limit where you can run tests. Perl is interpreted and runs on both Unix and Windows hosts with no script changes.

Requirement #2: The Runtime Environment Should be Ubiquitous

The more systems that already have the required runtime, the less testbed configuration management required. Most Unix systems already have Perl installed, so automation can start immediately.

Requirement #3: Developer Skills Should be Common

The more people that already have the automation skills, the better. We don't want to have to train our automators. If our firm can hire people from a large pool of automators that already have the skills, we're better off. New automators are therefore productive earlier. Perl is a very common automation and system administration skill, especially in the Unix community. A shorter ramp-up for new hires saves the company money.

Requirement #4: No Compilation

Automation code should not need to be compiled either. Write it. Then run it. You may need to install test code on hundreds of machines. You don't need the compile step complicating configuration management. We don't want to have to worry about chip architecture, multiple processors, or 32-bit vs 64-bit. This basically leaves us with scripting languages. Perl is an interpreted scripting language. The same script runs on any platform without compilation.

Requirement #5: Mature

We want to test our product, not the scripting language. We do not need cutting edge, nor do we need to follow the latest language fads. We need dependable. This is a business decision. It is not about entertaining the automation developers. Perl has been around for quite a while and is very reliable.

Requirement #6: Free Modules

Our productivity is greatly enhanced if there is a large developer community freely sharing code. This allows us to leverage other people's efforts and avoid re-inventing the wheel. The Comprehensive Perl Archive Network (CPAN) contains modules and scripts for solving every problem you will encounter. It's amazing how quickly new technologies will be reflected in CPAN in freely downloadable modules.

Requirement #7: Easy Extensibility

It should be easy to download, install, and test required modules once you've found them. Installation should resolve dependencies automatically. Perl comes equipped with a utilities that will download modules and their dependencies from CPAN and automatically run them through the author's tests.

Requirement #8: Built-in Documentation

The language itself should contain internal documentation facilities for every facet of the language. It should contain easy to access documentation and the ability to search the documentation and Frequently Asked Questions (FAQ) or How Tos. It should be possible to write test code that contains its own Command Line Interface (CLI) readable documentation. Perl comes with a markup language called, Plain Old Documentation (POD), and a CLI POD reader called, perldoc. perldoc has an extensive FAQ library and search facilities. Standard Perl installations also contain modules for parsing POD and using POD for user help.

Requirement #9: Rapid Development

This is where the rubber meets the road. Rapid development is the desired effect of all of the other requirements. We should be able to write code quickly. Lines is money. This means we should be able to do powerful things with very few lines of code. Perl is one of the most terse and therefore powerful languages available.

Requirement #10: Integrated Text Manipulation

Most of our automation will be about reading text, from log files, CLI commands, or sockets. Perl was designed for this and has built-in syntax for manipulating text through regular expressions.

Requirement #11: Easy Data Structures

We should be able to model data accurately without the design overhead of an object oriented approach. This means we need native hashes and arrays and the ability to nest them arbitrarily. Perl's data structures are well suited for this, without complex object oriented syntax.

Requirement #12: Easy Support

We should be able to get free support easily. Perl has an extremely active self-help open source support community. There are many web sites with tutorials and numerous newsgroups. It's very unlikely you'll have a problem that hasn't already been solved. The solutions are easily found with simple web or newsgroup searches.

As you can see, Perl meets all the requirements. That is not say there aren't other languages that meet many of them. Python, for instance, would come in a close second. Perl wins over Python primarily due to ubiquity and legacy momentum. For these reasons, where there's a choice, it's the best option for test automation.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Thursday, July 16, 2009

Startup Automation Recipe

Here is a short list of the testing ingredients needed to get a startup's automated testing up to speed. Do this cheaply and you'll have all or more of the capabilities of a large company at a fraction of the cost. My recommendations are designed to get your QA planning and testing up-to-speed quickly and cheaply and fall into three major categories.

I. Setting Up

Install test planning and results database
Install bug tracking database
Create a test script contract and automation standards
Create a testbed specifier

II. Automating

Automate product builds
Automate product installation, upgrade, revert, uninstall, and configuration
Automate testbed reset
Automate soak tests
Automate disruptive soak tests
Automate testbed triage
Automate performance regression tests

III. Automating the Automation

Create a test harness to run a batch of tests and email results
Create a web CGI GUI for viewing the test harness results
Run nightly regression tests from the test harness run from cron
Run nightly install/upgrade/revert/uninstall test from the test harness run from cron
Run tests on demand for pre-checkin testing run from a web CGI
Test new tests from a web CGI

I'll elaborate on these suggestions in separate posts.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, July 14, 2009

Why Automate?

Automation is an investment. We make an investment because we expect a return. Here is how investing in test automation pays us back.

Determinism

Properly written automated regression tests are deterministic. A computer will doggedly follow exactly the same test algorithm every time. There is no human variability. Test determinism is the necessary prerequisite for regression testing. A software company that doesn't have deterministic automated tests, can't say they're doing regression testing with a straight face. We run regression tests for each release, so they save us money every time they're run.

Large Scale

Everything about information systems scales geometrically. Hiring thousands of manual testers isn't an option. Some of the most important system, soak, load, and performance tests are literally impossible without automation. Automating these tests literally makes some software products possible and translates into higher product quality and more business.

Run Tests Early and Often

It's not necessary to wait until code freeze to begin to test. Automated tests allow testing to begin immediately after a new release is branched. Some people call this continuous testing. This helps assure quality stays high throughout development and therefore higher quality when QA starts qualification testing. Automated tests can be run nightly, or even with every build. And every time they're run, they pay us back.

Automated tests allow developers to get immediate feedback when they break the product. They can run automated tests prior to check-in to pre-empt the costly triaging necessary in post-checkin integration testing. Bugs that can be found prior to check-in, should be. QA is a circuitous and therefore expensive way to find bugs. We save money every time a developer finds his own bug.

Testing early and often reduces the bug backlog entering QA qualification. Fewer bugs entering QA qualification means more reliable delivery. Reliable delivery means cheaper production, wider margins, and more business.

Scalability

System, soak, and performance tests and utilities can be written to scale. Executing an automated test with 1,000 client systems can be done with same command that launches the test for 1 client. People are the biggest expense in testing. Scalable tests allow a limited staff to command ever-growing testbeds, saving personnel expense.

Automated Automation

The nirvana of automation is automatically executing automated tests. Automated tests can be run as a batch from a scheduler like cron or in an event driven way upon each check-in, further reducing human costs and allowing personnel to focus on writing new automated tests and triaging test results. We save money when QA can focus on better testing.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Wednesday, July 8, 2009

Process

As producers, we hate process because of the constraints, delays, and general bureaucracy it imposes. It's frustrating and slows us down. As consumers, we like it because it helps make the services we consume reliable and improves quality. Process is ultimately about repeatable success.

Here are the most important standard process models and how an understanding of them within the QA context can contribute to repeatable success.

Capabilities Maturity Model Integration (CMMI)

http://en.wikipedia.org/wiki/CMMI

The CMMI characterizes a company's maturity. What is important about the model is that it tells us the characteristics that we need to exhibit to improve our chances of repeatable success. All of our processes in QA should work toward, and be evaluated by, this goal. In this blog, I will expound upon the essential QA components that contribute to climbing the CMMI hierarchy.

Systems Development Life-cycle (SDLC)

http://en.wikipedia.org/wiki/Systems_Development_Life_Cycle

I will also discuss how QA test processes correlate to the SDLC and how they contribute to reliable schedules. The SDLC informs us that process is all about deliverables. QA processes help assure deliverable readiness at each phase transition to help assure a software firm consistently meets its delivery schedules with a high quality product.

Every company needs to be successful, so appropriate processes that result in repeatable success are important at any scale.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt

Tuesday, July 7, 2009

What is a test?

As my first post, I want to define the basic terminology I'll be using to establish the foundation for future posts.

What is a test?
A test is the evaluation of an assertion.

To test our software, we evaluate its quality by comparing actual results to expected results. If the actual result equals the expected result, the test PASSes. If it doesn't, it FAILs. The expected results are specified in a test specification, which is derived from a functional specification. That is, we must know how our software is supposed to behave before we test it. A specification is specific. It is unambiguous. There are no variables in a test specification and there are no test results other than 'PASS' or 'FAIL'.
What isn't a test?
Simply operating something is not a test. You drive your car to work. That doesn't mean you're testing your car.
A test of your car's top speed, for instance, requires that:

the product specification includes its top speed,
the top speed specification provides the testbed specification (straight road, flat, asphalt, etc...),
the test specification specifies the assertion and PASS/FAIL criterion
the test specification is executed to completion, and
the actual top speed is evaluated against the expected top speed, resulting in an unambiguous test result: PASS or FAIL.

Demonstrating the software lacks a feature, isn't a test failure.

If you drive your car into the ocean and it sinks, you can't say the car failed the float test. The product specification doesn't say the car can float. We test to specifications. The fundamental responsibility of QA is to demonstrate the product meets its specifications. If QA doesn't know how the software is supposed to behave, they're can't assure its quality.

Often, QA will be engaged in what I call, discovery. In the absence of a specification, they're trying to 'discover' how the product works (performs, scales...whatever). This may be important work, but is not QA testing.

As the quality evaluator, QA is responsible for knowing the characteristics of the software.

Your teacher gives you a piece of paper with questions on it. You answer them and return the paper. You ask, "What's my score?". She says, "I don't know. I don't know what the correct answers are." If that's the case, how can you call that a test? It's her job to know the correct answers, then evaluate your answers against the correct answers, and finally calculate your score.
I'll address the many types of tests in another post.

Todd Shoenfelt
http://www.linkedin.com/in/toddshoenfelt