Slide #5
Learn how to design test cases that are most likely to uncover errors— an activity that should begin during the Discovery phase, before the Development phase (design and coding). By specifying these tests up front, you add measurable quality requirements—clear-cut acceptance criteria for the software. The testing process includes technical tests of the software’s components and architecture (white-box tests), tests to see if it works as advertised (requirements-based tests), and tests that see whether the software does these things “well enough” (system tests). Who does these tests and how does the PM fit in? The PM’s responsibility is to support testing. What this means, exactly, depends on the organization. Some organizations have a quality-assurance (QA) team responsible for testing. The PM is often a member of this team. In fact, many organizations have people with a PM title who do nothing but testing.
If you’re a PM involved in testing, most of your testing time will be absorbed by requirements-based testing. If your organization does not have specifically trained usability testers, you may also be asked to specify and/or run usability tests. You’ll need to know how to design effective tests and how to write them up. In general, the rest of the tests (white-box and most system tests) are too “techie” for the Product Manager. You still need to know enough about them to know what to ask for and to be able to understand the significance of the results, however.
What is testing? Testing is any activity aimed at proving that the software system does not do what it is supposed to. The negative phrasing is intentional. Each time a test uncovers a bug, it has proven itself—it means the bug won’t be released to the end-user. The term quality assurance is sometimes used because it suggests that more than the physical testing of the software may be required. For example, verifying a draft of a system use-case description with stakeholders is a testing activity.
What Exactly Is a Bug? We’ll take the broad view. A bug is any variance between what the system is supposed to do and what it actually does. Some of these bugs are going to be introduced by the programmers. But others are introduced earlier on by Product Managers through inaccurate, ambiguous, or missing requirements. It’s your job to eliminate as many of these bugs as possible.
Slide #6
General Guidelines Ivar Jacobson, a founder of OO, advises that you derive test cases from system use cases as follows: Test scenarios that cover the basic flow of each use case Tests scenarios that cover the alternate and exception flows of each use case Tests of line-item requirements in the PRD, where the requirements are traced to use case(s) Tests of features in the user documentation, where the documentation is traced to use case(s) You’ll need more than these suggestions to plan appropriate testing. Fortunately, much groundwork regarding testing has been done prior to OO.
This pre-OO approach is called structured testing. We’ll look at how structured analysis can be integrated with OO techniques and applied to an OO project. Structured Testing In 1976, Glenford Myers pioneered the field of testing with his book The Art of Software Testing. He laid out a discipline he termed structured testing. His work remains the basis for testing to this day. Structured Testing Structured testing is the process of using standardized techniques to locate flaws (bugs). Flaws detected by structured testing include those introduced during business analysis, design, and programming.
Slide #8
Principles of Structured Testing (Adapted for OO) You begin by establishing some principles, adapted from structured testing to OO projects. Why not go directly to the techniques? A sound understanding of basic principles helps avoid the kinds of institutional problems that lead to buggy systems.
Structured Walkthroughs: Most people think of testing as a process involving the execution of a program by the computer. This is only one type of testing—referred to, appropriately, as computer-based testing. Testing, however, also includes non-computer-based tests. How can you test a system without actually executing it? You walk through some aspect of the system manually with a group of participants. The formal method for doing this is the structured walkthrough. Why Are Structured Walkthroughs an Important Aspect of Testing? Errors are often thought to be exclusively due to bad programming; in fact, they can be introduced at any stage of a project. The beauty of walkthroughs is that, unlike computer- based testing, they can be performed before the software is written. Early testing means early detection of errors. The sooner errors are found, the easier they are to fix.
Also, unlike computer-based tests, walkthroughs tend to find the cause of a problem, not just its symptom. For example, a computer-based test may find symptoms, such as scattered situations where credit is advanced to non-worthy applicants; a walkthrough may uncover the cause—an incorrectly documented decision table for evaluating credit applications.
Slide #10
Requirements-Based (Black-Box) Testing The purpose of requirement-based testing is to find variances between the software and the requirements. The product requirements document (PRD) acts as the reference point for these tests. The term black-box tests is also used for these types of tests, since the tester does not need to know anything about the internals of the software, such as the code and table structure, to design and run them. This is in contrast to white-box tests, which are technical tests that are based on a knowledge of the inner workings of the IT system. Limitations Since no knowledge of the code is assumed with requirements-based testing, the only way to know definitively the effect that a particular set of inputs will have on the system is to test the system’s response to it. This means that for full coverage, you’d have to test every possible set of input values and conditions.
In practice, this is an unachievable standard, so instead you use techniques that help you design black-box tests that will uncover the greatest number of bugs in a given amount of time. Use-Case scenario testing Use-case scenario testing is one approach to requirements-based testing that tests the various scenarios of each use case. Use cases lend themselves well to testing. Because of the narrative style of the use case documentation, it is already very close to being a testing script. And the way the use cases are organized—into end-to-end business use cases and user-goal system use cases—matches the way the tests are organized.
I have often been asked if use cases, then, are all you need to design the test. The answer is no. To design tests, you need more detail, such as graphical user interface (GUI) screens, the structural model (which provides validation rules for attributes), and the documentation on business rules (stored in a business rules engine or kept manually in a folder). The system use cases may refer to these artifacts, but the artifacts are not part of the use-case documentation itself. Deriving Use-Case Scenario Tests Recall that your processing requirements were grouped around system use cases, each with their own group of scenarios. The flows were chosen so that they would cover all important scenarios:
- Basic flow: The normal path through the use case
- Alternate flows: Rarely occurring flows and other variations from the norm
- Exception flows: Unrecoverable errors and any other flows that result in the interaction ending without the user achieving the goal of the use case Use the flows of the system use case to derive scenarios, then test each scenario.
For example, one test scenario might walk through the basic flow. You might be able to design another test scenario that walks through the basic flow and all of the alternate flows. At a bare minimum, you’ll want to ensure that the basic flow and each of the alternate and exception flows are covered at least once in the test scenarios. But you may also be interested in designing key tests that use certain combinations of the alternate flows. For example, one alternate flow for a stock-trading site may be “non-standard lot size” and another may be “order can only be partially filled.” The software may be able handle each of these alternates one at a time but not when they occur together. To test the system’s response to this situation, you’ll want to include a test scenario that walks through both alternate flows.
If you need to ensure you’ve covered all possible combinations of a set alternate flows, use decision tables, covered later in this topic. During test execution, you’ll be looking to see whether the sequence of events during the test matches that described in the use case. One way to do this is to use the steps of the system use case as the source for the following test template. Place steps that begin with “The user…” in the “Action/Data” column of the template; place steps that begin with “The system…” in the “Expected Result/Response” column.
Slide #12
Decision Tables for Testing When the input conditions affecting a system use case are interrelated, it is not enough to test for each input condition separately; you must test all combinations of input conditions. An input condition is any condition that will have an impact on the system response. Examples from a Web retail site include Item on Sale, Customer Discount, and Fast Delivery Method Selected. You’ll find some input conditions documented in the system use case as alternate flows—for example, the alternate flow Non-Standard Lot Size. You might also see them already documented as part of the requirements in a decision table appended to a step of the use case or to the use case as a whole. In this case, you can reuse the decision table for testing purposes. From a testing perspective, each column in the table identifies a test scenario. Keep in mind, though, that the column only identifies the test scenario; it does not fully specify it. To properly specify a test, you need to complete the test template for each test scenario you’ve identified from the columns. Also, as discussed in the upcoming section “Boundary-Value Analysis,” you may need to create more than one test scenario per column. The use of decision tables in this context may be complex, but that does not mean that this approach to testing is “white box.” The input conditions and the expected system responses are still derived from the requirements—not from an examination of the code. This classifies the technique as “black box.”
What the Decision Table Does Not Say about Testing The decision table shows only the net result of each test; it does not show the required sequencing of steps. For this reason, you need to complete the test template for each test scenario, indicating the expected sequence of actions. Use the system use-case description to work out the expected workflow for each case. Also, each column tells you something about the input data for a given test, but does not specify exactly which data to test for. For example, for the test corresponding to column 2 in Figure, the number of Peace Committee members may be three, four, or five. Which of these should you use? What about tests for invalid data? These issues are addressed by boundary-value analysis.
Slide #13
Boundary-Value Analysis Boundary-value analysis is a technique for targeting test data most likely to reveal bugs. The technique is based on the premise that the system is most error-prone at points of change. Boundary-value analysis can help you pinpoint test data for any requirements-based (black- box) test. If you are working from a decision table, then boundary-value analysis can help you decide which data to use for the test(s) indicated by each column of the table. The technique covers both positive and negative testing: -A positive test is one that tests the system’s response to valid conditions (success scenarios). -A negative test is one that tests the system’s response to invalid conditions (errors). The following is a summary of boundary-value analysis rules:
The following is a summary of boundary-value analysis rules:
- If the condition states that the number of input (or output) values must lie within a specific range: -Create two positive tests, one at either end of the range. -Create two negative tests, one just beyond the valid range at the low end and one just beyond the high end. For example, for the system use case Update Case that accepts 2–10 parties to a dispute, the positive tests would have the user enter exactly 2 and exactly 10 parties. Negative tests would try for 1 and 11 parties.
- Similarly, if an input or output value must lie within a range of values and the whole range is treated the same way: -Create two positive tests, one at either end of the range. -Create two negative tests, one just beyond the valid range at the low end and one just beyond the high end. For example, if the Ward Number attribute of the Peace Committee class has a valid range of 1–100, create two positive tests: 1, 100. Also create two negative tests: 0, 101.
- If an input or output value must lie within a range of values and different valid ranges are treated differently: -Create a positive test for each end of each valid range. -Create two negative tests, one just below the smallest acceptable value and one just above the highest. For example, the decision table for the system use case Review Case Report indicates that system response depends on # Peace Committee Members. The valid ranges are 0–2, 3–5, and 6–99. The positive test values are 0, 2, 3, 5, 6, and 100. The negative tests are –1 and 100.
- If an input or output value must be one of a set of valid options and all options are treated the same way: -Create one test for valid data using any value from the set. -Create one invalid test using any value not in the set. For example, the Reason Code No Gathering attribute of Case must match the code of one of the reason codes in the lookup table. Create one positive test where the code is found in the table and one negative test where it is not.
- If an input or output value must belong to a set of values and each one is treated differently: -Create one test for each valid option. -Create one invalid test using a value not in the set. For example, if the system treats criminal offenses differently from civil offenses, create a positive test for a case whose dispute code refers to a criminal dispute and another test where the code refers to a civil one. Also, create a negative test for where the dispute code is not found in the lookup table.
- If the requirements state that a certain condition must be true: -Create one valid test where the condition is satisfied. -Create one invalid test where the condition is not satisfied. For example, the Testimony attribute of Party to Dispute must be non-null. Create a positive test where testimony is entered and a negative test where it is not.
- To limit the number of tests you have to run, you can combine as many valid tests as possible in a single run. However, you may not combine invalid tests.
- Look out for any boundaries not covered in the preceding rules. For example, for any reports or screens, test one case where exactly one page or screen is filled and one test where the output goes over by one line. Wherever sorting occurs, test cases where everything is presorted; where all values are the same; and where one value is the lowest possible, and one is the highest. For input values, try negative numbers and zero. Try entering no value at all.
Slide #14
White-Box Testing White-box testing is a testing methodology based on knowledge of the internal workings of the IT system. Who Does White-Box Testing? Developers perform these tests, since knowledge of programming code is required. But as a Product Manager, you have a supporting role: You may be required to specify the level of white-box testing that the software must pass through before it is accepted. And after the tests are run, you might be called on to inspect evidence that the white-box tests have been carried out successfully. This proof sometimes comes in the form of a report produced by an automated testing product, confirming the level of white-box testing to which the software has been exposed. To support white-box testing, you need a basic understanding of what such testing can and cannot achieve and of the meaning of the white-box testing levels.
Limitations of White-Box Testing To ensure that software is completely error-free, white-box testing would have to include enough tests to thoroughly “exercise” the code. In practice, however, this is impossible. Why? At first glance, it might seem sufficient to execute a set of tests that causes every statement to be executed at least once. Unfortunately, this does not supply sufficient coverage, because some errors show up only when a program’s execution follows a specific path through the code. To white-box test a program fully, then, you would need to try all possible paths of statement execution. Because the number of tests usually required for full coverage is so high, other approaches are used to winnow the set of tests to a manageable size. Even Small Programs Can Have an Astronomical Number of Pathways Consider an operation containing 20 statements that are repeated up to 20 times. The body of the loop includes several nested IF-THEN-ELSE statements. It would take about 1014 tests to cover all of the possible sequences in which those statements could be executed.
Slide #16
Sequencing of White-Box Tests When software is written, it is developed in modules, or units. In structured systems, the software unit is the process, known by various terms such as subroutine, function, or subprogram. In OO, the basic software unit is the class, which contains code for attributes and operations. In both structured and OO environments, a plan must be put together to sequence the testing of these units and their proper integration within the software. The process of planning and executing these piecemeal tests is called unit testing.
Unit Testing and the PM While the developers usually carry out unit testing, the PM needs to be able to consult with the developers about the planning and scheduling of these tests. Since most systems in large organizations involve a hybrid of structured software (typically for back-end legacy systems) and OO (typically for Web-enabled front-end systems), as a PM, you’ll need a basic understanding of unit testing in both environments.
Big Bang Approach to Unit Testing There are a number of approaches for the sequencing of unit tests. In the big bang approach, each unit is first tested individually. Once this is complete, all units are integrated and tested in one “big bang” test. In a structured system, these units are subroutines or functions. In an OO system, they are classes. In either environment, the developers often need to create “dummy” software to stand in for other units not being tested at that time. One of the disadvantages of the big bang approach is that, since units are first tested in complete isolation from the rest of the program, a large amount of dummy software has to be written. Another disadvantage is that the final big bang test is the first opportunity to test whether the units have been integrated properly in the software. If an integration problem shows up at this time, it will be very hard to diagnose. For this reason, the big bang approach is not advised—but is still used because it is easy to manage.
Incremental Approaches to Unit Testing A preferred approach is the incremental approach, where each unit is added to the system one by one. With each incremental test, the internal workings of a unit and its integration with the rest of the system are tested. Since not much is being added with each test, diagnosis is easier. In a structured environment, there are two types of incremental testing to choose from: top-down or bottom-up.
Top-Down Testing In top-down testing, the units are tested starting from the mainline program (the high- level module that coordinates the major functions) and advances toward the low-level units that carry out basic functions. The advantage of this sequence is that it mirrors the order in which software units are usually developed. The disadvantage is that since high-level subroutines are tested before the low-level routines on which they depend, the tester must create stubs—“fake” units that take the place of the real low-level routines during testing.
Bottom-Up Testing In bottom-up testing, the order is reversed: First the low-level routines are tested, followed by higher-level routines. The advantage of this approach is that it does not require the overhead of creating stubs. It does, however, require the creation of other stand-in software, called drivers, but these are usually easier to develop. The big disadvantage is that this sequence may not match the order in which the units are actually coded.
Incremental Testing in an OO Environment In an OO environment, the units are not organized top to bottom. Rather, objects are seen as being on the same level, collaborating with each other to carry out system use cases. It makes no sense, therefore, to speak of a “top-down” or “bottom-up” approach. Instead, the system use cases direct the sequencing of tests. When software is developed iteratively (as is commonly the case with OO systems), a set of system use cases is developed and released (internally or to the user). During each iteration, only the classes and operations required for the scheduled use cases are developed and tested. With each iteration, more classes and operations are developed and tested until the entire system has been covered.
Slide #17
System Tests Once the black-box tests have been completed, another battery of tests is executed. These are called system tests. With the exception of usability testing (a type of system test), you will not typically perform these tests, but you may be involved in planning them and in verifying that the tests have been conducted, so you should be aware of the tests in this category. The term system tests refers to a grab-bag of tests that go beyond functionality. The purpose of system tests is to test compliance with the non-functional requirements, also known as service-level requirements, or SLRs. The non-functional requirements spec- ify the required level of service—for example, the maximum acceptable response time.
Myers on System Testing Myers defines system testing as follows: “The purpose of system testing is showing that the product is inconsistent with its original objectives.”
The idea behind system testing is that even if the code has been adequately tested for coverage (white-box testing) and has been shown to do everything expressed in the user requirements, it may still fail because it doesn’t do these things well enough. It may not meet other objectives—such as those related to security, speed, and so on. Myers laid out a set of system tests designed to catch these kinds of failures. These tests are still widely in use today. Following are some of the more popular of these tests.
- Regression testing: Regression testing validates whether features that were supposed to be unaffected by a new release still work as they should. The test helps avoid the “one step forward, two steps backward” problem: a programming modification designed to fix one problem inadvertently creating new ones. How much regression testing should you do? That depends on the level of risk. Often, organizations create a problem review board to set standards for regression testing and to evaluate on a case-by-case basis the degree of regression testing required.
- Volume testing: Volume testing verifies whether the system can handle large volumes of data. Why is this necessary? Some systems break down only when volume is high, such as a system that uses disk space to store files temporarily during a sort. When the volume is high, the system crashes because there isn’t enough room for these temporary files. Also, some systems may become unbearably slow when volume is high. Often, this is due to the fact that the data tables become so large that searches and lookups take an inordinate amount of time.
- Stress testing: Stress testing subjects the system to heavy loads within a short period of time. What distinguishes this from volume testing is the time element. For example, an automated teller system is tested to see what happens when all machines are processing transactions at the same time, or a network server is tested to see what happens when a large number of users all log on at the same time.
- Usability testing: Usability testing looks for flaws in the human-factors engineering of the system. In other words, it attempts to determine whether the system is user-friendly. Isn’t it enough that the system does what it’s supposed to do? No. Users may reject it due to frustration with the user interface.
————————————————————- Usability Testing Questions Questions investigated during usability testing include the following: -Is the user interface appropriate for the educational level of the users? -Are system messages written in easy-to-understand language? -Do all error messages give clear, corrective direction? The user must always be given a “way out.” -Are there any inconsistencies in the user interfaces of the system? Look for inconsistencies with respect to screen layout, response to mouse clicks, and so on. -Does the system provide sufficient redundancy checks on key input? Important data should be entered twice, or in two complementary ways—for example, a social security number and a name for financial transactions. -Are all system options and features actually useful to the user? Unused “extras” make the system harder to learn and clutter the interface. -Does the system confirm actions when necessary? The system must confirm important actions, such as the receipt of a customer’s online order. -Does the flow dictated by the system support the natural flow of the business? ————————————————————-
-
Security testing: Security testing attempts to find holes in the system’s security procedures. For example, the tests will attempt to hack through password protection or to introduce a virus to the system.
-
Performance testing: Performance testing locates areas where the system does not meet its efficiency objectives. Performance tests include the measuring and evaluation of the following:
- Response time: The elapsed time it takes the system to respond to a user request. CPU time: The amount of processing time required.
- Throughput: The number of transactions processed per second.
- Storage testing: Storage testing checks for cases where storage objectives are not met. These objectives include requirements for random access memory (RAM) and disk requirements.
-
Configuration testing: Configuration testing checks for failure of the system to perform under all of the combinations of hardware and software configurations allowed for in the objectives. For example, these tests look for problems occurring when a supported processor, operating system revision, printer driver, or printer model is used.
-
Compatibility/conversion testing: Often, the goal of an IT project is to replace some part of an existing system. The objective of compatibility testing is to verify whether the replacement software produces the same result as the original modules (with allowance for new or revised features) and is compatible with the existing system. Conversion testing verifies whether the procedures used to convert the old data into new formats work properly.
-
Reliability testing: Reliability testing checks for failure to meet specific reliability objectives. For example, the objectives for one of my early programs—a food-testing program—stated that an automated count of bacteria grown on a grid be correct to a given accuracy. Reliability testing would verify whether this objective was met. Another metric that falls in this category is mean time to failure (MTTF).
-
Recovery testing: Recovery testing checks for failure of the recovery procedures to perform as stated in the objectives. For example, an online financial update program keeps a log of all activity. If the master files are corrupted, the objectives state that a recovery procedure will be able to restore files to their state just before the crash by processing the day’s transaction log against a backup of the previous day’s files. A recovery test would look for failure of this procedure to recover the files.
Slide #18
User acceptance testing (UAT): Acceptance testing is the final testing of the system before the users sign off on it. This test is often performed by the users themselves, although in some organizations, the PM performs the test while the user looks on. There are two alternative approaches to UAT—a formal and an informal approach.
——————————————————- Formal UAT Versus Informal UAT In the formal approach, the developers and users sign a document beforehand that lays out the terms of the UAT. The document stipulates that if the users carry out the UAT under the terms described in the agreement and if the tests are successful, the users will accept the system. By having participants sign off on this document before the UAT, the PM sets the stage for a clean end to the project. Proponents of the informal approach argue that the formal approach is inappropriate. In their view, users should have free reign to experiment with the system to make sure it can let them do their jobs, which might involve unexpected variations of usage. For example, IBM’s RUP methodology states: “In informal acceptance testing, the test procedures for performing the test are not as rigorously defined as for formal acceptance testing. The functions and business tasks to be explored are identified and documented, but there are no particular test cases to follow. The individual tester determines what to do. This approach to acceptance testing is not as controlled as formal testing and is more subjective than the formal one.” ——————————————————
- Beta testing: Alpha testing is the testing of the system by the manufacturer. These are the kinds of tests you have been reading about until this point. Beta testing occurs after the alpha testing is complete. In beta testing, copies of the system are distributed to a wide group of users, selected to represent the various configurations, volume, stress, and functional needs of the target user population. The developers correct any errors uncovered by beta testing before releasing the production version. Beta testing is often used for systems that will have a wide distribution
.
- Parallel testing: On some projects, the system undergoes parallel testing before final acceptance. With this approach, the new system is put into place and used while the old system is run concurrently. Both systems should provide equivalent outputs (except for any variations resulting from new enhancements and modifications). Parallel testing minimizes risk. If errors arise, the user can quickly revert to the old system until the problem is resolved.
- Installation testing: Installation testing is performed after the software is installed. Its purpose is to check for errors in the installation process itself. This test checks whether all files that should have been installed are, indeed, present; whether the content of the files is correct; and so on.