Tuesday, August 12, 2008

Beyond Broken Links

by Rick Hower. Internet Systems, July 1997

Although Web site testing has much in common with the testing of standard client/server applications, there are unique considerations that can affect the focus of your testing strategy. In this article I discuss some of the factors that determine what to test in a Web site, the special considerations of database-driven Web sites, and how new Web test tools can help.

What to Test
Ideally, all components and functionality of a Web site on both the client and server sides should be completely tested. However, this is rarely possible in modern software projects, especially in fast-paced Web-related projects. The best approach is to examine the project's requirements, set priorities based on risk analysis, and then determine where to focus testing efforts within budget and schedule constraints.

Requirements
Web sites are typically more graphics-oriented than standard software applications, and this, along with other multimedia functionality, will need to be considered in terms of requirements specifications for page layouts, appearance, site-specific standards, video and audio file sizes, and plug-in files. Database-driven Web sites will merit particular focus on performance requirements such as response times at various load levels and rate of error-free interactions per unit time. As in any project, requirements should ideally be clear, complete, reasonably detailed, cohesive, attainable, and testable. When this isn't the case, or when the project's requirements aren't stable, a subjective "how good is it?" testing approach may be more sensible than an objective "is it good enough (to meet requirements)?" testing strategy. This "how good is it?" approach might focus more on load testing, stress testing, usability testing, and link testing, and less on other aspects of testing.

Risk Analysis
Database-driven Web sites can involve a complex interaction among Web browsers, operating systems, plug-in applications, communications protocols, Web servers, databases, CGI programs or use of ISAPI or NSAPI features, security enhancements, and firewalls. Such complexity makes it impossible to test every possible dependency and everything that could go wrong with a site. The typical Web site development project will also be on an aggressive schedule, so the best testing approach will employ risk analysis to determine where to focus testing efforts. Risk analysis should include consideration of how closely the test environment will match the real production environment. Will the test scenarios closely mimic real-life users, Internet connections, modems, communications, hardware, clients, loads, data, database table sizes, and so on? Will the differences be significant? Ideally, the test and production environments will be the same, but budget constraints often prevent this, and the risk is that the differences may distort test results and make certain types of testing irrelevant. Other typical considerations in risk analysis include:
Which functionality in the Web site is most critical to its purpose?
Which areas of the site require the heaviest database interaction?
Which aspects of the site's CGI, applets, ActiveX components, and so on are most complex? What types of problems would cause the most complaints or the worst publicity? What areas of the site will be the most popular? What aspects of the site have the highest security risks? Factors such as the project's requirements, risk analysis, budget, and schedule determine which of the following types of testing are appropriate for your particular Web project: Validation or functional testing. This
is typically a core aspect of testing to determine if the Web site functions correctly as per the requirements specifications. Sites utilizing CGI-based dynamic page generation or database-driven page generation will often require more extensive validation testing than static-page Web sites.

Load testing.
Will there be a large number of interactions per unit time on the Web site? If so, you may want to perform testing under a range of loads to determine at what point your system's response time degrades or fails. The Web server software and configuration settings, CGI scripts, database design, and other factors can all have an impact. You will probably want to test the entire system under various conditions to get realistic results, but you may also want to consider separate testing of database response, server response, applet responsiveness, and other areas if the application is especially complex.

Stress testing.
This refers to testing system functionality while the system is under unusually heavy or peak load; it's similar to the validation testing mentioned previously but is carried out in a "high-stress" environment. This requires that you make some predictions about expected load levels of your Web site.

Usability testing.
Is your intended audience the general public? In-house Intranet users? Computer experts? Schoolchildren? The intended audience will determine the "usability" testing needs of your Web site. Additionally, such testing should take into account the current state of the Web and Web culture, because these will influence user expectations (for example, Web site navigation is expected to be extremely intuitive -- Web users do not expect to read manuals or help files).

Security testing.
If your site requires firewalls, encryption, user authentication, financial transactions, or access to databases with sensitive data, you may need to test these and also test your site's overall protection against unauthorized internal or external access.

Unit and integration testing.
Unit testing of code modules, objects, or discreet application functions is a standard part of testing any client/server or distributed application; integration testing may be needed to determine if various modules, other applications, and other parts of your site work together properly.

Regression testing.
If the project is large and complex, you may need to continuously retest everything as the site is initially developed and code is reworked to accommodate changes and bug fixes. Smaller, less complex projects may have minimal regression testing needs. As the site is upgraded and modified over the years, will there be significant changes requiring continuous retesting?

Link testing.
This type of testing determines if your site's links to internal and external Web pages are working. A Web site with many links to outside sites will need regularly scheduled link testing, because Web sites come and go and URLs change. Sites with many internal links (such as an enterprisewide Intranet, which may have thousands of internal links) may also require frequent link testing.

HTML validation.
The need for this type of testing will be determined by the intended audience, the type of browser(s) expected to be used, whether your site delivers pages based on browser type or targets a common denominator, and how strictly you want to adhere to HTML standards and/or Netscape, Microsoft, or other extensions.

Reliability and recovery testing.
Do you require uninterrupted 24 3 7 availability? Redundant database servers? Scalable Web servers? Depending on how critical your Web site is to your business, you may want to simulate various "emergency" scenarios (such as failure of a hard drive on the Web or database server, or communication link failures) in a test system to be sure that your production system will handle them successfully.

Server log
testing/report testing. Web sites that use advertising and track site usage for marketing needs may need extensive testing to ensure the accuracy of logging and reporting capabilities. (For more information on log analysis, see Dan Rahmel's article on page S12.)

Performance
Most non-Web client/server systems involve a much more controlled, predictable environment than Web-based systems. In Web-based systems, the client side may be any one of dozens of Web browsers installed on various operating systems and hardware platforms. Each browser will conform in varying degrees to multiple standards, with numerous possible plug-ins,
user-controlled options, and settings (variations in browser cache settings and graphics on/off settings are especially influential on performance). Web site usage and loads can vary with such factors as time of day, publicity, and site changes. With so many variables, system performance can be extremely difficult to predict and plan for, and testing more than a limited number of variations can be difficult or impossible. The primary factors
in performance that are under your control are your Web server software and hardware, the database, and the Web server connection capacity. Use of scalable systems can mitigate performance problems that aren't initially apparent, but the true scalability of such systems should be tested if it's a critical factor. Also, Web servers, database servers, and other system servers may need to be tested separately to find any bottlenecks that would negate improvements in other system servers.

Communications
Internet communication pathways and quality can be extremely variable. Intranet-based applications may employ more consistent and tightly controlled communication links, but public Web sites will have little control other than the connections into the Web server and connections between the Web server, database server, and other system servers. For non-Intranet systems, it may be preferable to use end-to-end test environments that are not overly concerned with communication variations, because they'll be outside your control anyway; it will often be sufficient just to emulate realistic communications environments. The same factors apply to client-side communications, especially dial-up type connections.

Security
Because Web-based systems operate in a relatively "open" environment, security can be a much greater concern than in many non-Web client/server systems. In addition to client-side access control (such as passwords and encryption), there are server-side access control issues concerning who can
publish and modify HTML files, who gets final publication approval, and who approves graphics files, sound files, and page layouts. Although corporate Intranets may require sharing and wide dissemination of information, they also need some type of control on the publication and availability of that information. Publicly available Web sites may involve outside access to
an organization's internal computer systems, which can entail some level of risk no matter what safeguards are put in place. Use of multiple public Internet communications links to pass information between browsers and Web servers entails security risks as well. Another issue is whether or not to require each Web user to have a personal user ID for logging into the DBMS
or instead to funnel all browser users through a single surrogate user ID in the DBMS. However, in my experience Intranet or Internet access is not used for direct access to a DBMS; any database interaction is usually handled by the CGI code, and the HTML pages just show user-friendly-type forms and aren't used for SQL-level interaction.

The Tools
Test tool vendors have been developing new products to help in several areas of testing database-driven Web sites. These products include load-test tools for performance testing, regression-test tools for capture/playback and test scripting, tools geared specifically to testing of Java applets and applications, and other tools for Web-related testing such as link-checking.

Load Test Tools.
The use of new test tools can significantly improve the efficiency of Web site load testing and scalability testing. Many of the leading test-tools, including Mercury Interactive Corp.'s LoadRunner/WebTest, Rational Software Corp.'s recently purchased preVue-Web product (from Performance Awareness Corp.), Pure Atria's PurePerformix/Web (PureAtria is soon to be bought by Rational Software as well), and Platinum Technology Inc.'s WebLoad, have come out with these types of products during the past year. A load test tool can emulate hundreds or even thousands of users accessing a Web site during short periods of time. Typically, such software includes capture/playback capabilities, a programming or scripting
language, monitoring and control functions, and test logging and reporting tools. The capture/playback functions can be used to "record" representative Web-user activities during interaction with your Web test site, save them as scripts, and then play back multiple instances of these scripts to simulate real loads (in the form of HTTP requests) on your site. Because these tools are relatively new, they can vary considerably in price, reliability, and capabilities. You'll want to consider the product's limitations and
capabilities regarding OS platforms, compatible browsers, secure HTTP capabilities, and the quality of the scripting language. Cost is often based on the number of simulated users. Some tools have a demo version available.

Regression Test Tools.
These are most useful for the automation of tests that will be repeated at least several times. The effort that may go into
scripting and debugging these tests is only justified if the same test scripts can be reused without major modifications. For example, a set of tests might need to be carried out each time a new build of your site's CGI software is modified or each time there are modifications to your Web site database stored procedures. As with the load test tools, regression test
tools are designed to work efficiently with Web-based applications and have capture/playback, scripting, and logging/reporting capabilities. Load test tools create direct HTTP requests; however, regression test tools instead simulate a user interacting with a browser and let the browser create the HTTP requests. Such tools can also be used in tandem with load test tools for stress testing by repeating a suite of tests under varying load conditions. Segue Software Inc.'s SilkTest and Platinum's FinalExam InternetTest are two examples of such tools.

Java Test Tools.
Java applets on the client side and Java applications on the server side can be tested separately, if necessary, with new test tools specifically geared to Java, such as those from the SunTest division of Sun Microsystems Inc. These tools test APIs and code coverage, and they perform capture/playback.

Other Test Tools.
A wide variety of other tools are available, including link-checking tools such as the Electronic Software Publishing Corp.'s LinkScan and Tetranet Software Inc.'s Linkbot.

Database-driven
Web sites are taking an increasingly critical role in the day-to-day operations of many organizations. You can minimize your risk of site problems, errors, and failures through careful analysis of requirements and risks, well-planned site test strategies, and appropriate use of new Web test tools.

Rick Hower is an independent software QA and test consultant and maintains an extensive Web site covering software QA and testing at www.charm.net/~dmg/qatest/. You can contact him via his Web site or by email at rhower@netcom.com.

No comments:

How to Get files from the directory - One more method

 import os import openpyxl # Specify the target folder folder_path = "C:/Your/Target/Folder"  # Replace with the actual path # Cre...