Types of testing associated with changes. Smoke testing during the daily assembly of the software product

Smoke and sanitary testing begins immediately after the release of the next version of the project. For many young testers, this process seems like absolute chaos. Did you recognize yourself? Then this article is for you. Now we will look at the definitions of smoke and health testing, and also show the difference between them with easy-to-understand examples.

Smoke testing:

Smoke testing is carried out in order to make sure that the resulting build is suitable for testing. It is also called a zero-day check.

It is this type of testing that will not let you waste time. It is logical that testing the entire application does not make sense if there are problems with key characteristics and critical bugs have not been fixed.

Sanitary testing:

Sanitary testing is carried out at the release stage to check the main functionality of the application. They usually don't go further. Such testing is sometimes referred to as a shortened version of regression testing.
When a release is under pressure, doing thorough regression testing is next to impossible. In this case, sanity testing does a great job, which checks the operation of the main functions of the application.

An example to better understand the difference between smoke and sanitation testing:

There is a project for which an initial release is planned. The development team releases a build for testing, the test team starts work. The very first test is the suitability test. You need to find out if you can work with this version or not. This is smoke testing. If the team gives the go-ahead for further work with the build, it goes to deeper stages of testing. Imagine that the build has three modules: “Login”, “Admin” and “Employee”. The test team checks the performance of only the main functions of each of the modules, without delving into the particulars. This will be health testing.

A few more differences between smoke and sanitation testing:

Smoke testing is done by both developers and testers;
Sanitary testing is carried out only by testers.

Smoke testing covers all the main functionality of the application from start to finish;
Sanitary testing only tests a specific component of an application.

Smoke testing goes through both stable and non-stable builds;
A relatively stable version of the build is undergoing sanitation testing.

Kirill Flyagin, game designer, QA Lead

Let's draw a summer analogy with these types of testing. Let's say you want to buy a watermelon. Smoke testing is when you check it visually, look at the strips, squeeze, knock, evaluate. There are masters who manage to buy a really tasty berry in this way. In health testing, you cut out a pyramid at the top and check its color (as one of the components), while not knowing at all whether the whole watermelon is like that. But for the cut part you are completely sure.

Smoke test on Jargon File

Wikimedia Foundation. 2010 .

See what "Smoke test" is in other dictionaries:

smoke test- noun A method of testing for leaks in drain pipes or chimneys by introducing dense smoke, often by using a smoke bomb Main Entry: smoke … Useful english dictionary

smoke test- Test made to determine the completeness of combustion … Dictionary of automotive terms

smoke test- 1. noun a) A test for leaks involving blowing smoke into a tube or pipe. b) A preliminary test on a newly constructed piece of electronic equipment, consisting simply of the application of electrical power, to make sure that no egregious wiring… … Wiktionary

smoke testing- is a term used in plumbing, woodwind repair, electronics, computer software development, and the entertainment industry. It refers to the first test made after repairs or first assembly to provide some assurance that the system under test will… … Wikipedia

smoke testing- bzw. Rauchtest ist ein Begriff aus dem Englischen, gebräuchlich im handwerklichen Bereich (z. B. in der Klempnerei, Elektronik oder beim Bau von Holzblasinstrumenten) wie auch in der Softwareentwicklung. Es bezeichnet den ersten… … Deutsch Wikipedia

Smoke- is the collection of airborne solid and liquid particulates and gases [SFPE Handbook of Fire Protection Engineering] emitted when a material undergoes… … Wikipedia

test suite- In software development, a test suite, less commonly known as a validation suite , is a collection of test cases that are intended to be used to test a software program to show that it has some specified set of behaviors. A test suite often… … Wikipedia

smoke bomb- A smoke bomb is a firework designed to produce smoke upon ignition. While there are smoke generating devices that are dropped from airplanes, the term smoke bomb is used to describe the three types of devices:# A smoke ball is a hollow, cherry… … Wikipedia

If you want to create a simple computer program, which consists of a single file, you just need to collect and link all the code you wrote into this file. On a typical development team project, there are hundreds, even thousands of files. This "contributes" to the fact that the process of creating an executable program becomes more complex and time consuming: you must "assemble" the program from various components.

The practice used, for example, in Microsoft and some other software development companies, is to build the program daily, which is supplemented by smoke testing. On a daily basis, after each file is compiled (built, built), linked (linked), and combined into an executable program, the program itself is subjected to a fairly simple set of tests, the purpose of which is to see if the program "smokes" during work. These tests are called smoke tests (from the English smoke - smoke). Most often, this process is quite well automated (or should be).

BENEFITS. This simple process provides several significant benefits.

Risk minimization during integration

One of the most significant risks faced by a development team is that the developers themselves work on the code separately, independently of each other, as a result of which a complex program does not work as expected when building the developed code. Depending on when the incompatibility was discovered in the project, the debugging of the program may take longer than with earlier integration, especially if the program interface changes or after the implementation of major changes to the main parts of the program.

Daily assembly and running of smoke tests makes it possible to reduce the risk of integration errors, respond to them in time and prevent their accumulation.

Reducing the risk of poor software product quality

The low quality of the product directly depends on the failures and problems during integration. Running a minimal set of smoke tests daily keeps errors and problems from taking over the project. If you bring the project to a stable state once, it will remain stable forever. This way you will never allow the quality to decrease to the level at which errors occur.

Help in diagnosing errors

If one day the product didn't build (built with errors), it's much easier to find the cause of the problem by building daily and running a set of smoke tests. A working product yesterday and not working today is a clear hint that something went wrong between the two builds.

Morale Improvement

If the product works and acquires more and more new qualities and functions every day, the morale of the developers, in theory, should grow and it does not matter at all what exactly this product should do. It is always a pleasure for a developer to watch his working “brainchild”, even if the product displays a rectangle on the screen :)

Using daily builds and smoke tests

Here are some details of this principle.

Daily app build

A fundamental part of the daily build is the build of the part that was made last. Jim McCarthy in Dynamics of Software Development (Microsoft Press, 1995) called the daily build of the project his heartbeat. If there is no heartbeat, there is no project, it is dead. Less figuratively, the daily build has been described by Michael Cusumano and Richard W. Selby as the project's clock pulse (Microsoft Secrets, The Free Press, 1995). Each developer writes code in his own way and he, the code, can go beyond the framework generally accepted on the project - this is normal, but with each exposure to a synchronizing pulse, the code returns to the standard. By insisting on developing with the sync pulse constantly, you prevent the project from getting completely out of sync.

In some companies, it is customary to collect the project not every day, but once a week. This system is erroneous, because in the event of a “breakdown” in the project this week, it may take another couple of weeks before the next successful build. In this case, the company loses all the benefits of the daily project build system.

Check for failed build

In the case of a daily build of the project, it is assumed that the project should work. However, if the project is not working, then fixing it becomes a task with priority 1.

Each project has its own standard and a sign of what is called "build failure". This standard should specify a level of quality that is sufficient to keep track of minor defects and not overlook defects that "block" the project.

A good build is one where at least:

all files, libraries and other components are successfully compiled;
links to all files, libraries and other components are valid;
does not contain any stable system, excluding the possibility of the correct operation of the application program, blocking errors;
all smoke tests pass.

Daily smoke tests

Smoke tests must be performed on the entire project from start to finish. They do not have to be exhaustive and comprehensive, but should contain a test of all major functions. Smoke testing should be deep enough that, if it passes successfully, you can call the project stable and call it such that it can be subjected to deeper testing.

The point of daily assembly is lost without smoke testing. This process guards the quality of the product and does not allow any integration problems. Without this, the daily build process is a waste of time, the purpose of which is to check the compilation.

Smoke testing should evolve with the project. In the beginning, the smoke tests will check for something as simple as whether the project can produce a "Hello, World!" message. As the system evolves, smoke tests become more in-depth. The time spent on the first smoke tests is calculated in a few seconds, however, as the system grows, the amount of time required for smoke testing also increases. At the end of a project, smoke testing can last for hours.

Build group definition

On most projects, there is a designated person responsible for checking the daily build of the system and performing smoke tests. This work is part of the duties of this employee, but on large projects there may be more such employees and such work is their main responsibility. For example, there were four people in the Windows NT 3.0 project's build team (Pascal Zachary, show stopper!, The Free Press, 1994).

Only add a revision to a build if it makes sense.

Typically, developers individually write code slowly enough that meaningful changes can be added to the system on a daily basis. They have to work on a large part of the code and integrate it into the system every few days.

Enter a system of penalties for disrupting the release of the next assembly (release of a non-working assembly).

Most projects have a system of penalties for failing to release the next build. At the very beginning of the project, it is worth making it clear that the preservation of the working draft is the task of the highest priority. Breaking the release of the next build may be the exception, but by no means the rule. Insist that developers leave everything until the system works again. In the event of frequent build failure (release of a non-working build), it is quite difficult to get the project back on track.

Minor fines emphasize a high degree the need to monitor the build quality of the system. On some projects, developers who cause the assembly to crash are given lollipops for releasing a non-working assembly. A corresponding sign hangs on the door of the office of such a developer until he fixes the assembly (provided that the developers have separate offices :)). On other projects, guilty developers are required to wear artificial goat horns or contribute a certain amount to a "morale fund" (examples taken from the history of real companies).

But on some projects, more serious penalties are introduced. For example, Microsoft developers on high-priority projects (Windows NT, Windows 95, Excel) wore pagers and had to report to work if an inspection was detected. Even if a breakdown or error was discovered at 3 am.

Build the system and "smoke" it even under pressure

When pressure on a project's release schedule intensifies, the job of checking a system build every day can seem like a waste of time. However, it is not. In stressful situations, developers often make mistakes. They feel such a pressure to release implementations that they simply don’t have under normal conditions. They check their code with unit tests much less carefully than usual. In such situations, the code tends to the state of entropy much faster than in less stressful situations.

Who benefits from this process? Some developers protest against daily builds, justifying their protests with the impracticality of this activity and its large time costs. But all the complex systems of recent times have been subjected to daily assemblies and smoke tests. At the time of its release, Microsoft Windows NT 3.0 contained 5.6 million lines in 40,000 files. A full build took 19 hours and ran on multiple computers. Despite this, the developers managed to assemble the system daily. As a professional team, the NT development team owes much of its success to daily builds. Those developers who work on less complex projects and therefore don't take advantage of the daily build process should consider coming up with some reasonable explanations for themselves.

Hey Habr! One day at our internal seminar, my supervisor, the head of the testing department, began his speech with the words “testing is not necessary.” Everyone in the hall fell silent, some even tried to fall off their chairs. He continued his thought: without testing, it is quite possible to create a complex and expensive project. And most likely it will work. But imagine how much more confident you will feel knowing that the product works as it should.

In Badoo, releases happen quite often. For example, the server part, along with the desktop web, is released twice a day. So we know firsthand that complex and slow testing is a development stumbling block. Quick testing is a blessing. So, today I will talk about how smoke testing is arranged at Badoo.

What is smoke testing

This term was first used by stove-makers, who, having assembled the stove, closed all the plugs, flooded it and watched so that the smoke came only from the right places. Wikipedia

In its original application, smoke testing is intended to test the simplest and most obvious cases, without which any other type of testing would be unnecessarily redundant.

Let's look at a simple example. Pre-production of our app is located at bryak.com (any resemblance to actual sites is purely coincidental). We have prepared and uploaded a new release for testing. What should be checked first? I would start by making sure the app is still open. If the web server answers us with “200”, then everything is fine and you can start testing the functionality.

How to automate this check? In principle, you can write a functional test that will raise the browser, open the desired page and make sure that it is displayed as it should. However, this solution has a number of disadvantages. First, it is long: the process of launching the browser will take longer than the check itself. Secondly, it requires the maintenance of additional infrastructure: for the sake of such simple test we need somewhere to keep the server with browsers. Conclusion: we need to solve the problem differently.

Our first smoke test

At Badoo, the back-end is mostly written in PHP. Unit tests, for obvious reasons, are written on it. In total, we already have PHPUnit. In order not to produce technologies unnecessarily, we decided to write smoke tests in PHP as well. In addition to PHPUnit, we need a URL client library (libcurl) and a PHP extension to work with it - cURL.

In fact, the tests simply make the requests we need to the server and check the responses. Everything is tied to the getCurlResponse() method and several types of assertions.

The method itself looks like this:

Public function getCurlResponse($url, array $params = [ 'cookies' => , 'post_data' => , 'headers' => , 'user_agent' => , 'proxy' => , ], $follow_location = true, $ expected_response = '200 OK') ( $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); if (isset( $params['cookies']) && $params['cookies']) ( $cookie_line = $this->prepareCookiesDataByArray($params['cookies']); curl_setopt($ch, CURLOPT_COOKIE, $cookie_line); ) if ( isset($params['headers']) && $params['headers']) ( curl_setopt($ch, CURLOPT_HTTPHEADER, $params['headers']); ) if (isset($params['post_data']) && $params['post_data']) ( $post_line = $this->preparePostDataByArray($params['post_data']); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_line); ) if ($follow_location) ( curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); ) if (isset($params['proxy']) && $params['proxy']) ( curl_setopt($c h, CURLOPT_PROXY, $params['proxy']); ) if (isset($params['user_agent']) && $params['user_agent']) ( $user_agent = $params['user_agent']; ) else ( $user_agent = USER_AGENT_DEFAULT; ) curl_setopt($ch, CURLOPT_USERAGENT, $user_agent); curl_setopt($ch, CURLOPT_AUTOREFERER, 1); $response = curl_exec($ch); $this->logActionToDB($url, $user_agent, $params); if ($follow_location) ( $this->assertTrue((bool)$response, "Empty response was received. Curl error: " . curl_error($ch) . ", errno: " . curl_errno($ch)); $this ->assertServerResponseCode($response, $expected_response); ) curl_close($ch); return $response; )
The method itself can return a server response at a given URL. It accepts parameters as input, such as cookies, headers, user agent and other data necessary to generate a request. When a response is received from the server, the method checks that the response code matches the expected one. If it is not, the test fails with an error stating this. This is done to make it easier to determine the cause of the fall. If the test fails on some assertion, telling us that there is no element on the page, the error will be less informative than the message that the response code is, for example, "404" instead of the expected "200".

When the request is sent and the response is received, we log the request so that later, if necessary, it is easy to reproduce the chain of events if the test falls or breaks. I will talk about this below.

The simplest test looks something like this:

Public function testStartPage() ( $url = 'bryak.com'; $response = $this->getCurlResponse($url); $this->assertHTMLPresent("
This test takes less than a second. During this time, we have verified that the start page responds with "200" and has a body element on it. With the same success, we can test any number of elements on the page, the duration of the test will not change significantly.

The advantages of such tests:

speed - the test can be run as often as needed. For example, for every code change;
do not require special software and hardware to work;
they are easy to write and maintain;
they are stable.

Regarding the last point. I mean - no less stable than the project itself.

Authorization

Let's imagine that three days have passed since we wrote our first smoke test. Of course, during this time we covered all the unauthorized pages that we found with tests. We sat for a while, rejoiced, but then realized that all the most important things in our project are behind authorization. How would I be able to test this too?

The simplest option is an authorization cookie. If we add it to the request, then the server “recognizes” us. Such a cookie can be hardcoded in the test if its lifetime is quite long, or it can be obtained automatically by sending requests to the authorization page. Let's take a closer look at the second option.

We are interested in the form where you need to enter the username and password of the user.

Open this page in any browser and open the inspector. We enter user data and submit the form.

A request appeared in the inspector, which we need to simulate in the test. You can see what data, in addition to the obvious ones (login and password), are sent to the server. It is different for each project: it can be a remote token, the data of any cookies received earlier, a user agent, and so on. Each of these parameters will have to be previously obtained in the test before generating an authorization request.

In the developer tools of any browser, you can copy the request by selecting copy as cURL. In this form, the command can be inserted into the console and viewed there. In the same place it can be tested by changing or adding parameters.

In response to such a request, the server will return cookies to us, which we will add to further requests in order to test authorized pages.

Since authorization is a rather long process, I suggest getting the authorization cookie only once per user and storing it somewhere. For example, we store such cookies in an array. The key is the user's login, and the value is information about them. If there is no key for the next user yet, log in. If there is, we make the request that interests us immediately.

Public function testAuthPage() ( $url = 'bryak.com'; $cookies = $this->getAuthCookies(' [email protected]', '12345'); $response = $this->getCurlResponse($url, ['cookies' => $cookies]); $this->assertHTMLPresent(" ", $response, "Error: test cannot find body element on the page."); )
As we can see, a method has been added that receives an authorization cookie and simply adds it to a further request. The method itself is implemented quite simply:

Public function getAuthCookies($email, $password) ( // check if cookie already has been got If (array_key_exist($email, self::$known_cookies)) ( return self::$known_cookies[$email]; ) $url = self::DOMAIN_STAGING .'/auth_page_adds'; $post_data = ['email' => $email, 'password' => $password]; $response = $this->getCurlResponse($url, ['post_data' => $ post_data]); $cookies = $this->parseCookiesFromResponse($response); // save cookie for further use self::$known_cookies[$email] = $cookies; return $cookies; )
The method first checks if the given e-mail (in your case, it could be a login or something else) already has an authorization cookie already received. If there is, he returns it. If not, it makes a request to an authorization page (for example, bryak.com/auth_page_adds) with the required parameters: user's e-mail and password. In response to this request, the server sends headers, among which there are cookies of interest to us. It looks something like this:

HTTP/1.1 200 OK Server: nginx Content-Type: text/html; charset=utf-8 Transfer-Encoding: chunked Connection: keep-alive Set-Cookie: name=value; expires=Wed, 30-Nov-2016 10:06:24 GMT; Max-Age=-86400; path=/; domain=bryak.com
From these headers, using a simple regular expression, we need to get the name of the cookie and its value (in our example, this is name=value). Our method that parses the response looks like this:

$this->assertTrue((bool)preg_match_all("/Set-Cookie: (([^=]+)=([^;]+);.*)\n/", $response, $mch1), " Cannot get "cookies" from server response. Response: " . $response);
Once the cookies are received, we can safely add them to any request to make it authorized.

Analysis of falling tests

It follows from the above that such a test is a set of requests to the server. We make a request, we manipulate the response, we make the next request, and so on. The thought creeps into my head: if such a test fails on the tenth request, it may be difficult to figure out the reason for its failure. How to simplify your life?

First of all, I would like to advise you to atomize tests as much as possible. You should not check 50 different cases in one test. The simpler the test, the easier it will be in the future.

It is also useful to collect artifacts. When our test fails, it saves the last server response to an HTML file and uploads it to the artifact repository, where this file can be opened from the browser by specifying the name of the test.

For example, our test failed because it cannot find a piece of HTML on the page:

Link
We go to our collector and open the corresponding page:

You can work with this page in the same way as with any other HTML page in the browser. You can use the CSS locator to try to find the missing element and, if it really is not there, decide that it has either changed or is lost. We may have found a bug! If the element is in place, perhaps we made a mistake somewhere in the test - we need to carefully look in this direction.

Logging also makes life easier. We try to log all the requests that the failed test made so that they can be easily repeated. Firstly, this allows you to quickly perform a set of similar actions to reproduce the error with your hands, and secondly, to identify frequently falling tests, if we have any.

In addition to helping with error parsing, the logs described above help us build a list of authorized and unauthorized pages that we have tested. Looking at it, it is easy to find and eliminate gaps.

Last but not least, I can advise that tests should be as convenient as possible. The easier they are to launch, the more often they will be used. The clearer and more concise the report of the fall, the more carefully it will be studied. The simpler the architecture, the more tests will be written and the less time it will take to write a new one.

If it seems to you that it is inconvenient to use tests, most likely it doesn’t seem to you. This needs to be dealt with as soon as possible. Otherwise, you risk at some point starting to pay less attention to these tests, and this may already lead to skipping the error to production.

In words, the idea seems obvious, I agree. But in fact, we all have much to strive for. So simplify and optimize your creations and live without bugs. :)

Results

On the this moment we *open Teamcity* wow, already 605 tests. All tests, if not run in parallel, run in a little less than four minutes.

During this time we make sure that:

our project opens in all languages (of which we have more than 40 in production);
for the main countries, the correct forms of payment are displayed with the corresponding set of payment methods;
the main requests to the API work correctly;
the landing page for redirects works correctly (including to a mobile site with the appropriate user agent);
all internal projects are displayed correctly.

Selenium WebDriver tests for all this would require many times more time and resources. Add tags

After making necessary changes, such as fixing a bug/defect, the software should be retested to confirm that the problem has indeed been fixed. Listed below are the types of testing to be done after installation software, to confirm that the application is working or that the defect has been corrected:

- Smoke testing(Smoke testing)

- Regression Testing(Regression Testing)

- Build testing(Build Verification Test)

- Sanitary testing or consistency/health check(Sanity Testing)

concept smoke testing came from engineering. When commissioning new equipment ("iron"), it was considered that the test was successful if no smoke came out of the installation. In the field of software testing, it is aimed at a superficial check of all application modules for operability and the presence of quickly found critical and blocking defects. According to the results of smoke testing, a conclusion is made whether it is accepted or not. installed version software for testing, operation or delivery to the customer. To facilitate work, save time and human resources, it is recommended to implement automation of test scenarios for smoke testing.

Regression Testing is a type of testing aimed at verifying changes made to an application or environment(fixing a defect, merging code, migrating to another operating system, database, web server, or application server) to confirm that pre-existing functionality works as before (see also Sanitary testing or consistency/health checks). Regression can be functional, so non-functional tests.

As a rule, test cases written at the early stages of development and testing are used for regression testing. This ensures that changes in the new version of the application do not break existing functionality. It is recommended to automate regression tests to speed up the subsequent testing process and detect defects in the early stages of software development.

The term “Regression testing” itself, depending on the context of use, can have a different meaning. Sam Kaner, for example, described 3 main types regression testing:

- Bug regression is an attempt to prove that a fixed bug is not actually fixed.

- Old bugs regression- an attempt to prove that a recent change in code or data broke the correction of old errors, i.e. old bugs started to play again.

- Side effect regression- an attempt to prove that a recent code or data change broke other parts of the application being developed.

Sanity Testing or Consistency Testing (Sanity Testing) – this is a narrow test, sufficient to prove that a particular function works according to the requirements stated in the specification. It is a subset of regression testing. Used to determine the health of a particular part of the application after changes have been made to it or the environment. Usually done manually.

The difference between sanitary testing and smoke testing. Some sources mistakenly believe that sanitary and smoke testing are the same thing. We believe that these types of testing have "movement vectors", directions in different directions. Unlike smoke testing, sanity testing is directed deep into the tested function, while smoke testing is directed in breadth, in order to cover as much functionality as possible with tests in the shortest possible time.

Build testing(Build Verification Test) is a test aimed at determining whether the released version complies with the quality criteria for starting testing. According to its goals, it is an analogue of Smoke Testing, aimed at accepting a new version for further testing or operation. It can penetrate further into the depths, depending on the quality requirements of the released version.

Installation Testing - is aimed at verifying the successful installation and configuration, as well as updating or uninstalling the software. IN currently The most common way to install software is through installers(special programs that themselves also require proper testing). In real conditions, there may not be installers. In this case, you will have to install the software yourself, using the documentation in the form of instructions or readme files, step by step describing all the necessary actions and checks. In distributed systems where the application is deployed on an already running environment, a simple set of instructions may not be enough. For this, a Deployment Plan is often written, which includes not only the steps for installing the application, but also the rollback steps (roll-back) to previous version, in case of failure. The installation plan itself must also go through a testing procedure to avoid problems when released into actual operation. This is especially true if installation is performed on systems where every minute of downtime is a loss of reputation and a large amount of funds, for example: banks, financial companies or even banner networks. Therefore, installation testing can be called one of the most important tasks for ensuring software quality.

It is this integrated approach with writing plans, step-by-step installation verification and installation rollback that can rightly be called installation testing or Installation Testing.