Test & Performance - International PHP Conference

Professional Test Management with TestRail – Part 2

IPC editorial team — Fri, 06 Oct 2023 12:28:46 +0000

Testing is more than just running the tests! We have already explained this statement in detail in Part 1 of our series.

The simplified flow from “Test Case Management” to “Test Planning”, “Test Execution” up to the “Final Reports” shows the wide spectrum of activities in the QA environment.

So that these things can be carried out in a controllable manner, there are tools such as “TestRail”.

The test management software allows to create a clean and filterable test catalog with detailed instructions for the execution of tests.

IPC NEWSLETTER

All news about PHP and web development

Together we have created such a test catalog in part 1, which we now use accordingly for further planning.

Now that we have developed our tests and entered them as optimally as possible in TestRail, it is time to prepare them for execution by means of “Test Runs”.

Creating Test Plans

There are several options available in TestRail for planning. The most basic option is to create a simple test run (“Test Run”), which we can create under the menu item “Test Runs & Results”. This test run will later contain various tests from the catalog, selected either manually or automatically via filtering.

However, if we want a more structured approach, TestRail also offers the possibility to create a test plan. A Test Plan can contain any number of Test Runs, allowing for thematic structuring or subdivision.

Test Plans and Test Runs can be combined in different ways. For example, as in the definition, a Test Run can be a single run with a completed result. A test plan could then contain several runs until finally everything was OK and the feature can be accepted.

Another variant is that a test plan contains different test runs covering diverse topics. This could mean, for example, that one of the test runs might contain all the automated Cypress tests, another for smoke and sanity tests, and another for regression testing or new features. This is often helpful to have a visual representation, but can also be used to assign to different testers on the team. In this case, the test runs would remain open or repeated until everything is ultimately OK.

Before we actually create a test plan, we should look at the “Milestones” section in the main menu item. All test plans or test runs can also be assigned to milestones. These thus provide a rough subdivision, which can be done at your own discretion or in coordination with the project management.

Now we create our test plan and three test runs each for “Cypress Tests”, “Smoke and Sanity” and “Regression Tests”.

When creating a single test run, we have several options for selecting the tests. We can choose to add all tests, only certain manually selected tests, or use dynamic filtering to make the selection.

In case of manual selection, a window opens with an overview of our test catalog. Here we can navigate through our structured sections and select desired tests by simply ticking them. After clicking “OK”, the selected tests are applied to the test run.

EVERYTHING IS CONNECTED TO THE INTERNET

Explore the Web Development Track

When using dynamic filtering, we also see a modal. On the right side we have the possibility to specify different filter settings. Depending on how extensive the list is, we need to make sure to click on the “Set Selection” button at the bottom (scrolling may be required). Only then will TestRail highlight the appropriate tests based on our filtering. The rest of the process is the same as for manual selection.

If you now think that this is all TestRail offers us, you are considerably mistaken. TestRail offers us many more useful functions in the editing view of a test plan. The “Configurations” button opens a small window where we can create various groups and configurations. Based on the selected combinations, our prepared test cases will be duplicated and created for each specified configuration. For example, we could create groups for browsers, operating systems and devices. The configurations could then be “Chrome”, “Firefox”, or “Windows 11”, “MAC”, etc. We can then select which combinations we want to test. After we confirm this, we have different test runs for all our combinations, which we can customize or even remove. Of course, it is also possible to assign each Test Run to a different tester in the system.

So with all these features, we have flexible options to find our own customized approach for a project and a way of working.

At the end of the day, it is crucial to have a clear overview of the tests and be able to quickly provide feedback on the current status.

Test Execution

Now we finally get to the execution of our tests. Depending on the strategy and approach, this can be done either during the project, or classically at the end. Combinations are also possible if there are sufficient resources.

To start a run, we simply go to the detail page of the desired test run. On this page we have an efficient overview with statistics, sections and filtering options. A simple master-detail navigation allows to see the list of tests on the left side, and the details of the currently selected test on the right side.

For each test, multiple results can be recorded here. To do this, we simply click on the drop-down menu of the status (e.g. “untested”) in the list or on “Add result” within the details page. We can pre-select anything without consequence, such as “passed”, as a separate window will open anyway where we can adjust the results again. This may seem unexpected at first, but it is easy to learn. Basically, it is up to us which view we want to use to test. The most important thing is to read the steps carefully. However, the modal offers the advantage of marking steps already performed as “passed” to keep track of them, and it also allows us to record times, which can be interesting for planning future test runs.

Once we have captured the result of the test, TestRail does an excellent job of logging. The modal contains not only a comment function, but also fields for build number, version, etc., in addition to the status (Passed, Blocked, Retry, Failed). These can be expanded with additional fields as needed. A particularly interesting area concerns defects. Here we not only have the option to enter reference numbers (i.e. ticket IDs), but you can also create tickets directly in Jira, as long as Jira is connected to TestRail. So if we find a bug in the software, we can create a Jira ticket directly from TestRail, and the ticket ID is automatically linked to the test result in TestRail. This allows QA teams to track the current status of Jira tickets directly in TestRail and see when a feature can be retested, independent of project management and developers. Within Jira, all relevant information from TestRail is displayed in the ticket, and the template used can be edited in TestRail. In this way, developers are also provided with all the necessary information.

IPC NEWSLETTER

All news about PHP and web development

Traceability and Reports

TestRail provides a comprehensive range of reporting options to monitor progress and test coverage. You can compare results from different test runs, configurations and milestones. These reports can be automatically generated with a schedule and shared with both internal team members and external stakeholders, including the ability to generate reports as PDFs.

Learning TestRail’s reporting features may take some time, but once the various options are understood, many options are available to customize the reports to meet the team’s unique needs.

In addition to generated reports, TestRail also offers real-time reports. These can be found at the project level, milestone level, test plan level and test run level.

In the area of tracking, TestRail provides the ability to assign external reference IDs. This can be a Jira ticket ID, for example. If one has additionally linked Jira correctly, a tooltip field with information directly from Jira even opens when hovering. This gives you the possibility to assign different tests to a Jira ticket (e.g. Epic). This linking can be used for corresponding evaluations, but also for simple filtering when creating test plans.

TestRail API

TestRail has an extremely comprehensive HTTP-based API, which enables the creation of a wide range of interfaces. Using this API, we can retrieve test cases, create new test results, send attachments, and perform basic tasks such as creating test runs and editing configurations.

TestRail provides its own Github repository with templates for development in PHP, Java, .Net, Ruby and more.

Based on this API, we can now integrate a plugin for our test automation and submit results directly from Cypress to TestRail.

Cypress and TestRail

There are various reasons why test automation is sought. Whether it is due to resource constraints, to avoid repetitive steps, or to secure critical areas of the application that are often error prone.

To begin automation with Cypress, let’s create a Cypress project. Since the focus of this article is on TestRail, we will not go further into the implementation of Cypress tests here. The crucial point is the actual integration of our plugin.

First, we select a test from our test catalog. In collaboration with QA and development team (or Test Automation Engineers), a kick-off is conducted to take a closer look at the desired test and its behavior. After the test is implemented in Cypress, it is reviewed accordingly. If everything fits, we can mark the test as “automated” in TestRail. This will give us a better overview in the future of which tests are automated, and therefore no longer need to be tested manually.

But how do the results from Cypress get into TestRail? Quite simply – via an appropriate plugin based on the TestRail API. We install a compatible plugin like the “Cypress TestRail Integration” [https://github.com/boxblinkracer/cypress-testrail].

The configuration is relatively simple using the “setupNodeEvents” function enabled by Cypress.

e2e: { setupNodeEvents(on, config) { return require('./cypress/plugins/index.js')(on, config) } , }

This file relates our manually created “index.js” file with the actual registration of the plugin. Of course, this step can also be done inline.

const TestRailReporter = require('cypress-testrail');
module.exports = (on, config) => { new TestRailReporter(on, config).register(); return config }

After this is done, there are only two simple steps left. First, we still need a configuration for our TestRail instance, and of course we still need to link our created test to the test that is in TestRail.

Let’s start with the configuration. We have several options to do this. Either we create a “cypress.env.json” file or work directly with environment variables, for example in the CI/CD section.

The plugin offers two basic ways to send results to TestRail. It is possible to send the results directly to an existing and prepared test run, or to have new runs created dynamically. The choice of the appropriate approach can vary depending on the team and the project. So this flexibility is given.

The following example shows a JSON file that sends results to a defined Test Run:

{ "testrail": { "domain": "my-company.testrail.io", "username": "myUser", "password": "myPwd", "runId": "R123" } }

After the connection is configured, we just need to map our Cypress test to the appropriate TestRail test. This is done via a simple mapping in the test description (Test Description) using the ID from TestRail. The TestRail ID is visible with the tests and always starts with a “C”. It is also possible to link multiple Test Cases to a single Cypress Test.

it('C123: My Test for TestRail case 123', () => { // ... // ... })
it('C123 C54 C36: My Test for multiple TestRail case IDs', () => { // ... // ... })

That’s all. Now when we start Cypress in “run” mode, we see a hint about our integration and its configuration at the beginning. After a spec file is processed in Cypress, the results of the tests performed in it are finally sent to TestRail.

The integration offers many more options, such as uploading screenshots, adding more metadata and much more.

Conclusion

Testing is more than just running tests. To get the multitude of necessary tasks sorted out, test management tools like “TestRail” help us. TestRail offers a powerful test management solution that covers the entire quality management process, from test case creation to reporting. With features for structuring test catalogs, flexible test plans and comprehensive reporting, it enables efficient test management.

IPC NEWSLETTER

All news about PHP and web development

TestRail’s seamless integration with other tools, such as Jira, facilitates collaboration between test and development teams. In addition, the fully comprehensive API enables integration for automation software such as Cypress and Co. among others.

Overall, TestRail provides a comprehensive solution to streamline the QA process and deliver high-quality software products.

Links & Literature

https://www.testrail.com/

https://github.com/gurock/testrail-api

https://github.com/boxblinkracer/cypress-testrail

The post Professional Test Management with TestRail – Part 2 appeared first on International PHP Conference.

Professional Test Management with TestRail – Part 1

IPC editorial team — Tue, 26 Sep 2023 12:55:42 +0000

But what is the problem with testing? Not in testing itself, but in the perception that testing can be done quickly and at short notice. However, professional quality management encompasses much more than just testing. It starts at the very beginning of the project, and over its duration provides answers to questions such as the coverage of planned tests, the progress of the project, the number of known defects, and much more.

IPC NEWSLETTER

All news about PHP and web development

Tools are available to us for exactly these tasks, so-called test management applications. In this article, we will take a look at the application “TestRail”, and learn what possibilities such software offers us, and how we can use it.

However, before we get into the details, it is important to consider what is actually meant by the term “testing” and what tasks are associated with it.

What does professional testing mean?

What does testing actually mean? According to the guidelines of the ISTQB (International Software Testing Qualifications Board), testing includes:

The process consisting of all lifecycle activities (both static and dynamic) that deal with planning, preparation, and evaluation of a software product and associated deliverables.

This definition is undoubtedly based on a broad focus on all activities, which means that testing encompasses much more than simply running tests.

If we take a closer look at the start of a new project, it is common knowledge that project management, technical lead devs and other stakeholders work with customers and stakeholders to create project plans, divide them into work packages and release them for development. What is often neglected, however, is the role of testers in this crucial planning phase of the project.

BE ON THE SAFE SIDE!

Explore the Quality & Security Track

Learn more

In the area of quality management or quality assurance, one or more test concepts are developed in the professional approach at the beginning of the project. These test concepts sometimes deal with seemingly simple questions, which, however, play a central role in the development of test cases.

What are the goals of our testing? Do we want to build trust in the software, or just minimize risks? Evaluate conformance, or simply prove the impact of defects? What documents do we create for our tests? What forms the basis of our tests (concepts, specifications, instructions, functions of the predecessor software)? Which test environments are available, when will they be implemented, and which approaches and methods do we use to develop test cases?

For those who have now had their “aha” moment, it should be added that such test concepts can indeed be elaborated for each test level of the V-Modell. For example, in the area of component testing, we usually strive for things like unit tests, code coverage and whitebox testing, while in system testing, blackbox testing methods are increasingly used for test case development (equivalence classes, decision tables, etc.). In addition, system testing may already be validating instead of just verifying things.  >Validation deals with making sense of the result (does the feature really solve the problem), while verification refers to checking requirements (does it work according to the requirement).

Due to the considerable amount of information and the work steps according to ISTQB (yes, that was by far not all), I would like to divide these, into four simple areas:

Test Case Management
Test Planning
Test execution
Final reports

This clear structure makes it possible to manage the complexity of the testing process and to ensure that all necessary steps are carried out carefully.

Testing in a Software Project

To facilitate the later use of TestRail, let’s now take a rough look at the flow of a project, using the points simplified above.

After the test base (requirements, concepts, screenshots, etc.) has been defined, various test concepts have been generated, and appropriate kick-off meetings have taken place, it is the responsibility of the testers to develop appropriate test cases. These tests essentially provide step-by-step guidance on how to perform them, whether on a purely written or even visual basis.

Those who have done this before know that there are few templates and limitations in this regard. These range from simple functional tests, such as technical API queries, to extensive end-to-end scenarios, such as a complete checkout process in an e-commerce system, including payment (in test mode).

A key factor in test design is recognizing that quantity does not necessarily mean quality. It makes little sense to have 1000 tests that cannot possibly be run manually over and over again due to scarce capacity. It makes much more sense to create fewer tests, but with a large number of implicit tests so that they automatically test additional peripheral aspects of the actual case, if possible.

Now that a list of tests has been created, it is of course useful if it can be filtered. Therefore, the carefully compiled test catalog is additionally categorized. The so-called “Smoke & Sanity” tests comprise a small number of tests that are so critical that they should be tested with every release. Simple regression tests, in turn, provide an overview of optionally testable scenarios that can be rerun as needed (suspected sideeffects, etc.).

The list of these categories can vary, as there is no official standard and they can vary from company to company. Ultimately, the most important thing is the ability to easily filter based on requirements. Of course, there are many other interesting filtering options, such as a reference Jira ticket ID for the Epic covered in the test, or possibly specific areas of the software such as “Account”, the “Checkout” or the “Listing” in e-commerce projects.

Now that the test catalog has been generated, the question is whether we should directly test it in full. The answer is yes and no! Here it depends on what is crucial for the project management and the stakeholders, i.e. what kind of report they ultimately need.

Therefore, we can create test plans that include either all tests, or only a subset of them. Usually, for example, before a release for a plugin (typically with semantic versioning v1.x.y, …) all “smoke & sanity” tests are tested, as well as some selected tests for new features and old features. Although it would of course be ideal to run all tests, this is unfortunately often unrealistic, depending on team size and time pressure. A relaunch project that is created from scratch should of course be fully tested before final acceptance. However, for a more economical way of working (shift-left), it is possible to plan various test plans for the already completed areas of the software earlier. Thus, tests for the “account” area of an online store could be started before the “checkout” area is testable. This gives an earlier result and also provides a cheaper way to fix bugs (the earlier in development the cheaper). However, this is still a gamble, as side effects could still occur due to integration errors at the end of the project. Thus, additional testing at the end is always advisable.

Planning test executions thus involves selecting and compiling tests from our test catalog, taking into account various factors such as their importance, significance, priority and feasibility.

After the test plans have been created, and the work packages have been put into a testable state, now the perhaps simplest, but extremely prominent step in the QA process starts – the execution of the tests. This step can be quite straightforward, depending on the quality of the prepared tests, but it always requires a step-by-step approach. (A small tip: in addition to running these tests, freer and exploratory testing is also recommended to uncover additional paths and bugs).

During test execution, however, it is critical to log results as accurately as possible. This includes capturing information such as screen sizes, devices used, browsers used, taking screenshots and recording the ticket ID of the work package, and more. Such logging is necessary for tracking and makes troubleshooting much easier for developers.

After the tests have been run, it’s time to create the final reports. Stakeholders and other involved parties naturally want to know what the status of the project is. Among other things, they are interested in the test coverage, the number of critical issues found, and whether they might suggest a premature go-live of the application. The creation of reports is therefore an essential step in the QA process, as they form the basis for decisions and consequences for the entire project.

Fortunately, in order not to lose track of all these tasks, tools and applications are available. Although in theory simple documents based on Word and Excel can suffice, professional test management applications provide a much more efficient and organized workspace for the entire team.

A leading tool in this field is “TestRail”.

Test Management with TestRail

TestRail, developed by Frankfurt-based Gurock Software, is characterized by its specialization in highly efficient and comprehensive solutions for QA teams. Its offerings range from comprehensive test management capabilities to the creation of detailed test plans, precise execution of tests, meticulous logging and extensive reporting. And for those who want to go even further, TestRail offers an extensive API that can be used to develop custom integrations to further customize and optimize the QA process.

When visiting the TestRail website, it quickly becomes clear that there is more than just software on offer here. TestRail’s content team continuously publishes interesting articles on the subject of testing, which offer real added value thanks to their practical and technically appealing content.

TestRail itself can be used either as a cloud solution or via an on-premise installation. The cloud variant offers a comprehensive solution at quite affordable prices, around EUR 380 per user per year. For those who want additional functions, the Enterprise Cloud version is available for around EUR 780 per user per year. This includes single sign-on, extended access rights, version control of tests and much more.

IPC NEWSLETTER

All news about PHP and web development

The installation on own servers is more expensive, about 7,700 EUR to 15,620 EUR per year, but already includes a large contingent of available users and can be a suitable solution especially for larger teams and companies.

Once you have chosen a version, such as the cloud solution, it can be used after a short registration.

Create a project

Let’s start by creating a new project in TestRail. In addition to the project title and access rights, there are settings related to Defects and References, which will be discussed in more detail later in this article. Through these two functions, it is possible to link applications such as Jira, with TestRail and get a smooth navigation, as well as a preview of linked Defect tickets or even Epic tickets (references).

Probably the most interesting and important area concerns the type of project we are creating. Here, TestRail offers us three different options for structuring our test catalog.

The user-friendly “Single Repository” option allows us to create a simple and flexible test catalog that can be divided into sections and subsections.

The “Single Repository with Baseline Support” option allows us to keep the simplicity of the first model, but create different branches and versions of test cases. This is especially useful for teams that need to test different product versions simultaneously.

The third variant offers the possibility to use different test catalogs to organize the tests. Test catalogs can be used for functional areas or modules of the application. This type of project is more suitable for teams that need a stricter division of the different areas. A consequence of this is that test executions can only ever include tests from a single test catalog.

For our project launch and greater flexibility, we choose the “Single Repository” type.

Create tests

After the project is created, we are taken to an overview page. Here, at a later stage of the project phase, we will find more useful information.

Now it is time to create our first test. To do this, we open the “Test Cases” section in the project navigation.

On this page we see the currently still empty test catalog. Our task now is to create an appropriate number of tests that are optimally structured and filterable for us.

TestRail offers a variety of options for organizing test cases. In addition to filterable properties, we can also create a hierarchical structure by using sections. There are no hard and fast rules on how this should be done.

We can use sections for different areas of the application like “Checkout” or “Account”, or create them for individual features. The author often finds it helpful to use sections to break down the application by area or feature, as these can be used later as a guide when creating test plans.

Regardless of whether we decide to use sections or not, the next step is to create our first test.

Looking at the input screen, we notice that a lot of emphasis has been placed on relevant information here.

We have the option to define various properties, such as the type of test (smoke, regression, etc.), priority, automation type and much more. If these options are not enough, we can easily create and add new fields through the administration.

When we define the instructions of a test, we have the option to use one of several templates. Besides the variant with a free text field, we also have a template for step-by-step instructions. With the latter, we can define any number of steps with sequences and expected intermediate results. This not only offers the advantage of clear instructions, but also allows us to specify exact results for each step. This way, we can later immediately see from which step an error occurred.

YOU LOVE PHP?

Explore the PHP Core Track

Learn more

For testers managing large projects, there is also the option of outsourcing certain steps to separate central tests, such as the “login process on a website”, and then reusing them in different tests.

Thanks to the extensive editing options for tests in TestRail, there are no limitations when it comes to defining test cases efficiently and precisely.

Today we learned about the different processes of a testing team in a software project, and started using TestRail to set up our project.

With the tests we created together and the resulting filterable test catalog, we now have a perfect basis to plan the actual testing of our application.

In the next part we will use this test catalog to create test plans as well as to execute the tests.

We will also take a look at reporting, traceability, and Cypress integrations via the available TestRail API to complete our flow.

The post Professional Test Management with TestRail – Part 1 appeared first on International PHP Conference.

PHPUnit 10 – All you need to know about the latest version

IPC editorial team — Tue, 01 Aug 2023 14:22:31 +0000

PHPUnit 10 should have been released on February 5, 2021, the first Friday in February 2021. It would have followed the tradition of PHPUnit 6, 7, 8 and 9 of being released on the first Friday of February each year, before most people in Germany had their first cup of coffee. PHPUnit 10 was then released on February 3, 2023, the first Friday in February 2023, two years late.

There are reasons for the delay. One of the most substantial may be a pandemic that has affected us all and permanently changed the lives and work habits of many people. Since April 2017, PHPUnit Code Sprints were held every six months, which the author attended with great pleasure and regularity. On one hand, these sprints gave the opportunity to rediscover and rediscover the functionality of PHPUnit together with Sebastian Bergmann, friends and acquaintances of PHPUnit, and on the other hand also to contribute to the development of PHPUnit in a concentrated way.

In September 2019, the last Code Sprint for the time being took place in Mannheim. In October 2019, Sebastian Bergmann, Arne Blankerts, Stefan Priebsch, Ewout Pieter den Ouden and the author participated in the EU-FOSSA Cyber Security Hackathon, organized by the European Union, to work on critical infrastructure for the European Union in parallel with other developers. It was there that the idea for one of the biggest changes in PHPUnit came up, the new event system that would find its way into PHPUnit 10.

However, COVID-19 meant that events such as the PHPUnit Code Sprint, official and unofficial hackathons, PHP user groups and conferences could no longer take place in the usual way. These events were cancelled completely or were only held online. The working habits of many of us, who had previously been able to engage in constructive exchange with developers on-site at customer locations, for example, and were now only able to do so online, also underwent lasting changes as a result of the pandemic.

IPC NEWSLETTER

All news about PHP and web development

These changes also affected the work on PHPUnit. However, this does not mean that nothing has been achieved since the release of PHPUnit 9 in February 2020. On the contrary, PHPUnit 10, as already indicated, brings major changes, especially beneath the surface.

PHPUnit 10.0.0

PHPUnit 10.0.0 was released on February 3, 2023. Immediately after the release, a number of releases followed in quick succession until the end of March, fixing bugs and flaws and responding to feedback from developers. PHPUnit 10.0.19 was released on March 27, 2023.

PHPUnit 10 requires PHP 8.1 or higher. Developers using versions older than PHP 8.1 must use older versions of PHPUnit, such as PHPUnit 9 (requires PHP 7.3 or higher) or PHPUnit 8 (requires PHP 7.2 or higher). For PHPUnit 10, the documentation has been completely revised. In the following we want to take a look at the new functionalities.

Event system

The TestListener and Hook system available in PHPUnit 9 provide interfaces for extending PHPUnit. Both interfaces have serious drawbacks.

The TestListener system required third-party vendors to create a class that implemented a TestListener interface. As a result, third-party vendors must implement every method of this interface, even if that method is not required. To facilitate implementation, PHPUnit provided a TestListenerDefaultImplementation trait.

The TestListener system allowed third-party developers to manipulate the factually modifiable objects within their implementation to alter test results. The best-known example of this might be an implementation that, when executing tests, checks in which environment those tests are executed and thus, for example, marks and outputs failed tests as successful in a CI environment.

The Hook system allowed third-party developers to create a class that only needs to implement the interfaces that are relevant to the extension. In addition, only scalars and no mutable objects were now passed to these methods. So this system improved PHPUnit’s extension interface: it removed the ability to influence test results, but also required more work for third-party vendors to provide similar functionality.

YOU LOVE PHP?

Explore the PHP Core Track

Learn more

In PHPUnit 10, both systems have now been replaced with an event system. Almost everything in PHPUnit is now an event. All output, both on the console and in log files, is based on events. The development of this event system was led by Arne Blankerts and the author. As mentioned at the beginning, the development of the event system was started at the EU-FOSSA Cyber Security Hackathon in October 2019 together with Stefan Priebsch and Ewout Pieter den Ouden.

In the process, PHPUnit’s internal code, which previously used the TestListener system and ResultPrinter classes, was completely reworked (and in some cases rewritten) to use the event system instead. Due to the self-imposed constraint of using events for all output, both console and log, many confusing and/or missing events were discovered early on.

The new event system is not only superior to the earlier approaches TestListener and Hook. The work on the event system had a ripple effect on the entire PHPUnit codebase. A lot of technical debt was finally paid off. Finding the right places to emit the right events brought to light countless previously hidden inconsistencies and problems.

For example, a concrete event required a canonical and immutable representation of the configuration. As a result, the code that loads the XML configuration could be improved. Likewise, the code that processes the command line options and arguments could be improved. And most importantly, the code that combines these sources into the actual configuration has been significantly improved. When this actual configuration was created, large parts of the command line program could be implemented much more easily. This allowed other parts to be cleaned up, and so on and so forth.

The new event system allows read-only access and now has a large number of event objects (currently 67) that can be created during PHPUnit execution and also processed by extensions to PHPUnit. The event objects that are then passed to these extensions, as well as any value objects that are combined into such an event object, are immutable and contain a variety of information that may be of interest to PHPUnit extensions. For example, all of these objects contain information about runtime, current and maximum memory usage, and much more.

IPC NEWSLETTER

All news about PHP and web development

PHPUnit 10 and its new event system require third-party developers to make significant changes to their extensions and tools for PHPUnit. The PHPUnit development team regrets that this may require significant effort, but at the same time is confident that in the long run the benefits of the new event system will outweigh the costs.

The PHPUnit development team has received promising feedback in this regard. Back in October 2021, Nuno Maduro reported that migrating Pest (an alternative and popular tool in the Laravel scene for running tests based on PHPUnit) from TestListener to the new event system had been a “great” experience. Discussions that the PHPUnit development team had with Filippo Tessarotto were then instrumental in ensuring that solutions like ParaTest could be updated to work with PHPUnit 10.

Separation of test results and test problems

In PHPUnit 10, a clear separation was introduced between the result of a test (failed, failed, incomplete, skipped or passed) and the problems of a test (considered risky, triggered a warning, etc.).

In PHPUnit 9, the internal error handling routine optionally converted errors of types E_DEPRECATED, E_NOTICE, E_WARNING, E_USER_DEPRECATED, E_USER_NOTICE, E_USER_WARNING, etc. into exceptions. These exceptions aborted the execution of a test and caused PHPUnit to consider the test as failed.

In PHPUnit 10, the internal error handling routine no longer converts these errors to exceptions. Therefore, the execution of a test is no longer aborted when, for example, an E_USER_NOTICE is raised. Consequently, such a test is no longer considered to have errors.

The example in Listing 1 raises an E_USER_NOTICE during the execution of a test.

```php doSomething()); } public function testSomethingElse(): void { $example = new Example(); self::assertFalse($example->doSomething()); } } ``` ```php

In PHPUnit 9, E_USER_NOTICE was converted to an exception and the execution of the test was aborted (Listing 2).

```
➜ php phpunit-9.6.phar --verbose ExampleTest.php
PHPUnit 9.6.0 by Sebastian Bergmann and contributors.
 
Runtime:       PHP 8.2.2
 
EE                                 2 / 2 (100%)
 
Time: 00:00.015, Memory: 6.00 MB
 
There were 2 errors:
 
1) ExampleTest::testSomething
message
 
/path/to/Example.php:11
/path/to/ExampleTest.php:13
 
2) ExampleTest::testSomethingElse
message
 
/path/to/Example.php:11
/path/to/ExampleTest.php:20
 
ERRORS!
Tests: 2, Assertions: 0, Errors: 2.
```

This means that using PHP functionality that triggers E_DEPRECATED, E_NOTICE, E_STRICT, or E_WARNING, or calling code that triggers E_USER_DEPRECATED, E_USER_NOTICE, or E_USER_WARNING can no longer hide an error in the executed code. In the example shown above, the assertion line is never reached when PHPUnit 9 is used and the code under test triggers E_USER_NOTICE.

In PHPUnit 10, the E_USER_NOTICE is not converted to an exception and therefore the execution of the test is not aborted (Listing 3). By default, PHPUnit 10 does not display details about deprecations, notices, or warnings. In order for these details to be displayed, the command line options –display-deprecations, –display-notices and –display-warnings (or their counterparts in the XML configuration file) must be used.

```
PHPUnit 10.0.0 by Sebastian Bergmann and contributors.
 
Runtime:       PHP 8.2.2
 
FN                                       2 / 2 (100%)
 
Time: 00:00.015, Memory: 6.00 MB
 
There was 1 failure:
 
1) ExampleTest::testSomething
Failed asserting that false is true.
 
/path/to/ExampleTest.php:13
 
--
 
There were 2 notices:
 
1) ExampleTest::testSomething
message
 
/path/to/ExampleTest.php:13
 
2) ExampleTest::testSomethingElse
message
 
/path/to/ExampleTest.php:20
 
FAILURES!
Tests: 2, Assertions: 2, Failures: 1, Notices: 2.
```

Metadata with attributes

In PHPUnit 10, metadata can be specified for test classes and test methods as well as for tested code units with attributes. Listing 4 shows the specification of metadata with annotations as known from PHPUnit 9 and older versions of PHPUnit. Listing 5 shows the specification of metadata with attributes as it is possible in PHPUnit 10.

```php
doSomething($input);
 
    self::assertSame($expected, $actual);
  }
 
  public static function provideData(): array
  {
    return [
      [
        'foo', 
        'bar',
      ],
    ];
  }
}
```

```php
doSomething($input);
 
    self::assertSame($expected, $actual);
  }
 
  public static function provideData(): array
  {
    return [
      [
        'foo', 
        'bar',
      ],
    ];
  }
}
```

In PHPUnit 10, both annotations and attributes are supported. PHPUnit 10 first searches for attributes for a code unit. If no attributes are found, the system falls back on any existing annotations.

Currently there are no concrete plans if and when the support for annotations will be marked as deprecated and removed.

New assertions

A number of assertions have been added in PHPUnit 10. These include:

assertIsList()
assertStringEqualsStringIgnoringLineEndings()
assertStringContainsStringIgnoringLineEndings()

New command line options

A number of command line options have been added in PHPUnit 10. These include:

–display-deprecations, enables the display of deprecations
–display-errors, enables the display of errors
–display-incomplete, enables the display of incomplete tests
–display-notices, activates the display of notices
–display-skipped, activates the display of skipped tests
–display-warnings, enables the display of warnings
–no-extensions, allows to disable all extensions for PHPUnit
–no-output, allows to disable all output from PHPUnit
–no-progress, allows to disable the progress indicator
–no-results, allows to disable the results display

Removed functionalities

In PHPUnit 10, all functionalities that were marked as deprecated in PHPUnit 9 have been removed. Developers:inside who receive warnings about using PHPUnit deprecated functionality when running their tests with PHPUnit 9 will not be able to upgrade to PHPUnit 10 until they have stopped using that deprecated functionality.

Removal of PHPDBG and Xdebug 2 support

In PHPUnit 10, support for PHPDBG and Xdebug 2 for collecting code coverage has been removed. PCOV or Xdebug 3 are required to collect code coverage.

Removal of integration with Prophecy

In PHPUnit 10, the integration with Prophecy for creating test doubles has been removed. Developers who use libraries such as Prophecy or Mockery in their tests to create test doubles will need to rewrite their tests for PHPUnit 10 or wait for Prophecy and Mockery to support PHPUnit 10. At this time, neither Prophecy nor Mockery support PHPUnit 10.

Removal of assertions

In PHPUnit 10, a number of assertions have been removed, some of which were replaced in PHPUnit 9 with newly added alternatives. These assertions include:

assertNotIsReadable(), replaced by assertFileNotIsReadable()
assertNotIsWritable(), replaced by assertFileNotIsWritable()
assertDirectoryNotExists(), replaced by assertDirectoryDoesNotExist()
assertDirectoryNotIsReadable(), replaced by assertDirectoryIsNotReadable()
assertDirectoryNotIsWritable(), replaced by assertDirectoryIsNotWritable()
assertFileNotExists(), replaced by assertFileDoesNotExist()
assertFileNotIsReadable(), replaced by assertFileIsNotReadable()
assertFileNotIsWritable(), replaced by assertFileIsNotWritable()
assertRegExp(), replaced by assertMatchesRegularExpression()
assertNotRegExp(), replaced by assertDoesNotMatchRegularExpression()
assertEqualXMLStructure(), removed without replacement

Removal of matchers

In PHPUnit 10, the at() matcher has been removed. This matcher previously allowed setting expectations on test doubles that methods would be called in a specific order.

The withConsecutive() matcher has also been removed. This matcher previously allowed expectations to be placed on Test Doubles that methods would be called in a certain order with certain arguments.

Both matchers previously allowed code to be written that introduced temporal coupling. Removing these matchers emphasizes that code that introduces temporal coupling is not timely and should be avoided.

Removal of command line options

In PHPUnit 10, a number of command line options have been removed. These include:

–debug, allowed debug output to be enabled while running tests.
–extensions, allowed configuration of extensions for PHPUnit
–printer, allowed configuration of a class to output test results
–repeat, allowed repeated execution of tests
–verbose, allowed configuring more detailed output while running tests

Removal of the TestListener and Hook systems

In PHPUnit 10, both the TestListener and Hook systems have been removed as interfaces for third-party extensions to PHPUnit. Developers:inside who rely on functionality from extensions for PHPUnit 9 will not be able to use PHPUnit 10 until those extensions have been migrated to PHPUnit 10’s new event system or they have found alternative extensions that are compatible with PHPUnit 10.

IPC NEWSLETTER

All news about PHP and web development

PHPUnit 10.1.0

PHPUnit 10.1.0 was released on April 14, 2023. This release was followed by only a smaller number of patch releases. PHPUnit 10.1.3 was released on May 11, 2023. Below are the new, changed as well as deprecated functionalities of PHPUnit 10.1.

New assertions

New assertions have been added in PHPUnit 10.1.0. These include:

assertObjectHasProperty()
assertObjectNotHasProperty()

New attributes

New attributes have been added in PHPUnit 10.1.0. These attributes include:

IgnoreClassForCodeCoverage
IgnoreMethodForCodeCoverage
IgnoreFunctionForCodeCoverage

New source element in XML configuration

In PHPUnit 10.1.0, a new element has been added to the XML configuration. This element allows to configure a list of directories and files to be considered as source code of a project by PHPUnit. In addition, this element allows to configure in detail how to handle notices, deprecations and warnings that arise from running the source code.

Accordingly, there is now a new Source object that represents the configuration of the element. The element replaces the element, which has now been marked as deprecated.

New methods for creating test doubles

In PHPUnit 10.1.0, a TestCase::createConfiguredStub() method has been introduced, analogous to the TestCase::createConfiguredMock() method that has been present since PHPUnit 9. This method allows to create a test double that has configured methods and return values, but causes a test to fail when called by other, non-configured methods.

New method for configuration by extensions

In PHPUnit 10.1.2, a method has been added to the extension facade that allows an extension to PHPUnit to indicate that the extension intends to replace the entire output of PHPUnit.

Suppression of deprecations, notices and warnings

In PHPUnit 10.1.0, E_USER_* errors suppressed by the @ operator are ignored again.

coverage element in XML configuration

In PHPUnit 10.1.0, the coverage element of the XML configuration was marked as deprecated. This element is replaced by the newly added source element.

Methods for creating test doubles

In PHPUnit 10.1.0, methods used to create and configure test doubles were marked as deprecated. These include:

MockBuilder::enableProxyingToOriginalMethods()
MockBuilder::disableProxyingToOriginalMethods()
MockBuilder::allowMockingUnknownTypes()
MockBuilder::disallowMockingUnknownTypes()
MockBuilder::enableArgumentCloning()
MockBuilder::disableArgumentCloning()
MockBuilder::addMethods()
MockBuilder::getMockForAbstractClass()
MockBuilder::getMockForTrait()
TestCase::createTestProxy()
TestCase::getMockForAbstractClass()
TestCase::getMockForTrait()
TestCase::getMockFromWsdl()
TestCase::getObjectForTrait()

These methods are expected to be removed in PHPUnit 12.

Methods to access aspects of configured source code

In PHPUnit 10.1.0, with the introduction of the element in the XML configuration, methods to access aspects of the configured source code were marked as deprecated. In their place, alternative and newly introduced methods of the source object can be used. These methods include:

Configuration::hasNonEmptyListOfFilesToBeIncludedInCodeCoverageReport(), replaced by Source::notEmpty()
Configuration::coverageIncludeDirectories(), replaced by Source::includeDirectories()
Configuration::coverageIncludeFiles(), replaced by Source::includeFiles()
Configuration::coverageExcludeDirectories(), replaced by Source::excludeDirectories()
Configuration::coverageExcludeFiles(), replaced by Source::excludeFiles()

PHPUnit 10.2.0

PHPUnit 10.2.0 was released on June 2, 2023. PHPUnit 10.2.2 was released on June 11, 2023. Below you can see the new functionalities and those marked as deprecated.

Optional suppression of deprecations, notices and warnings

In PHPUnit 10.2.0, enhancements have been made to allow optional suppression of deprecations, notices, and warnings.

Methods to access aspects of the configured source code

In PHPUnit 10.2.0, methods for accessing aspects of configured source code have been marked as deprecated. Instead, alternative and newly introduced methods of the source object can be used. These methods include:

Configuration::restrictDeprecations(), replaced by Source::restrictDeprecations()
Configuration::restrictNotices(), replaced by Source::restrictNotices()
Configuration::restrictWarnings(), replaced by Source::restrictWarnings()

PHPUnit 10.3.0

PHPUnit 10.3.0 is scheduled for release on August 4, 2023. The following is planned for it.

XML format for log files

For PHPUnit 10.3.0 it is roughly planned to release a new XML format for log files. The XML format for log files used by PHPUnit so far has existed for about 20 years and is based on the XML format used by JUnit. This XML format has the disadvantage that it is not under the control of either JUnit or PHPUnit. In addition, there is no official schema in XSD format that can be used to check the validity of log files.

However, the goal of a new XML format is not to produce another standard. Rather, the goal of a PHPUnit proprietary XML format is to be able to accommodate more information. Thanks to the new event system of PHPUnit 10, there is now significantly more information available, which unfortunately cannot be represented with the XML format currently used by PHPUnit 10.

Further planned releases

On October 6, 2023 PHPUnit 10.4.0 and on December 1, 2023 PHPUnit 10.5.0 will be released.

The post PHPUnit 10 – All you need to know about the latest version appeared first on International PHP Conference.

Quality on the assembly line

IPC editorial team — Mon, 09 May 2022 14:27:10 +0000

“I can’t right now, I have to deploy a hotfix!”, Mark said in an upset tone. “Why, what’s the problem?”, one of his colleagues replied. “Tom—who’s on vacation—used a match expression everywhere! But that only exists in PHP 8 and we use 7.4!”

Mark would love to have prevented the rollout in the aftermath, saving himself the trouble of having to rebuild Tom’s changes while under pressure from management. Once the new version is finally ready to run, colleagues ask themselves the same questions that many others do: “How can I secure my software against problems like this?”

In this article, we’ll dedicate ourselves to the most important tools and steps for this task. We’ll use PHP checks, static code analysis, unit tests, and more. For developing long-term quality, we’ll also use fully automated pipelines that prevent faulty code from being adopted in our Main Branch. Our focus is on individual steps and their connections up to the big picture from PHP’s point of view. Areas such as unit testing with JEST etc. can and should be integrated into a fully comprehensive pipeline. There are so many possibilities for individual tools and frameworks, so we’ll only loosely touch upon them so that they’re at least executable and functional.

IPC NEWSLETTER

All news about PHP and web development

PHP’s weaknesses

I’m sure many of us can relate to Mark’s emotional state. Unfortunately, this is just one of the many potential issues in the daily life of a developer. There are many error sources that can be caused by different versions, approaches, or more. In order to prevent this, first, we need to understand why these problems happen in the first place. PHP has become quite a powerful language, but it wasn’t always. Unfortunately, PHP’s origins lie in a poorly designed language. Because of its inconsistencies, it’s easy to create bad, unstable code. The lack of type safety alone can quickly lead to unplanned chaos and hidden bugs. For instance, in older PHP versions an argument of a function could only be the variable name without a data type. Now it’s finally possible to resolve this unsafe definition and declare it explicitly. The impact on the project when migrating PHP versions could be fatal.

While migrating a very large application from PHP 5.6 to PHP 7.x, as well as many migrations regarding type safety, etc., we found that in the end, our functions’ signatures weren’t consistently optimized for NULL values. Despite our well-intentioned “NULL queries” in our functions, they already crashed when they were called with FATAL exception, as NULL wasn’t allowed without direct specification. As you can imagine, these NULL cases were only in certain situations, in various areas in our application that were overlooked during testing. When I say “we found that in the end…” I mean in the live system. If the NULL checks were done before the functions were called, then there wouldn’t be any problem and the code would run as intended. This should show us that the real source of errors is developers themselves. But you can deliver code that’s perfect for the moment in every “flexible” language! You just need the right steps and tools. And that’s done best in combination with pipelines.

Hello Pipeline

Pipelines contain various subtasks that are executed fully automatically, either sequentially or in parallel. A pipeline always has a single result, which is either positive or—because of an error in a task—negative. This makes pipelines perfect for everything from unit tests and static code analysis to deployments and post-deployment tasks. A pipeline is built quickly. But as always, quickly doesn’t also mean stable. In the beginning, there’s always the question, “Where do I want my pipeline to run?” Depending on the repository hoster, the in-house service can be used for pipelines or an external service connected to the repositories. Bitbucket, GitHub, GitLab and co. now offer smooth integrations for running pipelines. If you want to be independent from them or focus even more on pipelines, there are companies like Buddy (https://buddy.works) dedicated to pipelines and automation.

Once you’ve made your decision, the concept for your pipelines can begin. There are basically no limits for their construction and use. There are two different basic kinds of pipelines: CI and CI/CD. CI (Continuous Integration) tries to return new code changes back into the main thread as often as possible. Focusing on test automation, attempts are made at uncovering new errors or other problems as early as the continuous integration of the new code. This kind of pipeline can be automatically executed after successfully merging a pull request, for example. But it’s also possible for open PRs to check in advance if the submitted code meets our quality requirements. CI/CD, (CD stands for Continuous Delivery or Continuous Deployment), goes one step further. With Continuous Delivery, changes are available directly for live deployment, if they pass the necessary tests and the pipeline’s analyses. With Continuous Deployment, changes are even fully automatically loaded onto the production system after successful tests. The basis for both deployment methods is a widespread and strict Continuous Integration pipeline that should try uncovering as many errors in advance as possible. For their application, we need some processes and tools that must first be integrated into our software.

YOU LOVE PHP?

Explore the PHP Core Track

Learn more

In the beginning, there was software

Unit tests, static code analysis, and other operations need to be prepared and embedded in our application in the best way possible. They can be used in pipelines, but they have no direct connection to them. They provide opportunities for measuring the quality of our application. It’s always important to have the best possible coverage of different kinds of tests. Our error concerning the NULL checks should already be discovered here. The number of tools utilized is growing quickly. So developers don’t lose sight of the forest for the trees, they should always pay attention to the tools’ user-friendliness. Developers can be touchy when it comes to testing and quality control. Only a few really enjoy putting their own code to the test. So it’s crucial that finding and fixing errors in a pipeline is easy and straightforward for developers. Let’s say we create a pull request that starts an extensive pipeline. After a long wait, we find the first error and the pipeline breaks. We fix the problem. Then we wait—error—and the pipeline breaks again. These waiting times can frustrate and exhaust developers. There’s often the reason why in open source software, I don’t make a pull request at the end.

For this problem, a makefile with a few simple commands can act as a solution. With it, each tool should be callable in the configuration, as intended for the software. So, there could be a make test for unit tests, such as make phpstan for running PHPStan, and so on. Since these commands bring the correct configuration, they can immediately be reused for pipelines. This creates transparency and provides a perfect abstraction layer for easily sharing tools or configurations centrally. If you want to make it even more palatable for developers, you can offer commands like make pr or make review that prepare code changes for a pull request (PHP CS-Fixer with Fix Mode etc.). Or you can execute the actual pipeline on the developer’s machine. Now that all our preparations are in place, it’s time to integrate our tools and build our pipeline piece by piece.

Abstraction layer with makefiles

Handling projects today isn’t a simple matter anymore. What do I need to install? How do I start the tests? This is not only annoying, but it also tempts you to do some things less often. One way of counteracting this problem is to introduce an abstraction layer with makefiles. This makes creating and executing commands at the command line level relatively easy. It also has the advantage of being able to exchange technologies and configurations centrally without having to attack pipelines. For instance, the following options are available for the developer, but they can also be used in a CI/CD pipeline:

make install (installs PROD dependencies)
make dev (installs DEV dependencies)
make build (starts SASS compiler, builds artifacts…)
make tests (starts PHPUnit + Jest…)
make phpstan (runs PHPStan)

PHP syntax checks

We’ll start building our pipeline with a few simple syntax checks of the direct PHP binary. This lets us check for missing semicolons or other syntax-related errors without an additional framework. It’s a perfect, easy introduction before we continue with unit tests and other verifications. Because the PHP linter php -l can only interpret and check one file, we’ll use the find function to search for all *.php files. Then the results are piped to our linter command. Using -n 1 we tell xargs to execute one command process per result. To make sure that we only check our own code and are leaving out large directories like the well-known node_modules folder, we can ignore certain paths when searching for files. Whether or not this is really necessary or desirable depends on the project. Last but not least, we can use -P to start several parallel processes to get some performance out of it. The following example searches all PHP files in the current directory (and subdirectories), except for node_modules and vendor, and starts four parallel PHP linter processes each.

find . -name '*.php' -not -path "./node_modules/*" -not -path "
./vendor/*" | xargs -n 1 -P4 php -l

We could now store this, for instance, as make phpcheck in the makefile easily usable for developers.

PHP Minimum Compatibility

As a manufacturer of frameworks or plug-ins, it’s extremely important to pay attention to the PHP versions your software supports. But even with a simple web application rollout, it could quickly happen that you roll out code with a feature that’s not available on the server yet due to an older PHP version. To prevent this problem in the automation chain, you can use the framework phpcompatibility/php-compatibility. Based upon friendsofphp/php-cs-fixer, it provides an extension for checking the compatibility of a minimal PHP version using the command line. Using the PHP-CS fixer installed with Composer with this extension is very easy, but both packages need to be installed and correctly configured. Once this is done, the line below can be used to check if our software still supports PHP 5.6 as a minimum requirement. Additionally, many folders can be ignored during this check. It’s perfect for legacy projects to have unit tests based on a more recent PHP version while the production code is still compatible up to PHP 5.6.

php vendor/bin/phpcs -p --ignore=*/Tests*,*/OtherFolder/* 
--standard=PHPCompatibility --extensions=php --runtime-set testVersion 5.6`

In order to use PHP-CS-Fixer and PHP-Compatibility, both packages need to be installed prior.

composer require --dev friendsofphp/php-cs-fixercomposer require 
--dev phpcompatibility/php-compatibility

In order for PHP-Compatibility to be recognized by CS-Fixer and for it to be specified as the default for our checks, we must store the path accordingly. This can be done either manually or automatically with scripts in the composer.json section. After installation or an update, the code in Listing 1 checks if PHP-CS-Fixer exists (installation only as Dev-Dependency) and automatically configures the corresponding path for PHP-Compatability. It does not interact with the developer (perfect for pipelines), so nothing happens when installing the production dependencies!

"scripts": {
  "post-install-cmd": [
    "[ ! -f vendor/bin/phpcs ] || vendor/bin/phpcs --config-set installed_paths vendor/phpcompatibility/php-compatibility"
  ],
  "post-update-cmd": [
    "[ ! -f vendor/bin/phpcs ] || vendor/bin/phpcs --config-set installed_paths vendor/Qphpcompatibility/php-compatibility"
  ]
}

Unit Tests

Now that our code’s validity has been verified based on syntax and PHP version, we can turn our attention to our application’s functionality. Unit tests play an important role here. This step is essential for a company to accept unit tests, as shown from experience in various teams and developers. All beginnings are difficult, especially when you’re forced to write tests. But experience also shows that there’s greater acceptance towards creating new tests if preparations have already been made. At any rate, a few tests must exist and starting the test suite needs to be easy and fast. Another motivational factor is the automatic execution of these tests in pipelines. If this has already been prepared, a test suite just needs to be extended. This is a much lower conceptual hurdle for developers. The following lines install PHPUnit as a dev dependency and starts it with the initial phpunit.xml configuration file to be created.

IPC NEWSLETTER

All news about PHP and web development

composer require --dev phpunit/phpunit
php vendor/bin/phpunit --configuration=phpunit.xml

The configuration file can be created either manually or with PHPUnit commands. The simple example in Listing 2 shows an execution definition that loads all of the tests from the ./Tests/PHPUnit folder. The default autoload.php in the vendor directory (Composer) is used as the bootstrap file for autoloader, etc. So, all classes in your project should be found in the unit tests.


 
  
    
      ./Tests/PHPUnit

With simple make tests, every developer could start unit tests without much effort. Pipelines are not just successive commands, but they can also bring a narrative with them. That means that in our process—if the checks up until now have been successful—our software has a correct, executable syntax that delivers our expected results in terms of logic and functionality at the lowest level. After this is guaranteed, we should check other things in our approach, like codestyles or perform static analyses. Depending on your taste, this can be done differently. Parallelization of these steps also has its appeal.

PHPStan

PHPStan focuses on finding errors without actually having to execute the code. All files and classes that need to be checked are analyzed for different error sources. Here, PHPStan differentiates between different levels, with level 8 being the highest—and sometimes most annoying—but safest level. In addition to level configurations, you can also set individual settings for checks and rules. You can even implement your own rules, such as pointing out an incorrect copyright annotation (for open source software). Installation is done with Composer.

composer require --dev phpstan/phpstan

Once installed, it’s time to configure PHPStan. This is done with a phpstan.neon file. This configuration includes the directories we will analyze, ignored directories, and bootstrap files for autoloader, additional settings, rules, and more. Listing 3 shows a simple example of a Level 8 check for the current directory. Files in the Resources and vendor folders are ignored.

parameters:
 
  level: 8
  paths:
    - .
  excludes_analyse:
    - Resources/*
    - vendor/*

PHPStan can be easily executed with the analyze command and the configuration file. The results are clearly visible in the terminal output.

php vendor/bin/phpstan analyse -c .phpstan.neon

PHP-CS-Fixer

The PHP Coding Standards Fixer verifies your code against a selected coding standard. You can choose between PSR-1, PSR-3 and many more. It even supports community styles like Symfony. As the name suggests, the fixer can also automatically modify, optimize, and fix your source code. The tool is perfect if you work in teams and want to enforce a uniform standard, but it’s also highly recommended for lone warriors. Installation with Composer is very simple:

composer require --dev friendsofphp/php-cs-fixer

A php_cs.php file is used to configure the CS-Fixer. In it, you can set all settings regarding rules, caching, finding files, and more. The example in Listing 4 shows a relatively simple, executable variant. The focus is mainly on specifying the folders that will be ignored, such as vendor.

setUsingCache(false)
  ->setRules([
    'array_syntax' => ['syntax' => 'short'],
    'ordered_imports' => true,
  ])
  ->setFinder(
    PhpCsFixer\Finder::create()
      ->exclude(['.git', '.github', 'vendor'])
      ->in(__DIR__)
  );

Execution takes place either as a dry run for purely analytical purposes, or directly with an integrated, automatic fixer. The recommendation in the pipeline is clearly on the dry run, since the focus is on the analysis. But a make pr command could automatically optimize the files with Fixer.

# Analysis with --dry-run
php vendor/bin/php-cs-fixer fix --config=./.php_cs.php --dry-run

# Fixing problems automatically
php vendor/bin/php-cs-fixer fix --config=./.php_cs.php

Integration Tests/E2E

After the previous steps have all been successfully completed, we’ll move on to the top-tier testing class. While this isn’t directly related to PHP, it’s an essential means of testing a PHP (or other) application for operation. Automated front-end testing requires a unique level of skill in both approach and stabilization. You also need a stable running environment, preferably one that’s automated in the cloud system where the pipelines are running. This can be easier or more difficult to implement depending on the project, but it always pays off. Plug-in and framework developers might have it easier here. For instance, if you’re developing a plug-in for a platform like Shopware, you can access ready-made Docker images from dockware.io. It’s enough to start the container, install the plug-in, and then you’ll have a simple localhost environment with demo data running that you can test against. This can also be done in parallel in different versions of Shopware.

For your own projects, booting up an environment is necessary, as well as importing databases, images, and other data. Of course, this should comply with the GDPR’s groundwork for anonymized data. Often, the simpler variant is tested against a separately installed test server that the new software version has been installed on. Whichever framework you use to run the tests is up to you. Whether it’s Cypress, Codeception, Ranorex, or something else, the important thing is that you have your tests under control in the long run. It’s recommended that you offer the lowest possible threshold for developers. For instance, you can use Cypress and makefiles to create a way to quickly launch the Cypress graphical interface in the correct configuration with make open-ui. Or you can run tests directly at the command line level using make run.

Next steps

There are still a lot of additional tasks that need to be done in pipelines. This includes creating .env files with sensitive data, performing database migrations with Doctrine Migrations, or additional tasks like configuring plug-ins, settings, clearing caches, and more. While the PHP application’s quality is now assured, additional project-dependent steps may be needed in the area of pipeline optimization. The following is, and remains, particularly important: After your pipeline has run through and your software is released, everything should be ready to go. So, pipelines are regarded as first-class citizens subject to constant adjustments and optimizations in order to offer secure deployments in the future.

Conclusion

Increasing the long-term quality in a PHP project isn’t done with just a few clicks. Preventing the variety of issues that PHP can cause in advance requires a perfectly coordinated combination of tools, pipelines, approaches, and developer awareness. But once you’ve configured this and made it available to your teams with simple tools (makefiles, pipelines, etc.) then nothing can stand in the way of successive optimization and reduced deployment problems. Once the strict pipeline has grown and test coverage is sufficient, even concepts like Continuous Deployment can be meaningfully applied in the project.

Links & Literature

[1] https://www.phpunit.de

[2] https://github.com/phpstan/phpstan

[3] https://github.com/FriendsOfPHP/PHP-CS-Fixer

[4] https://github.com/PHPCompatibility/PHPCompatibility

[5] https://www.dockware.io

[6] https://www.cypress.io

The post Quality on the assembly line appeared first on International PHP Conference.

PHPStan ‒ Gamification for developers and teams

IPC editorial team — Tue, 29 Jun 2021 11:24:27 +0000

PHPStan is more than just a simple PHP linter: The possibilities go beyond the usual finding syntax errors and missing dependencies. The tool manages to analyze code and improve it on this basis. It does not use a rush method – so you don’t have to work on the entire project at once. Users can start small and work their way up, level by level. The levels further you as a developer – and that’s exactly what we want: to develop. PHPStan suggests better solutions and code architecture. For example, it validates array keys that you want to access. If these are not set, you will consequently receive a warning. This way, you can also take care of things that might otherwise disappear in the log files. This results in much better code, which makes you happier as a developer and helps you personally. Good software is sustainable and makes everyone happy.

So let’s start right away and install PHPStan in our current project as a Composer Dependency:

composer require --dev phpstan/phpstan

For the first run on level 5 in Sulu Core, a line in phpstan.neon must be removed:

phpstan analyse src/Sulu/Bundle/PageBundle -c phpstan.neon --level 5

As a result, 126 errors are displayed, as shown in Figure 1.

Fig. 1: The error display of the first pass

The error shown in Figure 2 is something more challenging that can be traced back to the software architecture. So, we found at least one nice error with two commands in the command line – it can be that simple.

Fig. 2: This is a somewhat more serious error

Working together as a team against legacy code

Software quality is a team task. No more legacy code should be written, otherwise, you won’t be able to keep up with the cleanup. Once this is ensured, or at least strongly limited, it can be built upon and progress can be made quickly. Of course in a team, there are very different developers with different skills, but they must have a common denominator. Similar to an athlete’s training, it is important to do this on a daily basis. This is where the different levels of PHPStan excel. Starting at level 1, you can work your way up to level 8. The real challenges await you after level 5.

Cheating is allowed, but it’s less fun

If you really want to, you can exclude certain rules, which makes sense in some cases. A level consists of several rules and rule sets. So, you can start at level 5 and take care of certain code sections later. But cheats are not cool and have no style. Exceptions to code standards are understandable if teams explicitly agree on internal rules and record them in rule sets. In PHPStan, the level achieved is not really earned, so you shouldn’t proudly post on Twitter that you reached level 6. Leveling up is not the primary goal. It’s about improving your code and bringing your skills to a higher level in your daily work – as a team. This is the only way to work together and achieve a common goal. You can also see yourself in a kind of league with other teams. You will feel proud and more satisfied with the work you’ve done.

PHPStan Role Levels [1] at a glance:

Level 0: basic checks, unknown classes, unknown functions, unknown methods called on $this, wrong number of arguments passed to those methods and functions, always undefined variables
Level 1: possibly undefined variables, unknown magic methods and properties on classes with call and get
Level 2: unknown methods checked on all expressions (not just $this), validating PHPDocs
Level 3: return types, types assigned to properties
Level 4: basic dead code checking – always false instanceof and other type checks, dead else branches, unreachable code after return, etc.
Level 5: checking types of arguments passed to methods and functions
Level 6: report missing typehints
Level 7: report partially wrong union types – if you call a method that only exists on some types in a union type, level 7 starts to report that; other possibly incorrect situations
Level 8: report calling methods and accessing properties on nullable types

Baseline in TYPO3 project – the future is now

PHPStan offers a great feature with the baseline. A file can be automatically generated, and existing error lines in the code can be excluded. This way, new code can be subjected to quality standards. This is a very good decision strategy. It saves a lot of work and greatly limits the current refactoring task. With luck, it is already manageable. The TYPO3 core has successfully done this thanks to a lot of passion from Alexander Schnitzler (Listing 1). Here, a combination is possible: new code is validated at a much higher level than existing code. The feature is no accident; this method is effective and correct. Without this knowledge, it isn’t possible to start from a certain project complexity (TYPO3 is huge).

phpstan.level4.neon of TYPO3 Core
includes:
  - phpstan.level3.neon
rules:
# - PHPStan\Rules\Arrays\DeadForeachRule
# - PHPStan\Rules\Comparison\NumberComparisonOperatorsConstantConditionRule
  - PHPStan\Rules\DeadCode\NoopRule
# - PHPStan\Rules\DeadCode\UnreachableStatementRule
  - PHPStan\Rules\Exceptions\DeadCatchRule
# - PHPStan\Rules\Functions\CallToFunctionStamentWithoutSideEffectsRule
  - PHPStan\Rules\Methods\CallToMethodStamentWithoutSideEffectsRule
  - PHPStan\Rules\Methods\CallToStaticMethodStamentWithoutSideEffectsRule
  - PHPStan\Rules\TooWideTypehints\TooWideArrowFunctionReturnTypehintRule
  - PHPStan\Rules\TooWideTypehints\TooWideClosureReturnTypehintRule
  - PHPStan\Rules\TooWideTypehints\TooWideFunctionReturnTypehintRule
conditionalTags:
  PHPStan\Rules\Variables\IssetRule:
	phpstan.rules.rule: %featureToggles.nullCoalesce%
  PHPStan\Rules\Variables\NullCoalesceRule:
	phpstan.rules.rule: %featureToggles.nullCoalesce%
services:
  -
	class: PHPStan\Rules\Classes\ImpossibleInstanceOfRule
	arguments:
  	checkAlwaysTrueInstanceof: %checkAlwaysTrueInstanceof%
  	treatPhpDocTypesAsCertain: %treatPhpDocTypesAsCertain%
	tags:
  	- phpstan.rules.rule

From my personal experience, I can confirm that a baseline is needed in projects that have several full-time developers. All other projects or sub-projects such as extensions, plug-ins, web services, etc. can be worked through and upgraded level by level without a baseline. There is no reason to be scared – start at level 1 with one Composer command. This can also be easily introduced in your GitLab pipeline as a quality gate. Starting simple is the best way.

As an example, we can create a baseline file [2] using the Command Line:

vendor/bin/phpstan analyze --generate-baseline

Corresponding example code can be seen in Listing 2.

parameters:
  ignoreErrors:
	-
  	message: "#^Only numeric types are allowed in pre\\-decrement, bool\\|float\\|int\\|string\\|null given\\.$#"
  	count: 1
  	path: src/Analyser/Scope.php
	-
  	message: "#^Anonymous function has an unused use \\$container\\.$#"
  	count: 2
  	path: src/Command/CommandHelper.php

Contributing to open source – give something back and be happier

Personally, I have made my greatest progress as a developer by contributing to open source projects. Besides meeting nice people from the community, you’ll also get to know many practical tools and learn a lot of basics. We all use open source projects, and are happy about updates, patches, and releases. If you look at how many projects are in our Composer file, it is an impressive self-service store. So it can’t hurt to give something back to the community every now and then to appreciate important work. It also feels great to contribute a real line of code to a world-renowned open source software and it shows up by name in the Git log. You don’t have to fix complete bugs or implement new features. Code refactoring is an important task and with PHPStan, it’s possible to contribute code improvements. This can be done practically in all PHP open source projects, even if they do not rely on PHPStan. First and foremost, it is about improving code, as I regularly do for the Sulu CMS.

You can’t do it without a pipeline and continuous integration

Automatic quality gates, such as static code analysis and tests, must quickly form a true gate. If they are not green, code is not allowed into the production code (formerly called “master”). If they are only executed locally or even only displayed in color in the IDE, that’s nice – but that’s all it is. Either you work with a real code bouncer, or continue as before. Here, PHPStan is ingenious as a static code analysis that can be unleashed on current code without a fully functional environment. Therefore, it is advisable to follow the prescribed path, bring level 0 or level 1 to green, and then run a pipeline with a runner in GitLab. For this, a .gitlab-ci.yml is also available with all Dockerfiles and is open source on GitHub [3]. The never-code-alone page is available as an example of best practices as a PHP CMS project with the Sulu CMS.

Combination with Captain Hook – an unbeatable set-up

Pipelines are good. But the faster bugs are found, the better. So it only makes sense that PHPStorm includes a plug-in for PHPStan and will implement more quality assurance tools in the IDE in the future. Just as importantly, bugs can no longer be committed and brought into the Git repository. This is where Git Hooks are the answer. A CaptainHook configuration for PHP files can be seen in Listing 3 [4].

"pre-commit": {
  "enabled": true,
  "actions": [
	{
  	"action": "\\CaptainHook\\App\\Hook\\PHP\\Action\\Linting",
  	"options": [],
  	"conditions": []
	},
	{
  	"action": "vendor/bin/phpcs",
  	"options": [],
  	"conditions": []
	},
	{
  	"action": "vendor/bin/phpstan analyse src -c phpstan.neon",
  	"options": [],
  	"conditions": []
	},
	{
  	"action": "vendor/bin/xmllint config -v",
  	"options": [],
  	"conditions": []
	}
  ]
}

There’s more than PHPStan – static code analysis is great

One note in advance: There are also commercial tools that can find security vulnerabilities in code. A look into the world of static code analysis is definitely worthwhile. As can be seen in the CaptainHook configuration example above, four static code analyses are used locally before the commit. Here, there is a useful dump finder that validates my Twig files again. It’s worth taking a look, even if it’s not our current topic at hand.

The important thing is: I underestimated static code analysis and relied on it significantly too late. As far as I was aware, there was only a very simple linter. They were not needed at all without syntax errors.

Thanks to today’s IDEs, syntax errors are very unlikely, whether in Visual Studio Code or PHPStan. I would never have thought that tools, and PHPStan in particular, could permanently enhance my code and be so much fun at the same time. CaptainHook and PHPStan formed the beginning of a wonderful journey towards better code, new knowledge, and a lot of fun as a tool setup. And that’s what people are looking for right now.

Conclusion: Creative work can’t be done without fun

Legacy code makes you sick. It’s another topic, but quality gates and tools are not used in the end of applications. Internally, programmers always tend to have high expectations and are perfectionists. If these expectations cannot be met, a devaluing counter-reaction often follows. In a world between 0 and 1, there are rarely half measures. That is simply not true. Even in code, better is simply better. Thanks to its levels, PHPStan can be used appropriately, easily, and starting now. Now, teams can level themselves up. Additionally, new quality requirements offer the opportunity to bring existing open source projects to a better level and give something back to the community. So, friends, just get started – it’s worth it.

Links & Literature

[1] https://phpstan.org/user-guide/rule-levels

[2] https://phpstan.org/user-guide/baseline

[3] https://github.com/nevercodealone/cms-symfony-sulu/blob/master/.gitlab-ci.yml

[4] https://github.com/nevercodealone/cms-symfony-sulu/blob/master/captainhook.json

The post PHPStan ‒ Gamification for developers and teams appeared first on International PHP Conference.

Testing Strategy With the Help of Static Analysis

IPC editorial team — Mon, 28 Sep 2020 12:42:19 +0000

In this article, I’d like to introduce you to the concept of type safety and how it can improve the reliability and stability of your code. Once your code is more type-safe, and that fact is verified by automated tools, you can cherry-pick which parts of your application need extensive unit tests and where you can rely just on well-defined types.

Type System And Type Safety

To have a type system means to communicate what kinds of values travel through code clearly. Since not all values can be treated the same, the more we know about them, the better. If you currently don’t have any type hints at all, adding information to the code whether you’re accepting int, float, string or bool can go a long way.

But when a function declares it accepts an integer, does it really mean any integer? Just a positive integer? Or only a limited set of values, like hours in a day or minutes in an hour? Trimming down possible inputs reduces undefined behavior. Going down this road further means you have to start type-hinting your own objects which comes with additional benefits—not only that we know what we can pass to the function, it also tells us what operations (methods) the object offers.

I’m not saying scalar values are never enough, and you should always use objects, but every time you’re tempted to type hint a string, go through a mental exercise on what could go wrong with the input. Do I want to allow an empty string? What about non-ASCII characters? Instead of putting validation logic into a function that does something with a scalar value, create a special object and put the validation logic in its constructor. You don’t have to write the validation in each place where the object is accepted anymore, and you also don’t have to test the function’s behavior for invalid inputs provided the object cannot be created with invalid values.

For example, you might have a function which accepts a string for an email address, but you must check if the email is valid.

function sendMessage(string $email) {
   if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
      throw new \InvalidArgumentException(
         "Invalid email string"
      );
   }
   // do something
}

Instead, you can flip it and make your function—and any other one—explicitly expect an EmailAddress object as in Listing 1.

Listing 1

class EmailAddress
{
    /** @var string */
    private $email;

    public function __construct(string $email) {
       if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
          throw new \InvalidArgumentException(
             "Not a valid email string"
          );
       }
        $this->email = $email;
    }

    public function getAddress(): string {
       return $this->email;
    }
}

function sendMessage(EmailAddress $email) {
    // do something
}

Once your codebase is filled with type hints, IDEs and static analyzers know more about it, and you can benefit from them. For example, if you annotate a property with a phpDoc (unfortunately there’s no native support for property types yet), these tools can verify:

The type hint is valid, and the class exists.
You’re assigning only objects of this type to it.
You’re calling only methods available on the type-hinted class.

/**
 * @var Address
 */
private $address;

The same benefits stemming from type hints are also applicable to function/method parameters and return types. There’s always the write part (what is returned from a method) and the read part (what the caller does with the returned value).

Listen to the Types

Types can also give you subtle feedback about the design of your application—learn to listen to it. One such case is inappropriate interfaces—when you’re implementing an interface, and you’re forced to throw an exception from half of the methods you have to add to the class, the interface isn’t designed well and will usually benefit from separating into several smaller ones. Using such an interface in production code is dangerous—by implementing it, you’re promising it’s safe to pass the object somewhere the interface is type-hinted but calling its methods will throw unexpected exceptions.

Another type of feedback is making use of information that’s unknown to the type system. If the developer knows and takes advantage of something that isn’t obvious from looking at the type hints, like checking a condition in advance and knowing what a method will return based on the result. It can make the tools fail with a false positive:

if ($user->hasAddress()) {
    // getAddress() return typehint is ?Address
    $address = $user->getAddress();

    // potentially dangerous - $address might be null
    echo $address->getStreet();
}

There’s no machine-readable connection between hasAddress() and getAddress() in type hints. The knowledge the above code will always work is available only in developer’s head or by closely inspecting the source code of the class. You might object that this example is too simple, and everyone understands what’s going on, but there are much more complex examples like this in the wild. For example, a Product object that has every property nullable because they can be empty while the Product is being configured in the content management system, but once it’s published and available for purchase on the site, they are guaranteed to be filled out. Any code that works only with published products has to make $value !== null checks in order to comply with the type system.

A solution to this problem is generally not to reuse objects for different use cases. You can represent published products with a PublishedProduct class where every getter is non-nullable.

Tools for Finding Type System Defects (a.K.a. Bugs)

Because it is interpreted at runtime, PHP itself does not discover type system defects in advance because that’s usually a job of the compiler. A program in C# or Java will refuse to execute if there’s a problem like an undefined variable, calling an unknown method or passing an argument of a wrong type somewhere deep in the code. In PHP, if there’s an error like that in the third step of the checkout process, the developer (or the user) will find it when they execute that line of code during testing or in production. But thanks to the latest advancements in the language itself, like scalar and nullable type hints, it’s now easier to be sure about types of many variables and other expressions just by looking at the code without the need to execute it.

That’s where static analyzers come into play. They gather all available information about the code—besides native type hints, they understand common phpDoc conventions, employ custom plugins and analyze loops and branches to infer as many types as possible.

One of these tools is PHPStan; it’s open-source and free to use (disclaimer: I’m the main developer of PHPStan.) Other similar tools are Phan, Exakat, and Psalm. Besides obvious errors, it can also point out code that can be simplified like always false comparisons using ===, !==, instanceof, or isset() on never defined variables, duplicate keys in literal arrays, unused constructor parameters, and much more. Because running a comprehensive analysis on an existing codebase for the first time can result in an overwhelming list of potential issues, PHPStan supports gradual checking. Its goal is to enable developers to start using the tool as soon as possible and to feel like they’re leveling up in a video game.

vendor/bin/phpstan analyse src/

If you run the PHPStan executable without any flags, it will run the basic level zero by default, checking only types it’s completely sure about, like methods called statically and on $this. It does not even check types passed to method arguments until level five (only the number of passed arguments is checked on lower levels), but it definitely finds a lot of issues in between.

PHPStan is extensible—you can write custom rules specific to your codebase and also extensions describing behavior of magic __call, __get, and __set methods. You can also write a so-called “dynamic return type extension” for describing return types of functions or methods which vary based on various conditions like types of arguments passed to the function or the type of object the method is called on. There are already plenty of extensions available for popular frameworks like Doctrine, Symfony, or PHPUnit.

How to Save Time with a Static Analyzer

We already established there is a lot of value to gain from the type system. But how can we use it to save precious time and resources? When testing a PHP application, whether manually or automatically, developers spend a lot of their time discovering mistakes that wouldn’t even compile in other languages, leaving less time for testing actual business logic. Typically, there’s duplicate effort because some bugs are discovered by both static analysis and by unit tests as in Figure 1.

Figure 1.

Since tests must be written by humans and represent code as any other, they are very costly—not only during the initial development but for maintenance as well. Our goal should be to make those two sets disjoint, so we don’t write any test which can be covered by static analysis. And while we’re at it, we can try to make the blue circle as big as possible so the type system gains more power and is able to find most bugs on its own.

One could object we’ll save time by not writing redundant tests, but that time will get used up by writing classes, adding type hints, thinking about interfaces and structuring the code differently to benefit from the type system as much as possible. To counter the objection, I’d argue tests get written only to prevent bugs, but solid and strong types have benefits reaching much further—they improve readability and provide necessary communication and documentation about how the code works. Without any types, the orientation in the code is much harder not only for static analyzers but for developers too.

PHP is naturally very benevolent about the handling of data which usually goes against safety and predictability. Many of the tips I’m sharing below are about cutting down the space of possible outcomes, resulting in simpler code.

Tips for More Strongly-Typed Code

1. Don’t Hide Errors

Turn on reporting and logging of all errors using error_reporting(E_ALL);. Especially notices (e.g., E_NOTICE), regardless of their name, are the most severe errors that can occur—things like undefined variables or missing array keys are reported as notices.

2. Enable Strict Types

Use declare(strict_types = 1); on top of every file. This ensures only values of compatible types can be passed to function arguments, basically that “dogs” does not get cast to 0. I can’t recommend this mode enough; its impact can be compared to turning on notice reporting. The per-file basis allows for gradual integration—turn it on in a few selected files and observe the effects, rinse and repeat until it’s on in all the files.

3. Encapsulate All Code

All code should be encapsulated in classes or at least functions. Having all the variables created in the local scope helps tremendously with knowing their type. For the same reason, you shouldn’t use global variables. Instead of procedural scripts stitched together via include and using variables appearing out of nowhere, everything is neatly organized and obvious.

4. Avoid Unnecessary Nullables

Avoid nullables where they’re not necessary. Nullable parameters and return types complicate things. You have to write if statements to prevent manipulating with null. More branches signify there’s more code to test. Having multiple nullable parameters in a method usually means only some nullability combinations are valid. Consider this example which sets a date range on an object:

public function setDates(\DateTimeImmutable $from, 
                         \DateTimeImmutable $to);

If a requirement comes that the dates should also be removable, you might be tempted to solve it this way:

public function setDates(?\DateTimeImmutable $from, 
                         ?\DateTimeImmutable $to);

But that means you can call the method with only a single date, leaving the object in an inconsistent state.

$object->setDates(new \DateTimeImmutable('2017-10-11'), 
                  null);
$object->setDates(null, 
                  new \DateTimeImmutable('2017-12-07'));

You can prevent this from happening by adding an additional check to the method body, but that bypasses the type system and relies only on runtime correctness. Try to distinguish between different use cases not by making the parameters nullable but by adding another method:

public function setDates(\DateTimeImmutable $from, 
                         \DateTimeImmutable $to);
public function removeDates();

5. Avoid associative arrays.

When type-hinting an array, the code does not communicate that the developer should pass an array with a specific structure. And the function which accepts the array cannot rely on the array having specific keys and that the keys are of a certain type. Arrays are not a good fit for passing values around if you want maintainability and a reliable type system. Use objects which enforce and communicate what they contain and do. The only exception where arrays are fine is when they represent a collection of values of the same type. This can be communicated with a phpDoc and is enforceable using static analysis.

/**
 * @param User[] $users
 */
public function setUsers(array $users);

6. Avoid Dynamic Code

Avoid dynamic code like $this->$propertyName or new $className(). Also, don’t use reflection in your business logic. It’s fine if your framework or library uses reflection internally to achieve what you’re asking it to do but stay away from it when writing ordinary code. Do not step outside the comforts of the type system into the dangerous territory of reflection.

7. Avoid Loose Comparisons

Don’t use loose comparisons like == or !=. When comparing different types, PHP will try to guess what you mean which leads to very unexpected results. I circled the most surprising ones in Figure 2. Did you know “0” equals to false? Or that array() equals to null?

Figure 2.

Instead, use strict comparisons like === and !==. They require both the types and the values to match. In case of objects, === will return true only if it’s the same instance on both sides. In case of DateTime objects, where the comparison operators are overloaded by the language, using == was acceptable. When PHP 7.1 introduced microseconds, it broke a lot of code. I recommend comparing DateTime instances by first calling – >format(), stating the required precision and then compare the strings using strict operators.

8. Avoid Empty Comparisons

A similar case of loose typing is the empty construct. Here’s a list of values considered empty:

“” (an empty string)
0 (an integer)
0.0 (a float)
“0”
null
false
an empty array

This makes empty unusable for any input validation. Instead, work with specific values and write a specific comparison of what you’re trying to achieve. Don’t use empty when asking about an empty array, use count($array) === 0. Don’t use it for detecting an empty string because it would not accept “0”, write $str !== ” instead.

9. Have only Booleans in conditions

If you look at the table summarizing loose comparisons, the true and false columns summarize what happens when a bare value is put into a condition, and the result can surprise you. Again, use explicit comparison against the value you’re looking for. Tip: PHPStan’s extension phpstan/phpstan-strict-rules enforces this and other rules for strictly-typed code.

What Tests Do We Need?

All the tips above were of local character—how to improve specific places in your code to get the most out of the type system. What I’m about to share now impacts the architecture of the whole application and makes opinionated decisions about what has to be unit-tested and where the static analysis of the type system is sufficient.

First, let’s clarify why I think a unit test is the most valuable kind of test and what criteria it needs to meet. Unit tests focus on testing one small unit, usually a class or a function. When a unit test fails, we know exactly which place needs to get fixed. In contrast, when an integration or even a UI test fails, we have no idea where to look. A third party service could be down, our database could have changed, or maybe we just moved some button 10 pixels to the right and changed its text. Unit tests shouldn’t have any dependencies outside of the code. They should not connect to the database, access the filesystem and send anything over the network. Since they focus on testing just the application code, they tend to be very fast. Having some integration tests help, mainly for things that can’t be tested with unit tests because of their nature, but unit tests should form the big and solid base of your test pyramid.

Application code can be divided into two main types: wiring and business logic. Wiring is what holds the application together; controllers, facades, passing values along, getters and setters. Business logic is what justifies the existence of our application and what we’re paid to do, such as mathematical operations, filtering, validation, parsers, managing state transitions, etc. A static analyzer will tell you if you’re calling an undefined method, but it can’t know you should have used ceil() instead of floor() or that you should have written that if condition the other way around.

So how can one justify the existence of wiring? In modern applications, there’s a lot of it, and we’re better for it. Wiring makes reusability possible. We can extract common pieces of code and call them from multiple places. It improves readability by splitting code into smaller chunks. Thanks to wiring, our code is more maintainable.

But because of its nature, testing it can become tedious and boring. Testing setters, getters, and whole classes whose only purpose is to forward values to subsequent layers does not yield a lot of revealed errors and is not very economical. Once we have type hints for everything and employ a static analyzer, it should be enough to verify the wiring code works as expected.

The role of the entry point to an application, like a controller, is to sanitize all incoming data and pass them further down the line as well-defined and validated types. If you allow strong types to grow through your application, it becomes much more solid and statically verifiable as shown in Figure 3.

Figure 3.

Unit tests should focus on business logic. A classic mistake is to interweave business logic with data querying, making the job of a unit test much harder:

function doSomething(int $id): void {
   $foo = $this->someRepository->getById($id);
   // business logic performed on $foo...
   $bars = $this->otherRepository->getAllByFoo($foo);
   // business logic performed on $foo and $bars
}

With this code structure, you don’t have any other option than to mock in order to provide the interesting lines you want to test with some data. Mocking in a correctly architectured application shouldn’t be necessary because it’s white-box testing and by definition dependent on the inner structure of the tested code, therefore more prone to breaking.

Instead, separate the business logic a to a different class and design its public interface to receive all the data it needs for its job. This class does not need to perform any database queries or other side effects. It will receive data from the real source in production and manually created data in tests. You can write a lot of simple unit tests without any mocking and test every edge case since you saved a lot of time by not testing the wiring code (Listing 2)!

Listing 2

class Calculator
{
   /**
    * @param Foo $foo
    * @param Bar[] $bars
    */
   public function calculate(Foo $foo,
                             array $bars): CalculatorResult {
      // ...
   }
}

This is also the reason why I like using ORMs like Doctrine. They get a bad reputation because people use them for the wrong reasons. You shouldn’t look to an ORM to generate SQL queries for you because the tool doesn’t know what you will need in advance, resulting in poor performance. You shouldn’t use an ORM so you can switch to a different database engine one day. Quite the opposite—you’re missing out if you’re not using the advanced features of your database of choice. The reason why I like to use an ORM is because it allows me to represent data as objects—they can be type-hinted, constructed by hand for the purpose of unit tests, and contain methods which guarantee consistent modifications.

With tests, you can measure code coverage, a percentage of executed lines of code during test runs. I propose a similar metric for static analysis. If having more type hints means we can rely on the code more, there should be a number associated with that. I propose the term “type coverage” to signify how many variables and other expressions result in mixed and which have a more specific type.

Closing Words

Static analyser should be in the tool belt of every PHP programmer. Similarly to a package manager and unit testing framework, it’s an indispensable tool for making quality applications. It should run besides tests on a continuous integration server because its function is similar – to prevent bugs. Once we get used to the feedback from the type system, we can concentrate our testing efforts in places where the static analyzer can’t know how the right code should look. In the rest of the application, it has us covered.

The post Testing Strategy With the Help of Static Analysis appeared first on International PHP Conference.

Crowdtesting in quality assurance – The massive hunt for mistakes

IPC editorial team — Wed, 29 Aug 2018 08:50:29 +0000

Crowdtesting usually describes the fee-based testing of software, by a broad mass of free and independent tester through the internet. It is something comparable to an outsourced measure for quality assurance. And it can be used very flexibly, right from the very first click dummy in the process of software development. Primarily, the goal is to test application software on all relevant end devices and operating systems.

The term crowdtesting is based on the term crowdsourcing (crowd and outsourcing), which was introduced by Jeff Howe in his article “The Rise of Crowdsourcing” from 2006 [1]. There were first attempts to define the test method between 2009 and 2011, accompanied by an amalgamation of contributions to the topic.

The professional use of this test method as a service was developed at the end of 2010 and was met with approval in the start-up scene at the beginning. With the growing number of devices on the market, crowdtesting is becoming increasingly more important for companies developing application software, especially for the mobile development.

Figure 1 Shows the three most important participants of crowdtesting.

The provider and his platform

Numerous providers are available for a crowdtest operation. They either are dedicated exclusively to this service or they offer it as an additional service. Popular representatives include the following providers [2]:

RapidUsertests (Berlin, Germany)
Applause (Framingham, USA)
Passbrains (Zurich, Swiss)
BugFinders (Cheltenham, Great Britain)
99tests (Bangalore, India)
Testbirds (Munich, Germany)

With 150 000 registered testers and 380 000 devices in all parts of the world, Testbirds is one of the world’s biggest crowdtesting platforms. Due to the testers being scattered around the world, they can test application software in different languages and on a wide variety of devices.

The provider and his platform act as a central communication point between the testers and the client:

Providers and testers communicate over a web-based platform, which provides all necessary functionalities. Depending on the requirements, a project manager on the provider side, for example in quality assurance, supports the results which are provided by the testers and he also serves as a first point of contact for other open issues.
Providers and clients use the web platform for communication and also for the possibility to arrange direct agreements. Usually, this feature is carried out by the testmanager of the client. Under certain framework conditions, agreements and commitments do contain restraints, devices or operating systems, through which the tests are executed. This way, the testmanager is accompanying the crowdtesting process from begin to end.
Testers and contractors, in this case the developers of the client, only rarely get into contact – if they do, then it’s indirectly – about found defects (errors), via contributions on the platform.

The tester and his heterogeneity

Testers do represent all social classes, interest groups, cultural and religious affiliations, age classifications, and people with different technical affinities, whether these are professional or just for fun in regards to the topic. Thus, there are theoretically no restrictions on who can qualify as a tester. In practice, of course, there are basic conditions that must be fulfilled, for instance a person’s business capability.

These persons have the possibility to register on a crowdtesting platform, until a test object is ordered. If the testers meet the criteria and bring the appropriate test environment (device) with them, interested parties can then request a test run. In practice, testers are classified according to, for example, their experience or their frequent participation and are then subsequently selected. Then, they must document and record any error, which they find a commission, accordingly.

The testers receive a payment for the services, which they do preform. These payments do vary from provider to provider and usually are orientated on the number of defects found and tiered according to risk classes. Other factors, which can affect payment, are the experience of the testers and the duration of their platform membership.

The client and his software product

All persons and companies who develop or have application software developed are eligible as clients. The main focus is either on finding errors or on user and market analysis, through the feedback of usability suggestions. Furthermore, the client is hoping for a multitude of advantages, which will be considered separately in the following.

The advantages of crowdtesting for the client

The Internet of Things (IoT for short) [3], and especially the test coverage which is to be achieved on the countless mobile devices, are making crowdtesting more popular. Even large software companies with a powerful QS team are no longer in the position to satisfactorily supply the market of devices with quality. The availability of software on many different devices is becoming increasingly important and marks a significant competitive advantage. In the long run, you will no longer be in the comfortable position to exclude devices, and thus important market shares, in order to leave them to the competition. This raises the question as to how many devices, with different specifications and random operating system, can be tested, which will become obsolete again tomorrow.

Summarized this does result in six features which clearly speak to the use of crowdtests (Fig.2).

Flexibility:

It is technically possible to place a crowdtest during the entire software life cycle. Nevertheless, it has to be considered whether this does make sense or not, because not every phase is a suitable candidate. The classical use of crowdtests is the functional testing in search of errors. In earlier phases a crowdtest can be applied to click dummies in order to perform usability tests or market analyses. The crowd shows a high temporal flexibility in the execution of the tests, which can be carried out at any time. For example, the client can view the defects found over the weekend without any time delay at the beginning of the week and can then proceed to rectification. The tests can be carried out either explorative, i.e free of any specifications, or with specifications and up to very strict basic conditions.

Cost reduction:

Basically, it can be assumed that all necessary devices, their versions and designs, which are relevant for the market, are represented in a large crowd. Therefore the customer can save himself the provisioning of an extensive equipment pool and thus the costs of the administration, maintenance and the versioning. Also, the outsourcing of the test operation does free up internal resources in the form of employees, who are otherwise permanently being bound.

Real market conditions:

An end user should test the application, unaltered and without being bound to a place and time. This best reflect the real market. And to ensure neutrality in testing, the tester should be independently employed. A crowdtester meets these requirements.

He’s testing, whenever it is convenient: Whether that may be on the sofa, or while he is walking, or when he is on the bus under conditions of a fluctuating reception. It does reflect feedback from real end users in real scenarios.

High diversification of end devices:

In theory, crowdtesting is distinguished by the availability of all the end devices and their combinatorics with other factors of influence such as operating systems, browser and display size. Hence, this makes it possible to test and ultimately to guarantee the compatibility of software with as many end devices as possible. The strength of crowdtesting is that it is possible to test comprehensively. For traditionally used methods (regression-, smoke- or functions tests etc.), this is no longer affordable and almost impossible, in the times of IoT.

Experienced and unbiased testers:

The testers experience is quite broad. Some earn their living with their experience, while other only do it sporadically, for fun or just out of sheer curiosity. This is the strikes of balance between experience-oriented testing and the explorative, naive approach. The impartiality of the testers, which the International Software Testing Qualifications Board demands in its ISTQB standard [4], is guaranteed. Also, the testers are not involved in the development process.

Fast and precise feedback:

The documentation of defects is done via a template and complemented with video recordings or images. In the case of video recordings the defects can be reproduced quickly and unambiguously. Direct feedback to the testers is platform-dependent and possible, for example, through a chat or a comment function. In rare cases, the personal contact data is stored. Since the recordings are created during the testing, they are available on the platform almost simultaneously. This way it possible to improve the usability during the development phase.

Method meets reality

The advantages of crowdtesting do inevitably correlate with the provider and his platform. Therefore, we are going to consider the following, connected disadvantages, their causes and measures for compensation, from practice. Make note of the following points, because they are the very basics of every successful crowdtest operation.

No acceptance without usability:

The usability of the platform and the allocation of the most important and necessary functions are critical for the use and acceptance of the platform.

A good provider does have a modularized platform which enables a fast configuration, in order to provide only the most necessary of functions for a customer request. Furthermore, the interface must be fun and easily understandable. Because a lengthy registration process for new users is the first indicator of a lack of usability. If the application has too many functions, extensive paths of navigation and never ending drop down menus, then it is strongly advised to not use this platform.

A defect-documentation free of Interpretation:

If there is no possibility of direct feedback, then a documentation of defects should be possible without interpretation. For a fast reproduction of reported defects, they must be documented without interpretation. Here it is recommended to record the creation of the defect, for example, with a short video. The reason behind this procedure is that only in the rarest of cases, there is a direct feedback possibility to the testers, for example, in the form of a telephone number. The platform provides a query function, e.g. a forum, in most cases though. However, the earliest an answer can be expected in this forum is after three hours, no matter how urgent the actual question might be. No tester is obliged to be available on demand outside of their testing hours. And it’s not unheard of that a written documentation turns out be incomplete. In the case of no possible direct communication with the tester, it should be checked, whether the documentation is possible and feasible as a video. This approach does prevent a lot of frustration during the reproduction and correction.

Use with caution:

Crowdtests are promoted as being flexible all-purpose weapons which they are not. Therefore, it makes sense to examine the usefulness and area of application. A crowd test is a good idea, if the company structure and the prevailing development model can integrate it. For example, the test can be excellently integrated as an additional QS method in the V-model. In agile development models with the goal of 100% test automation, an explorative crowdtest at the end of development does make sense. We’re talking about the top of the test pyramid. The prevailing methodology should continue to be applied. Although, a widespread use is not recommended, as the costs incurred are better invested in other test methods. Furthermore, the test form is very well suited for a usability test, a code review or a functional test in the final stage. This is less true for module or integration tests. It is recommended to check to what extent crowdtesting can supplement the existing quality assurance, but not replace it. A reputable provider, who has a long-term customer loyalty in mind, will analyze the circumstances together with the customer and will also point out suitable application possibilities.

Restricted real market conditions:

Real market conditions are only conditional. Who waits for events to be paid for when the Internet connection is bad? The primary goal of the tester is to find defects. After all this for what he gets payed for, unless he is somehow remunerated for his working hours. Therefore, the testers want to navigate through the application as quickly as possible in order to generate inputs that cause the application to stumble. This works best with a good Internet connection and in a quiet environment. If an on-the-go application is about to be tested real life conditions, then you should be aware of the point that the crowdtest is usually performed at home, in order to avoid performance issues.

Many testers, many duplicates:

Many testers mean quite a lot of duplicates – so a good support by the provider is absolutely essential. The testers do not know each other, neither are they aware of the tests results. Therefore, the probability is very high that a found error will be recorded several times. Here, the provider is obliged to accompany the crowdtest with a test manager that ensures high-quality documentation and results without duplicates. The scope of the provider’s services should be examined and renegotiated if necessary.

Additional work through crowdtests:

You live and breathe Scrum and avoid unnecessary extra work true to the Agile Manifesto? Crowdtests require planning and coordination and thus generate additional costs. Especially for agile development methods, with a high test automation (test-driven development) [5], a crowdtest as an explorative test can be of great use. However, this does require a coordinator or a test manager, who does not always belong into the philosophy (here for example, Scrum). With well thought out platforms and good interfaces for other tools, good providers minimize the planning and coordination effort for the client with the provider. It is impossible to avoid appointing a person responsible for coordination. Another recommendation is, to ask for a demonstration close to the project in order to get an insight into the processes and to be able to better estimate additional expenses.

Does the crowd meet the requirements?

A crowd is only as versatile and flexible as the mass it embodies. The application will be tested today in German and tomorrow in Spanish, Chinese and Russian with twenty of the most popular devices and operating systems? Not every crowd can keep up with every scale. A provider with 100 000 testers worldwide is usually much more diverse and flexible, can react better to changes and continue to meet customer requirements with high scaling. It is a good advice to be informed about the size of the crowd and to select a suitable provider. Also, it must not be the cheapest one, who won’t fit again tomorrow. Here, the advice is to plan in medium terms to avoid changing the provider.

To avoid effort in defect management:

The interfaces to the most common application lifestyle management systems (ALM) do minimize the effort and facilitate the further processing [6]. The defects documented on the platform are rarely corrected immediately and go through one or the other processing status in the regular process of the company. Maintenance and tracking are then carried out in the tool specified by the company, e.g. HP ALM [7] or DOORS [8], so that the defects created on the platform are transferred to the system. This process is very time consuming without an interface to the system, because the defects are retightened manually. It must be ensured that the provider deploys an interface to the customer’s system or exports a format which can be imported from the in-house system.

Final advice for the usage

It is almost an impossible to test the quality with crowdtests alone – they are simply not suitable for this task. However, the method is highly recommended as a supplementary and flexibly plannable quality assurance method. Especially with regard to the countless devices and their volatility, crowdtesting does avoid an elaborate and expensive test pool. Explorative tests, usability tests and market analyses are the great strengths of the method. It is difficult to give a general recommendation for the use of crowdtests, because their use depends on many individual factors. An examination of the usefulness is necessary in each case by the client; a reputable provider advises for this independently and supports prospective customers with the selection.

The success of the application stands or falls with the provider and its platform. Here the wheat separates from the chaff. Therefore, the scope of the service and support must be observed and checked. It does make sense to take a close look at the supplier, to have a detailed demonstration of the application and the processes shown and to use the method first in an uncritical project. Users will quickly notice if the requirements are not met or if the effort involved in processing the defects found has been estimated unrealistically.

Conclusion and outlook

All things considered, a well-placed crowdtest with a carefully selected provider will have an impact on the quality of a software solution.

I am firmly convinced that crowdtesting will bring about far-reaching changes in quality assurance and will see significant growth. Existing test centers which have been outsourced to near-, off- or smart shore projects for cost reasons will become significantly less important. After all, they are only a reflection of the old, internal structures and processes, in a different place, at a different time, at the same or even higher expense and at slightly lower costs.

All these arguments will no longer meet the requirements of modern and flexible quality assurance in the future. Because crowdtesting is neither tied to one place nor to one time, causes significantly fewer processes and thus saves time and costs. In the foreseeable future, outsourced test centers will either become crowdtesting platforms themselves or they will disappear completely, as crowdtesting platforms are already successfully offering test center services.

The post Crowdtesting in quality assurance – The massive hunt for mistakes appeared first on International PHP Conference.

Is my shop up to the task? Realistic performance tests for operative security

IPC editorial team — Tue, 14 Aug 2018 10:19:21 +0000

With the relaunch of a new website or a new version of a web application, it is often the case that you are only going to have an application at your disposal, which is sufficient enough to perform a load test, just in time for the relaunch. It is then for the sake of simplicity, and also because we did not schedule enough time for this in the first place, that we fall back on simpler tools, such as ab (Apache Benchmark) or siege. Both tools allow you to call a single URL, often and in parallel, and also give you a first indication of the website’s response time. The result of both tools is simple numbers that are easy to communicate: Requests per second and minimum, maximum and average response times.

However, in most cases I advise against taking such a seductively easy way. None of the problems, which I found with the realistic customer load tests, would have been uncovered this way and the mentioned numbers are actually quite limited in their significance. The same does also apply to the problems, which caused a failed relaunch. And on the other hand so far, I could ensure it to 100 percent through load tests, that the expected load was mastered by the tested website. How exactly can this be achieved?

Realistic user scenarios

The first and probably most important aspect is the realistic user scenarios. This does propose the question, as to what it is, what the website is used for. This means that we have to ask which actions are usually performed by the visitors and in what kind of ratio these actions stand to each other.

Using the example of an online shop, this could be the following scenario for example:

An anonymous random-browser
A new customer log in
A logged in random-browser
A check-out of a filled shopping cart

Additionally, depending on the online shop, other relevant user scenarios may also occur, for example if popular configurators exist or notepads play an important role. For each scenario it should be known how high the share of this scenario in the total traffic of the website is. These figures are sometimes difficult to estimate, because it is not necessarily possible to assess user behavior perfectly after a relaunch, a new website or even during an advertising campaign. However, the utilization of previous figures or customary figures from one’s own business area usually provides a sufficiently accurate estimate.

Why these scenarios?

Caching is the primary reason for using realistic scenarios. This does not only mean the own caching in the web application or in reverse caching proxies (varnish, nginx), but also the caching of various layers, which are not directly under your control, such as database servers, opcode caches or kernel caches. If, as it is the case with ab and siege, a small number of consistent URLs is called again and again, all these layers can store the corresponding data in the memory very easily and efficiently. Unfortunately, this behavior does not correspond to the expected behavior of real users, so these tests have no significance.

If you look at the simple schematic representation of an online shop in Figure 1, today such a shop consists of many components, which often have their own cache runtimes and contexts. While the header and navigation are often comparatively static, comments do change when they are released; product descriptions change when new data comes from ERP and the product inventory changes when products are sold. Elements such as the shopping cart are even directly linked to the user and are usually not cached at all.

Often, either reverse caching proxies with Edge-Side Includes (ESI) are used to cache such individual components or this takes place directly inside the web application. Even if all this is not the case, database servers cache the same queries, or the kernel caches requests to always the same files. However, the calculation or recalculation of content is exactly the action that costs the most resources on the server, so that it has the greatest effect on the load of the systems and must be simulated as realistically as possible.

For this reason, scenarios like the above are designed and must be executed in parallel with different users and sessions. This causes cache misses in all systems, caches are recalculated and possible cache hits sink to a normal level.

Number of calls

If you know the user scenarios, you also have to know in which number you can run them in parallel. Basically there is of course the possibility to set the number of parallel users higher and higher until the own servers collapse under the load. However, it usually makes more economic sense to understand how many users are expected in order to optimize the servers accordingly. It has to be taken into account that the users of most websites are not evenly distributed over the day, but there are specific core times that should be simulated.

For example, if in a German online shop with a core time of 18:00 to 22:00 only 240 000 page impressions (PI) per day is known, we should not aim to simulate 10 000 PI/hour (1 request/second), but probably more 40 000 PI/hour (3 requests/second). Because of a lack of equal distribution in normal user behavior, a simulation of 5 requests/second is probably the safest option in this case. In the best case, access logs are available, which help to find out not only the meaningful number of requests per second, but also the distribution to the individual user scenarios.

JMeter

There are now several tools and frameworks, besides ab and siege, which can perform meaningful load tests on the basis of realistic user scenarios. For example Apache JMeter, which is already very long on the market, freely available, Open Source and it is functionally very complete. Even if the creation of the user scenarios normally takes place via a user interface that requires getting used to, in my opinion this is still the most sensible tool for carrying out larger and realistic load tests.

JMeter allows automatic remote control of tests, can use clusters of servers to generate the necessary load (which is rarely necessary), and implements all conceivable protocols besides HTTP(S) to test special web applications. At the beginning you have to get used to the names for the individual concepts – but it has been shown that they can be used to create meaningful and reusable tests:

Thread group: A thread group is exactly a user scenario as described above. For example, there may be a “Random Surfer” thread group in which we simulate a search engine bot or an unregistered user.
Timer: A timer defines the intervals between several actions in a thread group. This is more important than you think, because real users don’t always wait exactly the same time between two clicks. Usually the click intervals of users are distributed in a Gaussian normal distribution around a defined value. In JMeter there is a configurable Gaussian Timer for this purpose.
Controllers: Controllers allow you to implement logic, such as decisions or loops, based on variables or input values.
Configuration elements: Data is provided via configuration elements. This can be a cookie manager or sample data from a CSV file to provide JMeter with existing user data in the test system for log-ins.
Sampler: The samplers perform the actual requests based on the elements mentioned above. For Web pages, these are usually HTTPS requests. JMeter also supports FTP, SOAP and many other protocols.

With these elements, even more complex interactions such as forms to fill out or XMLHttpRequest-based website interactions can be simulated completely without any problems. Relatively simple and common tasks, such as reloading images and other assets or managing session cookies for individual users, are performed by JMeter itself.

However, it does not support an interpretation of JavaScript on the website and the theoretical automatic simulation of the use of a single page application. JMeter can be easily clustered to generate a very high load and aggregate the results in a meaningful way.

To measure correctly

At the beginning it was said that ab and siege provide simple and easily communicated numbers, like requests per second and the average response time. Of course JMeter does this too, but the significance of these numbers is rather limited. What I see as more significant are the following things:

Error rates: The number of errors (status code ⋝ 500) that individual systems return. Especially at high load these often occur more frequently and require a more detailed analysis.
95 % percentile: 95 % percentile is the response time, for which more than 95 percent of all responses are faster. This number is much more relevant than the average of all response times, because it provides an estimate of how long users actually have to wait and better ignores individual outliers (minimum, maximum). In addition, other percentiles like 50 percent, 90 percent, 98 percent and 99 percent, are often considered. These can be calculated correctly from the data supplied by JMeter.
Requests per second: We actually only look at the requests per second to determine whether the notified number of requests was actually reached by us in the load test.

In addition to these statistics, there is another crucial point in measuring: We usually want to know not only whether the website can withstand the load, but also which systems reach their limits and in what way.

In Figure 2 you can see a schematic and simplified server landscape of web applications. At all relevant interfaces and on all systems we should try to measure, where the respective bottlenecks are. Network throughput can usually be easy monitored with ifstat or iftop. The most important system metrics, such as memory utilization, IO wait and system load can be obtained from vmstat. From external systems you want to measure at least the response times, if you don’t use monitoring solutions like Tideways, which do that for PHP code anyway. Depending on the service, further dedicated monitoring solutions like Tideways, New Relic or App Dynamics may be required.

After analyzing all this data together, in most cases you can already very accurately conclude the causes of possible performance problems. For this it can be profitable to put together your own experts from operations and development or to call in external experts.

Tips and tricks

External services are mentioned for the first time in Figure 2, and of course the response times under load should also be observed here. Under external services we understand simple components such as mailers, but also external or own web services (Microservices), which are integrated into our own application. In the case of external web services, the operators should always be informed about an upcoming load test. It would not be the first time, that a load test of the own systems forces an used external Web service to its knees.

Another option is to disable (mock) external services during a load test run. This can sometimes be useful for cost reasons, but the information if external services can withstand the expected load, is also extremely important for later operation.

As hardware during a load test run, the real, later productively used hardware should always be used for the web application. A docker container or a virtual machine behaves completely different under load, than a server at Amazon S3 or a bare metal server. In PHP applications, the IO wait (waiting for the hard drive) is often one of the bottlenecks. But virtualized file systems in particular behave very differently from real ones which operate on real hard drives or SSDs.

The hardware and the connection of the test servers, i.e. the server on which JMeter runs, are also relevant. If you don’t want to test the hosts’ network connection, we usually recommend, that you put test systems in the same data processing center as the systems, which are being put under load. The primary goal is to ensure that the load can be reliably simulated, without possibly blocking the test systems by the hosts’ DDOS (Distributed Denial of Service) detection. Secondarily, this is also a cost issue – a load test can of course generate a great deal of traffic. For example, a customer was supposed to pay several thousand of euros in traffic costs, due to the faulty routing of a host that caused traffic, which was supposed to be internal, to go through an external line. In any case, the respective hosters should be informed in advance about such tests, if possible.

Errors found

Many errors that I and my colleagues found during load tests didn’t meet the expectations of the developers of the web application. And almost all of them would have become apparent, without a dedicated load test, only during production operation.

Varnish is used in conjunction with Edge-Side Includes (ESI) to cache individual parts of the website with different cache contexts and runtimes. This was used very granularly on a large online shop and it worked wonderfully during the development and the usual tests. However, due to the very granular use of ESIs and the combinatorial explosion of the cache contexts, Varnish could no longer keep all necessary variants in the memory during the load test. A cache hit rate of < 10 % meant, that the PHP framework for the online shop was requested not only once for the entire page, but up to forty times for a single page view of a visitor. This large increase in requests was far too much for the application servers. However, the problem could be solved relatively easy by a strong reduction of the ESIs.

NFS (Network File System) is often used in web applications to synchronize static files between several application servers. But NFS often behaves very differently under a high load than under a low load, because multiple parallel write accesses very often lead to complete blockages which can last minutes. This is an effect, which does not occur in normal tests, but it can be observed quite often under load. By the way, this can be avoided by writing NFS shared files only once and never changing them again. Multiple reading servers are usually no problem.

A cluster pre-configured by a hoster had another problem ready for my team, which would have caused a lot of errors in the production operation, but was not noticed in the test operation. The Apache server simply accepts about twice as many connections as the MySQL server. This resulted in 50% of all queries to end with an error, because no connection to the MySQL server could be established anymore. It’s actually a trivial problem, but one which can cause a relaunch or an advertising campaign to fail completely.

In none of the cases we tested, was the MySQL server too slow, which is the most common assumption of the developers. With the exception of one case, the systems we have tested so far, would never have withstood the announced load. However, after the tests, measurements and corresponding troubleshooting, every system has so far survived the planned event without any problems. Even if the investment in such a load test, either by building up knowledge of one’s own or by external experts, seems large for the time being, it is worth it again.

Checklist

A checklist can help to identify the most important points, before a load test is carried out:

A determined hardware for the test server: The tests should never be run on the system, which is supposed to be tested – this would strongly distort the measurement.
Sufficient test hardware within the same data processing system: Both the network throughput and the load should also be measured and monitored on the test servers to ensure that the announced load can be reliably generated.
Testing the actual hardware: There is no way around the testing of the actual hardware, if you are in need of meaningful results. Infrastructure automation (Ansible, Puppet …) can help to duplicate possibly existing system.
Use realistic data: The data sizes and structures within the tested software should be as close as possible to the realistic environment. Especially the size of the index in databases and their respective storage consumption are often crucial when it comes to the performance of the respective system.
Inform external service providers: In any case, all operators of integrated external services, which are also tested, should be informed. Often it has to be agreed on how to deal with the data arising during the test period.
Defining realistic user scenarios: The significance of a load test stands or falls with the realistic user scenarios. These should be worked out together with the product owner. It is also important to understand the number of actions that occur.

Conclusion:

Setting up and running a really meaningful load test is more work than calling a short script. In return, such a load test can give you the certainty that an advertising campaign or a relaunch can withstand the expected user numbers, and it allows you to plan the necessary hardware much more precisely, which can reduce costs in the long term. In the end, a failed advertising campaign is often more expensive than a test can be in advance – but the security and confidence that a meaningful load test gives is priceless.

The post Is my shop up to the task? Realistic performance tests for operative security appeared first on International PHP Conference.

Testing PHP code more efficiently

IPC editorial team — Wed, 17 Jan 2018 11:21:57 +0000

As a PHP developer, you can easily get into the situation of working with or interacting with files, especially when processing uploaded files or writing cache or log files. And as a good developer, you also want to cover this program flow with tests.

Listing 1 shows a simplified example of a CacheWarmer class that creates a cache directory in the warmUp method if it does not already exist.

cacheDirectory = $cacheDirectory;
  }

  public function warmUp()
  {
    if (!is_dir($this->cacheDirectory) &&!mkdir($this->cacheDirectory)) {
      throw new \RuntimeException('Could not create cache directory!');
    }
    // ...
  }
}

If you want to test such a class with the help of a unit test, the simplest approach would be to do it directly in the file system. The class is initialized locally, the warmUp method is called, and the system checks at the end whether the corresponding directory has been created and correctly filled. Although this method is quick and extremely effective, it does involve a variety of possible problems. For example, if you are working with an operating system other than the actual target system, the testers and developers must keep this in mind. It is also important to make sure that the tests clean up the file system correctly afterwards – on the one hand to avoid side effects in dependent tests and on the other hand to prevent your own file system from being flooded with test data.

The better solution to check such applications is to use a virtual file system. A virtual file system is created as a separate stream wrapper for PHP. This allows built-in PHP functions such as mkdir or file_exists to be used. The vfsStream library offers just that. One advantage is that vfsStream can be used with almost any PHP test framework, such as PHPUnit. The installation is done as usual with Composer:

composer require –dev mikey179/vfsStream

Now, the file system to be tested is only streamed and discarded after the tests have been completed. This has the big advantage that you don’t have to worry about cleaning up and side effects of tests. Better performance for I/O operations can also play a role in large test suites. Listing 2 shows a possible test of our CacheWarmer class using PHPUnit and vfsStream.

root = vfsStream::setup();
    }

    /**
     * @test
     */
    public function canCreateCacheDirectoryOnWarmUp()
    {
        $cacheWarmer = new CacheWarmer($this->root->url() . '/cache');
        $cacheWarmer->warmUp();

        $this->assertTrue($this->root->hasChild('cache'));
    }
}

The library has many other useful features such as permission handling and adjustable disk quotas. However, there are also limitations, which should be considered. For example, interaction with symlinks is not possible. Further documentation and examples can be found in the official Wiki of the project.

Testing built-in PHP functions

Another problem you often encounter as a PHP developer is working with built-in PHP functions like time() or exec(). These are usually difficult to test. Listing 3 shows a simplified version of a PdfCreator class that uses wkhtmltopdf internally to create PDFs and executes it via the exec() function.


Of course, this procedure can be bypassed with an appropriate software architecture. However, this is not always possible. If you still want to make sure that the application handles return values of built-in PHP functions correctly, there are several ways to do this.
Unit instead of integration tests
The first, and often obvious, variant would be to cover this with a functional or integration test. In this case, you would execute the application directly under certain conditions or contexts and check the results accordingly. However, these tests are often very time-consuming because the test environment must be prepared or initialized each time.
To test the expectations of built-in PHP functions directly, we can use php-mock. This tool also supports various test frameworks like PHPUnit or Prophecy (phpspec). To mock built-in PHP functions php-mock uses the namespace fallback rule of PHP. It says that used built-in PHP functions are first searched in your own namespace if they are set unqualified (without leading backslash). Only after that the function from the global namespace is taken.
The version of php-mock for PHPUnit can also be installed via Composer:
composer require –dev php-mock/php-mock-phpunit
For the tests php-mock provides a trait, so that a possible test for our already shown PdfCreator class could look like Listing 4.
getFunctionMock('My\App', 'exec');
        $exec->expects($this->once())->willReturnCallback(
            function ($command, &$output, &$returnValue) {
                $this->assertEquals('/usr/bin/wkhtmltopdf file.html file.pdf', $command);
                $output = ['failure'];
                $returnValue = 1;
            }
        );

        $pdfCreator = new PdfCreator;
        $pdfCreator->execute('file.html', 'file.pdf');
    }
}

Such simple unit tests obviously have big advantages. But the downside is that you have to know exactly which return values are possible. Otherwise, errors may occur despite extensive test coverage. A further small disadvantage is the already mentioned limitation to unqualified function calls. But those can be set easily in your own code.
Conclusion and outlook
Sometimes the most obvious test procedures at first glance are the ones with the biggest pitfalls. This is especially true when working with extensive test suites. In this context, the two libraries presented here are useful as real efficiency helpers when it comes to avoiding slow integration tests and replacing them with unit tests. For this reason, a developer should always keep this option in mind.
The post Testing PHP code more efficiently appeared first on International PHP Conference.

Making code refactoring safer with functional tests

IPC editorial team — Thu, 28 Sep 2017 07:18:16 +0000

Only very rarely will the quality of software improve with a rewrite: Instead, you will most likely lose features no stakeholder remembers anymore (usually about 40 per cent) and the implementation will take way more time than anticipated. Furthermore, implementing new features during a rewrite is often impossible, if you don’t set up a new team which in turn will have less knowledge of and insight into the existing software quality.

Automated tests are the easiest way to make sure you don’t lose any functionality. Automated tests are often equated with unit tests, though, which usually can’t be written for old code. It’s those outdated structures making unit tests hard to write: static method calls, singletons and static registries are still widely used, making code difficult to test while at the same time posing the reason why you should refactor at all. A vicious cycle!

Course of action

Automated tests are not identical to unit tests. Other kinds of tests like integration tests, acceptance tests and functional tests can also be automated! For refactoring purposes, it’s the functional tests which can be of great assistance. They are utilized to make sure a piece of software executes specific functions or tasks. You might, for instance, test a web application to checkout if a checkout process in an online shop still works fine. The process you want to test will be executed as a whole in the test. This is how it works:

A use case for an application is specified in a functional test.
You make sure the code to be refactored is covered by the test.
You refactor that code.
You write unit tests for important new pieces of code.

If you follow these steps, they will ensure the use case included in the test is still working after refactoring. How to adjust and clean up code will be discussed below. Unit Tests for new, clean code should be written immediately!

Visit our IPC ’17 Sessions with a focus on Testing & Quality

→ Domain-specific Assertions

→ Architecture Refactoring: Moving towards DDD

More about Testing & Quality

Simple functional tests

There are many ways to create functional tests for websites, fulfilling different requirements as to the difficulty of creating those tests, maintaining them and keeping them stable. In our experience the optimal solution for use cases like this is to utilize a combination of PHPUnit and Mink. The tests we are talking about don’t need to work for a long time, but will be used exclusively while code undergoes refactoring. You can usually delete them afterwards and replace them with something more sensible. When it comes to stable and maintainable functional tests, requirements are quite different; at least some refactoring of frontend code is required before implementing them.

Listing 1 illustrates how a functional test for a successful login-procedure could look like. The visible parent class FeatureTest derives from PHPUnit_Framework_TestCase – this way tests will be integrated with other, existing tests in PHPUnit. Mink, the framework for browser tests, is initialized by the baseclass and provides some simple helper methods. Full code can be found in our app example.

Listing 1

class LoginTest extends FeatureTest
{
  use FeatureTest\UserHelper;

  public function testLogin()
  {
    $user = $this->createUser('kore', 'password', 'kore@example.com', 'Kore Nordmann');

    $page = $this->visit('/');
    $page->find('css', '#username')->setValue('kore');
    $page->find('css', '#password')->setValue('password');
    $page->find('css', '#submit')->press();

    $page = $this->session->getPage();
    $this->assertNotNull(
      $welcomeBox = $page->find('css', '.welcome'),
      'Login failed'
    );
    $this->assertContains("Hello Kore", $welcomeBox->getText());
  }
}

It’s rather obvious what this code does. The Mink API for browser control allows for many other interactions to be executed. In most cases it is best to use CSS Selectors to address virtually any element you like in an existing website. In Listing 1, the first two form fields get filled out with data of a user account specified before; afterwards the form will be send by confirming the corresponding button. Next, the test will check whether the next page displays the welcome text. If it doesn’t, some error must have occurred.

On the other hand, this kind of weak error handling illustrates some typical issue with functional tests: if a single test fails, you will most likely get no information on the reason for that error – this is a major difference to unit tests. After a functional test failed, the problem needs to be singled out by usual debugging processes. But if it’s just about ensuring you don’t break features while refactoring, this test will work. Most problems will have been caused right before they show up by whatever you did throughout the refactoring process. That kind of makes finding the source easy, doesn’t it?

How does Mink work?

Mink can work with different browser emulators (Zombie.js, Goutte) or browser remote controls (Selenium, Sahi). Especially the latter kind helps tremendously with manually monitoring and debugging test execution. Browser emulators, though, are way faster in executing tests, being quite useful for fast verification. Mink abstracts between different backends involved, so you can also use both at the same time, as long as you don’t need any special features.

Goutte: Goutte is a browser-side implementation written in PHP, ultimately just executing HTTP-Requests. Correspondingly, JavaScript used in frontend won’t work.
Zombie.js: Zombie.js simulates a headless browser – therefore you don’t need a graphical output system, making this solution work smoothly on servers, just like Goutte.Zombie.js can even interpret JavaScript!
Sahi/Selenium: Both are popular frameworks for working with real browsers, which for one is helpful for watching test execution, but also assists with testing views and JavaScript in different browsers. In our experience, Sahi works better with applications using a lot of JavaScript.

In figure 1 you can see the differences described above. Goutte and Zombie.js will access the frontend of you web application without intermediary (green, grey), Selenium and Sahi use a browser to execute tests (orange).

In the end, PHPUnit-Tests are just PHP-Code and nothing more, so we can immediately access the application code from the tests (blue) – at least if it runs on the same machine. Therefor you can execute operations straight at the database (red), like resetting or adding data sets. However, it is also possible to develop dedicated services for adding data to an application for testing (listing 1) – they, again, will have direct access to database and application.

Services

Setup of functional tests is easy and fast; there remains one problem to be solved though: the tested website most likely interacts with external services like databases and web services (REST, SOAP, …) for payment, newsletter or some other functionality. Most websites that need to be refactored don’t have sufficient level of abstraction in the code for this kind of services; therefore, you can’t just exchange them for testing.

Running functional tests against a test system is highly recommendable since we want to adjust the code in the next step. The optimal solution is to use a virtual machine; if that is not possible, a staging system can be sufficient too. However, that might be problematic especially when several developers are working on the code at the same time. If there are no such testing systems yet, generating them is necessarily the first step. You can also automatize setup processes at the same time if possible (Ansible, Chef, Puppet, …) – but don’t underestimate the amount of effort you need to put into that. Strategies for handling external services can be of different kind and will most likely be in line with established manual testing strategies. Some services (payment) will offer test accounts that won’t trigger any real action – you might want to use those, seriously! Many databases are just too large to be reset completely every time you run a test. Correspondingly you need to find an optimal solution in the local environment to handle test data. It might help to work with new, random user names, to reset just particular tables or selectively delete all data that was added after a specific date, if there is such an option.

However, there will still be some services from time to time, which get accessed by so many instances in your code, that are woven so tightly into the code, that none of these strategies work. In this case you should first hide the services behind an abstraction layer so you can test other aspects of the system and refactor them. Later on we will look into how to refactor these services. C.f. the paragraph on Branch by Abstraction.

Code Coverage

Code Coverage usually indicates which parts of the code have not yet been subjected to unit tests; you don’t want to mess up these statistics, so functional tests and integration tests won’t account for your code coverage. However, in our case we actually can divert code coverage from its intended use.

When writing functional tests for our application, we want to make sure our tests actually cover all the code to be refactored. To do so, we can use XDebug to record our code coverage server-side while the tests are executed. Be aware the functional tests usually consist of many different calls, so the individual code-coverage-reports need to be merged afterwards. A library offering functionality for both recording and merging of the collected information into an HTML-report is PHP_CodeCoverage.

This leads us to the following course of action:

Identify code to be refactored
Write functional tests covering this code
Test for code coverage to make sure, you really covered all the code – if not, go back to step 2.

When executing these steps on code with a long history, you will probably run into code not accessible anymore. Keep in mind some code isn’t made to be accessed online, but there might be other ways: cronjobs, command line scripts. If your code can’t be accessed on either of these ways, just get rid of it immediately. You can still restore it from version control is necessary.

Renaming:

Correction of Name and Identifier can be of tremendous help to make code more readable. Many names might not be correct anymore for historical reasons or at least have become inconsistent. Side effects like interactions with other parts of the code can easily result from that. Luckily, IDEs like PHPStorm nowadays offer a good support for changing names.

Extract Method: it can also help with making code easier to read if new methods get extracted from old ones that are very long. Inside a method, Code blocks with additional comments are good places to start. In the best case, you can extract so called “pure methods” – they don’t have side effects, because they don’t access class variables or global state. Pay attention to the variables written and read when extracting code though. IDE support for this step is just about moderate.
Extract Class:

after successful extraction of some methods, you most likely will see the outlines of method groups belonging together. Those can be extracted in separate classes now. Be careful with the class variables you use! IDEs offer some rudimentary support for this at most, since new classes especially need to be made available in all corresponding locations – for example by the utilized dependency injection container.

Extract Component:

when several classes have been extracted, you sometimes notice some group of them actually is better represented as a separate component or library. You could transfer those to a separate project; but I don’t know of any automated support for this.

After the steps Extract Method and especially after Extract Class, it’s often useful to write tests for the code that got extracted. These tests are the ones that can survive over a long period of time. “Pure methods” mentioned above are especially easy to test, because you don’t need any test doubles.

As mentioned above, some of the steps can be automated by using tools like IDEs. However, they might be prone to errors and need to be checked manually, due to PHP’s weak and dynamic typing. Besides well-established IDEs, there are also some CLI-tools trying to help with implementing these changes, which you can use irrespective of your IDE of choice.

Branch by Abstraction

When doing large refactorings of long established software, a central point of interest stems from certain back ends (web services, databases, …), that got access to all parts of the code. Or there might be implicit logic, representing central business logic, being distributed all through the code. Branch by Abstraction is a suitable approach to refactor such aspect.

Figure 2 illustrates the basic idea of locating all instances of these kinds of access. Every access will be extracted into a wrapper class that will execute calls to the original old code. It is important not to change the code yet but just introduce indirections to notably increase the probability of not breaking anything.

After localising all instances and replacing the code everywhere, you can now simplify the API of the wrapper class. In doing so, you should define an Interface for this wrapper – don’t change the old code yet! This alone generates some value, because the code executing the call will be significantly easier to read. Most times, many lines of code can be replaced by a comprehensible access to the wrapper class.

After the interface was defined, you can start working on a new implementation if that is still necessary at all. This new implementation just needs to fit the requirements of the interface we just defined.

Especially when working with complex systems that were extracted this way, it might be necessary to add verification beyond the tests already written. By following the approach outlined above, you can add another implementation of the interface that will use both the old and new interfaces and compare the results. If the new implementation reacts the same as the old one in most cases, you can be sure the implementation works right. GitHub worked like this to first verify the implementation of their merge-algorithm before exchanging it.

Finally, you can eventually remove all old pieces of code, with the new code resolving all requests. But even if you just execute step one, your code will be easier to read.

Advice

At the end, let me give you two further pieces of advice regarding refactoring that proved to be quite helpful with our customers to get fast and good results. Let me first remind you of what I said before: Commit early and often.

However, especially when working on large projects, you should not commit to feature branches as it is regularly done on Git. Code advancements make it hard to impossible to re-integrate them back with the master branch, doubling the amount of work. When working in small, non-invasive steps, code will always stay functional, opening up the possibility to work on master branch; this also adds value to all other developers involved immediately.

When you want to refactore complexe code or do a rework of complex concepts in your code, it might be a good idea to not just work with pair programming but to skip to mob programming instead. With all developers involved in the project working together on one large monitor, virtually all errors will be recognized instantly. Furthermore, this facilitates a common understanding of the code and changes that get executed. Final clean up can be done afterwards in small groups.

There is still a question of the goal of refactoring to be answered. Sometimes the goal is made obvious by external influences, sometimes certain parts of the application hurt so much everyone just knows what needs to be done. Generally speaking, you should be familiar with concepts like S.O.L.I.D. and Clean Code to make sensible changes. Furthermore, there are the lists of typical code problems and solutions to them, everyone knows what I’m talking about here – but that would go beyond the scope of this article.

Conclusion

Functional tests are excellent resources to make sure your code keeps working throughout refactoring. Working with our customers, we have successfully realized this approach in projects that couldn’t be developed further or even fixed anymore before. The method outlined in this article works also for large codebases and very complex products. Refactored Code can increase productivity substantially, without the need to stop advancements for a long time.

More about Testing & Quality

→ Make your own Robotic Tester with Docker, FFmpeg and codeception

→ Getting comfortable being uncomfortable – Lessons learned one Year after Coding Bootcamp

→ Debugging, Logging and Profiling in Distributed Systems

→ Jenkins vs Circle vs Travis

Program of IPC ’17

The post Making code refactoring safer with functional tests appeared first on International PHP Conference.