International PHP Conference https://phpconference.com/ IPC 2024 Wed, 07 Feb 2024 14:13:00 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.2 Unlocking PHP 8.3 https://phpconference.com/blog/php-8-3-new-features-enhancements-guide/ Mon, 05 Feb 2024 09:02:41 +0000 https://phpconference.com/?p=85971 The final version of PHP 8.3 was released recently in November of 2023. As with every year, there are a number of new features and bug fixes, as well as deprecations and breaking changes that need to be considered before updating to PHP 8.3.

The post Unlocking PHP 8.3 appeared first on International PHP Conference.

]]>

The highlight of every new version is, of course, the new features. They often help us to simplify our code and program more securely. Version 8.3 also includes a few adjustments that allow PHP to provide us with better error handling and enable us to keep an even closer eye on our code. This article is intended to provide an overview of the most important changes. A complete overview of all major and minor changes can be found in the official release notes [1].

Cloning of readonly classes

You want to clone an object, but instead you only get an error message from PHP. Anyone who uses the readonly properties of PHP 8.1 or the readonly classes of PHP 8.2 may already be familiar with this problem. This behavior has been adjusted in PHP 8.3. In the magic method __clone, readonly properties of an object can now be overwritten (Listing 1) [2].

class PHP {
  public string $version = '8.3';
}
 
readonly class Foo {
  public function __construct(
    public PHP $php
  ) {}
 
  public function __clone(): void {
    $this->php = clone $this->php;
  }
}
 
$instance = new Foo(new PHP());
$cloned = clone $instance;
 
$cloned->php->version = '8.3'; 

IPC NEWSLETTER

All news about PHP and web development

 

Type-safe constants in classes

Constants are a convenient tool for storing and retrieving fixed values. Once defined, they provide a reliable source of consistently identical data, which is not the case in PHP. A child class can overwrite the constant of a parent class. And not only that, since constants could not previously have a type, a string in a parent class could become an array in the child class, for example. This problem has since been addressed in PHP 8.3 and you can define the class constants in a type-safe way (Listing 2) [3].

interface I {
  const string VERSION = '1.0.0';
}
 
class Foo implements I {
  const string VERSION = [];
}
 
// Fatal error: Cannot use array as value
// for class constant Foo::PHP of type string 

Dynamic call of class constants

Let’s stick with the topic of class constants, up until now, these could only be called dynamically in a roundabout way. To do this, you had to use the constant() method, as a direct dynamic call was previously not possible here. Luckily, PHP 8.3 has been adapted accordingly. Constants can now be called dynamically with the same syntax as we already know from the dynamic call of class properties. However, this change not only applies to constants, but has also been implemented for the enums introduced in PHP 8.1 (Listing 3) [4].

class Foo {
  const PHP = 'PHP 8.3';
}
 
$searchableConstant = 'PHP';
 
var_dump(Foo::{$searchableConstant}); 

 

#[\Override] attribute

With the new #[\Override] attribute, child class methods can be marked to emphasize the deliberate overriding of a method of the parent class. Incidentally, this allows errors in the method definition to be intercepted, as PHP 8.3 issues an error if this method doesn’t exist in the parent class. So instead of looking for an error why the method you want to overwrite is not called, for example, because of a typo in the name, PHP now provides error messages to clearly indicate any issues with method definitions. Additionally, if you modify a parent class and inadvertently remove a method that has been overridden by a child class, you will now be notified with an error message (Listing 4) [5].

use PHPUnit\Framework\TestCase;
 
final class MyTest extends TestCase {
  protected $logFile;
 
  protected function setUp(): void {
    $this->logFile = fopen('/tmp/logfile', 'w');
  }
 
  #[\Override]
  protected function taerDown(): void {
    fclose($this->logFile);
    unlink('/tmp/logfile');
  }
}
 
// Fatal error: MyTest::taerDown() has #[\Override] attribute,
// but no matching parent method exists 

json_validate() function

JSON is the method of choice in many interfaces when it comes to data exchange. So it’s quite surprising that you can’t avoid parsing a JSON string in PHP to validate it and check whether an error has occurred. That’s no longer the case with PHP 8.3, where there is the new json_validate() method to check whether it is valid JSON before further use. So if you are not interested in the content, but only in the fact that the JSON is valid, you have a new method here that also works more efficiently than a json_decode(), which was previously the only way to check [6], [7]:

var_dump(json_validate('{ "test": { "foo": "bar" } }')); // true

New Randomizer::getBytesFromString() method

With the random extension introduced in PHP 8.2, PHP has taken a real and important step towards cryptographically correct random methods. PHP 8.3 introduced the new method Randomizer::getBytesFromString(), which is passed a string of arbitrary characters that should make up the randomly generated string (Listing 5) [8], [9].

// A \Random\Engine may be passed for seeding,
// the default is the secure engine.
$randomizer = new \Random\Randomizer();
 
$randomDomain = sprintf(
  "%s.example.com",
  $randomizer->getBytesFromString(
    'abcdefghijklmnopqrstuvwxyz0123456789',
    16,
  ),
);
 
echo $randomDomain; 

YOU LOVE PHP?

Explore the PHP Core Track

 

New Randomizer::getFloat() and Randomizer::nextFloat() methods

In addition to the getBytesFromString() method, the Randomizer class now has two more methods which return a random float. Randomizer::getFloat() returns a random float whose limits can be defined as desired using the parameters $min and $max; a third parameter can be used to specify whether or not the limit values should be included in the pool of expected random numbers. Randomizer::nextFloat(), on the other hand, returns a float between 0 and 1 and is therefore equivalent to Randomizer::getFloat(0,1, \Random\IntervalBoundary::ClosedOpen) (Listing 6) [10], [11].

$randomizer = new \Random\Randomizer();
 
$temperature = $randomizer->getFloat(
  -89.2,
  56.7,
  \Random\IntervalBoundary::ClosedClosed,
);
 
$chanceForTrue = 0.1;
// Randomizer::nextFloat() is equivalent to
// Randomizer::getFloat(0, 1, \Random\IntervalBoundary::ClosedOpen).
// The upper bound, i.e. 1, will not be returned.
$myBoolean = $randomizer->nextFloat() < $chanceForTrue; 

PHP linter with support for multiple files

A practical command on the command line is php -l. This can be used to check any PHP file for syntax errors. With PHP 8.3, it is now possible to validate not just one, but any number of files at once. Not much has changed in terms of the output; for each additional file, an additional line is output to indicate whether the file contains syntax errors or not [12]:

php -l foo.php bar.php
No syntax errors detected in foo.php
No syntax errors detected in bar.php

New classes, interfaces and functions

Of course, these are not all the changes that PHP 8.3 has to offer, but they are definitely the most important in the daily life of a PHP developer [13]. The DOM classes DOMElement, DOMNode, DOMNameSpaceNode and DOMParentNode have received new additional helper methods to simplify navigation in the DOM of HTML and XML documents. IntlCalendar has received new helpers to set date and time, and IntlGregorianCalendar has received two new methods to create a calendar based on a date and time. With mb_str_pad there is a function that works analogously to str_pad, but supports multibytestrings. To increment and decrement an alphanumeric string, you can use the str_increment and str_decrement functions from PHP 8.3 onwards.

Deprecations and breaking changes

In the latest version of PHP, there are again deprecations that will be removed in later versions, but there are also a few changes that alter the functionality of existing code [14]. Incorrect data when using PHP’s Date/Time extension has previously led to warnings or errors in the form of \Exception or \Error. These were not always easy to handle, as no specific exceptions were thrown. This changes with PHP 8.3, for example, there is now a general DateException for all errors caused by dates that generate an error when parsing. The DateException is implemented in several child exceptions such as the DateInvalidTimeZoneException. When initializing an empty array with a negative index n, PHP 8.3 ensures that the next key is not 0 but n + 1.

IPC NEWSLETTER

All news about PHP and web development

 

Outlook for PHP 8.4

Of course, the PHP developers aren’t just standing around after version 8.3’s release. The first changes for PHP 8.4 have already been announced [15]. For example, the parsing of HTML5 documents with the DOM extension is to be simplified and there are a few changes to the just-in-time compiler. BCrypt, the hashing algorithm used by PHP to hash passwords, is to become more expensive by making it more difficult to crack passwords. With mb_trim, trim is also finally getting a sister function that can work with multi-byte strings.


Links & Literature

[1] https://www.php.net/releases/8.3/en.php

[2] https://wiki.php.net/rfc/readonly_amendments

[3] https://wiki.php.net/rfc/typed_class_constants

[4] https://wiki.php.net/rfc/dynamic_class_constant_fetch

[5] https://wiki.php.net/rfc/marking_overriden_methods

[6] https://wiki.php.net/rfc/json_validate

[7] https://www.php.net/manual/en/function.json-validate.php

[8] https://wiki.php.net/rfc/randomizer_additions#getbytesfromstring

[9] https://www.php.net/manual/en/random-randomizer.getbytesfromstring.php

[10] https://wiki.php.net/rfc/randomizer_additions#getfloat

[11] https://www.php.net/manual/en/random-randomizer.getfloat.php

[12] https://www.php.net/manual/en/features.commandline.options.php

[13] https://www.php.net/releases/8.3/en.php#other_new_things

[14] https://www.php.net/releases/8.3/en.php#deprecations_and_bc_breaks

[15] https://wiki.php.net/rfc#php_84

The post Unlocking PHP 8.3 appeared first on International PHP Conference.

]]>
Serde for PHP 8: How Functional Purity Drives Serde’s Architecture https://phpconference.com/blog/interview-larry-garfield-serde-php-8-library/ Thu, 11 Jan 2024 08:27:45 +0000 https://phpconference.com/?p=85919 Delve into the world of Serde and Crell with Larry Garfield, the PHP expert who created this unique and versatile library. Larry currently works as a staff engineer at LegalZoom but has worked at Platform.sh, written books on PHP, and contributed to the Drupal 8 Web Services initiative to create the modern PHP we're familiar with today. We caught up with Larry to talk about Serde and its supporting libraries. Read on and learn everything you need to know about Serde and Crell.

The post Serde for PHP 8: How Functional Purity Drives Serde’s Architecture appeared first on International PHP Conference.

]]>

IPC-Team: Thank you for taking the time to speak with us today, Larry. Can you introduce yourself for our readers?

Larry Garfield: Hi, I’m Larry Garfield, 20+ year PHP veteran. I’ve worked on a couple of different Free Software projects, and currently work as a Staff Engineer for LegalZoom. I am also a leading member of the PHP Framework Interoperability Group (PHP-FIG).

IPC-Team: Congratulations on the recent release of Serde 1.0.0. How did Serde come about and what was your motivation?

Larry Garfield: Serde came out of a need I had while working for TYPO3, the German Free Software CMS. I wanted a tool to help transition TYPO3 from giant global array blobs for all configuration toward well-typed, explicitly defined objects. Translating arrays into objects is basically a serialization problem, so rather than do something one-off I figured it was a good task for serialization.

I first looked at Symfony Serializer, as TYPO3 already uses a number of Symfony components and it was generally regarded as the most robust option. Unfortunately, after spending multiple weeks trying to coax it into doing what I needed I determined that is just couldn’t. It didn’t have the structure-manipulation features I needed, and its architecture was simply too convoluted to make adding it feasible. That meant I had to build my own.

After some initial experimentation of my own, I looked into Rust’s Serde crate, as it’s generally regarded as the best serializer on the market. Rust, of course, is not the same as PHP, but I was still able to draw a lot of ideas from it. For instance, Crell/Serde is streaming, like Rust’s Serde. It doesn’t have a unified in-memory intermediate representation (necessarily), but there is a fixed set of “low level” types that exporters and importers can map to. (That list is smaller for PHP than for Rust, naturally.)

IPC NEWSLETTER

All news about PHP and web development

 

It took a few months, but I was able to get Crell/Serde to do nearly everything I needed it to for TYPO3. Most especially, I’m very happy with the data restructuring capabilities it has. That is, the serialized form of an object doesn’t have to be precisely the same as the in-memory object. There’s robust rules for automatically changing that structure in controlled ways when serializing and deserializing. For instance, a set of 10 JSON properties can be grouped up into three different object properties, with changed names in PHP to avoid prefixes and such, automatically. That was important for TYPO3, because the old array structures had a lot of legacy debt in their design, and this was an opportunity to clean that up.

Along the way, Serde also spawned two supporting libraries: Crell/fp and Crell/AttributeUtils. The latter is where a lot of Serde’s power lives, in fact, in its ability to power nearly everything through PHP attributes. That functionality is now available to any library to use, not just Serde.

In the end, TYPO3 chose not to pursue the array-to-object transition after all. But since it’s an Open Source organization, the code was already written and could be released. After I left TYPO3, I polished the library up a bit further, added a few more features, and released it.

IPC-Team: Serde shares its name with the Serde framework used with Rust, was that an inspiration? Have the two ever been confused?

Larry Garfield: As noted above, yes, it’s definitely named in honor of Rust Serde and drew from its design. So far the name similarity hasn’t been an issue. I did have someone on Mastodon complain that my name choice was going to hurt SEO, but so far that doesn’t seem to have been an issue. It’s too late to change it anyway. 🙂

“Despite all of its power and flexibility, Serde is pretty fast. The last time I benchmarked it, it was faster than Symfony Serializer on the same tasks, despite having more features and options.”

IPC-Team: What formats does Serde support?

Larry Garfield: As of 1.0.0, Crell/Serde can round-trip (serialize and deserialize) PHP arrays, JSON, YAML, and CSV. It can also serialize to JSON and CSV in a streaming-fashion. I have a working branch on XML support, but that’s considerably more challenging and I suspect XML may be better handled in a different approach than a general purpose serializer.

I would like to add support for TOML, and there has been interest in it, but so far we’ve not found any existing TOML 1.0 parsers for PHP, only for the old 0.4 format. If someone made a good 1.0-compatible library, plugging that into Serde should be pretty simple.

YOU LOVE PHP?

Explore the PHP Core Track

 

Serde is entirely modular, so new formats can be added by third parties easily. That said, I’m happy to colocate supportable formats in Serde itself. Ease of setup and use is a key goal for the project.

IPC-Team: What sets Serde apart from the competition?

Larry Garfield: I think Serde excels in a number of areas.

  • a. As mentioned, the data-restructuring capabilities are beyond anything else in PHP right now, as far as I’m aware. I think it may even be more flexible than Rust Serde in some ways.
  • b. Support for streaming JSON and CSV output. Combined with the ability to read from generators in PHP, that means Serde has effectively no maximum on the size of data it can serialize.
  • c. It’s “batteries included.” Symfony Serializer is a bit tricky to setup if you’re using it outside of the Symfony framework itself. There’s lots of exposed moving parts. Serde can be used by just instantiating one class and using it. It can be configured in more robust ways, but for just getting started it’s trivially easy to use.
  • d. Despite all of its power and flexibility, Serde is pretty fast. The last time I benchmarked it, it was faster than Symfony Serializer on the same tasks, despite having more features and options. If you don’t need Serde’s capabilities than a purpose-build lightweight hydrator would still be faster, but in most cases Serde will be fast enough to just use and move on. It also has natural places to hook in and provide custom serialization for certain objects, which can be purpose-built and faster than the general pipeline.
  • e. “Scopes” support. Symfony Serializer also supports multiple ways of serializing an object through serialization groups, which are very similar. The way Serde ties in attributes, however, gives it even more flexibility, and I am not aware of any other serializer besides Serde and Symfony that have that ability.
  • f. This is more of a personal victory, but Serde is about 99% functionally pure. It follows the functional principles of immutable variables, functionally pure methods, statelessness, etc., even though it’s overall object-oriented. Really holding to that line helped drive the architecture in a very good place, and is one of the reasons Serde is so extensible.

IPC-Team: What are some of the advantages of using Serde for serialization and deserialization in PHP applications?

Larry Garfield: I see Crell/Serde as a good fit any time unstructured data is coming into an application. It is always better to be working with well-structured, well-typed objects than array blobs. Because Serde is so robust and fast, it’s straightforward to “guard” everywhere data is coming into the application (from an HTTP request, config file, database, REST response from another service, etc.) with a deserialization layer that ensures you have well-structured, typed, autocompletable data to work with. That can drastically cut down on the amount of error handling needed elsewhere the application.

 

The same is true when sending requests. Rather than manually build up an array to pass to some API call (as many API bridges expect you to do), you can build up a well-structured object, using all of the good OOP techniques you already know (typed properties, methods, etc.), and then dump that to JSON or a PHP array at the last second before sending it over the wire. That ensures you have the right structured data every time; your PHP code wouldn’t even run otherwise.

IPC-Team: What are some of the Serde’s limitations?

Larry Garfield: As mentioned, XML and TOML support are still pending. I’ve had someone ask about binary formats like protobuf, and I think that could probably be done, but I’ve not tried.

There are some edge cases some users have reported around the data restructuring logic when using “boxed” value objects. For instance, a “Name” class that contains just a string and an “Email” class that contains just a string, both of which are then properties on an object to serialize. That’s only partially supported right now, although I’m working on ways to improve it. Hopefully it will be resolved by the time you read this.

Crell/Serde also supports only objects. It cannot serialize directly from or to arrays or primitives. In practice I don’t think that’s a big issue, as “turning unstructured data into structured objects” is the entire reason it exists.

IPC-Team: How do you recommend getting started with Serde and how can someone get involved with the community?

Larry Garfield: I’m quite proud of the Serde documentation in the project README. It’s long, but very complete and detailed and gradually works up to more and more features. The best way to get started is to read the first section or two, then try playing with it. It’s deliberately easy to just toy around with, and add-in capabilities as you find a use for them.

As far as getting involved in the project itself, as in any Free Software project, file good bug reports, file good feature requests. If you want to try and add a feature, please open an issue first to discuss it. I don’t want someone wasting time on a feature or design that won’t work.

In particular, if someone wants to try writing a formatter for writing to protobuf or other binary formats, I’d love to see what can be done there. I’ve not worked with that format myself so that’s a good place to dig in.

IPC NEWSLETTER

All news about PHP and web development

 

At the moment, Crell/Serde is entirely volunteer-developed by me, since it’s no longer sponsored by TYPO3. Please keep that in mind any time you’re working with this or any Free Software project. Of course, if you are interested, I’m happy to accept sponsorship for prioritizing certain requests.

IPC-Team: What’s on your wishlist for future iterations or updates of Serde?

Larry Garfield: Mainly addressing the limitations mentioned above. TOML support would be good to include. XML may or may not make sense. I like the idea of supporting boxed value objects better. Binary formats would be another good differentiating feature.

One feature in particular I’m exploring is allowing attributes to be read from a non-attribute source. AttributeUtils, which handles all attribute parsing, is also clean enough that plugging in an alternate backend should be easy. If that alternate backend reads data from a YAML file, for instance, using Serde, and deserializes into whatever attribute set a given library is using (such as Serde’s own attributes), that would allow any AttributeUtils-using library to easily support YAML or JSON configuration in addition to in-code attributes, but still resulting in the same metadata objects for a library to use. I’m still working on how to make this work, but I’m pretty sure it is feasible. Stay tuned.

The post Serde for PHP 8: How Functional Purity Drives Serde’s Architecture appeared first on International PHP Conference.

]]>
17 Years in the Life of ElePHPant https://phpconference.com/blog/keynote-17-years-in-the-life-of-elephpant/ Tue, 14 Nov 2023 08:51:52 +0000 https://phpconference.com/?p=85810 In the vast and dynamic world of programming languages, PHP stands out not only for its versatility but also for its unique and beloved mascot – the elePHPant. For 17 years, this charming blue plush toy has been an iconic symbol of the PHP community, capturing the hearts of developers worldwide.

The post 17 Years in the Life of ElePHPant appeared first on International PHP Conference.

]]>

The story of the elePHPant began in Canada, where Damien Seguy, the founder and father of the elePHPant, first brought this adorable creature to life. Little did he know that this creation would become a global ambassador for the PHP language, spreading joy and camaraderie among developers on every continent, including the frosty expanses of Antarctica.

At this year’s International PHP Conference (IPC), Damien Seguy took center stage to share the remarkable journey of the elePHPant. The keynote presentation was a nostalgic trip through the past 17 years, highlighting the elePHPant’s adventures, milestones, and enduring impact on the PHP community.

IPC NEWSLETTER

All news about PHP and web development

 

The elePHPant’s global travels are a testament to the interconnectedness of the PHP community. From North America to Europe, Asia, Africa, Australia, and even the remote corners of Antarctica, the elePHPant has become a cherished companion for PHP developers everywhere. It has been a source of inspiration, a conversation starter at conferences, and a symbol of the shared passion that unites developers across borders.

Beyond its physical presence, the elePHPant has also made its mark in the digital realm. It is a common sight on social media, where developers proudly share photos of their elePHPant companions during meetups, conferences, and coding sessions. The elePHPant’s virtual presence reflects the close-knit and supportive nature of the PHP community.
The IPC keynote offered a glimpse into the evolution of the elePHPant, showcasing the various editions and designs created over the years. Each elePHPant is a unique piece of PHP history, and collectors worldwide treasure them as valuable artifacts.

As the PHP language continues to evolve, so does the legacy of the elePHPant. It remains a symbol of the vibrant and passionate PHP community, which values collaboration, knowledge-sharing, and the joy of coding. The elePHPant’s 17-year journey is a testament to the enduring spirit of PHP developers worldwide. As it continues to travel the globe, it carries the memories and experiences of every coder who has crossed paths with this beloved mascot.

The post 17 Years in the Life of ElePHPant appeared first on International PHP Conference.

]]>
Professional Test Management with TestRail – Part 2 https://phpconference.com/blog/professional-test-management-with-testrail-part2/ Fri, 06 Oct 2023 12:28:46 +0000 https://phpconference.com/?p=85673 The process in a testing team already starts in the leading project phase with an intensive planning of test concepts, optionally directly for the different levels of the V-Modell (component test, integration test, system test, acceptance test).

The post Professional Test Management with TestRail – Part 2 appeared first on International PHP Conference.

]]>

Testing is more than just running the tests! We have already explained this statement in detail in Part 1 of our series.

The simplified flow from “Test Case Management” to “Test Planning”, “Test Execution” up to the “Final Reports” shows the wide spectrum of activities in the QA environment.

So that these things can be carried out in a controllable manner, there are tools such as “TestRail”.

The test management software allows to create a clean and filterable test catalog with detailed instructions for the execution of tests.

IPC NEWSLETTER

All news about PHP and web development

 

Together we have created such a test catalog in part 1, which we now use accordingly for further planning.

Now that we have developed our tests and entered them as optimally as possible in TestRail, it is time to prepare them for execution by means of “Test Runs”.

Creating Test Plans

There are several options available in TestRail for planning. The most basic option is to create a simple test run (“Test Run”), which we can create under the menu item “Test Runs & Results”. This test run will later contain various tests from the catalog, selected either manually or automatically via filtering.

However, if we want a more structured approach, TestRail also offers the possibility to create a test plan. A Test Plan can contain any number of Test Runs, allowing for thematic structuring or subdivision.

Test Plans and Test Runs can be combined in different ways. For example, as in the definition, a Test Run can be a single run with a completed result. A test plan could then contain several runs until finally everything was OK and the feature can be accepted.

Another variant is that a test plan contains different test runs covering diverse topics. This could mean, for example, that one of the test runs might contain all the automated Cypress tests, another for smoke and sanity tests, and another for regression testing or new features. This is often helpful to have a visual representation, but can also be used to assign to different testers on the team. In this case, the test runs would remain open or repeated until everything is ultimately OK.

Before we actually create a test plan, we should look at the “Milestones” section in the main menu item. All test plans or test runs can also be assigned to milestones. These thus provide a rough subdivision, which can be done at your own discretion or in coordination with the project management.

Now we create our test plan and three test runs each for “Cypress Tests”, “Smoke and Sanity” and “Regression Tests”.

When creating a single test run, we have several options for selecting the tests. We can choose to add all tests, only certain manually selected tests, or use dynamic filtering to make the selection.

In case of manual selection, a window opens with an overview of our test catalog. Here we can navigate through our structured sections and select desired tests by simply ticking them. After clicking “OK”, the selected tests are applied to the test run.

EVERYTHING IS CONNECTED TO THE INTERNET

Explore the Web Development Track

 

When using dynamic filtering, we also see a modal. On the right side we have the possibility to specify different filter settings. Depending on how extensive the list is, we need to make sure to click on the “Set Selection” button at the bottom (scrolling may be required). Only then will TestRail highlight the appropriate tests based on our filtering. The rest of the process is the same as for manual selection.

If you now think that this is all TestRail offers us, you are considerably mistaken. TestRail offers us many more useful functions in the editing view of a test plan. The “Configurations” button opens a small window where we can create various groups and configurations. Based on the selected combinations, our prepared test cases will be duplicated and created for each specified configuration. For example, we could create groups for browsers, operating systems and devices. The configurations could then be “Chrome”, “Firefox”, or “Windows 11”, “MAC”, etc. We can then select which combinations we want to test. After we confirm this, we have different test runs for all our combinations, which we can customize or even remove. Of course, it is also possible to assign each Test Run to a different tester in the system.

So with all these features, we have flexible options to find our own customized approach for a project and a way of working.

At the end of the day, it is crucial to have a clear overview of the tests and be able to quickly provide feedback on the current status.

Test Execution

Now we finally get to the execution of our tests. Depending on the strategy and approach, this can be done either during the project, or classically at the end. Combinations are also possible if there are sufficient resources.

To start a run, we simply go to the detail page of the desired test run. On this page we have an efficient overview with statistics, sections and filtering options. A simple master-detail navigation allows to see the list of tests on the left side, and the details of the currently selected test on the right side.

For each test, multiple results can be recorded here. To do this, we simply click on the drop-down menu of the status (e.g. “untested”) in the list or on “Add result” within the details page. We can pre-select anything without consequence, such as “passed”, as a separate window will open anyway where we can adjust the results again. This may seem unexpected at first, but it is easy to learn. Basically, it is up to us which view we want to use to test. The most important thing is to read the steps carefully. However, the modal offers the advantage of marking steps already performed as “passed” to keep track of them, and it also allows us to record times, which can be interesting for planning future test runs.

Once we have captured the result of the test, TestRail does an excellent job of logging. The modal contains not only a comment function, but also fields for build number, version, etc., in addition to the status (Passed, Blocked, Retry, Failed). These can be expanded with additional fields as needed. A particularly interesting area concerns defects. Here we not only have the option to enter reference numbers (i.e. ticket IDs), but you can also create tickets directly in Jira, as long as Jira is connected to TestRail. So if we find a bug in the software, we can create a Jira ticket directly from TestRail, and the ticket ID is automatically linked to the test result in TestRail. This allows QA teams to track the current status of Jira tickets directly in TestRail and see when a feature can be retested, independent of project management and developers. Within Jira, all relevant information from TestRail is displayed in the ticket, and the template used can be edited in TestRail. In this way, developers are also provided with all the necessary information.

IPC NEWSLETTER

All news about PHP and web development

 

Traceability and Reports

TestRail provides a comprehensive range of reporting options to monitor progress and test coverage. You can compare results from different test runs, configurations and milestones. These reports can be automatically generated with a schedule and shared with both internal team members and external stakeholders, including the ability to generate reports as PDFs.

Learning TestRail’s reporting features may take some time, but once the various options are understood, many options are available to customize the reports to meet the team’s unique needs.

In addition to generated reports, TestRail also offers real-time reports. These can be found at the project level, milestone level, test plan level and test run level.

In the area of tracking, TestRail provides the ability to assign external reference IDs. This can be a Jira ticket ID, for example. If one has additionally linked Jira correctly, a tooltip field with information directly from Jira even opens when hovering. This gives you the possibility to assign different tests to a Jira ticket (e.g. Epic). This linking can be used for corresponding evaluations, but also for simple filtering when creating test plans.

TestRail API

TestRail has an extremely comprehensive HTTP-based API, which enables the creation of a wide range of interfaces. Using this API, we can retrieve test cases, create new test results, send attachments, and perform basic tasks such as creating test runs and editing configurations.

TestRail provides its own Github repository with templates for development in PHP, Java, .Net, Ruby and more.

Based on this API, we can now integrate a plugin for our test automation and submit results directly from Cypress to TestRail.

Cypress and TestRail

There are various reasons why test automation is sought. Whether it is due to resource constraints, to avoid repetitive steps, or to secure critical areas of the application that are often error prone.

To begin automation with Cypress, let’s create a Cypress project. Since the focus of this article is on TestRail, we will not go further into the implementation of Cypress tests here. The crucial point is the actual integration of our plugin.

First, we select a test from our test catalog. In collaboration with QA and development team (or Test Automation Engineers), a kick-off is conducted to take a closer look at the desired test and its behavior. After the test is implemented in Cypress, it is reviewed accordingly. If everything fits, we can mark the test as “automated” in TestRail. This will give us a better overview in the future of which tests are automated, and therefore no longer need to be tested manually.

But how do the results from Cypress get into TestRail? Quite simply – via an appropriate plugin based on the TestRail API. We install a compatible plugin like the “Cypress TestRail Integration” [https://github.com/boxblinkracer/cypress-testrail].

The configuration is relatively simple using the “setupNodeEvents” function enabled by Cypress.

e2e: { setupNodeEvents(on, config) { return require('./cypress/plugins/index.js')(on, config) } , }

This file relates our manually created “index.js” file with the actual registration of the plugin. Of course, this step can also be done inline.

const TestRailReporter = require('cypress-testrail');
module.exports = (on, config) => { new TestRailReporter(on, config).register(); return config }

After this is done, there are only two simple steps left. First, we still need a configuration for our TestRail instance, and of course we still need to link our created test to the test that is in TestRail.

Let’s start with the configuration. We have several options to do this. Either we create a “cypress.env.json” file or work directly with environment variables, for example in the CI/CD section.

 

The plugin offers two basic ways to send results to TestRail. It is possible to send the results directly to an existing and prepared test run, or to have new runs created dynamically. The choice of the appropriate approach can vary depending on the team and the project. So this flexibility is given.

The following example shows a JSON file that sends results to a defined Test Run:

{ "testrail": { "domain": "my-company.testrail.io", "username": "myUser", "password": "myPwd", "runId": "R123" } }

After the connection is configured, we just need to map our Cypress test to the appropriate TestRail test. This is done via a simple mapping in the test description (Test Description) using the ID from TestRail. The TestRail ID is visible with the tests and always starts with a “C”. It is also possible to link multiple Test Cases to a single Cypress Test.

it('C123: My Test for TestRail case 123', () => { // ... // ... })
it('C123 C54 C36: My Test for multiple TestRail case IDs', () => { // ... // ... })

That’s all. Now when we start Cypress in “run” mode, we see a hint about our integration and its configuration at the beginning. After a spec file is processed in Cypress, the results of the tests performed in it are finally sent to TestRail.

The integration offers many more options, such as uploading screenshots, adding more metadata and much more.

Conclusion

Testing is more than just running tests. To get the multitude of necessary tasks sorted out, test management tools like “TestRail” help us. TestRail offers a powerful test management solution that covers the entire quality management process, from test case creation to reporting. With features for structuring test catalogs, flexible test plans and comprehensive reporting, it enables efficient test management.

IPC NEWSLETTER

All news about PHP and web development

 

TestRail’s seamless integration with other tools, such as Jira, facilitates collaboration between test and development teams. In addition, the fully comprehensive API enables integration for automation software such as Cypress and Co. among others.

Overall, TestRail provides a comprehensive solution to streamline the QA process and deliver high-quality software products.


Links & Literature

https://www.testrail.com/

https://github.com/gurock/testrail-api

https://github.com/boxblinkracer/cypress-testrail

The post Professional Test Management with TestRail – Part 2 appeared first on International PHP Conference.

]]>
Professional Test Management with TestRail – Part 1 https://phpconference.com/blog/professional-test-management-with-testrail-part1/ Tue, 26 Sep 2023 12:55:42 +0000 https://phpconference.com/?p=85653 "Now just a quick test and we can go live!" Surely most of us have heard this statement before. A professional approach, perfect plans and structured work during the project - and yet this optimistic, yet at the same time naive conclusion in the home stretch.

The post Professional Test Management with TestRail – Part 1 appeared first on International PHP Conference.

]]>

But what is the problem with testing? Not in testing itself, but in the perception that testing can be done quickly and at short notice. However, professional quality management encompasses much more than just testing. It starts at the very beginning of the project, and over its duration provides answers to questions such as the coverage of planned tests, the progress of the project, the number of known defects, and much more.

IPC NEWSLETTER

All news about PHP and web development

 

Tools are available to us for exactly these tasks, so-called test management applications. In this article, we will take a look at the application “TestRail”, and learn what possibilities such software offers us, and how we can use it.

However, before we get into the details, it is important to consider what is actually meant by the term “testing” and what tasks are associated with it.

What does professional testing mean?

What does testing actually mean? According to the guidelines of the ISTQB (International Software Testing Qualifications Board), testing includes:

The process consisting of all lifecycle activities (both static and dynamic) that deal with planning, preparation, and evaluation of a software product and associated deliverables.

This definition is undoubtedly based on a broad focus on all activities, which means that testing encompasses much more than simply running tests.

If we take a closer look at the start of a new project, it is common knowledge that project management, technical lead devs and other stakeholders work with customers and stakeholders to create project plans, divide them into work packages and release them for development. What is often neglected, however, is the role of testers in this crucial planning phase of the project.

BE ON THE SAFE SIDE!

Explore the Quality & Security Track

 

In the area of quality management or quality assurance, one or more test concepts are developed in the professional approach at the beginning of the project. These test concepts sometimes deal with seemingly simple questions, which, however, play a central role in the development of test cases.

What are the goals of our testing? Do we want to build trust in the software, or just minimize risks? Evaluate conformance, or simply prove the impact of defects? What documents do we create for our tests? What forms the basis of our tests (concepts, specifications, instructions, functions of the predecessor software)? Which test environments are available, when will they be implemented, and which approaches and methods do we use to develop test cases?

For those who have now had their “aha” moment, it should be added that such test concepts can indeed be elaborated for each test level of the V-Modell. For example, in the area of component testing, we usually strive for things like unit tests, code coverage and whitebox testing, while in system testing, blackbox testing methods are increasingly used for test case development (equivalence classes, decision tables, etc.). In addition, system testing may already be validating instead of just verifying things.
 >Validation deals with making sense of the result (does the feature really solve the problem), while verification refers to checking requirements (does it work according to the requirement).

Due to the considerable amount of information and the work steps according to ISTQB (yes, that was by far not all), I would like to divide these, into four simple areas:

  • Test Case Management
  • Test Planning
  • Test execution
  • Final reports

This clear structure makes it possible to manage the complexity of the testing process and to ensure that all necessary steps are carried out carefully.

Testing in a Software Project

To facilitate the later use of TestRail, let’s now take a rough look at the flow of a project, using the points simplified above.

After the test base (requirements, concepts, screenshots, etc.) has been defined, various test concepts have been generated, and appropriate kick-off meetings have taken place, it is the responsibility of the testers to develop appropriate test cases. These tests essentially provide step-by-step guidance on how to perform them, whether on a purely written or even visual basis.

Those who have done this before know that there are few templates and limitations in this regard. These range from simple functional tests, such as technical API queries, to extensive end-to-end scenarios, such as a complete checkout process in an e-commerce system, including payment (in test mode).

A key factor in test design is recognizing that quantity does not necessarily mean quality. It makes little sense to have 1000 tests that cannot possibly be run manually over and over again due to scarce capacity. It makes much more sense to create fewer tests, but with a large number of implicit tests so that they automatically test additional peripheral aspects of the actual case, if possible.

Now that a list of tests has been created, it is of course useful if it can be filtered. Therefore, the carefully compiled test catalog is additionally categorized. The so-called “Smoke & Sanity” tests comprise a small number of tests that are so critical that they should be tested with every release. Simple regression tests, in turn, provide an overview of optionally testable scenarios that can be rerun as needed (suspected sideeffects, etc.).

The list of these categories can vary, as there is no official standard and they can vary from company to company. Ultimately, the most important thing is the ability to easily filter based on requirements. Of course, there are many other interesting filtering options, such as a reference Jira ticket ID for the Epic covered in the test, or possibly specific areas of the software such as “Account”, the “Checkout” or the “Listing” in e-commerce projects.

Now that the test catalog has been generated, the question is whether we should directly test it in full. The answer is yes and no! Here it depends on what is crucial for the project management and the stakeholders, i.e. what kind of report they ultimately need.

Therefore, we can create test plans that include either all tests, or only a subset of them. Usually, for example, before a release for a plugin (typically with semantic versioning v1.x.y, …) all “smoke & sanity” tests are tested, as well as some selected tests for new features and old features. Although it would of course be ideal to run all tests, this is unfortunately often unrealistic, depending on team size and time pressure. A relaunch project that is created from scratch should of course be fully tested before final acceptance. However, for a more economical way of working (shift-left), it is possible to plan various test plans for the already completed areas of the software earlier. Thus, tests for the “account” area of an online store could be started before the “checkout” area is testable. This gives an earlier result and also provides a cheaper way to fix bugs (the earlier in development the cheaper).
However, this is still a gamble, as side effects could still occur due to integration errors at the end of the project. Thus, additional testing at the end is always advisable.

Planning test executions thus involves selecting and compiling tests from our test catalog, taking into account various factors such as their importance, significance, priority and feasibility.

After the test plans have been created, and the work packages have been put into a testable state, now the perhaps simplest, but extremely prominent step in the QA process starts – the execution of the tests. This step can be quite straightforward, depending on the quality of the prepared tests, but it always requires a step-by-step approach. (A small tip: in addition to running these tests, freer and exploratory testing is also recommended to uncover additional paths and bugs).

 

During test execution, however, it is critical to log results as accurately as possible. This includes capturing information such as screen sizes, devices used, browsers used, taking screenshots and recording the ticket ID of the work package, and more. Such logging is necessary for tracking and makes troubleshooting much easier for developers.

After the tests have been run, it’s time to create the final reports. Stakeholders and other involved parties naturally want to know what the status of the project is. Among other things, they are interested in the test coverage, the number of critical issues found, and whether they might suggest a premature go-live of the application. The creation of reports is therefore an essential step in the QA process, as they form the basis for decisions and consequences for the entire project.

Fortunately, in order not to lose track of all these tasks, tools and applications are available. Although in theory simple documents based on Word and Excel can suffice, professional test management applications provide a much more efficient and organized workspace for the entire team.

A leading tool in this field is “TestRail”.

Test Management with TestRail

TestRail, developed by Frankfurt-based Gurock Software, is characterized by its specialization in highly efficient and comprehensive solutions for QA teams. Its offerings range from comprehensive test management capabilities to the creation of detailed test plans, precise execution of tests, meticulous logging and extensive reporting. And for those who want to go even further, TestRail offers an extensive API that can be used to develop custom integrations to further customize and optimize the QA process.

When visiting the TestRail website, it quickly becomes clear that there is more than just software on offer here. TestRail’s content team continuously publishes interesting articles on the subject of testing, which offer real added value thanks to their practical and technically appealing content.

TestRail itself can be used either as a cloud solution or via an on-premise installation. The cloud variant offers a comprehensive solution at quite affordable prices, around EUR 380 per user per year. For those who want additional functions, the Enterprise Cloud version is available for around EUR 780 per user per year. This includes single sign-on, extended access rights, version control of tests and much more.

IPC NEWSLETTER

All news about PHP and web development

 

The installation on own servers is more expensive, about 7,700 EUR to 15,620 EUR per year, but already includes a large contingent of available users and can be a suitable solution especially for larger teams and companies.

Once you have chosen a version, such as the cloud solution, it can be used after a short registration.

Create a project

Let’s start by creating a new project in TestRail. In addition to the project title and access rights, there are settings related to Defects and References, which will be discussed in more detail later in this article. Through these two functions, it is possible to link applications such as Jira, with TestRail and get a smooth navigation, as well as a preview of linked Defect tickets or even Epic tickets (references).

Probably the most interesting and important area concerns the type of project we are creating. Here, TestRail offers us three different options for structuring our test catalog.

The user-friendly “Single Repository” option allows us to create a simple and flexible test catalog that can be divided into sections and subsections.

The “Single Repository with Baseline Support” option allows us to keep the simplicity of the first model, but create different branches and versions of test cases. This is especially useful for teams that need to test different product versions simultaneously.

The third variant offers the possibility to use different test catalogs to organize the tests. Test catalogs can be used for functional areas or modules of the application. This type of project is more suitable for teams that need a stricter division of the different areas. A consequence of this is that test executions can only ever include tests from a single test catalog.

For our project launch and greater flexibility, we choose the “Single Repository” type.

Create tests

After the project is created, we are taken to an overview page. Here, at a later stage of the project phase, we will find more useful information.

Now it is time to create our first test. To do this, we open the “Test Cases” section in the project navigation.

On this page we see the currently still empty test catalog. Our task now is to create an appropriate number of tests that are optimally structured and filterable for us.

TestRail offers a variety of options for organizing test cases. In addition to filterable properties, we can also create a hierarchical structure by using sections. There are no hard and fast rules on how this should be done.

We can use sections for different areas of the application like “Checkout” or “Account”, or create them for individual features. The author often finds it helpful to use sections to break down the application by area or feature, as these can be used later as a guide when creating test plans.

Regardless of whether we decide to use sections or not, the next step is to create our first test.

Looking at the input screen, we notice that a lot of emphasis has been placed on relevant information here.

We have the option to define various properties, such as the type of test (smoke, regression, etc.), priority, automation type and much more. If these options are not enough, we can easily create and add new fields through the administration.

When we define the instructions of a test, we have the option to use one of several templates. Besides the variant with a free text field, we also have a template for step-by-step instructions. With the latter, we can define any number of steps with sequences and expected intermediate results. This not only offers the advantage of clear instructions, but also allows us to specify exact results for each step. This way, we can later immediately see from which step an error occurred.

YOU LOVE PHP?

Explore the PHP Core Track

 

For testers managing large projects, there is also the option of outsourcing certain steps to separate central tests, such as the “login process on a website”, and then reusing them in different tests.

Thanks to the extensive editing options for tests in TestRail, there are no limitations when it comes to defining test cases efficiently and precisely.

Today we learned about the different processes of a testing team in a software project, and started using TestRail to set up our project.

With the tests we created together and the resulting filterable test catalog, we now have a perfect basis to plan the actual testing of our application.

In the next part we will use this test catalog to create test plans as well as to execute the tests.

We will also take a look at reporting, traceability, and Cypress integrations via the available TestRail API to complete our flow.

The post Professional Test Management with TestRail – Part 1 appeared first on International PHP Conference.

]]>
PHPUnit 10 – All you need to know about the latest version https://phpconference.com/blog/phpunit-10-all-you-need-to-know-about-the-latest-version/ Tue, 01 Aug 2023 14:22:31 +0000 https://phpconference.com/?p=85551 PHPUnit 10 is the most important release in PHPUnit's now 23-year history. It is to PHPUnit what PHP 7 was to PHP: a massive cleanup and modernization that lays the foundation for future development. Let's take a look inside at what specific changes PHPUnit 10 has brought and will bring in the coming months.

The post PHPUnit 10 – All you need to know about the latest version appeared first on International PHP Conference.

]]>

PHPUnit 10 should have been released on February 5, 2021, the first Friday in February 2021. It would have followed the tradition of PHPUnit 6, 7, 8 and 9 of being released on the first Friday of February each year, before most people in Germany had their first cup of coffee. PHPUnit 10 was then released on February 3, 2023, the first Friday in February 2023, two years late.

There are reasons for the delay. One of the most substantial may be a pandemic that has affected us all and permanently changed the lives and work habits of many people. Since April 2017, PHPUnit Code Sprints were held every six months, which the author attended with great pleasure and regularity. On one hand, these sprints gave the opportunity to rediscover and rediscover the functionality of PHPUnit together with Sebastian Bergmann, friends and acquaintances of PHPUnit, and on the other hand also to contribute to the development of PHPUnit in a concentrated way.

In September 2019, the last Code Sprint for the time being took place in Mannheim. In October 2019, Sebastian Bergmann, Arne Blankerts, Stefan Priebsch, Ewout Pieter den Ouden and the author participated in the EU-FOSSA Cyber Security Hackathon, organized by the European Union, to work on critical infrastructure for the European Union in parallel with other developers. It was there that the idea for one of the biggest changes in PHPUnit came up, the new event system that would find its way into PHPUnit 10.

However, COVID-19 meant that events such as the PHPUnit Code Sprint, official and unofficial hackathons, PHP user groups and conferences could no longer take place in the usual way. These events were cancelled completely or were only held online. The working habits of many of us, who had previously been able to engage in constructive exchange with developers on-site at customer locations, for example, and were now only able to do so online, also underwent lasting changes as a result of the pandemic.

IPC NEWSLETTER

All news about PHP and web development

 

These changes also affected the work on PHPUnit. However, this does not mean that nothing has been achieved since the release of PHPUnit 9 in February 2020. On the contrary, PHPUnit 10, as already indicated, brings major changes, especially beneath the surface.

PHPUnit 10.0.0

PHPUnit 10.0.0 was released on February 3, 2023. Immediately after the release, a number of releases followed in quick succession until the end of March, fixing bugs and flaws and responding to feedback from developers. PHPUnit 10.0.19 was released on March 27, 2023.

PHPUnit 10 requires PHP 8.1 or higher. Developers using versions older than PHP 8.1 must use older versions of PHPUnit, such as PHPUnit 9 (requires PHP 7.3 or higher) or PHPUnit 8 (requires PHP 7.2 or higher). For PHPUnit 10, the documentation has been completely revised. In the following we want to take a look at the new functionalities.

Event system

The TestListener and Hook system available in PHPUnit 9 provide interfaces for extending PHPUnit. Both interfaces have serious drawbacks.

The TestListener system required third-party vendors to create a class that implemented a TestListener interface. As a result, third-party vendors must implement every method of this interface, even if that method is not required. To facilitate implementation, PHPUnit provided a TestListenerDefaultImplementation trait.

The TestListener system allowed third-party developers to manipulate the factually modifiable objects within their implementation to alter test results. The best-known example of this might be an implementation that, when executing tests, checks in which environment those tests are executed and thus, for example, marks and outputs failed tests as successful in a CI environment.

The Hook system allowed third-party developers to create a class that only needs to implement the interfaces that are relevant to the extension. In addition, only scalars and no mutable objects were now passed to these methods. So this system improved PHPUnit’s extension interface: it removed the ability to influence test results, but also required more work for third-party vendors to provide similar functionality.

YOU LOVE PHP?

Explore the PHP Core Track

 

In PHPUnit 10, both systems have now been replaced with an event system. Almost everything in PHPUnit is now an event. All output, both on the console and in log files, is based on events. The development of this event system was led by Arne Blankerts and the author. As mentioned at the beginning, the development of the event system was started at the EU-FOSSA Cyber Security Hackathon in October 2019 together with Stefan Priebsch and Ewout Pieter den Ouden.

In the process, PHPUnit’s internal code, which previously used the TestListener system and ResultPrinter classes, was completely reworked (and in some cases rewritten) to use the event system instead. Due to the self-imposed constraint of using events for all output, both console and log, many confusing and/or missing events were discovered early on.

The new event system is not only superior to the earlier approaches TestListener and Hook. The work on the event system had a ripple effect on the entire PHPUnit codebase. A lot of technical debt was finally paid off. Finding the right places to emit the right events brought to light countless previously hidden inconsistencies and problems.

For example, a concrete event required a canonical and immutable representation of the configuration. As a result, the code that loads the XML configuration could be improved. Likewise, the code that processes the command line options and arguments could be improved. And most importantly, the code that combines these sources into the actual configuration has been significantly improved. When this actual configuration was created, large parts of the command line program could be implemented much more easily. This allowed other parts to be cleaned up, and so on and so forth.

The new event system allows read-only access and now has a large number of event objects (currently 67) that can be created during PHPUnit execution and also processed by extensions to PHPUnit. The event objects that are then passed to these extensions, as well as any value objects that are combined into such an event object, are immutable and contain a variety of information that may be of interest to PHPUnit extensions. For example, all of these objects contain information about runtime, current and maximum memory usage, and much more.

IPC NEWSLETTER

All news about PHP and web development

 

PHPUnit 10 and its new event system require third-party developers to make significant changes to their extensions and tools for PHPUnit. The PHPUnit development team regrets that this may require significant effort, but at the same time is confident that in the long run the benefits of the new event system will outweigh the costs.

The PHPUnit development team has received promising feedback in this regard. Back in October 2021, Nuno Maduro reported that migrating Pest (an alternative and popular tool in the Laravel scene for running tests based on PHPUnit) from TestListener to the new event system had been a “great” experience. Discussions that the PHPUnit development team had with Filippo Tessarotto were then instrumental in ensuring that solutions like ParaTest could be updated to work with PHPUnit 10.

Separation of test results and test problems

In PHPUnit 10, a clear separation was introduced between the result of a test (failed, failed, incomplete, skipped or passed) and the problems of a test (considered risky, triggered a warning, etc.).

In PHPUnit 9, the internal error handling routine optionally converted errors of types E_DEPRECATED, E_NOTICE, E_WARNING, E_USER_DEPRECATED, E_USER_NOTICE, E_USER_WARNING, etc. into exceptions. These exceptions aborted the execution of a test and caused PHPUnit to consider the test as failed.

In PHPUnit 10, the internal error handling routine no longer converts these errors to exceptions. Therefore, the execution of a test is no longer aborted when, for example, an E_USER_NOTICE is raised. Consequently, such a test is no longer considered to have errors.

The example in Listing 1 raises an E_USER_NOTICE during the execution of a test.

```php
<?php
 
declare(strict_types=1);
 
use PHPUnit\Framework;
 
final class ExampleTest extends Framework\TestCase
{
  public function testSomething(): void
  {
    $example = new Example();
 
    self::assertTrue($example->doSomething());
  }
 
  public function testSomethingElse(): void
  {
    $example = new Example();
     self::assertFalse($example->doSomething());
  }
}
```
 
```php
<?php
 
declare(strict_types=1);
 
final class Example
{
  public function doSomething(): bool
  {
    // ...
 
    trigger_error('message', E_USER_NOTICE);
 
    // ...
 
    return false;
  }
}
```

In PHPUnit 9, E_USER_NOTICE was converted to an exception and the execution of the test was aborted (Listing 2).

```
➜ php phpunit-9.6.phar --verbose ExampleTest.php
PHPUnit 9.6.0 by Sebastian Bergmann and contributors.
 
Runtime:       PHP 8.2.2
 
EE                                 2 / 2 (100%)
 
Time: 00:00.015, Memory: 6.00 MB
 
There were 2 errors:
 
1) ExampleTest::testSomething
message
 
/path/to/Example.php:11
/path/to/ExampleTest.php:13
 
2) ExampleTest::testSomethingElse
message
 
/path/to/Example.php:11
/path/to/ExampleTest.php:20
 
ERRORS!
Tests: 2, Assertions: 0, Errors: 2.
```

This means that using PHP functionality that triggers E_DEPRECATED, E_NOTICE, E_STRICT, or E_WARNING, or calling code that triggers E_USER_DEPRECATED, E_USER_NOTICE, or E_USER_WARNING can no longer hide an error in the executed code. In the example shown above, the assertion line is never reached when PHPUnit 9 is used and the code under test triggers E_USER_NOTICE.

 

In PHPUnit 10, the E_USER_NOTICE is not converted to an exception and therefore the execution of the test is not aborted (Listing 3). By default, PHPUnit 10 does not display details about deprecations, notices, or warnings. In order for these details to be displayed, the command line options –display-deprecations, –display-notices and –display-warnings (or their counterparts in the XML configuration file) must be used.

```
PHPUnit 10.0.0 by Sebastian Bergmann and contributors.
 
Runtime:       PHP 8.2.2
 
FN                                       2 / 2 (100%)
 
Time: 00:00.015, Memory: 6.00 MB
 
There was 1 failure:
 
1) ExampleTest::testSomething
Failed asserting that false is true.
 
/path/to/ExampleTest.php:13
 
--
 
There were 2 notices:
 
1) ExampleTest::testSomething
message
 
/path/to/ExampleTest.php:13
 
2) ExampleTest::testSomethingElse
message
 
/path/to/ExampleTest.php:20
 
FAILURES!
Tests: 2, Assertions: 2, Failures: 1, Notices: 2.
```

Metadata with attributes

In PHPUnit 10, metadata can be specified for test classes and test methods as well as for tested code units with attributes. Listing 4 shows the specification of metadata with annotations as known from PHPUnit 9 and older versions of PHPUnit. Listing 5 shows the specification of metadata with attributes as it is possible in PHPUnit 10.

```php
<?php 
 
declare(strict_types=1);
 
namespace App\Test;
 
use App\Example;
use PHPUnit\Framework;
 
/**
 * @covers \App\Example 
 */
final class ExampleTest extends TestCase
{
  /**
   * @dataProvider provideData
   */
  public function testSomething(
    string $expected, 
    string $input,
  ): void {
    $example = new Example();
 
    $actual = $example->doSomething($input);
 
    self::assertSame($expected, $actual);
  }
 
  public static function provideData(): array
  {
    return [
      [
        'foo', 
        'bar',
      ],
    ];
  }
}
```
```php
<?php 
 
declare(strict_types=1);
 
namespace App\Test;
 
use App\Example;
use PHPUnit\Framework;
 
#[Framework\Attributes\CoversClass(Example::class)]
final class ExampleTest extends TestCase
{
  #[Framework\Attributes\DataProvider('provideData')]
  public function testSomething(
    string $expected, 
    string $input,
  ): void {
    $example = new Example();
    
    $actual = $example->doSomething($input);
 
    self::assertSame($expected, $actual);
  }
 
  public static function provideData(): array
  {
    return [
      [
        'foo', 
        'bar',
      ],
    ];
  }
}
```

In PHPUnit 10, both annotations and attributes are supported. PHPUnit 10 first searches for attributes for a code unit. If no attributes are found, the system falls back on any existing annotations.

Currently there are no concrete plans if and when the support for annotations will be marked as deprecated and removed.

New assertions

A number of assertions have been added in PHPUnit 10. These include:

  • assertIsList()
  • assertStringEqualsStringIgnoringLineEndings()
  • assertStringContainsStringIgnoringLineEndings()

New command line options

A number of command line options have been added in PHPUnit 10. These include:

  • –display-deprecations, enables the display of deprecations
  • –display-errors, enables the display of errors
  • –display-incomplete, enables the display of incomplete tests
  • –display-notices, activates the display of notices
  • –display-skipped, activates the display of skipped tests
  • –display-warnings, enables the display of warnings
  • –no-extensions, allows to disable all extensions for PHPUnit
  • –no-output, allows to disable all output from PHPUnit
  • –no-progress, allows to disable the progress indicator
  • –no-results, allows to disable the results display

 

Removed functionalities

In PHPUnit 10, all functionalities that were marked as deprecated in PHPUnit 9 have been removed. Developers:inside who receive warnings about using PHPUnit deprecated functionality when running their tests with PHPUnit 9 will not be able to upgrade to PHPUnit 10 until they have stopped using that deprecated functionality.

Removal of PHPDBG and Xdebug 2 support

In PHPUnit 10, support for PHPDBG and Xdebug 2 for collecting code coverage has been removed. PCOV or Xdebug 3 are required to collect code coverage.

Removal of integration with Prophecy

In PHPUnit 10, the integration with Prophecy for creating test doubles has been removed. Developers who use libraries such as Prophecy or Mockery in their tests to create test doubles will need to rewrite their tests for PHPUnit 10 or wait for Prophecy and Mockery to support PHPUnit 10. At this time, neither Prophecy nor Mockery support PHPUnit 10.

Removal of assertions

In PHPUnit 10, a number of assertions have been removed, some of which were replaced in PHPUnit 9 with newly added alternatives. These assertions include:

  • assertNotIsReadable(), replaced by assertFileNotIsReadable()
  • assertNotIsWritable(), replaced by assertFileNotIsWritable()
  • assertDirectoryNotExists(), replaced by assertDirectoryDoesNotExist()
  • assertDirectoryNotIsReadable(), replaced by assertDirectoryIsNotReadable()
  • assertDirectoryNotIsWritable(), replaced by assertDirectoryIsNotWritable()
  • assertFileNotExists(), replaced by assertFileDoesNotExist()
  • assertFileNotIsReadable(), replaced by assertFileIsNotReadable()
  • assertFileNotIsWritable(), replaced by assertFileIsNotWritable()
  • assertRegExp(), replaced by assertMatchesRegularExpression()
  • assertNotRegExp(), replaced by assertDoesNotMatchRegularExpression()
  • assertEqualXMLStructure(), removed without replacement

Removal of matchers

In PHPUnit 10, the at() matcher has been removed. This matcher previously allowed setting expectations on test doubles that methods would be called in a specific order.

The withConsecutive() matcher has also been removed. This matcher previously allowed expectations to be placed on Test Doubles that methods would be called in a certain order with certain arguments.

Both matchers previously allowed code to be written that introduced temporal coupling. Removing these matchers emphasizes that code that introduces temporal coupling is not timely and should be avoided.

Removal of command line options

In PHPUnit 10, a number of command line options have been removed. These include:

  • –debug, allowed debug output to be enabled while running tests.
  • –extensions, allowed configuration of extensions for PHPUnit
  • –printer, allowed configuration of a class to output test results
  • –repeat, allowed repeated execution of tests
  • –verbose, allowed configuring more detailed output while running tests

Removal of the TestListener and Hook systems

In PHPUnit 10, both the TestListener and Hook systems have been removed as interfaces for third-party extensions to PHPUnit. Developers:inside who rely on functionality from extensions for PHPUnit 9 will not be able to use PHPUnit 10 until those extensions have been migrated to PHPUnit 10’s new event system or they have found alternative extensions that are compatible with PHPUnit 10.

IPC NEWSLETTER

All news about PHP and web development

 

PHPUnit 10.1.0

PHPUnit 10.1.0 was released on April 14, 2023. This release was followed by only a smaller number of patch releases. PHPUnit 10.1.3 was released on May 11, 2023. Below are the new, changed as well as deprecated functionalities of PHPUnit 10.1.

New assertions

New assertions have been added in PHPUnit 10.1.0. These include:

  • assertObjectHasProperty()
  • assertObjectNotHasProperty()

New attributes

New attributes have been added in PHPUnit 10.1.0. These attributes include:

  • IgnoreClassForCodeCoverage
  • IgnoreMethodForCodeCoverage
  • IgnoreFunctionForCodeCoverage

New source element in XML configuration

In PHPUnit 10.1.0, a new <source> element has been added to the XML configuration. This element allows to configure a list of directories and files to be considered as source code of a project by PHPUnit. In addition, this element allows to configure in detail how to handle notices, deprecations and warnings that arise from running the source code.

Accordingly, there is now a new Source object that represents the configuration of the <source> element. The <source> element replaces the <coverage> element, which has now been marked as deprecated.

New methods for creating test doubles

In PHPUnit 10.1.0, a TestCase::createConfiguredStub() method has been introduced, analogous to the TestCase::createConfiguredMock() method that has been present since PHPUnit 9. This method allows to create a test double that has configured methods and return values, but causes a test to fail when called by other, non-configured methods.

New method for configuration by extensions

In PHPUnit 10.1.2, a method has been added to the extension facade that allows an extension to PHPUnit to indicate that the extension intends to replace the entire output of PHPUnit.

Suppression of deprecations, notices and warnings

In PHPUnit 10.1.0, E_USER_* errors suppressed by the @ operator are ignored again.

coverage element in XML configuration

In PHPUnit 10.1.0, the coverage element of the XML configuration was marked as deprecated. This element is replaced by the newly added source element.

Methods for creating test doubles

In PHPUnit 10.1.0, methods used to create and configure test doubles were marked as deprecated. These include:

  • MockBuilder::enableProxyingToOriginalMethods()
  • MockBuilder::disableProxyingToOriginalMethods()
  • MockBuilder::allowMockingUnknownTypes()
  • MockBuilder::disallowMockingUnknownTypes()
  • MockBuilder::enableArgumentCloning()
  • MockBuilder::disableArgumentCloning()
  • MockBuilder::addMethods()
  • MockBuilder::getMockForAbstractClass()
  • MockBuilder::getMockForTrait()
  • TestCase::createTestProxy()
  • TestCase::getMockForAbstractClass()
  • TestCase::getMockForTrait()
  • TestCase::getMockFromWsdl()
  • TestCase::getObjectForTrait()

These methods are expected to be removed in PHPUnit 12.

Methods to access aspects of configured source code

In PHPUnit 10.1.0, with the introduction of the <source> element in the XML configuration, methods to access aspects of the configured source code were marked as deprecated. In their place, alternative and newly introduced methods of the source object can be used. These methods include:

  • Configuration::hasNonEmptyListOfFilesToBeIncludedInCodeCoverageReport(), replaced by Source::notEmpty()
  • Configuration::coverageIncludeDirectories(), replaced by Source::includeDirectories()
  • Configuration::coverageIncludeFiles(), replaced by Source::includeFiles()
  • Configuration::coverageExcludeDirectories(), replaced by Source::excludeDirectories()
  • Configuration::coverageExcludeFiles(), replaced by Source::excludeFiles()

PHPUnit 10.2.0

PHPUnit 10.2.0 was released on June 2, 2023. PHPUnit 10.2.2 was released on June 11, 2023. Below you can see the new functionalities and those marked as deprecated.

Optional suppression of deprecations, notices and warnings

In PHPUnit 10.2.0, enhancements have been made to allow optional suppression of deprecations, notices, and warnings.

Methods to access aspects of the configured source code

In PHPUnit 10.2.0, methods for accessing aspects of configured source code have been marked as deprecated. Instead, alternative and newly introduced methods of the source object can be used. These methods include:

  • Configuration::restrictDeprecations(), replaced by Source::restrictDeprecations()
  • Configuration::restrictNotices(), replaced by Source::restrictNotices()
  • Configuration::restrictWarnings(), replaced by Source::restrictWarnings()

PHPUnit 10.3.0

PHPUnit 10.3.0 is scheduled for release on August 4, 2023. The following is planned for it.

XML format for log files

For PHPUnit 10.3.0 it is roughly planned to release a new XML format for log files. The XML format for log files used by PHPUnit so far has existed for about 20 years and is based on the XML format used by JUnit. This XML format has the disadvantage that it is not under the control of either JUnit or PHPUnit. In addition, there is no official schema in XSD format that can be used to check the validity of log files.

However, the goal of a new XML format is not to produce another standard. Rather, the goal of a PHPUnit proprietary XML format is to be able to accommodate more information. Thanks to the new event system of PHPUnit 10, there is now significantly more information available, which unfortunately cannot be represented with the XML format currently used by PHPUnit 10.

Further planned releases

On October 6, 2023 PHPUnit 10.4.0 and on December 1, 2023 PHPUnit 10.5.0 will be released.

The post PHPUnit 10 – All you need to know about the latest version appeared first on International PHP Conference.

]]>
The PHPUnuhi Framework at a Glance https://phpconference.com/blog/the-phpunuhi-framework-at-a-glance/ Fri, 14 Jul 2023 12:31:37 +0000 https://phpconference.com/?p=85494 While pipelines, tests, and automation positively influence many aspects of our daily work, there are still topics where manual work makes developers yawn. The platform-independent open source framework PHPUnuhi is trying to revamp the topic of “translations”, enhancing it with possibilities in the areas of CI/CD, storage formats, and even OpenAI.

The post The PHPUnuhi Framework at a Glance appeared first on International PHP Conference.

]]>

Who hasn’t had the following situation? You’re working on an application, a plug-in, or something similar and suddenly discover that translations in some language are missing. Depending on the software’s application area, this can either make the user smile slightly, or it can have far-reaching consequences. But one thing is always the same. The non-functional requirement “trust in the software” is harmed.

It’s a pity that mistakes happen here again and again. Meanwhile, there are tools like PHPUnit, PHPStan, and many others that help create high-quality applications. But what about translation? Wouldn’t it be wonderful if the pull request pipeline failed right when a colleague forgot a translation? Or even if states arise where individual localizations are out of sync and have a different, invalid structure? This is exactly PHPUnuhi’s approach. But let’s start at the beginning.

Among other things, I’m the developer of the official Mollie payment plug-ins for Shopware. These plug-ins serve as central and optionally installable modules in online stores, based on Shopware 5 or Shopware 6 [1]. Merchants can install these plug-ins in no time and offer a wide range of payment methods from Mollie in their Shopware store [2]. Anyone who’s ever had to do anything in this area knows that payment is a serious sector. In short, it’s about money. There aren’t many excuses when a mistake happens. It just has to work!

IPC NEWSLETTER

All news about PHP and web development

 

Because of this, we’ve already spent a lot of time building pipelines. These range from the usual unit tests, static analysis, to many E2E tests based on Cypress. But despite these precautions, it happens again and again that translations for multilingual plug-ins are forgotten. Every developer and tester knows it’s difficult to verify all areas in all languages, especially as a small team. But for the product’s end user, it simply looks embarrassing and untested.

So one day I decided to integrate a small script that would do at least some rough checks. Lo and behold, soon after, the first pipeline failed when I forgot a translation.

From then on, there were always one or two ideas for further tests and features. And so I decided to completely rebuild the previously small script from the Mollie plug-ins and publish it in combination with many other requirements as a platform-independent open source framework. After all, the world only benefits when more developers get something out of it.

But before we begin our first application, why “unuhi”? Quite simply, it means “translate” or “translation” in Hawaiian. Do I speak Hawaiian? No.

First steps

Before we get into the possibilities and basic concepts of PHPUnuhi, I’d like to start directly with its usage. After just a few steps, the tests are ready to be integrated into a pipeline. Let’s imagine we’re developers

Because after just a few steps, the tests are ready to be integrated into a pipeline. Let’s imagine we are developers:inside of a software that has several translations based on JSON files. These are already finished and are located in the project or source code of the application.

YOU LOVE PHP?

Explore the PHP Core Track

 

You can easily install PHPUnuhi with Composer. The recommendation is to do so as a dev dependency:

composer require --dev boxblinkracer/phpunuhi

After installation, all that’s needed is to create an XML-based configuration, and the framework is ready for use.

In our configuration (phpunuhi.xml) we define one or more translation sets. These sets are freely definable bundles of localizations. A localization is then mapped via a file, section or other, depending on the format. One can either create one large set or several topic-based sets, depending on the platform and application requirements (Listing 1).

<phpunuhi>
  <translations>
    <set name="App">
      <format>
        <json/>
      </format>
      <locales>
        <locale name="de">./snippets/de.json</locale>
        <locale name="en">./snippets/en.json</locale>
      </locales>
    </set>
  </translations>
</phpunuhi>

With that, we’re already finished with basic installation and configuration. Now we can start our tests and check how the translations are doing.

php vendor/bin/phpunuhi validate

Who could see it coming? Unfortunately, the tests fail. We received information that a wrong structure was found, and that a translation exists but doesn’t contain a value.

PHPUnuhi works for individual translation with unique keys in a localization. In our case, there’s an issue with the key card.btnCancel in the German as well as English version (Fig. 1).

(Editor’s note: This article was originally published in German and has been translated into English. Therefore, the translation example in PHPUnuhi is working from German to English.)

Fig. 1: Example of error output during validation

To solve this problem, we have the option of manually entering the missing entry in the de.json file, or we can use a prepared command to automatically repair the structures:

php vendor/bin/phpunuhi fix:structure

This will give us a uniform structure in both files. Now we can run the following command and automatically correct our empty translation too.

php vendor/bin/phpunuhi translate --service=googleweb

With Google’s support, our empty entry has now been automatically translated and entered into the corresponding JSON file. Besides Google [3], DeepL [4], and OpenAI [5] can also be used for this. But before we delve deeper into this topic, it’s time to get to know the basic framework better.

 

PHPUnuhi’s basic structure

PHPUnuhi exists in the combination of different abstraction layers. This makes it possible to guarantee basic functionality while still being flexible in choosing formats and services. What does this mean?

In the current version, there are three basic pillars: storage formats, exchange formats, and translation services. These are in constant interaction and can be combined with each other as you wish (Fig. 2).

Fig. 2: Basic structure of the PHPUnuhi abstraction layers

Storage formats

Storage formats define how data is persisted. Translations can be stored in JSON files, INI files, PHP (array) files, or directly in a database (Shopware 6). Therefore, the focus of Storages is on reading, converting, and writing translations.

Different formats can also be equipped with individual settings. For instance, the JSON and PHP formats have the option of specifying the number of indentations and alphabetical sorting. In the case of Shopware 6 Storage, the database entries entity can (and must be) specified. Listing 2 shows two examples for the INI and Shopware 6 formats.

<set name="Storefront">
  <format>
    <ini indent="4" sort="true"/>
  </format>
  ...
</set>
 
<set name="Products">
  <format>
    <shopware6 entity="product"/>
  </format>
  ...
</set>

While simpler formats like JSON, INI, and PHP are based on simple data structures, there are also formats that divide translations into groups, like Shopware 6. The Shopware 6 format directly connects to the database, so a corresponding connection to the database must be established first. The parameters needed for this connection can be stored easily with an env area in the XML configuration or specified directly via env export (Listing 3).

<phpunuhi>
  <php>
    <env name="DB_HOST" value="127.0.0.1"/>
    <env name="DB_PORT" value="3306"/>
    <env name="DB_USER" value=""/>
    <env name="DB_PASSWD" value=""/>
    <env name="DB_DBNAME" value="shopware"/>
  </php>
</phpunuhi>

But back to our groups. Shopware 6 works as a storage with entities in the database. These are things like products, payment types, currencies, and more. Here, translations don’t refer to the general names of properties, but to product data or user data in the system.

IPC NEWSLETTER

All news about PHP and web development

 

This means that each entry of these entities (for instance, a single product) has multiple properties (name, description, etc.) that can be translated into different languages. The resulting additional dimension in our matrix is solved in PHPUnuhi using groups. Each entity (each product) receives a unique group ID with all associated translations. Table 1 shows an example of this.

Key Group DE EN
name product-1 PHP Magazin PHP Magazine
description product-1 ein tolles Heft a great magazine
name product-2 Entwickler Magazin Developer Magazine
description product-2 auch ein tolles Heft also a great magazine

Table 1: Example of generated translation structures based on groups

Considering that products in particular can have many properties, this list can get very long. There’s also a high chance that only a part of the properties should even be translated at all. This is where another storage format feature comes into play: the filters.

With include or exclude filters, you can include or exclude certain translations. Wildcard placeholders can also be used for this. The configuration in Listing 4 removes the custom_fields property and all properties beginning with meta_ from the translation list.

<set>
  ...
  <filter>
    <exclude>
      <key>custom_fields</key>
      <key>meta_*</key>
    </exclude>
  </filter>
  ...
</set>

Exchange formats

This type of format or abstraction layer is used for exchange with other systems. It focuses on data preparation suitable for the format and the storage (export), as well as reading certain file types for conversion back into PHPUnuhi compatible translations (import).

Of course, the classic CSV cannot be missing. This supports the export and import of simple and extended storage formats (groups).

In other words, no matter what your storage format is, you will receive a CSV file. If the storage you use supports writing translations, then the CSV file can be automatically imported again.

 

Besides CSV, there’s  also an integrated HTML format. This format solves several problems at once. The export creates a single index.hml file that can be easily opened in any browser. This file contains an HTML-based spreadsheet with integrated editing options and storage of the adjustments. CSS and JavaScript are directly integrated. This is a great plug-and-play approach, especially for colleagues who tend to send back .xls files instead of the needed CSV files.

However, more than just local processing is possible. There is also another variant that’s just as exciting for staging systems, for instance. Since the export path can be selected individually, it’s possible to store this file in a public directory on the web server. This way, a certain URL on the staging system can output an overview of all currently available translations. Thanks to the integration form, these can also be directly edited. The resulting output can be downloaded and imported into the software with the import command for the next iteration. To add even more automation, generating this export can either run the pipeline’s post-deployment job, or simply in a fixed interval via cronjob or something similar.

The HTML format also supports storage formats with groups. In this case, grouped translations are displayed visually so that translation can be done intuitively. Figures 3 and 4 show examples of HTML and CSV exports.

Fig. 3: Example of HTML export with integrated form

Fig. 4: Example of a CSV export with three languages

Translation services

The last abstraction area in the current version is connecting to different translation providers. Currently, it supports Google, DeepL, and OpenAI. This makes it possible for missing translations to be automatically added with an integrated translate command. Thanks to the framework’s basic concept, this means that all kinds of storage formats that support writing translations can also be combined with translation services at the same time.

PHPUnuhi only needs an existing value in another language as a basis for this automation. If this is the case, the translation can be requested from the external service. The result is automatically persisted with configured storage.

Further individual configurations are provided when integrating different providers. For instance, in DeepL, you can use the  –deepl-formal argument to specify if the translation should be formal or informal. This affects the German salutations “du” and “Sie”, for instance.

The googleweb service can be used for a quick start. This sends a simple query to the familiar Google website that we all know:

php vendor/bin/phpunuhi translate --service=googleweb

Although this isn’t recommended for continuous mass queries, it usually works quite well and can be used purposefully.

If you want to take a more professional approach, you can also connect to Google Cloud Translate and, as previously mentioned, to DeepL, which is becoming increasingly successful. For AI enthusiasts, there is now also an OpenAI integration. It currently uses the text-davinci-003 model, which is not perfect yet but it already delivers surprisingly good results. OpenAI can be used with the following command along with the specification of a corresponding service including the API key:

php vendor/bin/phpunuhi translate --service=openai --openai-key=(my-api-key)

What functions are available?

Now that we understand the basic framework and some of its possibilities, we can take a closer look at the framework’s extended functionality.

YOU LOVE PHP?

Explore the PHP Core Track

 

With the help of a few commands, you can perform much more than simple translation testing. State analysis, listings, reporting, imports and exports offer a multitude of possibilities for your project.

Translation coverage

With the status command, you can output coverage in the area of translations. Values are provided on the level of localizations, translation sets, and as an overall view:

php vendor/bin/phpunuhi status

Validation

One of the framework’s core functions is the validate command. As I previously mentioned, you can test translations for completeness. But the command also has some other useful features.

A problem that occurs frequently during further software development is an unplanned variation in translation key spelling. While working with code styles, little consideration is given to the fact that text modules should also have a conforming structure. Using case style validation, you can maintain the consistency of keys over the project’s lifecycle. PHPUnuhi offers a list of potential options, like the well-known variants Pascal, Camel, Kebab, and more.

Therefore, a translation set can consist of several potential case styles. If no styles are specified, the whole test is skipped. The actual test based on this list works for simple storage formats and for multi-nested storages like JSON and PHP. Here, all hierarchy levels are checked for the specified styles.

Optionally, you can also fix different styles on certain levels. For a nested structure like JSON, Pascal Case can be defined at the root level, while Kebab Case must be used at all other levels (Listing 5).

<set>
  <styles>
    <style level="0">pascal</style>
    <style>kebab</style>
  </styles>
</set>

Friends of JUnit reports will also get their money’s worth with PHPUnuhi. With the report-format argument, you can generate a JUnit compliant XML file:

php vendor/bin/phpunuhi validate --report-format=junit --report-output=junit.xml

This contains all tests performed with corresponding error reports and can be used in a familiar way and processed by the machine.

Fix structure

With large file-based translations like JSON and INI, manually fixing different structures can be extremely time-consuming, even more so if they span several hierarchies or levels. This can be automated and simplified using the integrated fix:structure command..

In the process, PHPUnuhi verifies individual structures and ensures that each localization also receives all of the entries. As a little bonus, the storage formats also rewrite values with previously configured indentations or even in alphabetical order, depending on the type:

php vendor/bin/phpunuhi fix:structure

I should mention that this is only a matter of repairing structures. The values are stored with an empty string, so a validation still fails.

Export/Import

Exports and imports provide a simple variant for working with external agencies and systems. Using a simple export command, you can quickly create files that can be passed to systems or people by selecting a format:

php vendor/bin/phpunuhi export ... --format=csv
php vendor/bin/phpunuhi export ... --format=html

If no special translation set is specified, then all sets will be exported to separate files. However, as with many commands, you can also select a set by argument and have only this set processed. After the customized results have been returned, they can be imported back into the system with an import command:

php vendor/bin/phpunuhi import --set=storefront --file=storefront.csv

It should be noted here that version control using Git or something similar is strongly recommended, especially when working with file-based storage formats. For storage formats using a database, an appropriate back-up should also be made before the import.

Translate

The translate command is one of the more exciting features along with the validate command. As already described in the “Translation Services” section, an external service can be used to automatically translate values. A service is simply selected with the service argument.

Now PHPUnuhi goes through all existing entries and tries to translate empty translations with the specified service. The value of a found language serves as a basis. Only one value can exist. If this isn’t the case, then it cannot be translated.

php vendor/bin/phpunuhi translate --service=googleweb
php vendor/bin/phpunuhi translate --service=deepl --deepl-key=xyz

If you want to completely retranslate an existing localization, you can use the force argument for this. You must specify the locale that will be retranslated.

php vendor/bin/phpunuhi translate --service=googleweb --force=en-GB

But with automated services, it’s important to always remember that translations should be generated depending on the application’s context. Automatically generated results fit most cases, but manual, human verification is still recommended.

IPC NEWSLETTER

All news about PHP and web development

 

Conclusion

As a platform-independent open source framework, PHPUnuhi tries to simplify translation work for developers and teams, while also increasing the possibilities of quality assurance measures. With its simple configuration options, it can be quickly integrated into existing projects and used efficiently after just a few minutes. PHPUnuhi’s possibilities are far from exhausted. So if you feel like joining or just have some ideas, you can participate via the GitHub repository [6].


Links & Literatur

[1] https://www.shopware.com

[2] https://www.mollie.com

[3] https://translate.google.com

[4] https://www.deepl.com

[5] https://openai.com

[6] https://github.com/boxblinkracer/phpunuhi

The post The PHPUnuhi Framework at a Glance appeared first on International PHP Conference.

]]>
Asynchronous Programming in PHP https://phpconference.com/blog/asynchronous-programming-in-php/ Wed, 03 May 2023 12:01:15 +0000 https://phpconference.com/?p=85231 When starting this article I wanted to write about quite a lot of things and quite a lot of concepts. However, trying to explain just the fundamental blocks of what asynchronous programming is, I quickly hit the character limit I had and was faced with a choice. I had to decide between going into details of the A’s and B’s or give an eagle’s eye perspective of what is out there in the async world. I chose the former.

The post Asynchronous Programming in PHP appeared first on International PHP Conference.

]]>

We will cover a very basic, naive and simplistic take on what asynchronous programming is like. However, I do believe that the example we explore will give the reader a good enough picture of the building blocks of a powerful and complex technique.

Enjoy!

A service for fetching news

Imagine we work in a startup! The startup wants to build this really cool new service where users input a topic into a search field and they get a bunch of news collected from the best online news sites there are. We are the back-end engineering team and we are tasked with building the core of this fantastic new product – the news aggregator. Luckily for us, all of the on-line news agencies which we will be querying provide nice APIs. All we need to do is for each requested topic to make a call to each of the APIs, collect and format the data so it’s readable by our front-end and send it to the client. The front-end team takes care of displaying it to the user. As with any startup, hitting the market super fast is of crucial importance, so we create the simplest possible script and release our new product. Below is the script of our engine.

 format_europe($europe_news),
    'asia_news' => format_asia($asia_news),
    'africa_news' => format_africa($africa_news)
  ];

  echo json_encode($formatted);

This is as simple as it gets! We give a big “Thank you” to the creators of PHP for making the wonderful file_get_contents() function which drives our API communications and we launch our first version.

YOU LOVE PHP?

Explore the PHP Core Track

 

Our product proves to be useful and the number of clients using it starts to increase from day to day. As our business expands and so does the demand for news from The Americas and from some other countries. Our engine is easy to expand, so we add news from the respective news services in a matter of minutes. However, with each additional news service, our aggregator gets slower and slower.

A couple of months later our first competitor appears on the market. They provide the exact same product, only it’s blazingly fast. We now have to quickly come up with a way to drastically improve our response time. We try upgrading our servers, scaling horizontally with more machines, paying for a faster Internet connection, but still we don’t get even close to the incredible performance of our competitor. We are in trouble and we need to figure out what to do!

The Synchronous nature of PHP

Most of you have probably already noticed what is going on in our “engine” and why adding more news sites makes things slower and slower. Whenever we make a call to a news service in our script, we wait for the call to complete before we make the next call. The more services we add, the more we have to wait. This is because the built-in tools that PHP provides us with are in their nature designed for a synchronous programming flow. This means that operations are done in a strict order and each operation we start must first end before the next one starts. This makes the programming experience nice, as it is really easy to follow and to reason about the flow. Also, most of the time a synchronous flow fits perfectly with our goals. However, in this particular example, the synchronous flow of our program is what in fact slows it down. Downloading data from external services is a slow operation and we have a bunch of downloads. However, nothing in our program requires the downloads to be done sequentially. If we could do the downloads concurrently, this would drastically improve the overall speed of our service.

A little bit about I/O operations

Before we continue, let’s talk a little about what happens when we work with any input/output operations. Whether we are working with a local file or talking to a device in our computer or communicating over a network, pretty much the flow is the same. It goes something like this.

When sending/writing data…

  • There is some sort of memory which acts as an output buffer. It may be allocated in the RAM or it may be memory on the device we are talking to. In any case, this output buffer is limited in size.
  • We write some of the data we want to send to the output buffer.
  • We wait for the data in the output buffer to get sent/written to the device with which we are communicating.
  • Once this is done, we check if there is more data to send/write. If there is, we go to 2. If not, we go back to whatever we were doing immediately before we requested the output operation (we return).

When we receive data a similar process occurs.

  • There is an input buffer. It also is limited in size.
  • We make a request to read some data.
  • We wait while the data is being read and placed into the input buffer.
  • Once a chunk of data is available, we append its contents in our own memory (in a variable probably).
  • If we expect more data to be received, we go to 3. Otherwise we return the read data to the procedure which requested and carry on from where we left off with it.

Notice that in each of the flows there is a point in which we wait. The waiting point is also in a loop, so we wait multiple times, accumulating waiting time. And because output and input operations are super-slow compared to the working speed of our CPU, waiting is what the CPU ends up spending most of its time doing. Needless to say, it doesn’t matter how fast our CPU or PHP engine is when all they’re doing is waiting for other slow things to finish.

Lucky for us, there is something we can do.

IPC NEWSLETTER

All news about PHP and web development

 

The above processes describe what we call blocking I/O operations. We call them blocking, because when we send or receive data the flow of the rest of the program blocks until the operation is finished. However, we are not in fact required to wait for the finish. When we write to the buffer we can just write some data and instead of waiting for it to be sent, we can just do something else and come back to write some more data later. Similarly, when we read from an input buffer, we can just get whatever data there is in it and continue doing something else. At a later point we can revisit the input buffer and get some more data if there is any available. I/O operations which allow us to do that are called non-blocking. If we start using non-blocking instead of blocking operations we can achieve the concurrency we are after.

Concurrently downloading files

At this point it is a good idea that our team looks  into the existing tools for concurrent  asynchronous programming with PHP like ReactPHP and AMPHP. However, our team is imaginary and is in the lead role of a Proof-of-Concept article, so they are going to take the crooked path and try to reinvent the wheel.

Now that we know what are blocking and non-blocking I/O operations, we can actually start making progress. Currently when we are fetching data from news services we have a flow like the following:

  • Get all the data from service 1
  • Get all the data from service 2
  • Get all the data from service 3
  • .
  • .
  • .
  • Get all the data from service n
  • Instead, the flow we want to have would look something like the following:
  • Get a little bit of data from service 1
  • Get a little bit of data from service 2
  • Get a little bit of data from service 3
  • Get a little bit of data from service n
  • Get a little bit of data from service 1
  • Get a little bit of data from service 3
  • Get a little bit of data from service 2
  • We have collected all the data
  • In order to achieve this, we first need to get rid of file_get_contents().

Reimplementing file_get_contents()

The () function is a blocking one. As such we need to replace it with a non-blocking version. We will start by re-implementing its current behavior and then we will gradually refactor towards our goal.

Below is our drop-in replacement for file_get_contents().

function fetchUrl(string $url) {
    $host = parse_url($url)['host'];
    $fp = @stream_socket_client("tcp://$host:80", $errno, $errstr, 30);
    if (!$fp) {
        throw new Exception($errstr);
    }
    stream_set_blocking($fp, false);
    fwrite($fp, "GET / HTTP/1.1\r\nHost: $url\r\nAccept: */*\r\n\r\n");

    $content = '';
    while (!feof($fp)) {
        $bytes = fgets($fp, 2048);
        $content .= $bytes;
    }
    return $content;
}

Let’s break down what is happening:

  • We open a TCP socket to the server we want to contact.
  • We throw an exception if there is an error
  • We set the socket stream to non-blocking.
  • We write an HTTP request to the socket.
  • We define a variable $content in which to store the response.
  • We read data from the socket and append it to the response received so far.
  • We repeat step 6 until we reach the end of the stream.

Note the stream_set_blocking() call we make. This sets the stream to non-blocking mode. We feel the effect of this when we later call fgets().  The second parameter we pass to fgets() is the number of bytes we want to read from the input buffer (in our case – 2048). If the stream mode is blocking, then fgets() will block until it can give us 2048 bytes or until the stream is over. In a non-blocking mode, fgets()  will return whatever is in the buffer (but no more than 2048 bytes) and will not wait if this is less than 2048 bytes.

 

Although we are now using non-blocking input this function still behaves as the original file_get_contents(). Because of the loop in it, once we call it, we will be stuck until it’s complete. We need to get rid of this loop, or rather – move it out of the function.

We can break down what the function does in four steps:

  1. Initialization – opening the socket and writing the request
  2. Checking if we’ve reached the end of the stream
  3. Reading some data if not
  4. Returning the data if yes

Disregarding the loop, we can organize those parts in a class. The first three steps we will implement as methods, and instead of returning the data, we will simply expose the buffer as public.

class URLFetcher
{
    public string $content = '';
    private $fp;
    public function __construct(private string $url) {}

    public function start(): void {
        $host = parse_url($this->url)['host'];
        $this->fp = @stream_socket_client(...);
        if (!$this->fp) {
            throw new Exception($errstr);
        }
        stream_set_blocking($this->fp, false);
        fwrite($this->fp, "GET …");
    }

    public function readSomeBytes(): void {
        $this->content .= fgets($this->fp, 2048);
    }

    public function isDone(): bool {
        return feof($this->fp);
    }
}

Rebuilding the loop

Now we need to rebuild the loop. This time, instead of executing one loop per file, we want to have multiple files in one loop.

Because we now have many news services to fetch data from, we have refactored our initial code to hold their names and URLs in an array.

$services = [
    'europe' => 'https://api.europe-news.org?q=%s',
    'asia' => 'https://api.asia-news.org?s=%s'
    ...
];

For each service we will create a URLFetcher and ‘start’ it. We will also keep a reference to each of the fetchers.

$fetchers = [];
foreach ($services as $name => $url) {
    $fetcher = new URLFetcher(sprintf($url, $topic));
    $fetcher->start();
    $fetchers[$name] = $fetcher;
}

Now we will add the loop in which we will iterate through the fetchers, reading some bytes from each of them upon each iteration.

$finishedFetchers = [];
while (count($finishedFetchers) < count($fetchers)) {
    foreach ($fetchers as $name => $fetcher) {
        if (!$fetcher->isDone()) {
            $fetcher->readSomeBytes();
        } else if (!in_array($name, $finishedFetchers)) {
            $finishedFetchers[] = $name;
        }
    }
}

The $finishedFetchers array helps us track which fetchers have finished their work. Once all of the fetchers are done, we exit the loop. The data gathered is accessible through the $content property of each fetcher. This simple way of downloading data concurrently gives us an incredible performance boost.

Having successfully solved our performance issues, we beat the competition and our business continues to grow. With it – the requirements towards our engine.

One of the new features we need to implement in the next trimester is a history of all the topics our users had searched for and the results they got for them. For this we want to use a SQL database, but when attempting to add it to the mix, the numerous inserts we perform for each topic slow down our service significantly. We already know what the problem is – the execution of the database queries is blocking and thus each insert delays the execution of everything else. We immediately take action and develop our own implementation of functionality for concurrent DB inserts. However, adding those to the loop we have proves to be quite a mess. The inserts need looping over and tracking of their own, but they also need to track the requests to the services, because we can not do an insert before having the data from the respective new service. Once again, we have to rethink our lives.

Generalizing the Loop

It is clear to see that if we want to take advantage of other non-blocking operations we would need to have some sort of a handy generic way to add more things in the ‘driving’ loop. We need a loop which makes it possible to dynamically add more things in it to get executed. It turns out creating such a loop is quite simple.

class Loop 
{
    private static array $callbacks = [];

    public static function add(callable $callback) 
    {
        self::$callbacks[] = $callback;
    }

    public static function run() 
    {
        while (count(self::$callbacks)) {
            $cb = array_shift(self::$callbacks);
            $cb();
        }
    }
}

The $callbacks array acts as a FIFO queue. At any point in our program we can add functions to it to get executed. Once we call the run() method, functions on the queue will start being executed. The run()  method will run until there are no callbacks in the queue. This can potentially be forever as each of the callbacks may add new callbacks while being executed.

Next step would be to adapt our downloading tools. We can create a small function to work with our file fetcher class and with the loop.

function fetchUrl(string $url) {
    $fetcher = new URLFetcher($url);
    Loop::add(fn () => $fetcher->start());

    $tick = function () use ($fetcher, &$tick) {
        if (!$fetcher->isDone()) {
            $fetcher->readSomeBytes();
            Loop::add($tick);
        }
    };
    Loop::add($tick);
}

In this new version of fetchUrl() we instantiate a fetcher and add a callback to the loop which will start the download. Then we create a closure which we also add to the loop. When called the closure will check if the fetcher is done and if it’s not it will read some bytes and add itself to the loop again. This will ‘drive’ reading from the stream until the end is reached.

All we have to do now is add all our services to the loop and start it:

foreach ($services as $url) {
    fetchUrl($url);
}
Loop::run();

This will indeed download the data from all of the services we need, but we have a major problem – we don’t have any means to get the results. We can not get them from fetchUrl(), because it returns before the download has even started. We also want to record the fetched results to the database (remember the new feature we’re implementing) and we want to do this during the download. Otherwise we would have to wait for a new loop for recording things and this would slow us down.

IPC NEWSLETTER

All news about PHP and web development

 

The solution to our problems is to add one more parameter to fetchUrl() – a callback function which will get called when downloading the data is complete. As a parameter this callback will take the downloaded data and in its body it will initiate the insertion in the database.

Bellow is the new fetchUrl() with the changes in red:

function fetchUrl(string $url, callable $done) {
    $fetcher = new URLFetcher($url);
    Loop::add(fn () => $fetcher->start());

    $tick = function () use ($fetcher, $done, &$tick) {
        if (!$fetcher->isDone()) {
            $fetcher->readSomeBytes();
            Loop::add($tick);
        } else {
            $done($fetcher->content);
        }
    };
    Loop::add($tick);
}

And now the updated initialization:

$results = [];
foreach ($services as $name => $url) {
    fetchUrl(
        $url,
        function (string $content) use ($name, &$results) {
            $results[$name] = $content;
            insertIntoDatabase($content);
        }
    );
}
Loop::run();

The callback now collects the results from the news service and initiates the database insert. The database insert will use similar techniques and will take advantage of the Loop to run concurrently with the other tasks and thus we eliminate the need to wait for another loop.

Error Handling

There are many things that can go wrong while downloading data from the Internet, but in our example we are only throwing one exception from within the start() method of the URLFetcher. For the sake of simplicity we are going to keep it this way. You may have noticed that so far we haven’t been dealing with this exception at all. Time to address this oversight.

A naive colleague from our imaginary team tried to handle the issue by enclosing the calls to fethcUrl() in a try-catch  block like this.

foreach ($services as $name => $url) {
    try {
         fetchUrl(...);
    } catch (Exception $e) {
         ...
    }
}
Loop::run();

The production environment quickly and painfully demonstrated to our team that the exception somehow slipped out  of the try-catch block and went unhandled into our script to break it.

Well, the thing is fetchUrl()  does not actually throw any exceptions. It merely adds callbacks to the loop. One of those callbacks throws an exception (the one initializing the fetcher), but it does not get called until later on. It is only when we start the loop (call Loop::run()) when the exceptions start being thrown. Enclosing the Loop::run() call into a try-catch block will allow us to catch exceptions thrown from within it, but at this level of handling we won’t know what threw them. And even if we did know that, how would we return to the flow of the respective function after handling the error?

The way we can deal with this situation is by adding one more callback parameter to the fetchUrl() function. The new callback will get called whenever an error occurs. So fetchUrl() will look something like this:

function fetchUrl(string $url, callable $done, callable $onerror) {
    $fetcher = new URLFetcher($url);
    Loop::add(function () {
        try {
            $fetcher->start()
        } catch (Exception $e) {
            $onerror($e);
        }
    });

    $tick = function () use ($fetcher, $done, $onerror, &$tick) {
        if (!$fetcher->isDone()) {
            try {
                $fetcher->readSomeBytes();
            } catch (Exception $e) {
                $onerror($e);
            }
            Loop::add($tick);
        } else {
            $done($fetcher->content);
        }
    };
    Loop::add($tick);
}

And respectively the calling code would now look like this:

foreach ($services as $name => $url) {
    fetchUrl(
        $url, 
        function (string $content)  {...}, 
        function (Exception $e) {...}
    );
}
Loop::run();

Now we can handle error situations properly via the new callback.

Retrospect

By the end of the story, in order to allow concurrent operations, our imaginary team had started practicing asynchronous programming in a single-threaded environment based on non-blocking I/O and an event loop. The last sentence is bloated with terminology and I would like to talk briefly about terms.

Concurrency

Both in computer and in general contexts, this means to be dealing with more than one thing at a time. In our example we were downloading data from multiple Internet services and inserting entries in a database at the same time.

Asynchrony

“Asynchrony, in computer programming, refers to the occurrence of events independent of the main program flow and ways to deal with such events.”

Wikipedia

In our example the main program flow was dealing with downloading data, inserting records, encoding for the clients and sending to them. The “events” outside of the main flow were in fact the events of new data being available for reading, the successful completion of sending data, etc.

Non-blocking I/O

We based our work on the ability to “query” I/O for its availability. The way we did it was to periodically check if we could use the “device”. This is called polling. Since polling requires CPU cycles, our program becomes more CPU demanding than it needs to be. It would have been smarter if we had “outsourced” the polling to some sort of a lower level “actor” like our operating system or a specialized library. We could then communicate with it via events, interrupts or another mechanism. In any case though, whatever this mechanism for communicating with I/O devices was, at the end of the day it would still be built upon non-blocking I/O polling and maybe hardware interrupts.

 

Event loop

Notice we didn’t call our loop an event loop but just a “loop”. This was intentional, because it would have brought confusion as we hadn’t mentioned events anywhere else in the example. An event loop is just a specific version of a loop like ours. It is designed to work in conjunction with event-based I/O communication and thus the name “event loop”. It allows callbacks to be executed when a certain event occurs, making it more “user-friendly” for the programmer, but essentially it is the same thing. Other names for an event loop are message pump, message dispatcher, run loop and more.

… and last, but not least…

Single-threaded

PHP has a single-threaded model of execution and this will most probably always be the case. This means that all of the instructions to the engine (and from there to the CPU) are executed one-by-one and nothing ever happens in parallel. But wait! We just created a program which downloads data in parallel. True – the downloads happen in parallel but the instructions which control the download flow do not. We simply switch from one download to the other, but never in fact execute two things at the same time. This leads to a problem which we must always keep in mind when doing async single-threaded programming. Because instructions are not in fact executed in parallel, if we throw a heavy computation somewhere inside the loop, everything else will be stuck until the computation is complete.

Let’s look into another example in order to better illustrate the problem of blocking the loop.

We want to create a server. As any other server does, it will listen on a port and await connections from clients. Once a client is connected they will be able to make some sort of a request to the server and the server will serve it. Of course, we need to be able to serve multiple clients at the same time.

We can use the technique we’ve discussed so far to create such a server. It would open a port and poll it for connections. When a client connects, it will use non-blocking I/O to communicate with the client and will continue to switch between this communication, checking for new connections and serving already established connections. However, if, for-example, a client requests the server to calculate a fibonacci sequence of a great length, the server will be stuck in this and will not be able to do anything else for the other clients before it finishes. Connections will time out, new ones will not be received, etc. Essentially – the server will be gone. If we want our server to execute heavy computational tasks and still be responsive, we would need to use actual parallelism of execution, either by multi-threading or by spawning new processes to carry the heavy work for us.

So, why do we not do this by default, instead of dealing with this loop-switching thing? Starting and switching between threads and processes is a lot heavier and slower than “staying in one” process/thread and doing the work ourselves. It works perfectly for I/O heavy and CPU light programs (and most of what we do fall into this category). Indeed, however, if we do need those CPU cycles, multi-threading/processing is the way to go.

Final words

These were just the very basic oversimplified building blocks of what asynchronous programming is about. There is a lot more to be said, but this is an article, not a book, so we have to stop somewhere. If you are interested in the topic, I would suggest further research on promises, coroutines and fibers.

Enjoy the rabbit hole that asynchronous programming is!

IPC NEWSLETTER

All news about PHP and web development

 

The post Asynchronous Programming in PHP appeared first on International PHP Conference.

]]>
New awesome features of MySQL https://phpconference.com/blog/new-awesome-features-of-mysql-blog/ Thu, 06 Apr 2023 09:41:24 +0000 https://phpconference.com/?p=85194 MySQL, like any other technology, is constantly evolving and brings many new features. However, not everyone has the time to keep up with all the new features in this popular open-source database. Fortunately, this article provides a quick catch-up!

The post New awesome features of MySQL appeared first on International PHP Conference.

]]>

Before we dive into the features, let’s first look at how versioning works in MySQL. This will provide insights into some interesting facts. The first public version 8.0.11 of the latest major version of MySQL 8 was released on 19.4.2018. Since that, there have been over twenty individual releases. In the terminology of semantic versioning, these releases would be called “patch” releases. Unfortunately, MySQL does not adhere to semantic versioning, where only major releases introduce breaking changes. As a result, breaking changes can occur even in patch releases, such as the removal of TLSv1 and TLSv1.1 in 8.0.28 or changes in the MySQL protocol in 8.0.24. Hence, caution should be exercised when upgrading to the latest version. On the other hand, “patch” versions can bring many interesting features, which we will now explore.

Generated columns

We will start with a “left over” feature introduce in MySQL 5.7.7 – generated columns [1]. Generated columns provide a way to store automatically generated data in a table. The value of this data is computed by a predefined expression and cannot be manually changed, but it can be indexed. To create a generated column, we use the keyword “GENERATED always AS (expression)“. An example of this can be seen in the following code, where we are creating a table with a virtual column “full_name” that is combined from the “name” and “surname” columns (Listing 1).

CREATE TABLE users (
   id        INT AUTO_INCREMENT PRIMARY KEY,
   name      VARCHAR(60) NOT NULL,
   surname   VARCHAR(60) NOT NULL,
   full_name VARCHAR(120) 
   GENERATED ALWAYS AS (CONCAT(name, ' ', surname))
);

If we select the data from the table, the last column will contain the full name of the user (Listing 2). There are certain limitations to the expressions used in virtual columns. They cannot reference another generated column, auto-increment columns, use columns outside the table, or contain non-deterministic functions such as NOW().

 

SELECT * FROM users;
+----+------------+-----------+----------------+
| id | name       | surname   | full_name      |
+----+------------+-----------+----------------+
|  1 | Jane       | Doe       | Jane Doe       |
|  2 | Janie      | Stiles    | Janie Stiles   |
|  3 | Richard    | Miles     | Richard Miles  |
+----+------------+-----------+----------------+

There are two types of generated columns: virtual and stored. The value of a virtual column is resolved during a read operation, while the value of a stored column is evaluated during an insert or update operation and then stored on disk. The type of the column can be specified as the last value in the column definition, as either “VIRTUAL” or “STORED“. Examples of how to use both types can be seen in the Listing 3.

CREATE TABLE users_alter (
   id         INT AUTO_INCREMENT PRIMARY KEY,
   name       VARCHAR(60) NOT NULL,
   surname    VARCHAR(60) NOT NULL,
   full_name  VARCHAR(120) 
   GENERATED ALWAYS AS (CONCAT(name , ' ', surname )) VIRTUAL,
   hash varchar(32) 
   GENERATED ALWAYS AS (MD5(CONCAT(name , ' ', surname ))) STORED
);

SELECT * FROM users_alter;

+----+------------+-----------+----------------+------------+
| id | name       | surname   | full_name      | hash       |
+----+------------+-----------+----------------+------------+
|  1 | Jane       | Doe       | Jane Doe       | f001124... |
|  2 | Janie      | Stiles    | Janie Stiles   | 55bac62... |
|  3 | Richard    | Miles     | Richard Miles  | f32f3cd... |
+----+------------+-----------+----------------+------------+

As virtual columns are not stored on disk, their usage does not require any additional storage space. Absence of storing the results also means that INSERT and UPDATE queries come with no overhead. However, read operations may be slower as the results need to be evaluated during the read. Stored columns, on the other hand, have a performance penalty only during data-modifying queries.

There are many potential use cases for virtual columns, including simplifying and unifying queries, caching complicated conditions, indexing complex values, and extracting values from JSON data columns. They also serve as a foundation for other features that we will explore later.

JSON support

JSON support [2] has been available since the release of MySQL 5.7.9, with significant improvements made in MySQL 8. It is implemented in the form of a native column data type, providing automatic validation and an optimised binary storage format. While individual values within a JSON document cannot be directly indexed, functional indexes can be used to achieve this. JSONPath can be used to select values within a JSON document.

Inserting JSON data is as straightforward as inserting regular strings and does not require any special care. Values within a JSON document can be used in a field list by using the column path operator (->) and a JSONPath to the field. However, this can result in values being wrapped in quotes, which can be inconvenient. This can be remedied by using the column inline path operator (->>) since MySQL 8.0. Both operators can be used in other parts of a query, such as the WHERE condition (Listing 4).

SELECT id, browser->>'$.os' os
FROM activity
WHERE browser->>'$.name'='Firefox';

+----+---------+
| id | os      |
+----+---------+
|  2 | Windows |
+----+---------+

A partial update is implemented using a set of JSON_* functions. The most useful of these functions is probably JSON_SET(json_doc, path, val), which replaces existing values and adds new ones as necessary. In our example, we will attempt to rename Firefox back to its original name, Phoenix (Listing 5).

UPDATE activity
SET `browser` = JSON_SET(
   `browser`,
   '$.name',
   'Phoenix'
)
WHERE browser->>'$.name'='Firefox';

The JSON_SET function expects a column name as its first parameter, in which the JSON data is stored. The second parameter is the JSONPath, where in our case, “$.name” represents the property name. The last parameter represents the value itself, which in our case is the string ‘Phoenix’. Other useful functions include JSON_INSERT, JSON_REPLACE, and JSON_REMOVE.

IPC NEWSLETTER

All news about PHP and web development

 

Generated columns can be easily combined with JSON data types, as the column inline path operator (->>) can be used in column expressions. This allows us to extract some of the properties and make them available as regular fields (Listing 6).

CREATE TABLE activity (
   id           int auto_increment primary key,
   event_name   ENUM('page-view', 'user-login'),
   user_id      int,
   properties   json,
   browser      json,
   browser_name varchar(20)
   GENERATED ALWAYS AS (`browser` ->> '$.name')
);

JSON support provides the ability to mix document databases with relational databases, offering at least in theory the benefits of both. However, this can be complex and caution should be exercised. Possible use cases include error logging, application event logging, and piloting new ideas.

Instant DDL

DDL stands for Data Definition Language and is an umbrella term for all schema-changing commands. Instant DDL allows for schema changes in InnoDB without making the data in the schema unavailable. It has been partially supported since 8.0.12 and was extended in 8.0.29, making it the first big addition since the initial release of MySQL 8.0 (8.0.11). There is no need to do anything special to enable online DDL, but it is advisable to understand what is happening under the hood.

InnoDB supports three algorithms: COPY, INPLACE, and INSTANT. The difference between INSTANT and the other algorithms is that INSTANT only performs metadata changes and does not touch the data file of the table. The only downside is that INSTANT is only supported for limited DDL operations, namely: adding a column, dropping a column, renaming a column, modifying a column default value, and renaming a table.

Some operations are not supported, such as tables that use compressed row format, tables with a FULLTEXT index, temporary tables, and stored columns. Additionally, there are unsupported combinations. For example, a column drop is an instant operation, but an index drop is not. If we try to perform this operation, it will fall back to one of the old operations, which can negatively impact access to the changed table. Another possible issue is the difference between MySQL versions. Prior to 8.0.29, a column could only be added as the last one.

Fortunately, the algorithm can be forced by using the keyword ALGORITHM with the modifier INSTANT, for example: ALGORITHM=INSTANT. If it is not possible to force this algorithm, the operation will fail and not fall back to another algorithm. Even with that, it is always advisable to consult documentation [3] before making any changes.

 

Indexes

There have been numerous improvements in the field of indexing. Several new types of indexes have been introduced, including multi-valued, functional, descending, and invisible. The multi-valued index [4], introduced in 8.0.17, can be defined on a column that stores an array of values and is primarily used to index JSON arrays. The index is defined by casting the column to an array during index creation with “CAST(… AS … ARRAY)”, as demonstrated in Listing 7.

CREATE TABLE workers (
  id       BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  modified DATETIME DEFAULT CURRENT_TIMESTAMP,
  worker_info JSON,
  INDEX zips((CAST(worker_info ->'$.zipcode' AS UNSIGNED ARRAY)))
);

The multi-valued index can be leveraged using specific condition functions such as MEMBER OF (json_array), JSON_CONTAINS (target, candidate[, path]), and JSON_OVERLAPS (json_doc1, json_doc2).

For example, if we want to search for zip codes stored in the “zipcode” property we should use the MEMBER OF function, which will return true if the specified value is an element in the provided array. A practical example can be found in Listing 8.

SELECT * FROM workers
WHERE 123 MEMBER OF(worker_info->'$.zipcode');

+----+---------------------+----------------------------------------------------+
| id | modified            | worker_info                                        |
+----+---------------------+----------------------------------------------------+
|  2 | 2019-06-29 22:23:48 | {"user": "Alice", "zipcode": [456, 123, 94582]}    |
|  3 | 2022-08-29 12:56:12 | {"user": "Bob", "zipcode": [94477, 123]}           |
|  5 | 2012-01-29 13:23:35 | {"user": "Ana", "zipcode": [456, 123]}             |
+----+---------------------+----------------------------------------------------+

The second function, JSON_CONTAINS, returns true if a given JSON document is contained within a target JSON document. This can be useful if we want to match rows with multiple values. For example, if we want to search for rows that have the zip codes 123 and 456, we can use JSON_CONTAINS (Listing 9).

SELECT * FROM workers WHERE 
JSON_CONTAINS(worker_info->'$.zipcode', CAST('[123,456]' AS JSON));

+----+---------------------+----------------------------------------------------+
| id | modified            | worker_info                                        |
+----+---------------------+----------------------------------------------------+
|  2 | 2019-06-29 22:23:48 | {"user": "Alice", "zipcode": [456, 123, 94582]}    |
|  5 | 2012-01-29 13:23:35 | {"user": "Ana", "zipcode": [456, 123]}             |
+----+---------------------+----------------------------------------------------+

The last-mentioned function, JSON_OVERLAPS, can be used if we want to select rows that have at least one of the specified values. For example, if we provide the zip codes 123 and 456, it will match values that have either 123, 456 or both. You can see an example in Listing 10.

SELECT * FROM workers WHERE 
JSON_OVERLAPS(worker_info->'$.zipcode', CAST('[123,456]' AS JSON));

+----+---------------------+----------------------------------------------------+
| id | modified            | custinfo                                           |
+----+---------------------+----------------------------------------------------+
|  1 | 2000-12-10 18:23:12 | {"user": "Russell", "zipcode": [456, 94536]}       |
|  2 | 2019-06-29 22:23:48 | {"user": "Alice", "zipcode": [456, 123, 94582]}    |
|  3 | 2022-08-29 12:56:12 | {"user": "Bob", "zipcode": [94477, 123]}           |
|  5 | 2012-01-29 13:23:35 | {"user": "Ana", "zipcode": [456, 123]}             |
+----+---------------------+----------------------------------------------------+

Functional indexes [5] have been available since 8.0.13 and allow us to use the result of a function as the basis for an index, thereby speeding up a query. One possible use case is if we want to filter only by a part of a value. For example, we can demonstrate this by aggregating prices for a specific month. This is usually done by calling the MONTH function on one of the date columns in the filter condition: SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;

A functional index is created by using the function in the index definition, as demonstrated in Listing 11. This is the only step required. MySQL will determine whether the index is efficient enough to use when evaluating the query.

YOU LOVE PHP?

Explore the PHP Core Track

 

CREATE TABLE `products` (
   `id`          int unsigned NOT NULL PRIMARY KEY AUTO_INCREMENT,
   `price`       integer DEFAULT NULL,
   `create_time` timestamp NULL DEFAULT NULL,
   KEY `functional_index` ((month(`create_time`)))
) ENGINE=InnoDB;

Functional indexes are internally implemented as hidden virtual generated columns and therefore have the same limitations. They count towards the total limit of columns in a table and can only use functions that are permitted for generated columns. Functional indexes can complement JSON columns well. For example, we can create a functional index that directly extracts a value from JSON. To do this, we should use a combination of the CAST, JSON_UNQUOTE, and JSON_EXTRACT functions, or, since 8.0.21, there is a function that combines all these functions called JSON_VALUE. If we want to use an index created with JSON_VALUE, we must use the same function in our query, as demonstrated in Listing 12.

CREATE TABLE data(
   j JSON,
   INDEX i1 ( (JSON_VALUE(j, '$.id' RETURNING UNSIGNED)) )
);

SELECT * FROM data WHERE JSON_VALUE(j, '$.id' RETURNING UNSIGNED) = 123;

Descending indexes [6] have been available since 8.0.1. We can describe them as indexes that store key values in descending order. The strength of descending indexes lies in their use in combination with other columns in a multiple-column index. This can improve the performance of the following pattern: “ORDER BY field1 DESC, field2 ASC LIMIT N“. The pattern is frequently used to display the most recently inserted items and use the name as a secondary sorting condition. An example with an article as an item can be found in Listing 13.

CREATE TABLE `articles` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `name` varchar(100) DEFAULT NULL,
   `created` datetime DEFAULT NULL,
   PRIMARY KEY (`id`),
   KEY `created_desc_name_asc` (`created` DESC,`name`)
) ENGINE=InnoDB;

SELECT * FROM articles ORDER BY created DESC, name ASC limit 10;

+----+------+---------------------+
| id | name | created             |
+----+------+---------------------+
|  1 | foo  | 2022-10-01 16:20:52 |
|  3 | quz  | 2022-06-18 16:21:27 |
|  2 | bar  | 2022-06-09 16:21:08 |
+----+------+---------------------+

Finally, there are invisible indexes. Invisible indexes [7] are fully maintained and kept in sync with the table data, just like regular MySQL indexes. The only difference is that they are not used during query execution. Any regular index can be easily converted to an invisible index, and vice versa, during an ALTER operation, as demonstrated in Listing 14.

Turning on/off visibility of the indexes is instant operation, so there is no need to worry about table locking. This demonstrates the main use case of invisible indexes: they help us to test the effect of removing an index without making a permanent change. There have been several other improvements related to indexes, such as histograms, simultaneous index building (available since 8.0.27), and CHECK constraints.

IPC NEWSLETTER

All news about PHP and web development

 

CTE – Common table expression

Common Table Expressions (CTEs) [8] were introduced in version 8.0.1 and provide a lightweight alternative to derived tables, views, and temporary tables. They simplify complex joins and subqueries, leading to more readable and performant queries. CTEs create a short-term table from a provided query and allow the same statement to use it. However, the CTEs are only used for one query and cannot be reused. They have a limited scope compared to temporary tables.

Fig. 1: EAV model and its transformation into regular table

We will demonstrate the use of CTEs in combination with the Entity-Attribute-Value (EAV) model. In this model, data is stored as three columns: an entity unique identifier, an attribute name, and an attribute value. This allows for a fully dynamic data model in relation to the database without the need for schema changes. An example of the EAV model can be seen in Fig 1. In connection with the EAV model, pivoting is often required. Pivoting enables us to transform EAV data into a regular table (Listing 15), and this is where CTEs come into play.

SELECT user_id,
   MAX(CASE WHEN meta_key='first_name' THEN meta_value END) as first,
   MAX(CASE WHEN meta_key='last_name' THEN meta_value END) as last
FROM `wp_usermeta`
GROUP BY user_id

+---------+--------+---------+
| user_id | first  | last    |
+---------+--------+---------+
|       1 | Emma   | Obrien  |
|       2 | Nial   | Casey   |
|       3 | Keeley | Brookes |
|       4 | Bert   | Mccoy   |
|       5 | Alyce  | Sheldon |
+---------+--------+---------+

We can use a pivoting query as the base for a CTE table. A CTE is built around the following pattern: “WITH name of cte table AS (query that will be used as base for cte) query that will be executed on cte table“. After filling the CTE table with result of pivot operation, we should execute a query that selects data from the CTE table (Listing 16).

WITH cte AS (
   SELECT user_id,
      MAX(CASE WHEN meta_key='first_name' THEN meta_value END) as first,
      MAX(CASE WHEN meta_key='last_name' THEN meta_value END) as last
   FROM `wp_usermeta`
   GROUP BY user_id
)
SELECT * FROM cte
WHERE first = "Emma" OR last = "Sheldon"
ORDER BY first

There is also a special recursive variant of CTEs in which the subquery refers to itself. This approach can be used to generate time series or traverse hierarchical or tree-structured data (Listing 17). In this case CTE is extended by the RECURSIVE modifier. A CTE query is also a bit different and is now made up of two parts or members: seed and recursive. Both are connected with the “UNION ALL” statement. The seed member represents the initial query, executed in the first iteration. The recursive member refers to the same CTE name (hence is recursive) and generates all remaining rows for the main query during execution.

WITH RECURSIVE cte AS (
   initial_query    -- "seed" member
   UNION ALL
   recursive_query  -- recursive member referring the same CTE
)
SELECT * FROM cte;  -- main query

A demonstration of this can be seen in Listing 18, where we generate a list of ascending dates starting with the date from the seed member (1.1.2013) and subsequently increase this value by one day in a recursive query until we reach the specified condition in the query. The last part of the query is a plain select from the CTE table.

WITH RECURSIVE cte (n) AS
(
  SELECT '2013-01-01'
  UNION ALL
  SELECT n + INTERVAL 1 DAY FROM cte WHERE n < '2013-01-10'
)
SELECT * FROM cte;

+------------+
| n          |
+------------+
| 2013-01-01 |
| 2013-01-02 |
| 2013-01-03 |
| 2013-01-04 |
|     …      |

One of the most powerful uses of a recursive query is retrieving a tree structure from a database. We will demonstrate this with a list of pages in Listing 19. Each page has a parent represented by its parent id. We start by selecting the parent element (“Home”), which does not have any other parent, for the seed member. In the recursive part, we select the record from the “page_cte” and join it with the “pages” table. We can then use the values from “page_cte” to create things such as breadcrumbs: “CONCAT(pc.path,’ -> ‘,pg.name)” or to show the item level: “rc.level + 1“.

CREATE TABLE pages(
   id         INT PRIMARY KEY AUTO_INCREMENT,
   name       VARCHAR(20),
   parent_id INT,
   FOREIGN KEY (parent_id) REFERENCES pages(id)
);

WITH RECURSIVE pages_cte(id, name, path, level) AS (
   SELECT id, name, CAST(name AS CHAR(100)), 1
   FROM pages
   WHERE parent_id IS NULL
   UNION ALL
   SELECT pg.id, pg.name, CONCAT(pc.path,' -> ',pg.name), pc.level+1
   FROM pages_cte pc JOIN pages pg ON pc.id=pg.parent_id)
SELECT * FROM pages_cte ORDER BY level;
+------+----------+---------------------------------------+-------+
| id   | name     | path                                  | level |
+------+----------+---------------------------------------+-------+
|  1   | Home     | Home                                  |     1 |
|  2   | Articles | Home -> Articles                      |     2 |
|  3   | Events	  | Home -> Events	                     |     2 |

Window functions

Windows functions [9] offer aggregate-like functionality on a defined range of rows in a query and were introduced in version 8.0.2. The main difference is that they return a value for every row in the query result, in contrast to regular SQL aggregations, which collapses the rows automatically. Windows functions are built around two key words: “OVER“, which is mandatory and indicates the usage of a window function, and “PARTITION BY“, which is used to divide the rows into groups (Listing 20).

SELECT
   <agregation>(field) OVER() AS field_name,
   <agregation>(field) OVER(PARTITION BY field) AS field_name
FROM <table name>

 

Window functions

The following example demonstrates the use of windows functions on the “transactions” table, which holds information about sales across countries and products. One can query the table for statistics using regular aggregations (Listing 21) or a window function (Listing 22).

CREATE TABLE transactions(
   id      INT PRIMARY KEY AUTO_INCREMENT,
   year    INT,
   country VARCHAR(20),
   product VARCHAR(32),
   profit  INT
);

SELECT SUM(profit) AS total_profit FROM transactions;
+--------------+
| total_profit |
+--------------+
|         7535 |
+--------------+
SELECT country, SUM(profit) AS country_profit 
FROM transactions GROUP BY country ORDER BY country;
+---------+----------------+
| country | country_profit |
+---------+----------------+
| Finland |           1610 |
| India   |           1350 |
| USA     |           4575 |
+---------+----------------+

Both solutions return the same values representing the total profit and profit aggregated by country, but the window function does not collapse the result and returns every row.

SELECT
   year, country, product, profit,
   SUM(profit) OVER() AS total_profit,
   SUM(profit) OVER(PARTITION BY country) AS country_profit
FROM transactions
ORDER BY country, year, product, profit;

+------+---------+------------+--------+--------------+----------------+
| year | country | product    | profit | total_profit | country_profit |
+------+---------+------------+--------+--------------+----------------+
| 2000 | Finland | Computer   |   1500 |         7535 |           1610 |
| 2000 | Finland | Phone      |    100 |         7535 |           1610 |
| 2001 | Finland | Phone      |     10 |         7535 |           1610 |
| 2000 | India   | Calculator |     75 |         7535 |           1350 |
| 2000 | India   | Calculator |     75 |         7535 |           1350 |

Usability may be questionable at first glance, but there are other window functions such as “RANK“. The “RANK” function can be used to assign rank to individual rows. Next interesting function is “FIRST_VALUE“, which represents the first value in the result set and can be used to calculate the difference between the value of the first row and other rows, as shown in example Listing 23.

SELECT
   year, country, product, profit,
   RANK() OVER(
      PARTITION BY `country`
      ORDER BY `profit` desc, id
   ) total_profit_rank,
   profit - FIRST_VALUE( profit ) OVER (
      PARTITION BY `country`
      ORDER BY `profit` desc, id
   ) profit_back_of_first
FROM transactions
ORDER BY country, total_profit_rank;
+------+---------+------------+--------+-------------------+----------------------+
| year | country | product    | profit | total_profit_rank | profit_back_of_first |
+------+---------+------------+--------+-------------------+----------------------+
| 2000 | Finland | Computer   |   1500 |                 1 |                    0 |
| 2000 | Finland | Phone      |    100 |                 2 |                -1400 |
| 2001 | Finland | Phone      |     10 |                 3 |                -1490 |

These are not the only functions available. For a complete list of functions, refer to documentation. [10]

IPC NEWSLETTER

All news about PHP and web development

 

Closing notes

The latest versions of MySQL are packed with surprises, and even patch versions can introduce intriguing features. It is important to always check which patch version is being used and to consult the documentation and release notes. But above all, don’t be afraid to harness the full power of new features in MySQL!


Links & Literature

[1] https://dev.mysql.com/doc/refman/8.0/en/create-table-generated-columns.html

[2] https://dev.mysql.com/doc/refman/8.0/en/json.html

[3] https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html

[4] https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-multi-valued

[5] https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-parts

[6] https://dev.mysql.com/doc/refman/8.0/en/descending-indexes.html

[7] https://dev.mysql.com/doc/refman/8.0/en/invisible-indexes.html

[8] https://dev.mysql.com/doc/refman/8.0/en/with.html

[9] https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html

[10] https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html

The post New awesome features of MySQL appeared first on International PHP Conference.

]]>
Clean Code in PHP https://phpconference.com/blog/clean-code-in-php/ Thu, 23 Feb 2023 14:36:36 +0000 https://phpconference.com/?p=85077 Do you want to create software that is easy to maintain and has as few bugs as possible? Or write code that others, but also you, can still understand after weeks or even months? Maybe clean code is something for you.

The post Clean Code in PHP appeared first on International PHP Conference.

]]>

Most software developers, of course, want to write “good” or “clean” code. But it’s not that trivial; otherwise, we wouldn’t be discussing it. But where do we begin? And what, after all, is “clean” code?

What exactly is “clean code”?

The term “clean code” was popularised by Robert C. Martin, also known as Uncle Bob, and his standard work of the same name. [1]. There is no truly uniform definition; in essence, it is code that is written simply and directly and can be read, understood, and modified without difficulty by other developers. Clean code is free of duplication, does not obscure the intentions of the authors, and is thoroughly covered by automated testing.

Over the years, many proven solutions and principles in the field of software development have emerged, on which clean code is based. These are not limited to a single programming language; many of the principles and approaches discussed later in this article also apply to most object-oriented languages.

To be clear, writing clean code is a time-consuming process. Unfortunately, we cannot teach you all the necessary skills in this article. Rather, we want to provide you with the resources you need to get started on this path, as well as pique your interest.

IPC NEWSLETTER

All news about PHP and web development

 

Naming

It is not always easy to name things correctly, that is, in a meaningful way. That is why naming is so important in clean code.

Although it should be common knowledge that variables are no longer abbreviated with a few letters, this practise persists. Count variables like $i and $j may still be borderline cases for simple loops, but using $row and $column instead, for example, to iterate over a CSV file, will help.

Names should be concise but meaningful. Placeholders like Manager or Processor, on the other hand, add no value, and humorous names or puns should be avoided. Cryptic and unpronounceable abbreviations cause more confusion than they solve. However, it is helpful to define fixed words for specific concepts. So, instead of using a combination of set, update, and put for updating values, stick to one word.

So don’t be upset if you spend some time deciding on names for variables, classes, or methods. Take your time; it will be well worth it. Also, be open to change and listen to feedback from other developers.

Design principles

Clean code is based on several design principles that we should keep in mind when writing code. We will present a few basic principles in the following part, though you have probably encountered one or two of them elsewhere.

DRY

DRY (Don’t Repeat Yourself) is possibly the oldest programming principle. Even in the early days of software development, it felt necessary to outsource repetitive code into subroutines, primarily to save precious memory. As programmes grew, it became clear that redundant code should be avoided to ensure maintainability.

However, the opposite of DRY is all too common: copy-and-paste programming. A piece of code is quickly duplicated, and evil takes its course. Bugfixes or other changes must be made in multiple locations if they can be found at all. Each copy is then altered slightly to perfect the chaos. Resist the temptation to allow code duplication. It will undoubtedly come back to haunt you one day.

KISS

KISS (Keep It Simple, Stupid) reminds us to look for the simplest solution to a problem. I wonder who hasn’t felt this way: we’re proud of our abilities and want to flaunt them. Then, at the next available opportunity, we overshoot and write a fantastic algorithm for which there is already a thoroughly tested function in the SPL library (Standard PHP Library) that comes with PHP by default. A quick internet search on the issue in question would have saved us a lot of time. Clean code is as simple as possible. Although adventurous one-liners demonstrate how well we can juggle ternary operators, they are much more difficult to read and maintain than a well-readable if statement over several lines, as shown in Listing 1.

// nested ternary operators
$number > 0 ? 'Positive' : ($number < 0 ? 'Negative' : 'Zero');
 
// If statement
if ($number > 0) {
    echo 'Positive';
} elseif ($number < 0) {
    echo 'Negative';
} else {
    echo 'Zero';
}

It doesn’t help anyone if the bug is not found quickly the next time it fails on the production system because it has been successfully hidden in our code art. In any case, our colleagues will not thank us for it.

YAGNI

YAGNI (You Ain’t Gonna Need It) is a continuation of KISS: don’t programme anything you don’t need right now. The code remains leaner, making it easier to maintain.

Because there are more requirements than you expected, the probability is that you will have to discard the code or radically change it. In the worst-case scenario, the unnecessary code remains in your software. After a while, nobody understands why it was written in the first place. And reading. understanding, testing, and making changes to the source code costs time.

This is not to say that we should not plan ahead of time. If it is clear that a feature we are currently working on will be made multilingually translatable in the next sprint, it can’t hurt to keep this in mind and take sensible precautions.

YAGNI, like KISS, is meant to encourage us to simplify by always questioning our decisions. Is a composer package really required for a single function? Is it really necessary to programme a process, or are there ready-made solutions on the market?

YOU LOVE PHP?

Explore the PHP Core Track

 

SOLID

Of course, SOLID cannot be left out of this list. SOLID is an acronym for a set of five principles that are commonly used in object-oriented programming and are thus ideal for PHP. Let’s look at each principle in more detail below.

Single Responsibility Principle (SRP): The SRP states that a module (typically a class) should only change for one reason. This does not, however, imply that each class should have only one responsibility, as is commonly assumed [2]. Rather, it means that the module should be accountable to only one actor.

For example, if a module contains two functions, one of which is used by actor 1 and the other by actor 2, these two functions should not be in the same class. If changes are later required for one actor, the behaviour of the other function may be affected (and thus indirectly affect the other actor).

Open Closed Principle (OCP): The behaviour of a class should be extensible without requiring modification. We can accomplish this primarily through polymorphism, which is the use of a single interface for multiple manifestations of an object.

To be prepared for different use cases, we define an interface for the method rather than inflating the code with multiple if-else constructs. This interface is then implemented by different classes, which implement this method in a variety of ways, depending on the use case.

A common example is a function called readFile() that is supposed to read data from both CSV and Excel files. Rather than combining these two tasks in a single class, we define an interface for readFile(), which is then implemented by two classes: one for CSV files and one for Excel files.

Liskov Substitution Principle (LSP): Behind the complicated name lies the simple principle that subtypes must behave like their base type. They may only extend, but not change, the functionality of the base type.

These may be obvious properties like Return Types, which an inherited class should not change. When we use Return Types, we force the inherited class to behave like the base class, at least in terms of return values, as shown in Listing 2.

class BaseClass
{
    public function getValue()
    {
        return 1;
    }
}
 
// violation of LSP because return  // type changes
class SubClass extends BaseClass
{
    public function getValue()
    {
        return 'result: ' . $this->getValue();
  }
}

Without Return Types, the above example would be valid PHP code. If Return Types were used, the LSP violation would result in an error, as shown in Listing 3.

class BaseClass
{
    public function getValue(): int
    {
        return 1;
    }
}
 
// Causes PHP error, because  // return type does not match the // of BaseClass
class SubClass extends BaseClass
{
   public function getValue(): string
    {
        return 'result: ' . $this->getValue();
    }
}

However, less obvious behaviour, such as throwing exceptions in certain circumstances, should be consistent with the base type. If the behaviour here is inconsistent, bugs are pre-programmed, which are frequently difficult to find.

Interface Segregation Principle (ISP): The Interface Segregation Principle states that a client should only rely on the specifics of a service that it requires. For example, a class that implements an interface should only implement the methods that are truly required for the use case.

Instead of providing a comprehensive interface that is ready for any eventuality and contains a correspondingly large number of methods, it is preferable to implement several, specialised interfaces that focus on a single aspect. The advantage of this is that the class has fewer dependencies, reducing so-called coupling. Since PHP allows classes to implement multiple interfaces, this principle is simple to put into practise.

Dependency Inversion Principle (DIP): To create a system that is easy to maintain and resilient, we must consider which code parts may be particularly affected by changes. The file system is an example: rather than always writing to a local drive, we should extract that part and move it to a low-level class. For example, if we want to store our files in a cloud bucket because our application is no longer running on a single server but in the cloud (which happens quite often), we simply swap out the low-level class without modifying the high-level class.

We implement an interface for file system accesses and thus reverse the dependency so that the high-level class is not required to be aware of the implementation details of the dependency. The high-level class is only aware of one interface, and we decide which low-level class to pass at runtime. Important: Dependency always flows from high to low; a low-level class never depends on a high-level class.

Consider the following example. The store() method of a ReportService is intended to store the contents of a report on a storage device, in this case an S3-compatible cloud storage (Listing 4).

class ReportService
{
    public function store(
        string $filename,
        string $content
    ): void {
        Filesystem::storage('s3')->put($filename, $content);
    }
}

In the preceding example, a fictitious framework provides an easy-to-use Filesystem helper class that allows access to the configured storages at any time and from any location in the code via static function calls.

This is very convenient, but it has significant drawbacks. Because the dependency on the Filesystem helper is hardcoded into the ReportService class, switching to another medium is not easy and requires changes to the class. Furthermore, the static calls make testing difficult. The approach shown in Listing 5 would be better in terms of DIP.

interface StorageInterface
{
    public function put(
        string $filename,
        string $content
    ): void;
}
 
class ReportService
{
    public function __construct(
        private StorageInterface $storage
    ) {
    }
 
    public function store(
        string $filename,
        string $content
    ): void {
        $this->storage->put($filename, $content);
    }
}

First, the dependency is passed directly to the class in the constructor. As a result, we could easily replace the Storage class when instantiating it. We ensure that the implementation remains compatible by using the StorageInterface. This approach, of course, requires a little more effort, especially since the so-called binding, i.e., the binding of the StorageInterface to the class to be injected, is still missing in this example. However, you will be rewarded with the greatest amount of flexibility possible.

You are correct if you are thinking of Dependency Injection (DI): DI is the implementation of the Dependency Inversion Principle in all common PHP frameworks.

IPC NEWSLETTER

All news about PHP and web development

 

Other principles

We cannot present all relevant principles in detail in this article due to space constraints, so here are a few more in quick succession:

Scout Rule: The Scout Rule reminds us to leave our campsite (i.e., our codebase) in better condition than we found it. So, if you’re already working on a class, try cleaning it up a little by deleting obsolete comments or splitting an overly long method into several short ones.

Favour Composition Over Inheritance: By composing classes instead of inheritance we promote loose coupling of a system, after all inheritance keeps the subclass dependent on the base class.

Information Hiding Principle: Only the most important details should be visible to the outside world through an interface. This also applies to implicit interfaces, such as the public details of a class. So only make public the methods or characteristics that are absolutely necessary.

Principle of Least Astonishment: Software should not surprise users. The getLastName() method of a user object should return only its last name. It should not change the state of the system (e.g., through write database accesses).

Design patterns

Admittedly, that was already a lot of principles (and by no means all of them). However, these principles are always abstract; if we want clearer instructions for action, we must look more closely at design patterns.

Design patterns are commonly used solutions for specific use cases. Of course, these best practises are not universal solutions to all problems, but they have more than proven themselves over time, so internalising the most common patterns can’t hurt. Design patterns are typically classified into three types:

Objects are made using creational patterns. Famous examples include the Factory Method and the Abstract Factory, as well as the controversial Singleton Pattern.

Structural patterns, such as the Adapter Pattern, the Decorator Pattern, and, of course, the Dependency Injection Pattern, combine objects into larger, but still flexible structures.

Behavioural patterns describe how objects interact with one another. The Iterator Pattern and Observer Pattern, for example, are commonly used in PHP.

The Dependency Injection pattern has already been discussed, but let’s take a closer look at the Observer Pattern. This pattern is useful when you want to react to an object’s actions without hardwiring this dependency into the code.

Assume we have a class that represents a customer account. When an account is terminated, we want one or more actions to be taken. To accomplish this, we can create observer (observer) classes that will be notified when this occurs. The actual customer account class, on the other hand, does not need to know which observers are involved in detail. Because this is a common pattern, we can already find objects in the previously mentioned SPL library to help us in implementing it. First, you can see the class for the customer account in Listing 6.

class CustomerAccount implements SplSubject
{
    private SplObjectStorage $observers;
 
    public function __construct()
    {
        $this->observers = new SplObjectStorage();
    }
 
    public function attach(SplObserver $observer): void
    {
        $this->observers->attach($observer);
    }
 
    public function detach(SplObserver $observer): void
    {
        $this->observers->detach($observer);
    }
 
    public function notify(): void
    {
        foreach ($this->observers as $observer) {
            $observer->update($this);
        }
    }
 
    public function cancelSubscription(): void
    {
        // ...
        $this->notify();
    }
}

Objects such as the ObjectStorage, in which the observers are managed, are already present. We have a working implementation of the Observer Pattern with only a few lines of code and no framework at all: when the cancelSubscription method is called, all Observers are notified.

Listing 7 shows an example of how such an observer is attached to a customer account.

class CustomerAccountObserver implements SplObserver
{
    public function update(CustomerAccount|SplSubject $splSubject): void
    {
        // ...
    }
}
 
$customerAccount = new CustomerAccount();

$customerAccount->attach(new CustomerAccountObserver());

$customerAccount->cancelSubscription();

As a result, the SPL library not only provides ready-made objects, but also interfaces like SplObserver, which ensure that the observer implements the necessary update method.

 

Code reviews

Now for the practical part: If you work in a team with other developers, one of the first tools you introduce should be code reviews. The four-eyes-principle in programming is implemented through code reviews: No code should be allowed to enter the codebase unchecked. Therefore, we ask other developers to review our code and approve it if it passes muster. Changes are requested if this is not the case.

This not only has the advantage of detecting bugs or other problems earlier, but it also leads to knowledge distribution within the team. As a reviewer, you not only gain an understanding of your colleagues’ code, but you also automatically exchange information about programming techniques, so everyone involved learns something.

Code reviews have become an essential part of many development teams’ workflows. Nevertheless, we would like to share some tips for successful code reviews with you in this article:

Make code reviews mandatory. They are a necessity rather than a nice-to-have. All major code hosting platforms provide the necessary features to prevent code from being merged into the Main Branch without being checked.

Nobody should be afraid of code reviews. The idea of experienced developers tearing code apart can be very intimidating, especially for colleagues with little professional experience. When reviewing, keep the authors’ skill level in mind at all times.

Take note of the wording, both yours and others’. Of course, accusations and insults are strictly prohibited, but feedback that lacks constructive suggestions will only lead to frustration. Because text comments lack important meta-information such as facial expression and voice pitch, a defusing emoji can go a long way. Provide examples and justifications for how it could be done better in the first person.

Before making assumptions, ask the authors why they programmed something the way they did. Maybe there’s a good reason you don’t know about?

Before you begin, agree on common coding styles and guidelines with the team.

Tools can be used to automate tasks such as reviewing coding styles and guidelines. Don’t waste time discussing how to format code. Tools are much faster and more thorough, and they are always objective.

The last two points, concerning coding styles, guidelines, and tools, will be discussed in greater depth in the following sections.

Coding styles

There are many ways to write and format code. Because we spend far more time reading than writing source code, it should be written as uniformly as possible. This reduces the cognitive load (mental effort) of reading and writing.

Instead of laboriously agreeing on a standard with your team, which usually results in entertaining discussions about “indenting with spaces or tabs,” you should instead rely on existing standards like PSR-12. PSR-12 is the current PHP-FIG (PHP Framework Interop Group) [3] coding style guide that defines basic code formatting. The benefit is that PSR-12 is widely used, and many tools, such as IDEs and code sniffers, already support it out of the box. Furthermore, many packages and frameworks use this standard, making their code easier to read when looking for errors, for example.

Coding guidelines

So, coding styles define how code should be formatted, but they don’t tell you anything about team-agreed-upon best practises or standards. This is where coding standards come in.

For example, you can specify that if you’re working with PHP 8 or later, you must use only the constructor property promotion (Listing 8).

//  Before PHP 8
class ExampleClass
{
    public string $name;
 
    public function __construct(
        string $name
    ) {
        $this->name = $name;
    }
}
 
// PHP 8 with constructor property promotion
class ExampleClass
{
    public function __construct(
        public string $name,
    ) {
    }
}

In addition to code-related guidelines, there are those relating to aspects of software architecture, such as compliance with the Single Responsibility Principle (SRP). However, specifications for folder structures and naming conventions should also be written down. Because most PHP applications are now built on a framework, it’s worth establishing some ground rules here, especially if the framework provides multiple ways to perform a task, such as configuring routing.

Unlike style guides, coding guidelines cannot be pre-made. Of course, there are numerous examples available on the Internet [4], but keep in mind that each codebase is uniqueSo you should adapt these templates in any case, and you probably won’t be able to adopt all points. It is best to create the guidelines as a team, even if requires several workshops. This ensures that the vast majority agrees.

 

Tools

The same is true when writing clean code, as is so often the case in programming: Automate as many steps as possible! After all, you don’t want to waste time in code review debating misplaced brackets. On the contrary, code should not be submitted for review if it does not meet certain minimum requirements.

This is exactly what we can achieve by employing code sniffers: PHP-CS-Fixer [5] and PHP CodeSniffer [6] are tools that check code for compliance with code style guides faster and more accurately than a human could. If they find an error, they provide an accurate bug report. Even better, the code sniffers can automatically correct the faulty code if desired. It doesn’t get any easier than this, and there are no more justifications for checking in incorrectly formatted code.

Which tool you use is a matter of personal preference. PSR-12 standards, for example, can handle both, and they can also be configured down to the smallest detail. But there’s a lot more. The PHP copy and paste detector phpcpd [7] can help you find duplicates. As a result, we effectively prevent single code parts from being copied arbitrarily back and forth.

The static code analysis is the final step. The code is thoroughly reviewed here for potential bugs or unclean programming. PHPStan [8] is probably the most well-known example of this genre in the PHP world. PHPStan analyses your code and indicates which parts may cause issues. You can begin with low requirements and gradually increase them.

There are, of course, other useful tools, such as Psalm [9], phpmd [10], Phan [11], and Exakat [12]. However, especially in the beginning, it is best to focus on one tool at a time to avoid becoming overwhelmed by the abundance of finds. You can then gradually add more tools.

Code metrics

Not only can code be analysed, but it can also be quantitatively evaluated. This is accomplished by employing code metrics such as Cyclomatic Complexity or NPath Complexity [13]. In short, the metrics indicate the complexity of the code under investigation. The more branches there are, the more difficult it is to understand and maintain. These metrics can help you determine the state of a codebase or provide specific hints about which classes to focus on first during a refactoring.

Without being able to go into these values in greater detail in this article, you should take away that there are established code metrics and that they can be collected automatically with the help of appropriate tools. PhpMetrics [14] and PHP Depend [15] are well-known guild representatives. However, keep in mind that if you do not understand the values, these tools will be useless. You will not be spared further study in this case.

Software tests

Software tests are not an afterthought in terms of clean code, but rather an essential component. But why are tests so important? Nobody writes perfect code the first time. We are always learning, and our own code from last year certainly does not meet our current requirements. This is completely normal and should not be suppressed. Rather, we should accept that constant refactoring, or reworking existing code while maintaining functionality, is a necessary part of the job. Everyone understands that machines require routine maintenance and cleaning. It is the same with code.

However, regular changes pose challenges: we must ensure that the actual functionality is not altered. Manual testing, such as clicking through the application monotonously, is error-prone and time-consuming. Automated tests can help in this situation.

However, software tests not only assist us in refactoring, but they also assist us in writing better code. If we test every new piece of code from the start, we are forced to write easily testable code — and testable code is especially important if, among other things, Dependency Inversion Principle (DIP) couplings are avoided.

A detailed introduction to automated tests could fill an entire book. Writing really good code without proper testing, on the other hand, is likely to be difficult, which is why you should get into the habit of testing every piece of code you write as soon as possible. Test Driven Development (TDD) does not have to be done exactly “by the book.”

Automation

If you’ve been wondering how the tools mentioned thus far can be most effectively integrated into your daily work routine, you’re absolutely correct. Because no matter how much work these tools relieve us of, as long as they must be executed manually, they are easily forgotten.

In an ideal world, no code should be added to the main branch without first passing a quality check. So, let us wrap up with the topic of continuous integration (CI). This is the repeated automatic execution of the same checking steps. Continuous quality assurance is an important aspect of clean code because without it, certain checks will always be overlooked or even omitted on purpose. We can basically distinguish two levels of checks here:

  1. Checks that are executed before checking into the repository (pre-commit)
  2. Checks that are executed during a pull request

Git hooks are commonly used to implement the first check level [16]. These are scripts that run during a git commit and abort the commit in case of a negative check. As test steps, we can run a code sniffer and a static code analyser, both of which we have already described in detail above.

This check may also include fast execution tests (typically unit tests). However, the execution time should be limited to a few dozen seconds to avoid overtaxing the programmer’s patience. At the second level, slower checks, such as end-to-end tests, are performed.

Once the code has passed the first set of checks, a pull request is typically issued, requesting that the code’s author merge the code into the main branch. The code reviews begin at this point. As previously stated, we want to ensure that the code being reviewed meets certain minimum requirements — all tests have passed without errors, the code is correctly formatted and error-free, at least in terms of static analysis. The code review should be performed only after these checks have been completed without errors.

Continuous integration is often described as a pipeline: the code mentally travels through a tube, stopping at various stations along the way. Different aspects of the code are tested at each station. If all checks are successful, the code is ready for delivery as a container image or, more traditionally, as a ZIP archive.

This pipeline can be divided into four sections roughly. Of course, you are free to use more or fewer steps; Figure 1 should only serve as a guide.

Fig. 1: Schematic breakdown of a build pipeline

Finally, the supreme discipline is continuous delivery. The code is not only thoroughly analyzed and tested, but (if there are no objections) it is also pushed out directly to the target system, e.g. a test environment or production, at the push of a button. Of course, this only works if you have a lot of confidence in the pipeline and it involves a lot of effort. But once you start working with it, you certainly won’t want to do without it.

IPC NEWSLETTER

All news about PHP and web development

 

Conclusion

The subject of clean code is extensive. Many different subareas intertwine here to produce clean code. We were only able to address the most important of them in this article. Regardless, you should have a good idea of how diverse and exciting the world of clean code is.

Don’t be concerned if you didn’t grasp everything right away. Writing clean code requires a significant amount of time, dedication, and, most importantly, practise — a lot of practise! However, the journey is worthwhile because you will quickly notice how the quality of your code improves. Is there a more powerful motivator?

Clean Code in PHP – The book

Have we piqued your curiosity about clean code? Packt Publishing recently published the book “Clean Code in PHP: Expert Tips and Best Practices to Write Beautiful, Human-Friendly, and Maintainable PHP” [12]. It delves into the topic of clean code in depth, with numerous PHP practical examples.


Links & References

[1] Martin, R.: „Clean Code. Refactoring, Patterns, Testen und Techniken für sauberen Code“ (1. Aufl.). mitp, 2009

[2] Martin, R.: „Clean Architecture. A Craftsman’s Guide to Software Structure and Design“ (1. Aufl.). Prentice Hall, 2018

[3] https://www.php-fig.org

[4] https://carstenwindler.de/software-quality/coding-guidelines/

[5] https://github.com/PHP-CS-Fixer/PHP-CS-Fixer

[6] https://github.com/squizlabs/PHP_CodeSniffer

[7] https://github.com/sebastianbergmann/phpcpd

[8] https://phpstan.org

[9] https://psalm.dev

[10] https://phpmd.org

[11] https://github.com/phan/phan

[12] https://www.exakat.io/en

[13] Windler, C. and Daubois, A.: “Clean Code in PHP. Expert tips and best practices to write beautiful, human-friendly, and maintainable PHP” (1. Aufl.). Packt Publishing, 2022

[14] https://phpmetrics.org

[15] https://pdepend.org

[16] https://githooks.com

The post Clean Code in PHP appeared first on International PHP Conference.

]]>