In this article, I will explain the methodology which allowed me to safely upgrade hundreds of PHP applications to the latest PHP version. The majority of those were one or several major PHP versions behind. I will also mention many common caveats so you can avoid them.
IPC NEWSLETTER
All news about PHP and web development
Does it Work on PHP 8?
Unless you have automated and thorough black box tests, you can’t tell for certain. Unit tests won’t be of much help because they or the testing framework might also not be compatible with the latest PHP. If you change both the tests and the system under test at the same time, regressions can slip through the cracks. The code needs to be exercised, whether manually or through a new black box test suite. I will discuss testing strategies a bit later in the article. For now, I will focus on some of the tools I use to help me with finding and understanding PHP compatibility issues.
The first tool has a self-explanatory name: PHPCompatibility. It’s open source, is mostly maintained by Juliette Reinders Folmer, and is currently in need of funding. It can detect a variety of compatibility issues, such as removed extensions, using class names that became reserved, forbidden call-time pass by reference, etc. Once installed by following the README, it can be executed on the command-line like this:
phpcs . --standard=PHPCompatibility --runtime-set testVersion 8.4
This will scan your current directory for compatibility issues with PHP 8.4. You can target other versions, so you could, for example, upgrade to PHP 7.4 before jumping to PHP 8.4. You could add more flags and options to the above command, such as -p to display progress, –colors to display colors in the output, and –extensions=php,inc,phtml to filter the files analyzed. The result looks like this:
FILE: /www/src/app/controllers/MyController.php
------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
------------------------------------------------------------------
166 | ERROR | Using 'break' outside of a loop or switch structure
| | is invalid and will throw a fatal error since PHP
| | 7.0
------------------------------------------------------------------
For some applications, this will be a very long list and will require much research to fix. Even then, it doesn’t find every possible issue. This is because PHPCompatibility doesn’t have cross-file awareness and doesn’t infer types. Even then, because the legacy code likely doesn’t have strict type declarations, you can’t detect errors until you get to the faulty scenario at runtime. In PHP 8, this represents a large portion of all issues because of the type strictness that it introduced. For example:
- Calling count() on non-countable used to be a warning and is now a fatal error.
- Stricter loose comparisons, making expressions such as ‘php’ == 0 return false instead of true.
- Most functions are stricter on their inputs, such as fopen() no longer accepting nulls or empty strings. In fact, fopen() can no longer return false, which is problematic with code that relies on that behavior.
Fatal errors, although frustrating, are still preferred to unexpected logic changes, which are harder to catch and can lead to severe effects. It can also require much effort to understand how all these changes impact a given piece of code, since nobody might know what it’s supposed to do in the first place. This is why I usually avoid indiscriminately casting values before passing them to a PHP function, as it hides the real issue or completely changes the behavior. Example:
$string = array();
- $lowercase = strtolower($string);
+ $lowercase = strtolower((string) $string);
Although this change prevents PHP 8 from emitting a fatal error, it also changes the result from null to “array”, which can take the execution down a completely different path, potentially causing destructive actions such as overwriting data with this new string.
Static Analysis
Tools such as PHPStan, Psalm, and Phan can detect some of the same things that PHPCompatibility can, but they are less focused on compatibility. However, they can complement PHPCompatibility with their ability to infer types. Here is an example of issues that PHPStan can detect, which can signal potential runtime issues on PHP 8:
- Undefined variables. These would only emit warnings in PHP 7, but fatal errors in PHP 8.
- Static calls to instance methods. These would get promoted from deprecation warnings to fatal errors in PHP 8.
These additional insights are very helpful, as give you a list that can serve as a basis for planning the upgrade project. However, PHPStan won’t be usable on all codebases, especially if it’s written in PHP 5, doesn’t use PSR-4, or has a lot of dead code. It will complain about pre-existing errors, even if those are false positives or inside dead code, and refuse to perform the full scan until you eliminate them. In large projects or in the early stages of an upgrade, eliminating all these issues might not be practical. PHPStan prioritizes preventing bugs over documenting them. Don’t be discouraged if this happens in your project. You can instead activate all warnings and notices on the original PHP version and exercise the code via a test suite, which I’ll discuss in the next part. The execution of the tests will generate logs. You can then research whether the warning or notice becomes a fatal error in PHP 8 and make a list that way.
Rector is a refactoring tool that I use when I need a very specific refactoring rule. In the PHP versions category, its main focus is on introducing modern features, which isn’t a priority if I want to get off an unsupported PHP version quickly. It also can be incorrect or incomplete, so it should be used carefully. For example, when replacing PHP 4 style constructors, it renames the method but doesn’t update the constructor calls. PHPStorm does this correctly, and I use PHPStorm to fix PHP 4 style constructors. Here are some examples of how I use Rector:
- Swap the order of implode() arguments. Both orders were accepted in the past.
- Add a missing parameter to a method based on the parent’s method.
- Create a custom rule to quickly fix a multitude of similar issues. Example: replace static calls with an instantiation, in the case of the static calls to instance methods issue mentioned above. Be aware that writing custom Rector rules has a steep learning curve.
Some of the previously discussed issues can be detected and fixed by PHPStorm, which is a commercial product, but widespread enough to mention here. It has a multitude of useful inspections and quick-fixes, but not enough to replace the previous tools. It does have a Replace Structurally feature, which allows us to create simple yet syntax-aware replacements. For example, say I wanted to write a compatibility adapter for the fopen() function, but only when it’s called with 2 arguments. I would be able to search for all references to this function with exactly 2 arguments, which is not something one should attempt with regular expressions because of the potential complexity. Example: fopen((new MyClass($array[‘key’]))->getPath(), ‘r’). PHPStorm does the heavy lifting here with fopen($arguments$). I then tell it to replace it with Compatibility::fopen($arguments$). This allows me to make safe changes that don’t accidentally erase portions of the code or introduce parsing errors.
As you can see, every tool has its advantages and drawbacks, so you need to find how to best combine all these tools for your specific upgrade project. Even with all these tools, you still need to thoroughly test your code to ensure that the behavior didn’t change.
Testing
If the application doesn’t yet have a complete black box test suite, I recommend writing characterization tests. These will ensure that the application continues to behave the same way as on the old PHP version. For this, you would write tests that pass on the old PHP version with the unchanged codebase. Once you put the application on a new PHP version, the same tests will obviously fail. You would then combine the insights from the previously discussed tools, logs, and the newly created tests to fix the compatibility issues until the tests pass. The more thorough the automated test suite, the fewer manual tests you would need.
There are many tools to accomplish this, although I personally use Cypress due to the community size, abundance of plugins, ease of use, and great documentation. Installation instructions and tutorials are available on their website. The tests can run in the browser, which is useful for debugging, or headless on the command-line, which is useful to put in a continuous integration pipeline. Test cases will be written in JavaScript or TypeScript. Here is an example of a test (Listing 1).
describe('Checkout', () => { it('Can add items to the shopping cart', () => { const productName = 'Product 1'; cy.visit('/shop') cy.contains('.product', productName) .siblings('div') .contains('button', 'Add to Cart') .click(); cy.title().should('eq', 'My Cart'); cy.contains('.cart-item', productName).should('exist'); }) })
This test opens the shop, finds a specific product, finds and clicks the associated Add to Cart button, then ensures we end up in the cart with that product added to it. One advantage of these tests is that they won’t need to be changed even if you replace most of your libraries. In fact, you could even rewrite your entire application in a completely different language, although I don’t recommend it in most cases. In 23 years, I only recommended a rewrite twice, both times because the language was dead. PHP is very much alive, so it’s safer and less expensive to upgrade the code.
Another type of regressions you should look out for is performance. Some compatibility solutions might be more resource-intensive. Black box tests can measure little beyond the response time, but the application can be modified to inject performance metrics into the page whenever it detects a test environment. These changes need to be done starting with the original code so that the current performance can be captured. The captured metrics can then be added into the tests’ expectations. Example:
window.phpPerformance = {
memoryUsage: <?php echo $memoryUsage; ?>
};
Let’s say that the current application reports 20MB, and we want the new version to not exceed this value. Here’s an example assertion in a Cypress test:
cy.window().then((win) => {
const megabyte = 1024 * 1024;
expect(win.phpPerformance.memoryUsage)
.to.be.at.most(20 * megabyte);
});
Limiting Behavior Regressions
The approach I privilege in PHP upgrades is one where I make minimal changes to individual expressions. Expressions are smaller than statements. For example:
- $object->property
- $a + 1
- number_format($a + 1)
Some expressions can be affected when moving to a new PHP version. It’s much easier to reason about an expression than it is about the state of an entire application, possibly across multiple HTTP requests. State can get extremely complex, especially if it doesn’t follow best practices and abuses globals, which is quite typical of the legacy applications I work with. If an individual expression, given the same values, exhibits the same behavior, then by extension, the entire application should continue to behave the same. With that in mind, I don’t need to understand each one of the millions of lines of code and how they interact. I reduce the application to its most basic elements and fix those.
Let’s take $object->property = ‘php’. It seems simple enough until the object is undefined. In PHP 8, this results in a fatal error. In PHP 4 through 7, it magically instantiates the object in that scope before assigning. If you’re lucky, you can initialize the variable just before. But what if it’s passed to the function, and you can no longer say with certainty whether it can be null? In a codebase which relied heavily on this magic behavior, I created a custom Rector rule to find all property assignments where the target object is not declared in the same scope. I then replaced those with Compatibility::initObject($object)->property = ‘php’, where the new method would check the object’s value at runtime and instantiate it if needed, making this expression retain its old behavior.
IPC NEWSLETTER
All news about PHP and web development
Native PHP functions became stricter in PHP 8. They would reject invalid types and values. In the past, such an input would typically return null or false, depending on the input. For example, in PHP 7, mb_strtolower() would return null if no arguments are provided, but false given an invalid encoding. Many PHP 8 functions are no longer capable of returning null or false. This is why most of my compatibility adapters check for these scenarios and return these values before calling the native function (Listing 2).
public static function mb_strtolower($string = null, $encoding = 'ISO-8859-1'): string|null|false { if (func_num_args() === 0) { return null; } if (self::isValidEncoding($encoding) === false) { return false; } return \mb_strtolower((string)$string, $encoding); }
Can a full compatibility library be created for legacy PHP? Perhaps, but that would cause performance degradation. All the logic that you see above the native function used to be inside the native function, which was written in C. If we reproduce every single scenario that the function used to have, but in PHP, it would add significant overhead. Instead, I recommend focusing only on functions and scenarios that affect your code. If you never pass invalid encodings to this function, then there’s no point in validating it in this adapter. You can check whether a function can receive invalid input by logging the types and/or values at the top of the adapter, run your test suite, and analyze the log to understand how the application uses this function.
Most of these solutions make the code less pretty. However, the aim is to make safe changes and enable us to modify the compatibility adapters if we discover a new scenario that we didn’t account for, instead of having to undo all our changes inline. It makes the fix more maintainable. The goal, once the pressure to get off an unsupported PHP version is gone, is to refactor to clean the code so it doesn’t need these fixes in the first place.
A tool that I really like to probe functions and experiment with solutions is 3v4l, an online shell to run PHP code on multiple PHP versions. It makes it easy to compare outputs. I would, for example, supply all kinds of invalid input to a function and compare the outputs across versions. This tells me what scenarios I may need to put in the adapter. I can also write an adapter and test it there. Once satisfied with the adapter, I would write unit tests for it. I can, of course, achieve this locally by running multiple PHP versions, but I like the ability to then share those code snippets with my colleagues and on social media.
Third-Party Code
I see many developers disregard third-party code because they expect to simply replace it with the latest community version. That might not be possible or practical if, for example:
- There are hardcoded overrides, so your version of the library no longer has the same behavior as the community version.
- The latest community version has a different behavior. Remember the complete overhaul of Doctrine between versions 1 and 2. Even subtle differences can significantly affect the application that uses it. These differences are often undocumented, that is, if the documentation still exists, which is becoming increasingly an issue as well.
- The community project was abandoned, so there is no replacement compatible with PHP 8.
Once you make a list of all your dependencies, you need to determine the feasibility of using the latest community version. In some cases, you may find community-maintained forks of the legacy library or framework, which both preserves the old behavior and runs on the latest PHP. An example of that would be zf1-future, which is compatible with PHP 8.1, and is a close enough starting point. Such forks are common, especially for widely used libraries, since so much other legacy code depends on them. Remember to dig a bit deeper if your code uses an abandoned library.
It’s also possible to replace abandoned libraries with different ones or even develop your own. This might be an opportunity to have something that better satisfies today’s needs. An example of that might be replacing pChart 1.x, since version 2 was a complete rewrite, so your code won’t work with it. For example, if the purpose was to render charts in the browser, then a JavaScript library like Chart.js or Google Charts might work even better than PHP code generating static images.
Another type of third-party code we didn’t discuss yet is extensions. I reason about them in the same way I reason about PHP libraries. Does it exist for the latest PHP version? Does it behave the same? Are there replacements, either as extensions or as Composer packages? Even when you do find a replacement, make sure it behaves the same as in the legacy PHP version. For example mcrypt_compat is a Composer package that replaces the abandoned mcrypt extension. However, mcrypt, before PHP 5.6, would accept a shorter key and would simply pad it with ‘\0’ to get the required size. Starting in PHP 5.6, it would instead reject the shorter key and return false. For a codebase that has already encrypted everything using the zero-padded key, that would be problematic. To fix it, I got the author to add a PHPSECLIB_MCRYPT_TARGET_VERSION constant to allow this package to replicate the old behavior.
After the upgrade, it is still a good idea to address the underlying issue of using a key that’s too short, as it undermines security, but this approach gives us more granular control for a progressive modernization journey. It’s always better to have things a bit more secure now than wait for everything to be ready later.
Conclusion
Here is what I would like you to take away from this article:
- Code becomes broken over time, as libraries get abandoned and old PHP versions stop getting security patches.
- Knowing the advantages and limitations of each tool will help you combine them in the best way for your upgrade.
- Focus on preserving existing behavior. You can refactor later once you’re off the old PHP version.
- To avoid regressions, create characterization tests and limit changes to individual expressions when possible.
- Make a list of third-party dependencies, research their state and alternatives, and make a decision on a case-by-case basis.
- Don’t be afraid to ask authors of your favorite tools and libraries about your specific scenario. You can also sponsor their work.
I hope this advice helps you with your next PHP upgrade project. Happy coding!
Frequently Asked Questions (FAQ)
1. Why should legacy PHP applications be upgraded?
Old PHP versions stop receiving security patches, making applications running on them vulnerable over time. Upgrading is necessary to maintain security and functionality.
2. What is the PHPCompatibility tool and how does it help?
PHPCompatibility is an open-source tool that scans PHP code for compatibility issues with specific PHP versions. It detects removed extensions, reserved class names, and invalid syntax such as using break outside of loops.
3. What are the limitations of PHPCompatibility?
PHPCompatibility lacks cross-file awareness and type inference, so it may miss issues related to PHP 8’s stricter type system. Errors often remain undetected until runtime.
4. How can static analysis tools like PHPStan help in upgrades?
PHPStan and similar tools infer types and detect potential runtime issues such as undefined variables or static calls to instance methods. However, they may struggle with older codebases or those lacking PSR-4 structure.
5. What role does Rector play in PHP upgrades?
Rector automates refactoring tasks and helps apply PHP version-specific changes. It can apply custom transformation rules, though it requires careful configuration and validation.
6. How can testing strategies prevent regressions during upgrades?
Creating characterization tests on the old PHP version helps preserve existing behavior after upgrade. Tools like Cypress can automate these tests in browsers or CI pipelines.
7. How can performance regressions be detected in upgraded applications?
By injecting performance metrics into pages and validating them in test assertions, developers can ensure that performance does not degrade after the PHP upgrade.
8. What is the safest approach to upgrading legacy PHP expressions?
Focusing on fixing individual expressions rather than entire states reduces the risk of introducing logic errors. Compatibility adapters can replicate legacy behavior safely.
9. How should third-party dependencies be handled in a PHP upgrade?
Each dependency should be evaluated for PHP 8 compatibility. Options include using community forks, replacing with modern alternatives, or creating custom adapters.
10. Is it feasible to create a full compatibility library for legacy PHP?
While possible, a complete compatibility layer may introduce performance overhead. Developers should focus on only the functions and behaviors actively used by their application.