phpstan: Does PHPStan execute my code?? / Handling invalid UTF-8 data

On PHP 7.1 execute:

composer require phpstan/phpstan:0.10.3
composer require nette/utils:2.5.2
vendor/bin/phpstan analyze --level=1 vendor/nette/utils/src/Utils/Strings.php
 ------ --------------------------------------------------------------------------------------
  Line   Strings.php
 ------ --------------------------------------------------------------------------------------
         Internal error: Malformed UTF-8 data (pattern: #^.{1,19}(?=[\s\x00-/:-@\[-`{-~])#us)
         Run PHPStan with --debug option and post the stack trace to:
         https://github.com/phpstan/phpstan/issues/new
 ------ --------------------------------------------------------------------------------------

# with --debug

/home/viktor/tmp/nette/vendor/nette/utils/src/Utils/Strings.php
PHP Fatal error:  Uncaught Nette\Utils\RegexpException: Malformed UTF-8 data (pattern: #^.{1,19}(?=[\s\x00-/:-@\[-`{-~])#us) in /home/viktor/tmp/nette/vendor/nette/utils/src/Utils/Strings.php:594
Stack trace:
#0 /home/viktor/tmp/nette/vendor/nette/utils/src/Utils/Strings.php(522): Nette\Utils\Strings::pcre('preg_match', Array)
#1 /home/viktor/tmp/nette/vendor/nette/utils/src/Utils/Strings.php(225): Nette\Utils\Strings::match('\xA5\xA3\xBC\x8C\xA7\x8A\xAA\x8D\x8F\x8E\xAF\xB9\xB3\xBE\x9C...', '#^.{1,19}(?=[\\s...')
#2 /home/viktor/tmp/nette/vendor/phpstan/phpstan/src/Type/Constant/ConstantStringType.php(47): Nette\Utils\Strings::truncate('\xA5\xA3\xBC\x8C\xA7\x8A\xAA\x8D\x8F\x8E\xAF\xB9\xB3\xBE\x9C...', 19)
#3 /home/viktor/tmp/nette/vendor/phpstan/phpstan/src/Type/VerbosityLevel.php(47): PHPStan\Type\Constant\ConstantStringType->PHPStan\Type\Constant\{closure}()
#4 /home/viktor/tmp/nette/vendor/phpstan/phpstan/src/Type/Constant/ConstantStringType.php(50): PHPStan\Type\VerbosityLevel->handle(Object(Closure), Object(Closure))
#5 / in /home/viktor/tmp/nette/vendor/nette/utils/src/Utils/Strings.php on line 594

RegexpException is thrown from that class. Please advise.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

Imho we should just use substr and be done with it; there is no reason the believe that the data are in utf-8.

@ondrejmirtes I think the problem is that https://github.com/phpstan/phpstan/blob/master/src/Type/Constant/ConstantStringType.php#L47 uses Nette\Utils\Strings which works only with UTF-8 strings and the value may be binary.

Edit: isolated as https://phpstan.org/r/d0ace396dbe4be26a9d848134bec627d (\xc3\x28 is an INVALID utf-8 sequence)

@TomasVotruba When I replace this line …

\Nette\Utils\Strings::truncate($this->value, self::DESCRIBE_LIMIT)

… with …

\voku\helper\UTF8::str_truncate($this->value, self::DESCRIBE_LIMIT)

(https://github.com/voku/portable-utf8)

… then it works as expected (at least in my use-cases). 😃