coreruleset: Inconsistent representation of the backslash character in search patterns

Describe the bug

In brief: Inconsistent representation of the backslash character in search patterns means that some CRS rules behave subtly differently depending on the platform in use.

Following on from #2140, a search of the entire Core Rule Set (v3.4/dev branch) has revealed a handful of instances of the pattern \\ being used to represent a single backslash character.

The pattern \\ works correctly with libmodsecurity however Apache requires the pattern \\\\ in order to correctly represent a single backslash character (due to slight differences in how rules are parsed, as explained in detail here).

Most CRS rules use a portable representation of a backslash, e.g. [\\\\], which works as intended with both libmodsecurity and Apache. Some rules only use \\, however. This inconsistency means that some CRS rules behave subtly differently depending on the platform in use.

Steps to reproduce

Test one of the rules highlighted below on both Apache/mod_security2 and nginx/libmodsecurity.

Test using a pattern containing a \ character that should match.

Observe that the rule may be triggered on one platform but not the other, e.g. the rule may match with nginx but not match with Apache.


Example

Attempting to trigger rule 933210 with the pattern (sys\tem)('uname');

Command: curl -o /dev/null -v "localhost:80/?test=(sys\tem)('uname');"

Test 1 Testing against Apache/mod_security2: < HTTP/1.1 200 OK (No rules triggered)

Test 2 Testing against nginx/libmodsecurity: < HTTP/1.1 403 Forbidden

…ModSecurity: Warning. Matched "Operator `Rx'…against variable `ARGS:test' (Value: `(sys\tem)('uname');' ) …[id "933210"]…

Expected behaviour

All CRS rules are expected to perform similarly on either Apache/mod_security2 or libmodsecurity.

Actual behaviour

Rules using the pattern \\ behave differently depending on the platform.

Additional context

This is much clearer when run on a terminal with colour highlighting:

$ grep -E '[^\\]\\\\[^\\]' rules/*
rules/php-errors.data:Cannot access property started with '\\0'
rules/REQUEST-930-APPLICATION-ATTACK-LFI.conf:SecRule REQUEST_URI|ARGS|REQUEST_HEADERS|!REQUEST_HEADERS:Referer|XML:/* "@rx (?:(?:^|[\\/])\.\.[\\/]|[\\/]\.\.(?:[\\/]|$))" \
rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf:# [\^\.\w '\"/\\\\]*\\\\)?[\"\^]*       \\net\share\dir\cmd
rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf:SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx (?:[*?`\\'][^/\n]+/|\$[({\[#a-zA-Z0-9]|/[^/]+?[*?`\\'])" \
rules/REQUEST-933-APPLICATION-ATTACK-PHP.conf:SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|REQUEST_FILENAME|ARGS_NAMES|ARGS|XML:/* "@rx (?:(?:\(|\[|\")[a-zA-Z0-9_.$\"'\[\](){}*\s\\]+(?:\)|\]|\")[0-9_.$\"'\[\](){}*\s]*\([a-zA-Z0-9_.$\"'\[\](){}*\s].*\)|\([\s]*string[\s]*\)[\s]*(?:\"|'))\s*[;]" \
rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf:SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|REQUEST_HEADERS:User-Agent|REQUEST_HEADERS:Referer|ARGS_NAMES|ARGS|XML:/* "@rx (?i)(?:\W|^)(?:javascript:(?:[\s\S]+[=\\\(\[\.<]|[\s\S]*?(?:\bname\b|\\[ux]\d))|data:(?:(?:[a-z]\w+/\w[\w+-]+\w)?[;,]|[\s\S]*?;[\s\S]*?\b(?:base64|charset=)|[\s\S]*?,[\s\S]*?<[\s\S]*?\w[\s\S]*?>))|@\W*?i\W*?m\W*?p\W*?o\W*?r\W*?t\W*?(?:/\*[\s\S]*?)?(?:[\"']|\W*?u\W*?r\W*?l[\s\S]*?\()|\W*?-\W*?m\W*?o\W*?z\W*?-\W*?b\W*?i\W*?n\W*?d\W*?i\W*?n\W*?g[\s\S]*?:[\s\S]*?\W*?u\W*?r\W*?l[\s\S]*?\(" \
rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf:SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|!REQUEST_COOKIES:/_pk_ref/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx (?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\\x00)" \

The \\net\share\dir\cmd line is irrelevant, as it is a comment. The remaining lines appear to be valid issues (and easy fixes, I believe).

So, affected are the following:

  • ~The php-errors.data file (which is referenced only by (response) rule 953100, PL1)~
  • Rule 930110, PL1 (already being handled in #2140)
  • Rule 932200, PL2
  • Rule 933210, PL1
  • Rule 941170, PL1
  • Rule 942440, PL2

Your Environment

  • CRS version (e.g., v3.2.0): v3.4 (dev branch)
  • Paranoia level setting: Any
  • ModSecurity version (e.g., 2.9.3): 2.9.4 with Apache, 3.0.5 with nginx
  • Web Server and version (e.g., apache 2.4.41): Apache 2.4, nginx 1.20
  • Operating System and version: Whatever flavour of Linux the modsecurity-crs-docker containers use

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 19 (19 by maintainers)

Commits related to this issue

Most upvoted comments

Okay, now I understand what you mean 😃

I agree with you: adding [\\\\] is incorrect. I originally thought the rule was looking for the string representation \x00 and not a real NUL byte.


If the rule is looking for a NUL byte, should the end of the pattern be: …|;?\x00) ?

I’ve tested how the pattern from rule 942440 is interpreted using the tools pcre4msc2 and pcre4msc3. Because of the differences with escaping, the result is different:

$ echo | ./src/pcre4msc2 -d regexes/942440_1.txt 

RAW pattern:
============
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\\x00)

ESCAPED pattern:
================
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\x00)
$ echo | ./src/pcre4msc3 -d regexes/942440_1.txt 

PATTERN:
========
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\\x00)

ModSecurity v3 leaves the pattern with the extra \.


If the pattern is modified to …|;?\x00) then it ends up being the same:

$ echo | ./src/pcre4msc2 -d 942440_1_modified.txt 

RAW pattern:
============
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\x00)

ESCAPED pattern:
================
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\x00)
$ echo | ./src/pcre4msc3 -d 942440_1_modified.txt 

PATTERN:
========
(?:/\*!?|\*/|[';]--|--[\s\r\n\v\f]|--[^-]*?-|[^&-]#.*?[\s\r\n\v\f]|;?\x00)

So, that modified pattern would work the same on both platforms, I think.

I’m not sure how to test it for real: I’ve struggled to submit a test request containing a NUL byte using cURL

There is also the question of patterns designed to work with Apache, which do not work with libmodsecurity, e.g. from rule 920460:

(?:^|[^\\\\])\\\\[cdeghijklmpqwxyz123456789]

Actually, I wonder if this should be split into separate issues for wider discussion, as it may be too much to put into one issue?

(I’ve reduced the scope of this issue to just look at rule 941170 and rule 942440 for now, to make it less messy.)

@fzipi I couldn’t get the null character \x00 test to work, either for Apache or libmodsecurity (I forget which one didn’t work.) I’ll re-test and post a result here when I can.

Thank you very much for looking for these patterns and for this very clearly structured issue! If I get it right, \\\\ would work for both platforms? So we will have to open a PR that replaces \\ with \\\\?