client-go: jsonpath interprets index expressions incorrectly
The jsonpath package interprets $.['a.b'] the same as $.a.b which is incorrect. It should be the same as $.a\.b.
Here’s some code to demonstrate the issue: https://play.golang.org/p/qIaggWZVHl5
Also demonstrated by the above example: it should be OK to use a double-quoted string as an index expression, but it gives a syntax error. See https://www.ietf.org/id/draft-ietf-jsonpath-base-01.html#section-3.5.4.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 2
- Comments: 17
I offer a draft of a re-implementation. I started with trying to do minor fixes and ended up with a complete re-implementation: https://github.com/sthielo/client-go/tree/jsonpath-issue-1018/util/jsonpath (find TODOs in ./doc.go - e.g. nicer error messages, some more JSONPath filter functions, …)
BUT … … it will break some - hopefully rare - cases (see TODOs in test files), as JSONPath spec contradicts some of the examples. I also tried to be backward compatible as much as possible even allowing extensions to the spec as long as it does not contradict the spec (e.g. dots in quoted string are interpreted as segment separators).
I would be happy to have/discuss some reviews. I would also appreciate votes from the community whether this would be welcome as a pull request.
Another example of this, if the key of a map contains a space, it is unable to interpret the value. I’ve created a test that highlights the issue here - https://github.com/kubernetes/client-go/compare/master...emmjohnson:map-spaces?expand=1
Having looked in a bit more detail at the jsonpath implementation, it really needs a very significant refactoring, if not a rewrite.
Here’s an example from the code:
parseArrayis called after finding a[character. This code is assuming that any]character will terminate the index expression. but that’s not true when you can have a quoted string in there which can legitimately contain]characters itself.Other places I observed issues:
parseArrayfunction again, it does this:dictKeyRexis the regular expression^'([^']*)'$. That is, if the regular expression matches, it is assuming a dict key index (a string). However it won’t match any string that contains an escaped quote. Neither does it work with double-quoted strings.Also, the above logic takes place after splitting the index text on
,, so a dict key index containing a comma won’t work correctly either.parseQuote, it does this:endis either'or". That is, the first time it sees a quote character, it looks at the character before it, and if it’s a backslach (\), it decides that the quote is escaped and continues. However there are times when a quote can be legitimately preceded by an escape character without being escaped, for example'\\'.I’d suggest refactoring the code so it uses a conventional lexer-based approach with a recursive-descent parser layered on top of that.