moshi: Moshi slower than Gson reading quoted values.

Ran the following in large loops.

try (com.google.gson.stream.JsonReader reader = new com.google.gson.stream.JsonReader(
    new InputStreamReader(Foo.class.getClassLoader().getResourceAsStream("foo.json"), UTF_8))) {
  reader.setLenient(true);
  reader.beginArray();
  while (reader.hasNext()) {
    reader.nextLong();
  }
  reader.endArray();
}

vs.

try (JsonReader reader = JsonReader.of(
    Okio.buffer(Okio.source(Foo.class.getClassLoader().getResourceAsStream("foo.json"))))) {
  reader.setLenient(true);
  reader.beginArray();
  while (reader.hasNext()) {
    reader.nextLong();
  }
  reader.endArray();
}

With foo.json containing a huge array of single-quoted longs, Gson was about 25% faster. With foo.json containing unquoted longs, Moshi was about as fast as Gson.

I found this performance degradation in other quoted values, like nextString and nextName, too.

I suspect it is the performance of nextQuotedValue and BufferedSource’s indexOfElement. (I will have to learn how to use real measurement tools, but I wanted to file here to remember to get back to this after finding this last Friday.)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 30 (26 by maintainers)

Most upvoted comments

Good news. I’ve started implementing a trie-based select() and the performance is better. From the SelectBenchmark, these are JVM numbers for a varying number of options:

linear    4     88.197 ± 0.519  us/op
trie      4     87.662 ± 0.399  us/op
linear    8    139.014 ± 0.787  us/op
trie      8     97.934 ± 0.434  us/op
linear   16    224.262 ± 3.278  us/op
trie     16    123.960 ± 0.360  us/op
linear   32    329.384 ± 1.867  us/op
trie     32    155.407 ± 1.191  us/op
linear   64    620.990 ± 5.640  us/op
trie     64    183.513 ± 0.724  us/op

The PR is not quite ready yet. I need to implement both select() and selectPrefix(). That should be straightforward and the performance numbers should carry over.

Yeah, we can make it faster for large numbers of fields. Could you share a sample JSON message and the type it maps to? Or perhaps just the min/mean/max lengths of these fields and whether they share a common prefix?