placeholder: Sqlite Error: malformed JSON

Describe the bug When using pelias prepare placeholder with ‘whosonfirst-data-admin-ca-latest.db’, ‘whosonfirst-data-admin-us-latest.db’, ‘whosonfirst-data-postalcode-ca-latest.db’, and ‘whosonfirst-data-postalcode-us-latest.db’, in ‘/data/whosonfirst/sqlite’, download on may 4th, I get the following errors:

Creating extract at /data/placeholder/wof.extract
import...
SqliteError: malformed JSON
    at Statement.run (<anonymous>)
    at DocStore.set (/code/pelias/placeholder/lib/DocStore.js:67:10)
    at Placeholder.insertWofRecord (/code/pelias/placeholder/prototype/wof.js:210:14)
    at DestroyableTransform.insert [as _transform] (/code/pelias/placeholder/cmd/load.js:15:19)
    at DestroyableTransform.Transform._read (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_transform.js:184:1>
    at DestroyableTransform.Transform._write (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_transform.js:172:>
    at doWrite (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:428:64)
    at writeOrBuffer (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:417:5)
    at DestroyableTransform.Writable.write (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:334:11)
    at DestroyableTransform.ondata (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_readable.js:619:20)
INSERT INTO docs (id, json) VALUES ($id, $json)
SqliteError: malformed JSON
    at Statement.run (<anonymous>)
    at DocStore.set (/code/pelias/placeholder/lib/DocStore.js:67:10)
    at Placeholder.insertWofRecord (/code/pelias/placeholder/prototype/wof.js:210:14)
    at DestroyableTransform.insert [as _transform] (/code/pelias/placeholder/cmd/load.js:15:19)
    at DestroyableTransform.Transform._read (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_transform.js:184:1>
    at DestroyableTransform.Transform._write (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_transform.js:172:>
    at doWrite (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:428:64)
    at writeOrBuffer (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:417:5)
    at DestroyableTransform.Writable.write (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_writable.js:334:11)
    at DestroyableTransform.ondata (/code/pelias/placeholder/node_modules/readable-stream/lib/_stream_readable.js:619:20)
populate fts...
optimize...
close...

Steps to Reproduce

  • git clone https://github.com/getorca/docker
  • install prereqs and configure pelias env for /code/docker/projects/canada-usa
  • pelias download wof
  • pelias prepare placeholder

Expected behavior pelias prepare placeholder extracts wof files and imports into sqlite3.

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Docker versions: Docker version 19.03.8, docker-compose version 1.25.0

Additional context Download both US & Canada wof admin & postals

    "whosonfirst": {
      "datapath": "/data/whosonfirst",
      "importPostalcodes": true,
      "countryCode": ["CA", "US"],
      "importPlace": ["85633041", "85633793"]
    }

full pelias config available @ https://github.com/getorca/docker/blob/master/projects/canada-usa/pelias.json

References

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (9 by maintainers)

Most upvoted comments

I’m adding https://github.com/pelias/placeholder/pull/188 so that any future reports will contain enough info in the logs to solve this.

This has been reported twice now, by @getorca and @divE4pip

I had another quick look at the code and it seems that the malformed JSON error should only be generated in one situation, when the json_extract function is being called on an invalid JSON value:

example:

sqlite3 :memory: "SELECT json_extract( 'not valid json', '$.geom.bbox' );"
Error: malformed JSON

however if the JSON is simply empty, or doesn’t contain the property we are targeting (‘geom.bbox’) then it doesn’t emit an error:

sqlite3 :memory: "SELECT json_extract( '{}', '$.geom.bbox' );"

there is a little complexity here as the json_extract function is executed in an SQL trigger but I’m confused how it’s possible to insert invalid JSON from this line https://github.com/pelias/placeholder/blob/master/lib/DocStore.js#L67 which is using JSON.stringify

We host pre-generated copies of the data here https://geocode.earth/data.

I suspect that there may be one or more JSON objects in the source data which are invalid for some reason, so I ran this query against the database:

sqlite3 store.sqlite3 'SELECT * from docs WHERE json_valid(json) = 0;'

This returned zero rows, so I’m still not able to track down the error… could you please try that query against your database and hopefully we can find out which ID is causing it?

In the meantime, if there are only one or two of these errors then you’re likely safe to continue with the build, you’ll only be missing those records and it may not be a reason to discard the whole build.

Another option is to download the copies from our data portal and use those instead.

sadly nope, same issue.

It also seems to affect our download hosted on data.geocode.earth.

I’ve asked the 🤖 to rebuild the file and hope to have that uploaded tonight, pending testing.