google-play-scraper: TypeError: Cannot read property 'split' of undefined
version: 5.0.0
invode param:
gPlay.list({
category : gPlay.category.GAME,
collection : gPlay.collection.NEW_FREE,
lang : 'zh',
country : 'CN',
fullDetail : true,
start : 0,
num : 100,
}).then(function (appList) {
//console.log(appList);
cb(null, appList);
}).catch(function (err) {
console.error("try to spider googleplay app list failed, err=", err);
cb(err);
});
log:
at parseFields (/googleplay/node_modules/google-play-scraper/lib/app.js:48:64)
at tryCatcher (/googleplay/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/googleplay/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/googleplay/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/googleplay/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/googleplay/node_modules/bluebird/js/release/promise.js:693:18)
at Async._drainQueue (/googleplay/node_modules/bluebird/js/release/async.js:133:16)
at Async._drainQueues (/googleplay/node_modules/bluebird/js/release/async.js:143:10)
at Immediate.Async.drainQueues (/googleplay/node_modules/bluebird/js/release/async.js:17:14)
at runCallback (timers.js:794:20)
at tryOnImmediate (timers.js:752:5)
at processImmediate [as _immediateCallback] (timers.js:729:5)
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 26 (14 by maintainers)
@facundoolano looks like google is doing some A/B testing. The structure and the css-class naming is different from time to time. Unfortunately, it looks like they are now generating css-class names which results in non-human-readable names (such as
AHFaub
). It could also be the case, that those names change everytime a change to the website is made 😦@tanqhnguyen I’ve created #205 following the logic described above. I think that solution is good as long as google doesn’t start changing the shape of the data.
I’ve added the first few fields to the result, now we just need to track down each of the remaining ones in the data object and add the paths (I’ll get to it eventually but any help to move this forward is appreciated)
So… this is what I have so far
This map contains the basic data to query stuff from the weird google data structure found on a game page
path
is whatever passed to_.get
from
indicates from which “node” we should get the data fromUnfortunately, this is purely manual work to find the correct node and its path 😦
And the regex is
I also need to use
vm
to execute the script content matched by the above regex to construct an array of{key: string, data: () => Array}
@tanqhnguyen just a suggestion, it would be neat to parse the scripts into a map object with ‘ds:3’, ‘ds:10’ as the keys, and the arrays as the values.
then you could express the field paths like this:
There are ramda functions to facilitate extracting data from paths like those.
Also, don’t use the vm or eval to extract the arrays. You can just remove
AF_initDataCallback(
and replace the strings likedata:function(){return
withdata:
to get a proper json literal.This applies only to parsing the application detail page. AFAIK the rest of the parsers are still workers (or at least tests are passing)
Hey folks, looks like some interesting progress around this.
Just a quick thought, is there any plan to support both the older format and the newer format as a fallback?
From what we’ve observed it seems not everything on the play store has migrated to this newer markup (yet) so it might be good to try both parsers. Perhaps only newly published apps or app updates now generate the newer format.