x-ray: Crawling to another site on collection always undefined

This is the example from the documentation and works fine

  x('http://google.com', {
    main: 'title',
    image: x('#gbar a@href', 'title'), // follow link to google images
  }).write('google.json')

Now I tried with the dribble example but fetching info from another site

x('https://dribbble.com', 'li.group', [{
  title: '.dribbble-img strong',
  image: '.dribbble-img [data-src]@data-src',
  short_description: x('.dribbble-link@href', '.shot-desc p')
}])
  .paginate('.next_page@href')
  .limit(3)
  .write('results.json')

but I’m getting [ undefined,undefined,undefined] in the results.json file

executing DEBUG=x-ray node .

x-ray params: {"source":"https://dribbble.com","scope":"li.group","selector":[{"title":".dribbble-img strong","image":".dribbble-img [data-src]@data-src"}]} +0ms
  x-ray starting at: https://dribbble.com +4ms
  x-ray fetching https://dribbble.com +1ms
  x-ray got response for https://dribbble.com with status code: 200 +585ms
  x-ray params: [Circular] +139ms
  x-ray params: [Circular] +4ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767755-Prepare-And-Gather +0ms
  x-ray fetching https://dribbble.com/shots/2767755-Prepare-And-Gather +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767491-Framer-Code-Folds +1ms
  x-ray fetching https://dribbble.com/shots/2767491-Framer-Code-Folds +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767633-BK-Bridge-WIP +1ms
  x-ray fetching https://dribbble.com/shots/2767633-BK-Bridge-WIP +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767710-Tools +1ms
  x-ray fetching https://dribbble.com/shots/2767710-Tools +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767662-Need-for-Speed +0ms
  x-ray fetching https://dribbble.com/shots/2767662-Need-for-Speed +1ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767882-Cheers +0ms
  x-ray fetching https://dribbble.com/shots/2767882-Cheers +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768238-Happy-Birthday-Donald +1ms
  x-ray fetching https://dribbble.com/shots/2768238-Happy-Birthday-Donald +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +4ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767691-Tarot-Magician +1ms
  x-ray fetching https://dribbble.com/shots/2767691-Tarot-Magician +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +3ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768231-Smile-Season +0ms
  x-ray fetching https://dribbble.com/shots/2768231-Smile-Season +1ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767999-Outside-Lands-Patch-Tree +0ms
  x-ray fetching https://dribbble.com/shots/2767999-Outside-Lands-Patch-Tree +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767908-Hover-State-Animated +0ms
  x-ray fetching https://dribbble.com/shots/2767908-Hover-State-Animated +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +3ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768129-Beer-Mat-Pattern +1ms
  x-ray fetching https://dribbble.com/shots/2768129-Beer-Mat-Pattern +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray undefined is not a url. Skipping! +1ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray undefined is not a url. Skipping! +0ms
  x-ray paginate(".next_page@href") => "https://dribbble.com/shots?page=2" +2ms
  x-ray paginating "https://dribbble.com/shots?page=2" +1ms
  x-ray 2 page(s) left to crawl +0ms
  x-ray fetching https://dribbble.com/shots?page=2 +0ms
  x-ray got response for https://dribbble.com/shots/2767633-BK-Bridge-WIP with status code: 200 +558ms
  x-ray got response for https://dribbble.com/shots/2767882-Cheers with status code: 200 +47ms
  x-ray got response for https://dribbble.com/shots/2767491-Framer-Code-Folds with status code: 200 +21ms
  x-ray got response for https://dribbble.com/shots/2767755-Prepare-And-Gather with status code: 200 +32ms
  x-ray got response for https://dribbble.com/shots/2767710-Tools with status code: 200 +18ms
  x-ray got response for https://dribbble.com/shots/2767662-Need-for-Speed with status code: 200 +20ms
  x-ray got response for https://dribbble.com/shots/2768238-Happy-Birthday-Donald with status code: 200 +28ms
  x-ray got response for https://dribbble.com/shots/2768231-Smile-Season with status code: 200 +22ms
  x-ray got response for https://dribbble.com/shots/2767691-Tarot-Magician with status code: 200 +19ms
  x-ray got response for https://dribbble.com/shots/2767908-Hover-State-Animated with status code: 200 +25ms
  x-ray got response for https://dribbble.com/shots/2767999-Outside-Lands-Patch-Tree with status code: 200 +17ms
  x-ray got response for https://dribbble.com/shots?page=2 with status code: 200 +18ms
  x-ray params: [Circular] +27ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768045-DaBull-Final-Logo +0ms
  x-ray fetching https://dribbble.com/shots/2768045-DaBull-Final-Logo +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +0ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767907-Japanese-Games +0ms
  x-ray fetching https://dribbble.com/shots/2767907-Japanese-Games +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767716-Rift +0ms
  x-ray fetching https://dribbble.com/shots/2767716-Rift +2ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768246-Twitched-it-Viper-Shading-Texture-Process +1ms
  x-ray fetching https://dribbble.com/shots/2768246-Twitched-it-Viper-Shading-Texture-Process +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767669-OH-again +0ms
  x-ray fetching https://dribbble.com/shots/2767669-OH-again +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767734-Born-To-Lose +0ms
  x-ray fetching https://dribbble.com/shots/2767734-Born-To-Lose +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767595-Profile-screen-for-upcoming-ios-app +0ms
  x-ray fetching https://dribbble.com/shots/2767595-Profile-screen-for-upcoming-ios-app +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767743-WALL-E +0ms
  x-ray fetching https://dribbble.com/shots/2767743-WALL-E +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768567-Bulk-Edit-Mode-Interaction +1ms
  x-ray fetching https://dribbble.com/shots/2768567-Bulk-Edit-Mode-Interaction +1ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767651-House-On-Stilts +0ms
  x-ray fetching https://dribbble.com/shots/2767651-House-On-Stilts +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +0ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768218-Sad-Iron-Man +0ms
  x-ray fetching https://dribbble.com/shots/2768218-Sad-Iron-Man +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767552-Bridgestone-Icons +0ms
  x-ray fetching https://dribbble.com/shots/2767552-Bridgestone-Icons +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray undefined is not a url. Skipping! +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray undefined is not a url. Skipping! +0ms
  x-ray paginate(".next_page@href") => "https://dribbble.com/shots?page=3" +2ms
  x-ray paginating "https://dribbble.com/shots?page=3" +0ms
  x-ray 1 page(s) left to crawl +0ms
  x-ray fetching https://dribbble.com/shots?page=3 +0ms
  x-ray got response for https://dribbble.com/shots/2768129-Beer-Mat-Pattern with status code: 200 +198ms
  x-ray got response for https://dribbble.com/shots/2768045-DaBull-Final-Logo with status code: 200 +419ms
  x-ray got response for https://dribbble.com/shots/2767716-Rift with status code: 200 +25ms
  x-ray got response for https://dribbble.com/shots/2767669-OH-again with status code: 200 +20ms
  x-ray got response for https://dribbble.com/shots/2768246-Twitched-it-Viper-Shading-Texture-Process with status code: 200 +21ms
  x-ray got response for https://dribbble.com/shots/2767595-Profile-screen-for-upcoming-ios-app with status code: 200 +20ms
  x-ray got response for https://dribbble.com/shots/2767907-Japanese-Games with status code: 200 +18ms
  x-ray got response for https://dribbble.com/shots/2767743-WALL-E with status code: 200 +19ms
  x-ray got response for https://dribbble.com/shots/2767734-Born-To-Lose with status code: 200 +21ms
  x-ray got response for https://dribbble.com/shots/2767651-House-On-Stilts with status code: 200 +18ms
  x-ray got response for https://dribbble.com/shots/2768567-Bulk-Edit-Mode-Interaction with status code: 200 +19ms
  x-ray got response for https://dribbble.com/shots/2767552-Bridgestone-Icons with status code: 200 +15ms
  x-ray got response for https://dribbble.com/shots?page=3 with status code: 200 +72ms
  x-ray params: [Circular] +40ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768596-Self-Portrait +0ms
  x-ray fetching https://dribbble.com/shots/2768596-Self-Portrait +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768053-Firefly-Direction-3 +1ms
  x-ray fetching https://dribbble.com/shots/2768053-Firefly-Direction-3 +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767742-UEFA-EURO-2016-Poster-Series +1ms
  x-ray fetching https://dribbble.com/shots/2767742-UEFA-EURO-2016-Poster-Series +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767553-Dance-Party +0ms
  x-ray fetching https://dribbble.com/shots/2767553-Dance-Party +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768257-Umbrella-Alert +0ms
  x-ray fetching https://dribbble.com/shots/2768257-Umbrella-Alert +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768113-Hand +0ms
  x-ray fetching https://dribbble.com/shots/2768113-Hand +0ms
  x-ray params: [Circular] +1ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767959-Suns-out-guns-out +0ms
  x-ray fetching https://dribbble.com/shots/2767959-Suns-out-guns-out +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767496-Spaceboy-2 +0ms
  x-ray fetching https://dribbble.com/shots/2767496-Spaceboy-2 +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768325-Coastal-Oasis +1ms
  x-ray fetching https://dribbble.com/shots/2768325-Coastal-Oasis +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +2ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768317-2D-3D-mixed-illustration +0ms
  x-ray fetching https://dribbble.com/shots/2768317-2D-3D-mixed-illustration +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2768060-California-Dreamin +0ms
  x-ray fetching https://dribbble.com/shots/2768060-California-Dreamin +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray resolved ".dribbble-link@href" to a https://dribbble.com/shots/2767949-Dapper-Ink-Signage +0ms
  x-ray fetching https://dribbble.com/shots/2767949-Dapper-Ink-Signage +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +0ms
  x-ray resolving to a url: .dribbble-link@href +1ms
  x-ray undefined is not a url. Skipping! +0ms
  x-ray params: [Circular] +0ms
  x-ray params: [Circular] +1ms
  x-ray resolving to a url: .dribbble-link@href +0ms
  x-ray undefined is not a url. Skipping! +0ms
  x-ray reached limit, ending +1ms
  x-ray got response for https://dribbble.com/shots/2768218-Sad-Iron-Man with status code: 200 +162ms
  x-ray got response for https://dribbble.com/shots/2768596-Self-Portrait with status code: 200 +338ms
  x-ray got response for https://dribbble.com/shots/2767553-Dance-Party with status code: 200 +29ms
  x-ray got response for https://dribbble.com/shots/2767742-UEFA-EURO-2016-Poster-Series with status code: 200 +34ms
  x-ray got response for https://dribbble.com/shots/2768053-Firefly-Direction-3 with status code: 200 +34ms
  x-ray got response for https://dribbble.com/shots/2768113-Hand with status code: 200 +19ms
  x-ray got response for https://dribbble.com/shots/2768257-Umbrella-Alert with status code: 200 +15ms
  x-ray got response for https://dribbble.com/shots/2767959-Suns-out-guns-out with status code: 200 +16ms
  x-ray got response for https://dribbble.com/shots/2768325-Coastal-Oasis with status code: 200 +15ms
  x-ray got response for https://dribbble.com/shots/2767496-Spaceboy-2 with status code: 200 +13ms
  x-ray got response for https://dribbble.com/shots/2767949-Dapper-Ink-Signage with status code: 200 +14ms
  x-ray got response for https://dribbble.com/shots/2768317-2D-3D-mixed-illustration with status code: 200 +20ms
  x-ray got response for https://dribbble.com/shots/2768060-California-Dreamin with status code: 200 +55ms

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 4
  • Comments: 19 (2 by maintainers)

Most upvoted comments

+1 no point on this library without it. Thanks for the effort though but it’s hard to use this library without this issue fixed 😭

just tried this with version version 2.0.2. and it is working if that helps anyone.

it would be really great to have a example of something more complex then google image for the crawling to the next site.

Got the same problem as @rkmax. As the docs indicate, a breadth-first crawling flow is recommended. So, you’d basically finish the root level of pages, then manually iterate over the second level and extend crawled data from the first level. Then proceed with the third. Solving this issue, really would confirm x-rays claim for being the next web scraper. So far great job, you’re definitely on the right track!