metascraper: [metascraper-amazon] Image selector matches incorrect image

I’m running into issues with the image value not being the main image for metascraper-amazon. There are actually multiple .a-dyanmic-image classes on the screen as seen in the attached photo. Can we create some rules with priority over this like wrapUrl($ => $('#landingImage').attr('src')) or wrapUrl($ => $('.a-dynamic-image').first().attr('src'))?

screen shot 2018-01-17 at 8 53 27 pm

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 2
  • Comments: 20 (7 by maintainers)

Most upvoted comments

Hey @agchou, I think you create your own package for support this new custom rule.

Can you share with us? I want to improve this in the metascraper-amazon package 😄

@bobber205 it’s probably because your User-Agent header looks like it is automated and coming from a script ( it is ) but you should be able to set it to anything you want. I’m setting it to a browser like this:

try {
    const { body: html } = await got(url, {
      headers: {
        "User-Agent": req.headers["user-agent"]
      }
    });
    data = await metascraper({ url, html });
    statusCode = 200;
  } catch (err) {
    statusCode = 401;
    data = {
      message: `Scraping the open graph data from "${url}" failed.`,
      suggestion:
        "Make sure your URL is correct and the webpage has open graph data, meta tags or twitter card data."
    };