puppeteer: [Bug]: page.pdf produces corrupt pdf

Bug description

Steps to reproduce the problem:

Occasionally, we find that our PDFs are not openable by any programs. We’ve narrowed the issue down to inclusion of certain images. When these images are present, the pdf created by puppeteer is corrupt. All image tools do not indicate that anything is wrong with the image itself so I believe this is an issue on the puppeteer side.

Create an test.html file with the following contents

<html>
  <head>
  </head>
  <body>
    <img src="image.jpg">
  </body>
</html>

In the same directory, place the attached image.jpg

Create a save_to_pdf.js file with the following contents

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();
  await page.goto(`file://${__dirname}/test.html`, { waitUntil: 'networkidle0', timeout: 60000 });
  await page.pdf({
    path: 'out.pdf',
    printBackground: true
  });

  await browser.close();
})();

Run node save_to_pdf.js
Try to open out.pdf in any program.

puppeteer_bug.zip image.jpg

Puppeteer version

10.1.0

Node.js version

12.22.5

npm version

6.14.14

What operating system are you seeing the problem on?

macOS

Relevant log output

No response

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 18 (2 by maintainers)

Commits related to this issue

More changes to test puppeteer 10.0.0 vs 10.1.0 specifically Also replicates the bug at https://github.com/puppeteer/puppeteer/issues/7757 and rules out a Chrome version change as the problem (the ve... — committed to MartinFalatic/puppeteer-explorations by MartinFalatic 3 years ago
fix: page.pdf producing an invalid pdf When defining a chunk size for <CDPSession>.send('IO.read', { handle, size }), the CDPSession will occassionally indicate an that it has reached the end of file... — committed to tjacobs3/puppeteer by tjacobs3 3 years ago
fix: page.pdf producing an invalid pdf (#7868) When defining a chunk size for <CDPSession>.send('IO.read', { handle, size }), the CDPSession will occasionally indicate that it has reached the end of ... — committed to puppeteer/puppeteer by tjacobs3 2 years ago
chore: revert #7868 to use the size parameter for streaming PDF (#8145) Issues: #7757 — committed to puppeteer/puppeteer by OrKoN 2 years ago

Most upvoted comments

So the Chromium fix landed in M100 and I plan to create new Puppeteer release early next week.

OrKoN on Jan 28, 2022

https://github.com/puppeteer/puppeteer/pull/7868 won’t actually fix the problem, it will just move the bug to the 10MB boundary which makes it less likely to happen (because fewer files are bigger than that). We can still land the fix until the Chromium fix arrives.

OrKoN on Jan 26, 2022