sharp: Segfaults when under pressure?

I’m trying to debug a segfault while using Gatsby that is potentially in the realm of using sharp.

(I’m a performance eng for Gatsby)

It seems that sites with lots of images can trigger a segfault and we’re not sure when exactly this happens.

I get this fairly frequently with a simple site that just has many images (~1500 for my machine) on the example site posted (in http://github.com/gatsbyjs/gatsby/pull/6291) as a repro;

Clone https://github.com/jp887/gatsby-issue6291-repro Update package.json to set all gatsby deps to * Clear the npm cache (~/.npm) Clear node_modules, in case that’s relevant npm install yarn gen-images 1500 (It triggers with 1500 images for me, this will try to process 9000 images in total) gatsby build

For me, often it segfaults roughly between 25% and 75%.

I’m on ubuntu (xfce) 19.10 Node v8.16.2 (through nvm)

I went through some of the older Gatsby issues relating to this bug. I am able to reproduce it with .simd(true) and .simd(false), and same for .cache(true) and .cache(false).

Reducing the available memory to node does not seem to make a difference (I set it to 500mb, half of the default, but saw no increase in crash rate). Was also able to repro it running single core.

Please let me know how I can help 😃

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 5
  • Comments: 58 (39 by maintainers)

Commits related to this issue

Most upvoted comments

That’s very interesting. I’ll have a look today.

v0.24.0 is now available with a prebuilt libvips v8.9.0.

We’ve now bumped Sharp in Gatsby, affecting the following versions. Thanks again!

  • gatsby-plugin-manifest@2.3.1
  • gatsby-plugin-sharp@2.5.1
  • gatsby-remark-images-contentful@2.2.1
  • gatsby-source-contentful@2.2.2
  • gatsby-theme-blog-core@1.3.2
  • gatsby-theme-blog@1.4.2
  • gatsby-theme-notes@1.2.2
  • gatsby-transformer-sharp@2.4.1
  • gatsby-transformer-sqip@2.2.1
  • gatsby@2.20.2

I can no longer reproduce this problem when using libvips compiled from the lock-for-metadata-changes branch.

Brilliant, thanks John, I’ve been unable to reproduce a segfault using the check-metadata-changes branch of libvips with the latest master branch of sharp.

I think I found what’s making those message before the crash:

https://github.com/libvips/libvips/blob/master/libvips/iofuncs/header.c#L311

If you change that to be:

        /*
        if( G_VALUE_TYPE( value ) == G_TYPE_STRING )
                g_value_init( &meta->value, VIPS_TYPE_REF_STRING );
        else
                g_value_init( &meta->value, G_VALUE_TYPE( value ) );
         */

        g_value_init( &meta->value, 0 );

And try vips copy x.jpg y.jpg you see:

$ vips copy k2.jpg x.v

(vips:14583): GLib-GObject-WARNING **: 16:33:11.184: ../../../gobject/gtype.c:4268: type id '0' is invalid

(vips:14583): GLib-GObject-WARNING **: 16:33:11.184: can't peek value table for type '<invalid>' which is not currently referenced

(vips:14583): GLib-GObject-WARNING **: 16:33:11.184: ../../../gobject/gvalue.c:185: cannot initialize GValue with type '(null)', this type has no GTypeValueTable implementation

The same set of errors, so probably the value param to meta_new() is pointing at a lot of zeros.

meta_cp takes a shallow copy of the list of metadata items on each input image, then calls meta_new() for each item. A metadata item is disappearing (probably) some time between the shallow copy of the list and calling meta_new().

This could be caused by an image being unreffed during the copy, but that seems unlikely to me, the chaos would be even worse.

It’s more likely that some other thread is changing the metadata on the input image during the copy. Could this be happening?

libvips images have a list of arbitrary name/value pairs attached to them, and these are inherited as images are processed. They are used for things like ICC profiles and EXIF data, but user code can add extra metadata too if it’s convenient. Image metadata is supposed to be immutable: after the image has been constructed, it’s all frozen and must not be changed or you get bad behaviour like this.

However, at the moment, there’s no lock mechanism to prevent you changing image metadata. How about adding something to enforce this? It could be very simple, eg:

void
vips_image_set( VipsImage *image, const char *name, GValue *value )
{
        g_assert( name );
        g_assert( value );

        /* If this image is shared, block metadata changes.  
         */
        if( G_OBJECT( image )->ref_count > 1 ) {
                g_info( "can't set metadata \"%s\" on shared image", name );
                return;
        }       
                        
        meta_init( image );
        (void) meta_new( image, name, value );

... etc.

Can confirm, no more segfaults with the yield branch.

It seems I can no longer trigger the problem on this branch. 30 or 40 runs on high system load (aka a Zoom call, lol) without signs of trouble. Thank you! (While the repro above was for about 9k thumbs, I was also able to successfully generate 30k thumbnails, back to back, without signs of the problem)

The yield branch that will become the forthcoming sharp v0.25.0 depends on and provides libvips v8.9.1.

npm i lovell/sharp#yield

sharp v0.25.0 will also include the N-API migration (see #1282) as well as the return of support for 32-bit node.exe on Windows to allow for easier use with Azure.

OK, let’s merge and make 8.9.1.

I had another idea: there’s a left-over int field in VipsImage called Length (don’t ask).

How about setting this to 42 in copy(), then in vips_image_set() verifying that it has the value 42. That should find all cases where we’ve forgotten to call copy before modifying metadata.

I’ve made a branch with this change and quite a few tests now fail. I’ll investigate this weekend.

Sadly I’ve been able to get https://github.com/jp887/gatsby-issue6291-repro to segfault with the latest sharp v0.24.0 (libvips v8.9.0) too:

(sharp:28295): GLib-GObject-WARNING **: 13:22:38.300: ../gobject/gtype.c:4268: type id '0' is invalid
(sharp:28295): GLib-GObject-WARNING **: 13:22:38.301: can't peek value table for type '<invalid>' which is not currently referenced
(sharp:28295): GLib-GObject-WARNING **: 13:22:38.301: ../gobject/gvalue.c:187: cannot initialize GValue with type '(null)', this type has no GTypeValueTable implementation
(sharp:28295): GLib-GObject-CRITICAL **: 13:22:38.301: g_value_type_compatible: assertion 'dest_type' failed
#0  0x00007ffff407d26d in g_type_parent () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#1  0x00007ffff40822bb in ?? () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#2  0x00007ffff4082fac in g_value_transform () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libgobject-2.0.so.0
#3  0x00007fffd5a88917 in ?? () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#4  0x00007fffd5a88987 in ?? () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#5  0x00007fffd5a99045 in vips_slist_map2 () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#6  0x00007fffd5a89064 in vips.image_copy_fields_array () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#7  0x00007fffd5a83bf2 in vips_image_pipeline_array () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#8  0x00007fffd5a83cf3 in vips_image_pipelinev () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#9  0x00007fffd5978636 in ?? () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#10 0x00007fffd5a79c09 in vips_object_build () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#11 0x00007fffd5a85998 in vips_cache_operation_buildp () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#12 0x00007fffd5a8c0dd in vips_call_required_optional () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#13 0x00007fffd5a8cbbf in vips_call_split () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#14 0x00007fffd5978979 in vips_copy () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#15 0x00007fffd5a80c8d in vips_image_decode () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#16 0x00007fffd58d04f8 in ?? () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#17 0x00007fffd5a79c09 in vips_object_build () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#18 0x00007fffd5a85998 in vips_cache_operation_buildp () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips.so.42
#19 0x00007ffff42b6ad5 in vips::VImage::call_option_string(char const*, char const*, vips::VOption*) () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips-cpp.so.42
#20 0x00007ffff42c9893 in vips::VImage::resize(double, vips::VOption*) const () from gatsby-issue6291-repro/node_modules/sharp/build/Release/../../vendor/lib/libvips-cpp.so.42
#21 0x00007ffff4507f15 in PipelineWorker::Execute() () from gatsby-issue6291-repro/node_modules/sharp/build/Release/sharp.node
#22 0x00000000012d1c0e in worker (arg=0x0) at ../deps/uv/src/threadpool.c:122
#23 0x00007ffff7c42182 in start_thread (arg=<optimised out>) at pthread_create.c:486
#24 0x00007ffff7b6bb1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I added the following to its package.json to achieve this:

+  "resolutions": {
+    "sharp": "v0.24.0"
+  },

sharp v0.23.4 now available with the changes in commit bb15cd9. I suspect the changes to libvips discussed here will also be required to fully deal with this.

(I expect the future sharp v0.24.0 to provide a prebuilt version of a future libvips with these fixes, either v8.8.4 or v8.9.0.)