visit: Non-screen capture save window in client/server crashes engine

Describe the bug

This is happening frequently for a VIP LLNL user.

Am attaching the whole set of debug logs here but also writing excerpt highly suggestive the issue is related to IceT because the failure case seems to never return from ConvertIceTImageToAVTImage (or never gets into it).

Excerpt from A.engine_par.000.5.vlog for a WORKING save…

Executing SetWinAnnotAttsRPC 2048x2048
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: Mesh_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: Mesh_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: FilledBoundary_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: FilledBoundary_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: Subset_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: Subset_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing RenderRPC for the following plots
   0, 1, 2,
NetworkManager::RenderSetup: annotMode=2
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0000".
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0001".
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0002".
GetSize: 2048, 2048
0: viewport: x=195, y=151, w=1853, h=1453
ConvertIceTImageToAVTImage: w=2048, h=2048, keepZ=0, keepA=0
ConvertIceTImageToAVTImage: IceTImage: rgba_ubyte=1, z=0
ConvertIceTImageToAVTImage: Copying image data.
ConvertIceTImageToAVTImage: Not reading back zbuffer data
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Engine::GatherData:
  writer->MustMergeParallelStreams()=false
  useCompression=false
  respondWithNull=false
  scalableThreshold=-1
  currentTotalGlobalCellCount=0
  cellCountMultiplier=1
exceeded scalable threshold of -1
sending 12583985 bytes to the viewer 73 from strings.
Number of actual direct writes = 2

Excerpt from A.engine_par.000.5.vlog for a FAILED save…

Executing SetWinAnnotAttsRPC 8192x8192
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: Mesh_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: Mesh_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: FilledBoundary_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: FilledBoundary_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing PrepareUpdatePlotAttsRPC: Subset_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing UpdatePlotAttsRPC: Subset_1.0
Resetting idle timeout to 480 minutes.
Resetting idle timeout to 480 minutes.
Resetting execution timeout to 30 minutes.
Executing RenderRPC for the following plots
   0, 1, 2,
NetworkManager::RenderSetup: annotMode=2
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
Found matching annotation attributes for legend Plot0000
Found matching annotation attributes for legend Plot0001
Found matching annotation attributes for legend Plot0002
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0000".
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0001".
VisWinAnnotations::AddAnnotationObject: New Legend object created. It is called "Plot0002".
GetSize: 8192, 8192
0: viewport: x=779, y=603, w=7413, h=5813
signalhandler_core: SIGBRT!

To Reproduce

Steps to reproduce the behavior. For example:

  1. Run client server from macOS or Linux (I haven’t tried Windows) to Pascal
  2. Open /usr/gapps/visit/data/multi_ucd3d.silo
  3. Select parallel pvis with 2 nodes and 36 processors
  4. Put up Mesh Plot, Filled Boundary Plot and Subset plot of domains
  5. Turn of material 1 and draw
  6. Now try a flurry of different saves from the Set save options window adjusting either width, which aspect ratio, whether to use screen capture or not. I suspect it will fail only in non-screen-capture mode. In my tests, often tried really, really large widths like 8192 and even 16384 (which saves all black except for legen) and although those large saves did indeed work, the engine would later crash.

Expected behavior

Engine should save a window and not crash.

Attachments

Desktop

  • OS and version: macOS and Linux
  • VisIt Version: 3.2.2 in both cases

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

After experimenting more with it I found that it also crashes when using a 1x2 window layout with a serial engine. That’s a lot easier to debug.

I went to the VIPs office and she showed me a consistent reproducer. I’ve been able to reproduce with multi_ucd3d.silo running with 36 cores on a node of quartz. This is with VisIt 3.2.2.

  1. Start VisIt (visit -v 3.2.2 -debug 5)
  2. Go to a 2x2 window setup
  3. Set scalable rendering to always
  4. Open /usr/gapps/visit/data/multi_ucd3d.silo
  5. Select a 36 node engine in pdebug
  6. Plot a filled boundary plot
  7. Go to window 2
  8. Click on Draw to draw the same filled boundary plot in the second window
  9. Click on Save window (this saves roughly a 1024x1024 image, the width is 1024, but the height isn’t since the window isn’t quite square)
  10. Got to window 1
  11. Click on Save window

The engine crashes.

A secondary bug, is that if you restart a new engine, it now fails to save the image because of a pipeline usage error. You actually have to clear the window, draw the plots again and then save the image.

I have attached a log file from processor 0.

A.engine_par.000.5.vlog.txt