OpenBLAS: Segfault in open_shmem()
I’m experiencing some weird Segmentation Faults that I can’t explain or reliably reproduce:
#0 0x00007fe7d24df7b8 in open_shmem () at init.c:662
#1 gotoblas_affinity_init () at init.c:810
#2 0x00007fe7d22b3057 in gotoblas_init () at memory.c:1412
#3 0x00007fe7f5e07d93 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#4 0x00007fe7f5e0ccea in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#5 0x00007fe7f434818f in _dl_catch_error () from /lib64/libc.so.6
#6 0x00007fe7f5e0c1f9 in _dl_open () from /lib64/ld-linux-x86-64.so.2
#7 0x00007fe7f4b30f26 in dlopen_doit () from /lib64/libdl.so.2
#8 0x00007fe7f434818f in _dl_catch_error () from /lib64/libc.so.6
#9 0x00007fe7f4b316a5 in _dlerror_run () from /lib64/libdl.so.2
#10 0x00007fe7f4b30fb1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#11 0x0000557c8679bd7f in internal_load_library (
libname=libname@entry=0x557c8b5825c8 "/usr/lib64/pgsql/rtpostgis-2.3.so") at dfmgr.c:226
#12 0x0000557c8679c663 in load_external_function (filename=<optimized out>,
funcname=funcname@entry=0x557c8b582458 "RASTER_in",
signalNotFound=signalNotFound@entry=1 '\001', filehandle=filehandle@entry=0x7ffe3257dc70)
at dfmgr.c:105
#13 0x0000557c8679c663 in load_external_function (filename=<optimized out>,
funcname=funcname@entry=0x557c8b582458 "RASTER_in",
signalNotFound=signalNotFound@entry=1 '\001', filehandle=filehandle@entry=0x7ffe3257dc70)
at dfmgr.c:105
#14 0x0000557c864d57fc in fmgr_c_validator (fcinfo=<optimized out>) at pg_proc.c:825
#15 0x0000557c8679e7e8 in OidFunctionCall1Coll (functionId=functionId@entry=2247,
collation=collation@entry=0, arg1=arg1@entry=1229779) at fmgr.c:1592
#16 0x0000557c864d50d6 in ProcedureCreate (procedureName=<optimized out>,
procNamespace=procNamespace@entry=2200, replace=<optimized out>,
returnsSet=returnsSet@entry=0 '\000', returnType=returnType@entry=1229778,
proowner=16384, languageObjectId=13, languageValidator=2247,
prosrc=0x557c89dce428 "RASTER_in", probin=0x557c89dce3f8 "$libdir/rtpostgis-2.3",
isAgg=0 '\000', isWindowFunc=0 '\000', security_definer=0 '\000', isLeakProof=0 '\000',
isStrict=1 '\001', volatility=105 'i', parallel=115 's', parameterTypes=0x557c8b5811f0,
allParameterTypes=0, parameterModes=0, parameterNames=0, parameterDefaults=0x0,
trftypes=0, proconfig=0, procost=procost@entry=1, prorows=prorows@entry=0)
at pg_proc.c:728
#17 0x0000557c865456e3 in CreateFunction (stmt=stmt@entry=0x557c89dce808,
queryString=queryString@entry=0x557c890e6998 '\n' <repeats 14 times>, "-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n--\n--\n-- PostGIS - Spatial Types for PostgreSQL\n-- http://postgis.net\n-- Copyright 2001-2003 Refractions Research"...)
at functioncmds.c:1083
#18 0x0000557c8669d52e in ProcessUtilitySlow (parsetree=parsetree@entry=0x557c89dce808,
queryString=queryString@entry=0x557c890e6998 '\n' <repeats 14 times>, "-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n--\n--\n-- PostGIS - Spatial Types for PostgreSQL\n-- http://postgis.net\n-- Copyright 2001-2003 Refractions Research"...,
context=context@entry=PROCESS_UTILITY_QUERY, params=params@entry=0x0,
completionTag=completionTag@entry=0x0, dest=<optimized out>) at utility.c:1376
#19 0x0000557c8669c38c in standard_ProcessUtility (parsetree=0x557c89dce808,
---Type <return> to continue, or q <return> to quit---
queryString=0x557c890e6998 '\n' <repeats 14 times>, "-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n--\n--\n-- PostGIS - Spatial Types for PostgreSQL\n-- http://postgis.net\n-- Copyright 2001-2003 Refractions Research"...,
context=PROCESS_UTILITY_QUERY, params=0x0, dest=<optimized out>, completionTag=0x0)
at utility.c:907
#20 0x0000557c8653bfa8 in execute_sql_string (
filename=0x557c88ad7d78 "/usr/share/pgsql/extension/postgis--2.3.2.sql",
sql=0x557c890e6998 '\n' <repeats 14 times>, "-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n--\n--\n-- PostGIS - Spatial Types for PostgreSQL\n-- http://postgis.net\n-- Copyright 2001-2003 Refractions Research"...) at extension.c:750
#21 execute_extension_script (extensionOid=extensionOid@entry=1228888,
control=control@entry=0x557c88b760c0, from_version=from_version@entry=0x0,
version=version@entry=0x557c88b76160 "2.3.2", requiredSchemas=requiredSchemas@entry=0x0,
schemaName=schemaName@entry=0x557c88b760a8 "public", schemaOid=2200) at extension.c:910
#22 0x0000557c8653d05b in CreateExtensionInternal (parents=parents@entry=0x0,
stmt=<optimized out>, stmt=<optimized out>) at extension.c:1502
#23 0x0000557c8653d58d in CreateExtension (stmt=stmt@entry=0x557c88b35ec8)
at extension.c:1560
#24 0x0000557c8669d079 in ProcessUtilitySlow (parsetree=parsetree@entry=0x557c88b35ec8,
queryString=queryString@entry=0x557c88b354a8 "CREATE EXTENSION postgis",
context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0,
completionTag=completionTag@entry=0x7ffe3257f220 "", dest=<optimized out>)
at utility.c:1296
#25 0x0000557c8669c38c in standard_ProcessUtility (parsetree=0x557c88b35ec8,
queryString=0x557c88b354a8 "CREATE EXTENSION postgis", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0, dest=<optimized out>, completionTag=0x7ffe3257f220 "") at utility.c:907
#26 0x0000557c8669992a in PortalRunUtility (portal=0x557c88ad19e8,
utilityStmt=0x557c88b35ec8, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>,
dest=0x557c88b36208, completionTag=0x7ffe3257f220 "") at pquery.c:1193
#27 0x0000557c8669a464 in PortalRunMulti (portal=portal@entry=0x557c88ad19e8,
isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0 '\000',
dest=dest@entry=0x557c88b36208, altdest=altdest@entry=0x557c88b36208,
completionTag=completionTag@entry=0x7ffe3257f220 "") at pquery.c:1349
#28 0x0000557c8669b039 in PortalRun (portal=0x557c88ad19e8, count=9223372036854775807,
isTopLevel=<optimized out>, dest=0x557c88b36208, altdest=0x557c88b36208,
completionTag=0x7ffe3257f220 "") at pquery.c:815
#29 0x0000557c866988e0 in PostgresMain (argc=<optimized out>, argv=<optimized out>,
dbname=<optimized out>, username=<optimized out>) at postgres.c:1086
#30 0x0000557c86427ea2 in BackendRun (port=0x557c88ad82a0) at postmaster.c:4294
#31 BackendStartup (port=0x557c88ad82a0) at postmaster.c:3968
#32 ServerLoop () at postmaster.c:1719
#33 0x0000557c86638777 in PostmasterMain (argc=3, argv=0x557c88ab2420) at postmaster.c:1327
#34 0x0000557c864295e9 in main (argc=3, argv=0x557c88ab2420) at main.c:228
That all is triggered by a simple “CREATE EXTENSION postgis;” in PostgreSQL.
I tried the last three hours to isolate this issue and it seems to be caused by a shared memory that is created by a different process (manage.py shell from Django) that creates two shared memory segments. Apart from that I don’t see how this process could cause that as it doesn’t even connect to the database.
Since the segfault happens inside open_shmem() I guess something must be something wrong there.
I have some troubles reproducing the behavior as it doesn’t happen all the time but I made sure to dump a core file of the segfaulted PostgreSQL process (275 MB, bzipped 9 MB).
My system specs:
$ uname -a
Linux blackhole.hq.terreon.de 4.13.9-200.fc26.x86_64 #1 SMP Mon Oct 23 13:52:45 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: Fedora
Description: Fedora release 26 (Twenty Six)
Release: 26
Codename: TwentySix
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 22 (7 by maintainers)
I just tested the patch. Now I get the following error message in my PostgreSQL log and it doesn’t crash:
👍