crate: UPDATE query crashes node (java.lang.StackOverflowError: null)
CrateDB version: 2.1.6, 2.3.2
JVM version: openjdk version “1.8.0_151” OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
OS version / environment description: Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-47-generic x86_64)
Problem description:
UPDATE query crashes node.
[2018-02-16T23:00:01,485][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [s1] fatal error in thread [elasticsearch[s1][bulk][T#2]], exiting
java.lang.StackOverflowError: null
at java.util.HashMap.hash(HashMap.java:339) ~[?:1.8.0_151]
at java.util.HashMap.get(HashMap.java:557) ~[?:1.8.0_151]
at java.util.Collections$UnmodifiableMap.get(Collections.java:1454) ~[?:1.8.0_151]
at io.crate.operation.scalar.DateTruncFunction.intervalAsUnit(DateTruncFunction.java:192) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.operation.scalar.DateTruncFunction.rounding(DateTruncFunction.java:170) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.operation.scalar.DateTruncFunction.compile(DateTruncFunction.java:116) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:53) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.operation.BaseImplementationSymbolVisitor.visitFunction(BaseImplementationSymbolVisitor.java:39) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.analyze.symbol.Function.accept(Function.java:62) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.analyze.symbol.SymbolVisitor.process(SymbolVisitor.java:32) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.operation.InputFactory$Context.add(InputFactory.java:166) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.executor.transport.TransportShardUpsertAction.resolveSymbols(TransportShardUpsertAction.java:319) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.executor.transport.TransportShardUpsertAction.processGeneratedColumns(TransportShardUpsertAction.java:573) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.executor.transport.TransportShardUpsertAction.prepareUpdate(TransportShardUpsertAction.java:381) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:240) ~[crate-app-2.1.6.jar:2.1.6]
at io.crate.executor.transport.TransportShardUpsertAction.indexItem(TransportShardUpsertAction.java:261) ~[crate-app-2.1.6.jar:2.1.6]
[repeated 100+ times]
Steps to reproduce:
SCHEMA:
CREATE TABLE IF NOT EXISTS "myschema"."notification" (
"company_id" STRING GENERATED ALWAYS AS substr("group_id", 1, 6),
"created_at" TIMESTAMP NOT NULL,
"deleted_at" TIMESTAMP,
"details" OBJECT (DYNAMIC) AS (
"coordinates" GEO_POINT,
"device_sn" STRING,
"document_id" STRING,
"geofence_id" STRING,
"group_id" STRING,
"limit" LONG,
"seal_id" STRING,
"speed" LONG,
"user_id" STRING,
"vehicle_id" STRING
),
"group_id" STRING,
"id" STRING NOT NULL,
"month" TIMESTAMP GENERATED ALWAYS AS date_trunc('month', "created_at"),
"persistent" BOOLEAN NOT NULL,
"status" STRING NOT NULL,
"timestamp" TIMESTAMP,
"type" STRING NOT NULL,
"updated_at" TIMESTAMP NOT NULL
)
CLUSTERED BY ("id") INTO 6 SHARDS
PARTITIONED BY ("month")
WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = false,
"blocks.read" = false,
"blocks.read_only" = false,
"blocks.write" = false,
column_policy = 'dynamic',
"mapping.total_fields.limit" = 1000,
number_of_replicas = '1',
"recovery.initial_shards" = 'quorum',
refresh_interval = 1000,
"routing.allocation.enable" = 'all',
"routing.allocation.total_shards_per_node" = -1,
"translog.durability" = 'REQUEST',
"translog.flush_threshold_size" = 536870912,
"translog.sync_interval" = 5000,
"unassigned.node_left.delayed_timeout" = 60000,
"warmer.enabled" = true,
"write.wait_for_active_shards" = 'all'
)
QUERY
update "myschema"."notification"
set "deleted_at" = CURRENT_TIMESTAMP
where "timestamp" < 1518904800000 and "persistent" = false and "deleted_at" is null;
NOTE: Only fails if there are rows in table that match WHERE conditions. NOTE 2: Transforming the UPDATE into a DELETE query does not crash the node - it works as expected.
Use case:
Feature description:
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (11 by maintainers)
@rps-v Great to hear that this solves your issue. Anyway this should not happen even when a node crashes with in-flight inserts. I’m closing this issue for now but also try to investigate how documents could get into this persistent version state. Thx for reporting.
@rps-v The document that causes the issue is with _id: AWGOgCvq3g58eQ8H13w2. So as quick workaround take a backup of this document with:
select * from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2';and saving the output. Then delete it:delete from "tracknamic"."notification" where _id='AWGOgCvq3g58eQ8H13w2';and insert it again:insert into "tracknamic"."notification" (....) values(....)This document ended up with
_version=-4which causes the stackoverflow…