logstash-input-jdbc: Input crashes on TZInfo::AmbiguousTime Error when setting jdbc_default_timezone

Hi there! First of all - thanks a lot for this plugin - it’s great!

However, when using my MySQL DB’s timestamp in elasticsearch/kibana I noticed it was off by 1h since the MySQL timestamp is interpreted as UTC but in fact is CET.

So i added: jdbc_default_timezone => “Europe/Berlin”

But now my logstash-pipeline crashes fatal resulting in an endless loop of trying to read data from MySQL with this error:

A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Jdbc jdbc_connection_string=>"jdbc:mysql://localhost:3306/MYDBNAME?zeroDateTimeBehavior=convertToNull", jdbc_user=>"MYDBUSER", jdbc_password=><password>, jdbc_driver_library=>"/driver/mysql-connector-java-5.1.35-bin.jar", jdbc_driver_class=>"com.mysql.jdbc.Driver", jdbc_default_timezone=>"Europe/Berlin", statement_filepath=>"/config/queries/shop_order_item.sql", type=>"shop_order_item", codec=><LogStash::Codecs::Plain charset=>"UTF-8">, jdbc_paging_enabled=>false, jdbc_page_size=>100000, jdbc_validate_connection=>false, jdbc_validation_timeout=>3600, jdbc_pool_timeout=>5, sql_log_level=>"info", parameters=>{"sql_last_value"=>1970-01-01 00:00:00 UTC}, last_run_metadata_path=>"/var/lib/logstash/.logstash_jdbc_last_run", use_column_value=>false, clean_run=>false, record_last_run=>true, lowercase_column_names=>true>
  Error: TZInfo::AmbiguousTime: 2015-10-25T02:49:45+00:00 is an ambiguous local time. {:level=>:error}

I’m aware that this might probably more like an tzinfo related problem (https://github.com/tzinfo/tzinfo/issues/32) but I don’t see any other possibility to make it work correctly.

Exporting a UNIX-timestamp in MySQL and using a date input filter combined with the “unix” timestamp parser is (for whatever reason) painfully slow resulting in jammed elasticsearch input queues.

Best - Max

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 2
  • Comments: 26 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Since v5.4.0 of this plugin (which is now distributed as a part of the JDBC Integration plugin for Logstash), you can specify the behaviour when encountering an ambiguous timestamp:

While it is common to store local times in SQL’s timestamp column type, many timezones change their offset during the course of a calendar year and therefore cannot be used with SQL’s timestamp type to represent an ordered, continuous timeline. For example in the America/Chicago zone when daylight saving time (DST) ends in the autumn, the clock rolls from 01:59:59 back to 01:00:00, making any timestamp in the 2-hour period between 01:00:00CDT and 02:00:00CST on that day ambiguous.

When encountering an ambiguous timestamp caused by a DST transition, the query will fail unless the timezone specified here includes a square-bracketed instruction for how to handle overlapping periods (such as: America/Chicago[dst_enabled_on_overlap:true] or Australia/Melbourne[dst_enabled_on_overlap:false]).

Logstash JDBC Input Plugin – jdbc_default_timezone – Ambiguous Timestamps


Versions of this plugin with this feature were first shipped with Logstash v8.6.0.

Logstash 7.6+ and 8.0+ shipped with the JDBC Integration plugin, so it can be updated with the normal method:

bin/logstash-plugin update logstash-integration-jdbc

Logstash < 7.5 will need to remove the stand-alone JDBC plugins prior to installing the integration plugin, which will look something like:

bin/logstash-plugin remove logstash-input-jdbc
bin/logstash-plugin remove logstash-filter-jdbc_static
bin/logstash-plugin remove logstash-filter-jdbc_streaming

bin/logstash-plugin install logstash-integration-jdbc

I don’t really understand why this is not labeled as a bug. If I understand this correctly, the jdbc_default_timezone cannot be used in countries using daylight saving time.

Good point - I don’t know either. If one of the contributors of this plugin could confirm or negate that we could keep this ticket here or move it to the proper elastic project instead.

I guess in the end there should be a global or plugin specific setting, leading to these calls:

tz.local_to_utc(d) { |periods| periods.first } or tz.local_to_utc(d) { |periods| periods.last }

as mentioned in the linked tzinfo issue.

As far as I understand the issue logstash-input-jdbc would have to let us configure (like tzinfo does it) what fallback (pick first or second point in time) to perform in case of an ambigous time stamp. As long as it doesn’t to this it’s somewhat “correct” to fail fatally.

Here is my understanding of the whole thing - please correct me if I’m wrong:

  • We have source timestamps in let’s say CE(S)T timezone supporting daylight saving time.
  • Events with timestamps during summer/winter-time shifting (twice a year) can’t be mapped precisely because at 3am the clocks are set back to 2am, so every timestamp between 2am and 3am exists twice.
  • Thus events within this time range can’t be mapped correctly and throw an fatal error.
  • As far as I understand tzinfo has an option to force such timestamps to either the earlier or the later timestamp equivalent in UTC.
  • When we could configure that in logstash-input-jdbc’s yml as well to pass it to tzinfo we could get rid of the problem for the cost of losing the exakt time during summer/winter-time shifting which can’t be avoided due to logical reason pointed out before.

I know - that’s all evil stuff and everyone should only write UTC timestamps to avoid such awkwardness but in reality there are a lot of databases out there logging stuff in local timezones that support daylight saving time shifts - so support for that would be great 😃