clickhouse-java: Format Timestamp for DateTime64 Parameter with millisecond precision not working

When setting a Timestamp for a DateTime64 parameter with millisecond precision on a prepared statement the milliseconds are lost when inserting.

This issue occurs for clickhouse-jdbc version 0.3.0

Example:

Create the following table

CREATE TABLE IF NOT EXISTS test (
    date DateTime64(3),
    value Decimal64(8)

And then insert a record with a Timestamp with millisecond precision:

  ClickHouseConnection conn = getConnection();
  PreparedStatement stmt = conn.prepareStatement("INSERT INTO test (date, value) VALUES (?, ?);");
  long date = 1617028535604L; // Mon Mar 29 2021 14:35:35.604
  stmt.setTimestamp(4, new Timestamp(date));
  stmt.executeUpdate();
  stmt.close();
  conn.close();

When selecting the inserted record you will see that milliseconds are lost, the DateTime64 returned by ClickHouse is then 2021-03-29 16:35:35.000 but it should be 2021-03-29 16:35:35.604.

I have pinpointed the problem to the formatTimestamp function

You can see with the following simple example:

    public static void main(String[] args) {
        long date = 1617028535604L; // Mon Mar 29 2021 14:35:35.604
        Timestamp ts = new Timestamp(date);
        String formatted = ClickHouseValueFormatter.formatTimestamp(ts, TimeZone.getTimeZone("UTC"));
        System.out.println("SQL Timestamp: " + ts);
        System.out.println("ClickHouse JDBC Formatted Timestamp: " + formatted);
    }

This will output:

SQL Timestamp: 2021-03-29 16:35:35.604
ClickHouse JDBC Formatted Timestamp: 2021-03-29 14:35:35

But the ClickHouse JDBC Formatted Timestamp should be 2021-03-29 14:35:35.604

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 3
  • Comments: 16 (1 by maintainers)

Most upvoted comments

To summarize what we discussed and items on my list regarding this:

  1. client-side prepared statement(using parse handler) aiming to support simple scenarios only For example:

    • client sends query insert into my_table(c1, c3, c5) values(1, ?, ?) along with 2 parameters
    • jdbc driver parses the query to understand its structure, and executes select c3, c5 from my_table where 1=0 Format TabSeparatedWithNamesAndTypes to get column types
    • and then jdbc driver executes substituted query like insert into my_table(c1, c3, c5) values(1, 'a', '2021-08-29 12:12:12.123456') accordingly
  2. batch insert optimization to avoid common mistakes Apparently the parsing and extra query will slow things down, but it can help to improve performance of batch insert. Let’s reuse above example here, if we change the query to below, we no longer have to execute many inserts but only one with streamed data:

    insert into my_table(c1, c3, c5)
    select 1 as c1, c3, c5
    from input('c3 String, c5 DateTime64(6)')
    Format RowBinary
    
  3. named parameter with or without type - try not to be smart and let the caller to decide Above two are workarounds for clickhouse-jdbc, the JDBC driver. Named parameter on the other hand will be applied to both jdbc driver and the upcoming clickhouse-client, successor of extended API, which I believe is a better way to address the issue.

    // localhost:9100
    ClickHouseNode server = ClickHouseNode.builder().withPort(ClickHouseProtocol.GRPC).build();
    // prefer to use grpc implementation
    try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.GRPC)) {
      // reusable request object, an immutable copy will be made when calling its send() method
      Mutation request = client.connect(server).write().format(ClickHouseFormat.RowBinary);
    
      Map<String, ClickHouseValue> namedParams = new HashMap<>();
      namedParams.put("c3", ClickHouseStringValue.of("a"));
      namedParams.put("c5", ClickHouseDateTimeValue.of("2021-08-29 12:12:12.123456789"));
    
      // named parameter without type
      request.sql("insert into my_table(c1, c3, c5) values(1, :c3, :c5)").params(namedParams).send();
      // with type
      request.sql("insert into my_table(c1, c3, c5) values(2, :c3(String), :c5(DateTime64(6))")
        // parameters can also be specified as array / collection
        .params(ClickHouseStringValue.of("a"), ClickHouseDateTimeValue.of(LocalDateTime.now())).send();
      // batch insert using raw data
      request
        .sql("insert into my_table(c1, c3, c5) select 3 as c1, c3, c5 from input('c3 String, c5 DateTime64(6)')")
        .data("/tmp/data.bin").send();
    }
    
  4. parsing errors Two issues:

    1. parsing date time with fraction is not supported(see this) - we probably should try something like below:
    static final DateTimeFormatter DATETIME_FORMATTER = new DateTimeFormatterBuilder()
            .appendPattern("yyyy-MM-dd HH:mm:ss").appendFraction(ChronoField.NANO_OF_SECOND, 0, 9, true).toFormatter();
    
    1. has problem to distinguish parameter placeholder ? and conditional operator ?:(see this), which is a known issue. Named parameter does not have this issue as of now.

I think it should be implemented in CH https://github.com/ClickHouse/ClickHouse/issues/28300

Thanks @PHameete, I think it’s a good idea of adding ClickHouseDateTime* to distinguish those types. Timestamp can be mapped to DateTime32 to simplify usage. Alternatively, we can convert timestamp to long and let ClickHouse server to figure it out.

Anyway, I’m working on a few PRs to 1) use RowBinary format in query and mutation instead of TabSeparated(text-based); and 2) client-side prepared statement(additional query to get metadata and then start batch insertion etc.). I believe both of them can address the issue but it takes time. I hope I can get back to this next week or so.