SqlClient: SqlDataReader.GetSqlDateTime does not support DATETIME2

Describe the bug

It seems like DATETIME2 sql server datatype is not fully or correctly implemented in SqlDataReader.

using method GetDateTime - it handles DATETIME2 correctly

using method GetSqlDateTime - it throws an System.InvalidCastException:

Exception message: Specified cast is not valid 
Stack trace:  at Microsoft.Data.SqlClient.SqlBuffer.get_SqlDateTime()

To reproduce

 [TestMethod]
        public void DateTime2Mapping()
        {
            string Statement = $@"SELECT CAST('2020-12-11 09:17:12.123456' AS DATETIME2) AS [MyDate]";
            
            var builder = new SqlConnectionStringBuilder()
            {
                InitialCatalog = "master",
                DataSource = @"any sql connection",
                IntegratedSecurity = true,
            };

            var sqlConnection = new SqlConnection(builder.ConnectionString);
            sqlConnection.Open();
            using (SqlCommand command = new SqlCommand(Statement))
            {
                command.CommandType = CommandType.Text;
                command.Connection = sqlConnection;
                command.CommandText = Statement;
                using (var reader = command.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        var dtColDate = reader.GetSqlDateTime(0);
                        Assert.IsFalse(dtColDate.IsNull);
                        Console.WriteLine(dtColDate.Value);
                    }
                }
            }
        }

Expected behavior

I expect that GetSqlDateTime handles DATETIME2 and returns a SqlDateTime struct

Further technical details

Microsoft.Data.SqlClient version: 2.1.0 .NET target: Framework 4.7.2 SQL Server version: 15.0.4079.3 (2019) Operating system: Windows 10

Additional context Why I found this issue, is because I was profiling our code and saw that SqlDataReader.IsDbNull took up a lot of time since we need to check every column for null, since GetXX methods does not handle DBNull - and then I found SqlDataReader.GetSqlXX methods which handles DBNull.

Basically trying to get rid of the overhead in SqlDataReader

 override public bool IsDBNull(int i)
        {
            {
                CheckHeaderIsReady(columnIndex: i);

                SetTimeout(_defaultTimeoutMilliseconds);

                ReadColumnHeader(i);    // header data only
            }

            return _data[i].IsNull;
        }

by swithing to The IsNull property in the SqlTypes

i.e.

  public bool IsNull => !this.m_fNotNull;

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 4
  • Comments: 50 (25 by maintainers)

Commits related to this issue

Most upvoted comments

@bjornbouetsmith First off, thanks for submitting the idea and showing where it has performance benefits. @roji and @Wraith2 are valued contributors here and I appreciate their analysis.

We all have jobs and existing priorities so it’s really up to the requestor to make the case for a change and get others to agree that the change should be prioritized over other people’s needs. Your needs are obviously very performance oriented. Implementing the change yourself would be the quickest way forward. You just need to make sure your changes would be accepted, as @JRahnama, said.

It’s really up to you to consolidate input and suggestions into a coordinated feature request. As it is, this thread is getting quite long. I suggest adding a proposal summary in your original comment for this issue based on discussions. I’m not sure what the technical proposal is, either. (See CONTRIBUTING for more tips.)

My opinion: I think it would be difficult to push a new type into System.Data.SqlTypes. If you were successful, it would only go into future versions of .NET. Then you would still have to get SqlClient changed to use it. And that would require SqlClient to add a binary specific to that future target. If there is a solution that is limited to SqlClient that you yourself can implement and that fits into the project’s goals of stability, reliability, backwards compatibility, maintainability, and performance, then that sets you up for a better chance of success.

Regards, David

@bjornbouetsmith

When reading millions of rows from the database where every single column can be null, it is a major performance degredation to use GetXX methods instead of GetSqlXX methods, since you need to check every single column on ever single row with first IsDBNull and then do something depending on the return value.

Is this something you’re actually seeing in production, or have benchmarked? Because a quick benchmark seems to show that the perf of GetSqlDateTime is the same as IsDBNull + GetDateTime (see below). After all, I’d expect GetSqlDateTime to have to do more or less the same null checking internally as you’re doing externally when calling GetDateTime.

Benchmark results:


BenchmarkDotNet=v0.12.1, OS=ubuntu 20.04
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.101
  [Host]     : .NET Core 3.1.10 (CoreCLR 4.700.20.51601, CoreFX 4.700.20.51901), X64 RyuJIT
  DefaultJob : .NET Core 3.1.10 (CoreCLR 4.700.20.51601, CoreFX 4.700.20.51901), X64 RyuJIT


Method IsNull Mean Error StdDev
GetDateTime False 139.7 us 2.68 us 4.48 us
GetSqlDateTime False 137.2 us 2.73 us 5.58 us
GetDateTime True 124.4 us 2.38 us 3.25 us
GetSqlDateTime True 124.2 us 2.45 us 3.35 us
Benchmark code
public class Program
{
    const string ConnectionString = @"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0";

    [Params(true, false)]
    public bool IsNull { get; set; }

    SqlConnection _connection;
    SqlCommand _command;

    [GlobalSetup]
    public void Setup()
    {
        _connection = new SqlConnection(ConnectionString);
        _connection.Open();
        _command = new SqlCommand(IsNull
                ? "SELECT CAST(NULL AS datetime)"
                : "SELECT CAST('2020-01-01' AS datetime)",
            _connection);
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        _connection.Dispose();
        _command.Dispose();
    }

    [Benchmark]
    public DateTime GetDateTime()
    {
        using var reader = _command.ExecuteReader();
        reader.Read();
        return reader.IsDBNull(0) ? default : reader.GetDateTime(0);
    }

    [Benchmark]
    public SqlDateTime GetSqlDateTime()
    {
        using var reader = _command.ExecuteReader();
        reader.Read();
        return reader.GetSqlDateTime(0);
    }

    static void Main(string[] args)
        => BenchmarkRunner.Run<Program>();
}

I think what @Wraith2 is suggesting is that you add the files there, and for example just link from the .NET Framework project, and make tests pass for that, then he will add to .NET Core and add docs links and other quirky stuff.

I’m slowly working through merging the two codebases wherever I can and eventually we’ll get to the point where the MS team will be needed to make policy decisions about each of the differences. Until then i’d prefer if you add any new code to the shared folder and add it identically in both projects.

Start with just one build and then replicate the changes into the other. This is one of the reasons I offered to help because I’m familiar with the project peculiarities. If you can get the SqlDataTime2 with test for all of it’s methods written I can help get it plumbed in.

There is a difference between legacy and obsolete. Obsolete means you shouldn’t use it and there is usually a replacement. Legacy means it’s old and it’s not obsolete but it’s not likely to be expanded. DataTable and the associated items supporting it in System.Data fit into the legacy category as far as I know, they work they’re supported but they’re unlikely to change. Does that sound right @roji?

I’m not sure about the exact status of the SqlTypes. Re DataTable/DataSet specifically - which is part of ADO.NET rather than SqlClient - I’d say they’re more “archived” than obsolete/legacy. That is, generally speaking no further improvements are being made to those types, and changes are likely to be restricted to bug/security fixes. But note that this isn’t any official status or anything - just the way I see it.

@Wraith2 I never said my issue was around garbage - but only with performance, since the SqlClient is missing a SqlDateTime2 or equivalent class to represent DATETIME2.

if you use DATETIME2 - you are forced to use SqlDataReader.GetDateTime - and since that method does not gracefully handle null, you are then forced to first call SqlDataReader.IsDbNull.

The codepath for IsDbNull does a lot of work, which is required because it needs to check if the state is correct to be able to check for null. And then it ends up by accessing a property on SqlBuffer which is basically just a field accessor telling it whether or not data is null. If there was a SqlDateTime2 struct/GetSqlDateTime2, I would not have to call SqlDataReader.IsDbNull and then GetSqlDateTime - I could just use the performant way of just retrieving the SqlDateTime2 struct and call the IsNull property and if false, return the data contained in the struct.

using my “testcase” from above - I am sure you will be able to see the issue running this code below. Of course the code below is silly, but imagine it being a 200 column row all having the possiblity of null and having 50k rows.

for (int x = 0; x < 10000000; x++)
{
	DateTime dateTime;
	if (!reader.IsDBNull(0))
	{
		// Do something
		dateTime = reader.GetDateTime(0);
	}
	else
	{
		dateTime = default;
	}

}
//vs below code which fails with an InvalidCastException
// so you have to imagine calling a GetSqlDateTime2 which returned a SqlDateTime2
// or imagine that SqlDateTime would be able to handle the extended range of DATETIME2 and not throw exceptions

for (int x = 0; x < 10000000; x++)
{
	DateTime dateTime;
	SqlDateTime sqlDateTime = reader.GetSqlDateTime(0);
	if (!sqlDateTime.IsNull)
	{
		dateTime = sqlDateTime.Value;
	}
	else
	{
		dateTime = default;
	}

}

Can you provide an isolated test case that exhibits the behaviour you see around null checks and direct type access causing more garbage than using the sql types. I’ve always assumed when working on the internals that the sql types would be more costly to use but if they’re not I can investigate if there are ways to equalize the performance of the paths.