npgsql: Resultset with many columns in sequential mode triggers exception

The issue

When executing select * from nfl_game_team_stats ts where (("ts"."team_id" = 11939) OR ("ts"."team_id" = 13994)) AND ("ts"."season_id" = 2017); Npgsql seems to throw a DataException which rolls up to Dapper. This previously/currently works in 3.2.7.

This is selecting from a view which is defined as:

CREATE OR REPLACE VIEW nfl_game_team_stats AS SELECT ps.*
    FROM (nfl_game g
     JOIN nfl_game_team_play_stats ps ON (((ps.game_id = g.game_id) AND (ps.play_id = g.last_play_id))));

In our logs, with Npgsql logging set to trace, we’re seeing the following when executing the above statement. Ultimately Dapper throws an exception as well.

[2018-06-20 11:58:40.211] [NLogLogger.Log] [8] [Debug]: Executing statement(s):
	select * from nfl_game_team_stats ts where ((("ts"."team_id" = $1) OR ("ts"."team_id" = $2)) AND ("ts"."season_id" = $3)) 
Exception thrown: 'System.Data.DataException' in System.Private.CoreLib.dll
[2018-06-20 11:58:40.288] [NLogLogger.Log] [7] [Error]: Breaking connector 
[2018-06-20 11:58:40.290] [NLogLogger.Log] [7] [Trace]: Cleaning up connector 
[2018-06-20 11:58:40.291] [NLogLogger.Log] [7] [Trace]: Closing connection... 
[2018-06-20 11:58:40.291] [NLogLogger.Log] [7] [Trace]: Closing connector 
[2018-06-20 11:58:40.291] [NLogLogger.Log] [7] [Debug]: Connection closed 
[2018-06-20 11:58:40.568] [StatMilk.Lib.UniversalDataService.UdsConnection+<GetListAsync>d__20`1.MoveNext] [7] [Error]: Method [QueryGameTeamStatsAsync] Error [Error parsing column 1 (game_id=562971428257794 - Int64)] SQL = [select * from nfl_game_team_stats ts where ((("ts"."team_id" = @TeamId_0) OR ("ts"."team_id" = @TeamId_1)) AND ("ts"."season_id" = @SeasonId_2))] Params = @TeamId_0=11939, @TeamId_1=13994, @SeasonId_2=2017]

When running the query above manually via command line, 46 rows are returned (none of them having a game_id of that).

What’s odd is that our sports have views defined exactly the same and they are not throwing this exception. We have a store procedure generate the view so there is zero difference in syntax between the generation of the offending view. All columns on other sports have fields that match the same type on the joins: bigint on game_id, play_id, and last_play_id.

The only distinction I am able to tell between the different sports/tables is that this specific table has 234 columns while the next largest has 141. Not sure if it’s a data reader issue with the column layout or some change in the mapping which seems fairly substantial in 4.0.0.

Further technical details

Npgsql version: 4.0.0 (also tried with 4.1.0-ci.1065+sha.bb4a1ef4d) PostgreSQL version: 9.6.9 Operating system: OSX High Sierra (also Windows Server 2012 in AWS) Dapper: 1.50.5

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 1
Comments: 23 (14 by maintainers)

Commits related to this issue

Fix bug with oversize RowDescription + sequential Fixes #2003 (cherry picked from commit 0a087b937a2a6ecf0858cbe9bc2db04c049f4b66) — committed to npgsql/npgsql by roji 6 years ago

Most upvoted comments

Sorry for delay - just verified that 4.0.1-ci.1071 works within our application(s) from a cursory test. It’d be very apparent if it didn’t work. Specifically our test case revolving these large models successfully returns the same data in 3.2.6 and 4.0.1-ci.1071

RichardD012 on Jun 26, 2018

@austindrenski pushed the fix, you can take a look. I think this would have been hard to nail without lots of knowledge on the internals and buffering… Looking at this code again I’m not very happy at how it all works, will rethink buffering for 4.1.0 (related also to #1982).

In any case, build 4.0.1-1070 is complete, can you guys please test with this nupkg from the stable feed?

roji on Jun 25, 2018

I’m seeing the source of this, it’s a pretty subtle bug that results in big messages (RowDescription in this case) triggering use of the oversize buffer, but with the sequential data reader still trying to read from the connection’s regular buffer.

@austindrenski thanks for the help minimizing the repro, unless you object I’ll take it from here…

roji on Jun 25, 2018

I just updated the repo to remove the Dapper dependency but kept the core issue intact and its pretty specific.

What triggers this assertion is if you call ExecuteReader with the flag CommandBehavior.SequentialAccess and the size is too big for the buffer so it has to reallocate a larger one.

If you provide a Read Buffer Sizeparameter larger than the data length (in this case 8598) then the asssertion doesn’t fail.

For instance, I changed the connection string to include Read Buffer Size=9600 and the assertion failure goes away.

RichardD012 on Jun 22, 2018

Updated the repo to remove those things. I’m happy to help but I’m not sure of the intricacies of reading from the stream. Any resources you could point me to that might help me out on that front?

RichardD012 on Jun 25, 2018

In our real code base we actually care about those other columns. I was just reproducing the specific failure I was seeing. The GameTeamStats class actually has the remaining properties defined and our BL uses those for modeling purposes.

It does seem a bit odd that returning specific properties works while the generic case doesn’t.

RichardD012 on Jun 22, 2018

Any assertion being tripped inside Npgsql is an issue we need to fix, so I’m definitely interested in understanding exactly what’s going on here… But unfortunately I won’t have any time to dive into this in the coming days. @austindrenski if you can investigate that would be great.

roji on Jun 22, 2018