efcore: OrderBy is ignored when used with Distinct and then Concatenated

I’m selecting from few different tables, projecting to a common model object and concatenating them (translated as UNION ALL) to one single result collection. I also apply OrderBy to each select. Example:

var result = dbSetA.Where(...).AsModel().OrderBy(x => x.Name)
.Concat(dbSetB.Where(...).AsModel().OrderBy(x => x.Name))
.ToList();

This example works as expected. All subqueries are order by name and then UNION ALL is applied. the result is [Sorted result of dbSetA] + [Sorted result of dbSetB]

However when I apply Distinct (doesn’t matter on what position in the subquery), the OrderBy is ignored (removed from translated query) example:

var result = dbSetA.Where(...).AsModel().Dictinct().OrderBy(x => x.Name)
.Concat(dbSetB.Where(...).AsModel().Dictinct().OrderBy(x => x.Name))
.ToList();

In this case the translated subqueries contain Distinct but no Order By, so the result is [Unsorted distinct result of dbSetA] + [Unsorted distinct result of dbSetB]

I believe this is a bug.

The only workaround I was able to find was to execute subqueries separately and merge using List methods: example:

var result =  dbSetA.Where(...).AsModel().Dictinct().OrderBy(x => x.Name).ToList();
result.AddRange(dbSetB.Where(...).AsModel().Dictinct().OrderBy(x => x.Name).ToList());

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 18 (15 by maintainers)

Most upvoted comments

Yeah… Their intention wasn’t bad - they don’t allow ORDER BY in subqueries without top/offset precisely to “communicate” that it doesn’t make sense. But then you can just work around that restriction with OFFSET 0, so… 😃

For a bit of an anecdote… EF Core has very extensive “specification tests” which any provider can (and should) implement; these ensure that the provider is behaving correctly etc. Every release there’s a handful of new tests which implicitly depend on ordering which isn’t guaranteed, but which happens to be good on SQL Server. When I run those tests on PostgreSQL they fail, since PostgreSQL tends to not preserve behavior and move things around, when order isn’t explicitly specified. Just a bit of fun I go through.

[…] A quirk of SQL Server

@stevendarby the fact that ordering isn’t preserved for rows coming out of subqueries isn’t a quirk: it’s a fundamental point in the SQL language in general… A good way to see this is the following:

SELECT * FROM (SELECT 1 AS foo ORDER BY foo) AS bar;

Trying to run this errors with: The ORDER BY clause is invalid ... unless TOP, OFFSET or FOR XML is also specified. In other words, SQL Server only allows ORDER BY in subqueries for the purpose of determining which rows come out, and not how they’re ordered.

A way you can get ordering working in sub-queries is to use OFFSET 0 ROWS instead:

If you see ordered rows coming out, that’s not guaranteed; it’s accidental and could change, and programs cannot rely on it.

You’re right that above, the loss of order is related more to the subquery and less to UNION ALL specifically, but it doesn’t change the fact that there’s no way to guarantee that ordering gets preserved in the operands of UNION ALL.