rails: ActiveRecord `pluck(:name)` returning different number of results than without `pluck`
Steps to reproduce
Have a Project model with a has_many
improvements. Improvements and projects have a name
column.
Run these 2 queries:
Project.includes(:improvements).where(improvements: { name: 'one' })
Project.includes(:improvements).where(improvements: { name: 'one' }).pluck(:name)
Expected behavior
I expect each query to return the same number of results (i.e. running #count
on them will return the same number)
Actual behavior
The query with the pluck(:name)
in it returns a different result set with many duplicates.
Without a pluck it runs this query:
SELECT DISTINCT "projects"."id" FROM "projects" LEFT OUTER JOIN "improvements" ON "improvements"."project_id" = "projects"."id" WHERE "improvements"."name" = $1 LIMIT $2
Followed by another long query to get all the actual data. However, thanks to the DISTINCT
restriction, no duplicate data is returned.
With the pluck(:name)
method, I get this query:
SELECT "projects"."name" FROM "projects" LEFT OUTER JOIN "improvements" ON "improvements"."project_id" = "projects"."id" WHERE "improvements"."name" = $1
Without a second query since no other data is required. However, because this is no longer using DISTINCT
on the project ID, I am getting multiple results, even though I only have one project.
In my case I have a single project called “hello” with 5 improvements. 4 of which have the name “one” I expect to get an array equal to ["hello"]
, but instead I get ["hello", "hello", "hello", "hello"]
.
I understand WHY this is happening and it makes sense, but the expected behaviour from pluck(:name)
is that it simply gets a single column from the query results, and does not alter the query itself. In this case it is completely changing what the query does.
Note: A temporary solution I am using is to convert the result set to an array and then plucking the results
System configuration
Rails version: Rails 6.0.3.2 Ruby version: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
Reproducible script
# frozen_string_literal: true
require "bundler/inline"
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
# Activate the gem you are reporting the issue against.
gem "activerecord", "6.0.3"
gem "sqlite3"
end
require "active_record"
require "minitest/autorun"
require "logger"
# This connection will do for database-independent bug reports.
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Schema.define do
create_table :posts, force: true do |t|
t.string :name
end
create_table :comments, force: true do |t|
t.integer :post_id
t.string :name
end
end
class Post < ActiveRecord::Base
has_many :comments
end
class Comment < ActiveRecord::Base
belongs_to :post
end
class BugTest < Minitest::Test
def test_pluck_does_not_effect_results
post = Post.create!(name: 'hello')
post.comments << Comment.create!(name: 'one')
post.comments << Comment.create!(name: 'one')
post.comments << Comment.create!(name: 'one')
post.comments << Comment.create!(name: 'one')
without_pluck = Post.includes(:comments).where(comments: { name: 'one' }).to_a.count
with_pluck = Post.includes(:comments).where(comments: { name: 'one' }).pluck(:name).to_a.count
assert_equal without_pluck, with_pluck
end
end
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 16 (12 by maintainers)
@tgxworld You’re right - So I’ll reiterate what I’m seeing here: Given a relation with an includes, for example
Post.includes(:comments).where(comments: { name: 'one' })
, should running pluck on that relation ever change the amount of records returned? I’m not 100% sure of the answer. Maybe this is the correct behavior but we can be a bit clearer in the documentation about what’s returning during includes + pluck. It may be happening here: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/calculations.rb#L189I think the key here is that pluck returns the column of the query result. If the query returns 4 rows, pluck will return the specified columns of all 4 rows.
It returns 4 strings because of
.includes(:comments).where(comments: { name: 'one' })
which does aLEFT JOIN
on the comments table which has 4 records that matches the conditions.pluck
simply takes the specified columns of all the rows and returns the array of type casted columns to the user.Also note that ActiveRecord behaves differently when you use
joins
vsincludes
.vs
Thanks @esparta, I just added the script
Will be helpful if you can add a reproducible script for this issue: https://guides.rubyonrails.org/contributing_to_ruby_on_rails.html#create-an-executable-test-case