framework: Wherehas() and Has() drastically slow - Laravel 4.1.*

Few days ago, some of our members complaint that in few pages website is very slow. After investigated the issue, i found that “those pages are using wherehas() and has()” for querying relationship. To see the performance of those queries, i have checked the run time of those queries using the following function:

dd(DB::getQueryLog());

Result is unpleasant. the query takes about 69 sec in my local environment and about 1 sec in the server.

My local Mac:

Processor 2.4 GHz Intel Core i7 - 4 cores Memory 8 GB 1600 MHz DDR3

Server: hosted in a cloud with 8 cores CPU.

Relationship:

public function posts()
{
       return $this->belongsToMany('Post')->withPivot('views')->withTimestamps();
}
    $users = User::remember(5)
          ->has('posts')
          ->with('comments')
          ->orderBy('updated_at', 'DESC')
          ->take(25)
          ->get();

Output:

dd(DB::getQueryLog());
array (size=3)
      'query' => string 'select * from `users` where `users`.`deleted_at` is null and (select count(*) from `posts` inner join `comments` on `posts`.`id` = `comments`.`post_id` where `posts`.`deleted_at` is null and `comments`.`user_id` = `users`.`id` and `posts`.`deleted_at` is null) >= ? order by `updated_at` desc limit 25' (length=333)
      'bindings' => 
        array (size=1)
          0 => int 1
      'time' => float 68973.13

note: for security reason, i have changed the table names. however, i concept is same.

First, I thought that this might be the server or PHP issue. However, when i have run the raw query to the database server using “Sequel Pro”, the result is same.

select * from `users` 
where `users`.`deleted_at` is null 
   and 
  (
       select count(*) from `posts` 
       inner join `comments` on `posts`.`id` = `comments`.`post_id` 
       where `posts`.`deleted_at` is null 
           and `comments`.`user_id` = `users`.`id` 
           and `posts`.`deleted_at` is null
    ) >= 1 
order by `updated_at` desc limit 25;

Time taken : 69.8 s (in my local pc) Time taken: 945ms (in Server)

About this issue

  • Original URL
  • State: closed
  • Created 10 years ago
  • Comments: 22 (14 by maintainers)

Most upvoted comments

@anam-hossain I tried running the same sql on sequel pro on my local machine and it took 291ms to complete.

Posts and Comments table have 160k records. (randomly generated)

select * from `users` 
where `users`.`deleted_at` is null 
   and 
  (
       select count(*) from `posts` 
       inner join `comments` on `posts`.`id` = `comments`.`post_id` 
       where `posts`.`deleted_at` is null 
           and `comments`.`user_id` = `users`.`id` 
           and `posts`.`deleted_at` is null
    ) >= 1 
order by `updated_at` desc limit 25;

But i was not able to run it using Eloquent.

I came across this issue when I analysed my current project for possible slow queries. In short, this issue is actually really bad and I wrote some simple code to get the correct queries in. (See below).

How bad is it?

Note: I’m only referring to Mysql not sure how other databases optimise things.

Look at the task of “Find all users that commented on any post belonging to a certain thread”. (Assuming Post, Comment and User models with the obvious relationships).

User::whereHas('comments', function($q) use ($threadId) {
  $q->whereHas('post', function($q) use ($threadId) {
    $q->where('thread_id', $threadId);
  });
})->get();

The way laravel crafts the has(…) queries using correlated subqueries often results in nested full table scans. That is, a full table scan of users where every row triggers a full table scan of comments where again every row triggers a full table scan of posts. In short, once your database grows, you are dead.

However, the task itself would be much faster if we first get the set of posts, then get the set of comments and then get the set of users. Doing it this way the execution time only depends on the number of records actually involved but not the overall size of the database.

Currently there are only two ways to achieve this:

  1. rewrite the whole thing using join statements and a final distinct.
  2. do it manually in PHP by first pulling out the IDs of the sub-queries manually and passing them on to another query using the whereIn(..., array) method.

As far as Mysql 5.5 is concerned that is actually where the story ends. Mytsql 5.5 turns a where ... in (select ...) statement into a where exists ... correlated subquery which again results in nested table scans.

However, Mysql 5.6 has the “Semi Join” and “Materialization” optimisation strategies for where ... in (select ...) type queries. The former tries to resolve the query as a join type query. If that’s not possible, the latter evaluates the sub-queries into temporary tables and takes it from there.

Thus, using where ... in (select ...) and Mysql 5.6 is the solution.

The patch

I wrote simple code that allows the has(...) method (which gets ultimately called by whereHas, orHas and all the other flavours) to take advantage of relationships that can craft a where ... in (select ...) constraint if the count value is >= 1. I also included the constraint generating code for the HasOneOrMany, BelongsTo and BelongsToMany relationshipts. I didn’t add the morphing ones because I’m not using them. In any case, the code falls back to the old behaviour whenever the new where in facility isn’t available.

I wrote and tested the code for Laravel 4.2, but as far as I can see there haven’t been changes to this section of code in 5.0, so it should apply to both.

I apologise for not presenting a finished pull request. My implementation is by deriving from the relationship classes so I can leave the vendor folder unchanged. So I’m simply pasting the relevant methods here.

\Illuminate\Database\Eloquent\Builder::has()

I added the big if block, the rest is unchanged.

    public function has($relation, $operator = '>=', $count = 1, $boolean = 'and', Closure $callback = null)
    {
        if (strpos($relation, '.') !== false)
        {
            return $this->hasNested($relation, $operator, $count, $boolean, $callback);
        }

        $relation = $this->getHasRelationQuery($relation);

        if ( (($operator == '>=') && ($count == 1)) || (($operator == '>') && ($count == 0)) ) {
            if (method_exists($relation, 'getWhereHasOneConstraints')) {
                $whereHasOneConstraints = $relation->getWhereHasOneConstraints($callback, $this);
                if (is_array($whereHasOneConstraints) && !empty($whereHasOneConstraints)) {
                    return $this->whereRaw($whereHasOneConstraints['sql'], $whereHasOneConstraints['bindings'], $boolean);
                }
            }
        }

        $query = $relation->getRelationCountQuery($relation->getRelated()->newQuery(), $this);

        if ($callback) call_user_func($callback, $query);

        return $this->addHasWhere($query, $relation, $operator, $count, $boolean);
    }

Illuminate\Database\Eloquent\Relations\BelongsTo::getWhereHasOneConstraints()

    public function getWhereHasOneConstraints(Closure $callback, $parent) {
        $parentKey = $this->wrap($this->getQualifiedForeignKey());
        $selectKey = $this->wrap($this->query->getModel()->getTable().'.'.$this->otherKey);

        if ($callback) call_user_func($callback, $this->query);
        $this->query->select(new Expression($selectKey));

        return array(
            'sql' => new Expression($parentKey .' in (' . $this->query->toSql() . ')'),
            'bindings' => $this->query->getBindings(),
        );      
    }

Illuminate\Database\Eloquent\Relations\HasOneOrMany::getWhereHasOneConstraints()

    public function getWhereHasOneConstraints(Closure $callback, $parent) {


        $parentKey = $this->wrap($this->getQualifiedParentKeyName());
        $selectKey = $this->wrap($this->getHasCompareKey());

        if ($callback) call_user_func($callback, $this->query);
        $this->query->select(new Expression($selectKey));

        return array(
            'sql' => new Expression($parentKey .' in (' . $this->query->toSql() . ')'),
            'bindings' => $this->query->getBindings(),
        );  
    }   

Illuminate\Database\Eloquent\Relations\BelongsToMany::getWhereHasOneConstraints()

    public function getWhereHasOneConstraints(Closure $callback, $parent) {

        if ($parent->getQuery()->from == $this->getRelated()->newQuery()->getQuery()->from) {
            // Table aliasing isn't implemented here. Return null to tell the caller to fall back
            // to the count query method.
            return null;
        }

        $parentKey = $this->wrap($this->getQualifiedParentKeyName());
        $selectKey = $this->wrap($this->getHasCompareKey());

        if ($callback) call_user_func($callback, $this->query);
        $this->query->select(new Expression($selectKey));

        return array(
                'sql' => new Expression($parentKey .' in (' . $this->query->toSql() . ')'),
                'bindings' => $this->query->getBindings(),
        );
    }