Pagination performance with Symfony PagerFanta

Pagination performance with Symfony PagerFanta

Recently, I’ve been looking into performance issues on an SQL-based search feature. The website in question paginates with Symfony’s PagerFanta Bundle. It seems that optimizing this bundle for use with large table is a somewhat common issue.

Use IN instead of = xxx OR

This one is simple enough. Using WHERE x IN (a, b, c) is faster than using WHERE x = a OR x = b OR x = c. Symfony makes this simple with the query builder. Let’s say we want to list users that are banned or suspended, we could do something like this:

$statuses = [
  1, // banned
  2, // suspended
];
$qb->andWhere($qb->expr()->in('user.status', $statuses));

In such a case, having an index on user.status helps too (see “Indexes” further down).

When joining, put ORDER BY in subquery

In SQL, at least in MariaDB, there is a performance issue affecting “late row lookups” and can be worked around by putting any ORDER BY clauses inside a subquery. Although it sounds counter-intuitive to me that adding a query would speed things up, it does work.

The idea is to change something like:

SELECT my_user.id, my_cars.id
FROM my_user
JOIN cars ON my_cars.user_id = my_user.id
ORDER BY my_user.id DESC

To something like:

SELECT DISTINCT my_user.id, my_cars.id
FROM
    (
        SELECT ordered_user.id
        FROM my_user ordered_user
        ORDER BY ordered_user.id DESC
        LIMIT 999999999
    ) ordered_user
JOIN my_user ON my_user.id = ordered_user.id
JOIN my_cars ON my_cars.user_id = my_user.id

The LIMIT 999999999 is required because MariaDB will only use ORDER BY in a subquery if a LIMIT is also provided.

Avoid prefix wildcards with LIKE

When running an SQL query with a prefix wildcard, indexes are not used. With this, even if the email column is indexed, the index will not be used. With a large table, this query will be slow:

SELECT * FROM my_user WHERE email LIKE '%foobar@gmail.com%'

Instead, more often than not, there are ways to query non prefix wildcards. For instance, in a CRM, it is often the case that a customer service representative copy-pastes the email address of a customer. This means that emails are exact. And so I can run this query instead:

SELECT * FROM my_user WHERE email = 'foobar@gmail.com'

Or, if I want speed and a bit of flexibility, I can match a prefix, in which case the index is used:

SELECT * FROM my_user WHERE email LIKE 'foobar@%'

This works especially well with data types such as phone numbers. If a searched phone number is valid — which in PHP can be checked with Google’s libphonenumber library — it probably OK to do a WHERE phone = xxx query. If, on the other hand, the phone number is incorrect, that may mean that it is incomplete, in which case I can revert to a slower but more accurate WHERE phone LIKE '%xxx%' query. I find that using data validation can be a good way to build efficient SQL queries on the fly and reduce query time in most cases, reverting to slow queries only when data validation leaves me unsure what to do.

Don’t show page number, unless you need to

PageFanta’s Doctrine ORM adapter is initialized like this, where $qb is a Doctrine query builder:

$pager = new Pagerfanta(new \Pagerfanta\Adapter\DoctrineORMAdapter($qb));

This Pagerfanta adapter counts the number of pages. This helps display something like this in the front-end:

pagination-1

This looks nice. Doesn’t it? It seems like it has become kind of an automatism to display this data. It’s usefulness is not necessarily questioned. But it is useful? In the case I’m working on, it’s useless. What matters more is to display the right data on the first page.

And, fetching the number of pages from the database takes time. Getting rid of this completely may just be an acceptable solution. This how I do that with PagerFanta:

$pager = new Pagerfanta(new \Pagerfanta\Adapter\CallbackAdapter(
    function() use($limit, $offset) {
        // approximation that allows always having one more page so as to get a "Next page" link
        return $limit * 2 + $offset;
    },
    function() use($qb) {
        return $qb->getQuery()->getResult();
    }
));

Setting indexes

Indexes can speed up queries in some situations. In a Symfony website, adding indexes can be setup with the Index annotation on any entity’s class:

use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Table(name="my_user", indexes={
 *   @ORM\Index(name="idx_status", columns={"status"}),
 * })
 * @ORM\Entity(repositoryClass="AppBundle\Repository\UserRepository")
 */
class User
{
    /**
     * @var integer
     *
     * @ORM\Column(type="integer")
     */
    protected $status;
}

Composite indexes can be used to further speed up specific types of queries that use WHERE on multiple fields in a single query.


Feedback or comments

Want to discuss this page? Send me an email or post to Hacker News.