Processing a large result set with a Doctrine iterator

Published on 2019-11-16 • Modified on 2019-11-27

When having to process a large number of rows with Doctrine, it's not a good idea to use the standard getResult function as you run into memory problems very quickly. That's where the iterate function can be useful. It returns an IterableResult object you can loop on without being afraid of memory problems. Here is an example. Of course on my blog, I don't have millions of rows... maybe soon! 😉
I have tested this recently for the migration of a table to a new one, with functional rules. There were half a million rows to process. The batch took about 250mb of memory and five minutes to finish.
PS: Don't forget to turn off the debug mode!


<?php

declare(strict_types=1);

namespace App\Controller\Snippet;

use App\Entity\Article;
use App\Repository\ArticleRepository;
use Doctrine\ORM\EntityManagerInterface;

/**
 * I am using a PHP trait to isolate each snippet in a file.
 * This code should be called from a Symfony controller extending AbstractController (as of Symfony 4.2)
 * or Symfony\Bundle\FrameworkBundle\Controller\Controller (Symfony <= 4.1).
 * Services are injected in the main controller constructor.
 *
 * @property ArticleRepository      $articleRepo
 * @property EntityManagerInterface $entityManager
 */
trait Snippet56Trait
{
    public function snippet56(): void
    {
        $qb = $this->articleRepo->createQueryBuilder('get_all_articles');
        $processed = [];
        $batchSize = 3;
        $flushCount = 0;

        echo 'Memory before: '.round(memory_get_usage() / 1024 / 1024, 2)." mb\n";

        // using getQuery()->iterate() is deprecated. It's better now because we
        // directly get the entity object as the loop item.
        foreach ($qb->getQuery()->toIterable() as $article) {
            // $article = $result[0] ?? null; // with iterate() we had to do this to get the Article object
            if (!$article instanceof Article) {
                throw new \TypeError('Invalid article object found.');
            }

            if ((\count($processed) % $batchSize) === 0) {
                // persist here
                $this->entityManager->flush();
                $this->entityManager->clear(); // call clear, so memory can be freed.
                ++$flushCount;
            }
            $processed[] = $article->getId();
        }
        $this->entityManager->flush(); // for last rows
        $this->entityManager->clear();

        echo 'Memory after: '.round(memory_get_usage() / 1024 / 1024, 2).' mb'.PHP_EOL;
        echo 'Number of flush/clear: '.($flushCount + 1).PHP_EOL;
        echo 'Processed articles with IDs: '.implode(',', $processed);

        // That's it! 😁
    }
}

 Run this snippet  More on Stackoverflow   Read the doc  Random snippet

  Work with me!