Processing a large result set with a Doctrine iterator
Published on 2019-11-16 • Modified on 2019-11-27
When having to process a large number of rows with Doctrine, it's not a good idea to use the standard getResult
function as you run into memory problems very quickly. That's where the iterate
function can be useful. It returns an IterableResult
object you can loop on without being afraid of memory problems. Here is an example. Of course on my blog, I don't have millions of rows... maybe soon! 😉
I have tested this recently for the migration of a table to a new one, with functional rules. There were half a million rows to process. The batch took about 250mb of memory and five minutes to finish.
PS: Don't forget to turn off the debug mode!
<?php
declare(strict_types=1);
namespace App\Controller\Snippet;
use App\Entity\Article;
use App\Repository\ArticleRepository;
use Doctrine\ORM\EntityManagerInterface;
/**
* I am using a PHP trait to isolate each snippet in a file.
* This code should be called from a Symfony controller extending AbstractController (as of Symfony 4.2)
* or Symfony\Bundle\FrameworkBundle\Controller\Controller (Symfony <= 4.1).
* Services are injected in the main controller constructor.
*
* @property ArticleRepository $articleRepo
* @property EntityManagerInterface $entityManager
*/
trait Snippet56Trait
{
public function snippet56(): void
{
$qb = $this->articleRepo->createQueryBuilder('get_all_articles');
$processed = [];
$batchSize = 3;
$flushCount = 0;
echo 'Memory before: '.round(memory_get_usage() / 1024 / 1024, 2)." mb\n";
// using getQuery()->iterate() is deprecated. It's better now because we
// directly get the entity object as the loop item.
foreach ($qb->getQuery()->toIterable() as $article) {
// $article = $result[0] ?? null; // with iterate() we had to do this to get the Article object
if (!$article instanceof Article) {
throw new \TypeError('Invalid article object found.');
}
if ((\count($processed) % $batchSize) === 0) {
// persist here
$this->entityManager->flush();
$this->entityManager->clear(); // call clear, so memory can be freed.
++$flushCount;
}
$processed[] = $article->getId();
}
$this->entityManager->flush(); // for last rows
$this->entityManager->clear();
echo 'Memory after: '.round(memory_get_usage() / 1024 / 1024, 2).' mb'.PHP_EOL;
echo 'Number of flush/clear: '.($flushCount + 1).PHP_EOL;
echo 'Processed articles with IDs: '.implode(',', $processed);
// That's it! 😁
}
}
Run this snippet More on Stackoverflow Read the doc Random snippet