Implementing a search engine with elasticsearch and Symfony (part 3/3)
Published on 2019-11-16 • Modified on 2020-04-18
In this third and last part, we will continue to improve our search engine. First, we will enhance our elasticsearch stack with Kibana. Then, we will implement an autocomplete using an elasticsearch suggester. Let's go! ๐
» Published in "A week of Symfony 672" (10-17 November 2019).
Tutorial
This post is the third and last part of the tutorial "Implementing a search engine with Elasticsearch and Symfony":
- Part 1: Setting the Elasticsearch stack, installing FOSElastica, indexing data, searching and displaying results.
- Part 2: Cleanup and refactoring, using an Elasticsearch alias, creating a custom provider, tuning the search relevance and adding the pagination.
- Part 3: Adding Kibana to the stack, implementing an auto-complete with Elasticsearch.
Prerequisite
The prerequisites are the same as the first two parts. It is, of course, recommended to read them (links above) before continuing with this one.
Configuration
- PHP 8.3
- Symfony 6.4.10
- elasticsearch 6.8
Installing Kibana
First, we will try to improve our Elasticsearch stack. Until now, we used the "head" plugin to manage our cluster. But this development tool is quite old and not maintained anymore. So, let's add Kibana to our Docker setup. Kibana is an open-source data visualization plugin for Elasticsearch. Of course, it will allow us to do all the essential maintenance tasks we used to with head: delete, close an index, create, delete an alias, check a document, verify the index mappings and much more! The list of what you can do with it is impressive (check out at the left menu of the next screenshot). Let's add the following entry to the docker-compose.yaml
file:
kibana:
container_name: sb-kibana
image: docker.elastic.co/kibana/kibana:6.8.18
ports:
- "5601:5601"
environment:
- "ELASTICSEARCH_URL=http://sb-elasticsearch"
depends_on:
- elasticsearch
Click here to view the new full docker-compose.yaml
file.
# ./docker-compose.yaml
# DEV docker compose file โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Check out: https://docs.docker.com/compose/gettingstarted/
version: '3.7'
services:
# Database โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# MySQL server database (official image)
# https://docs.docker.com/samples/library/mysql/
db:
container_name: strangebuzz-db-1
image: mysql:5.7
platform: linux/x86_64
command: --default-authentication-plugin=mysql_native_password
ports:
- "3309:3306"
environment:
MYSQL_ROOT_PASSWORD: root
MYSQL_HOST: '%'
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -P 3306 -proot | grep 'mysqld is alive' || exit 1"]
interval: 10s
timeout: 30s
retries: 10
# Snippet L21+4 in templates/snippet/code/_133.html.twig
# โโ Elasticsearch โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Elasticsearch server (official image)
# https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html
# https://hub.docker.com/_/elasticsearch/
# https://stackoverflow.com/questions/68877644/how-to-run-elasticsearch-6-on-an-apple-silicon-mac
elasticsearch:
container_name: strangebuzz-elasticsearch-1
# image: docker.elastic.co/elasticsearch/elasticsearch:7.17.9
# image: docker.elastic.co/elasticsearch/elasticsearch:7.1.
image: webhippie/elasticsearch:6.4
platform: linux/amd64
ports:
- "9209:9200"
- "9309:9300" # Important if you have multiple es instances running
environment:
- "http.port=9200"
- "discovery.type=single-node"
- "bootstrap.memory_lock=true"
- "ES_JAVA_OPTS=-Xms1G -Xmx1G"
- "xpack.security.enabled=false"
- "http.cors.enabled=true"
- "http.cors.allow-origin=*"
- "cluster.routing.allocation.disk.threshold_enabled=false" # https://www.elastic.co/guide/en/elasticsearch/reference/6.8/disk-allocator.html
healthcheck:
test: ["CMD-SHELL", "curl --silent --fail localhost:9200/_cluster/health?wait_for_status=yellow&timeout=30s || exit 1"]
interval: 10s
timeout: 30s
retries: 10
# Cache โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Redis (official image)
# https://hub.docker.com/_/redis
redis:
container_name: strangebuzz-redis-1
image: redis:5.0.13-alpine
ports:
- '6389:6379'
healthcheck:
test: ["CMD-SHELL", "redis-cli -h 127.0.0.1 ping | grep 'PONG' || exit 1"]
interval: 10s
timeout: 30s
retries: 10
# Snippet L55+12 in templates/snippet/code/_236.html.twig
# PHP-FPM โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# https://github.com/dunglas/symfony-docker
php:
container_name: strangebuzz-php-1
build:
context: .
target: symfony_php
args:
SYMFONY_VERSION: ${SYMFONY_VERSION:-}
SKELETON: ${SKELETON:-symfony/skeleton}
STABILITY: ${STABILITY:-stable}
restart: unless-stopped
volumes:
- php_socket:/var/run/php
healthcheck:
interval: 10s
timeout: 3s
retries: 3
start_period: 30s
environment:
# Run "composer require symfony/mercure-bundle" to install and configure the Mercure integration (@todo)
# MERCURE_URL: ${CADDY_MERCURE_URL:-http://caddy/.well-known/mercure}
# MERCURE_PUBLIC_URL: https://${SERVER_NAME:-localhost}/.well-known/mercure
# MERCURE_JWT_SECRET: ${CADDY_MERCURE_JWT_SECRET:-!ChangeMe!}
# new from legacy project (.env to transfer)
APP_ENV: prod
APP_SERVER: prod
APP_DEBUG: 0
DATABASE_URL: mysql://root:root@db:3306/strangebuzz?serverVersion=5.7
REDIS_URL: redis://redis:6379/1
ES_HOST: elasticsearch
ES_PORT: 9200
# Taken from .env
APP_SECRET: ${APP_SECRET}
APP_HTML5_VALIDATION: ${APP_HTML5_VALIDATION}
ADMIN_PASSWORD: ${ADMIN_PASSWORD}
SLACK_TOKEN: ${SLACK_TOKEN}
CORS_ALLOW_ORIGIN: ${CORS_ALLOW_ORIGIN}
JWT_PASSPHRASE: ${JWT_PASSPHRASE}
# Caddy Web Server โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# https://github.com/dunglas/symfony-docker
caddy:
container_name: strangebuzz-caddy-1
build:
context: .
target: symfony_caddy
depends_on:
- php
environment:
SERVER_NAME: ${SERVER_NAME:-localhost, caddy:80}
MERCURE_PUBLISHER_JWT_KEY: ${CADDY_MERCURE_JWT_SECRET:-!ChangeMe!}
MERCURE_SUBSCRIBER_JWT_KEY: ${CADDY_MERCURE_JWT_SECRET:-!ChangeMe!}
restart: unless-stopped
volumes:
- php_socket:/var/run/php
- caddy_data:/data
- caddy_config:/config
ports:
# HTTP
- target: 80
published: 80
protocol: tcp
# HTTPS
- target: 443
published: 443
protocol: tcp
# HTTP/3
- target: 443
published: 443
protocol: udp
volumes:
php_socket:
caddy_data:
caddy_config:
As you can see, we pass the URL of the Elasticsearch server, which hostname is the name of the docker container (sb-elasticsearch). We keep the standard 5601 port. We also used the same image version (6.8.18) that we used for the Elasticsearch server, so we are sure there are no compatibility problems. If you restart the docker hub, you can access the management page:
That's it for Kibana. I will stop here; it would require much more than a full article to introduce all the features. Check out the official website for more information. Kibana is very powerful; it can also be used to view your Symfony logs! Check out this excellent post on the JoliCode blog about this subject.
Adding an autocomplete in the search bar
As you can see, I have put a search bar in the header of this website. It works, but what about trying to autocomplete the user input and suggest terms they can find on this blog? Let's see how we can do this with Elasticsearch; we are going to build a new index that will be dedicated to this task.
Setting the mapping
Until now, we used the default "text" type for all mapping properties. In this case, we will use a particular type: completion. We will add a new "suggest" index configuration just after the "app" we were using in the previous posts:
# config/packages/fos_elastica.yaml
fos_elastica:
clients:
default: { host: '%es_host%', port: '%es_port%' }
indexes:
app:
###
suggest:
use_alias: true
settings:
index:
analysis:
analyzer:
suggest_analyzer:
type: custom
tokenizer: standard
filter: [lowercase, asciifolding]
types:
keyword:
properties:
locale:
type: keyword
suggest:
type: completion
analyzer: suggest_analyzer
contexts:
- name: locale
type: category
path: locale
Click here to see the full YAML mapping.
# config/packages/fos_elastica.yaml
fos_elastica:
clients:
default: { host: '%es_host%', port: '%es_port%' }
indexes:
app:
use_alias: true
types:
articles:
properties:
type: ~
keywordFr: { boost: 4 }
keywordEn: { boost: 4 }
# i18n
titleEn: { boost: 3 }
titleFr: { boost: 3 }
headlineEn: { boost: 2 }
headlineFr: { boost: 2 }
ContentEn: ~ # The default boost value is 1
ContentFr: ~
persistence:
driver: orm
model: App\Entity\Article
provider:
service: App\Elasticsearch\Provider\ArticleProvider
listener:
insert: false
update: false
delete: false
settings:
index:
analysis:
analyzer:
synonym:
tokenizer: "standard"
filter: [synonym]
filter:
synonym:
type: "synonym"
synonyms:
- "elastic,elastica => elasticsearch"
# L1->L29 snippet in templates/blog/posts/_48.html.twig
suggest:
use_alias: true
settings:
index:
analysis:
analyzer:
suggest_analyzer:
type: custom
tokenizer: standard
filter: [lowercase, asciifolding]
types:
keyword:
properties:
locale:
type: keyword
suggest:
type: completion
analyzer: suggest_analyzer
contexts:
- name: locale
type: category
path: locale
Some explanations about the new index and its mapping: before declaring the new type, I add a custom analyser in the setting section. This asciifolding filter will allow us to ignore french accents to make the match even accents are not used in the input. For example, if we type: "element", the "รฉlรฉment" word should be suggested.
Then, in the "type" section, we also use an alias as the main app index. In the mapping, we have two properties: first, the suggest
of the completion types. We need this particular type to be able to use the completion suggester as we will see in the next chapter. And we have a second property locale
that will allow us to filter the suggestions depending on the current locale (en or fr). We can see that we have added a context to the suggest field and it's associated with the locale property (path: locale
).
If we launch the populate command, the new index is created. As this point, we now have two indexes in the Elasticsearch cluster:
Populating the suggest index
Now, we must fill the new suggest index. As there is no model mapped to this index, we won't create a provider but a command. The idea is to extract all the words we have already indexed in the app index. Here is the new Symfony command: (some insights after the snippet ๐ค)
<?php
declare(strict_types=1);
// src/Command/PopulateSuggestCommand.php (used by templates/blog/posts/_51.html.twig)
namespace App\Command;
use Doctrine\Inflector\Inflector;
use Doctrine\Inflector\NoopWordInflector;
use Elastica\Document;
use FOS\ElasticaBundle\Elastica\Index;
use FOS\ElasticaBundle\Finder\TransformedFinder;
use FOS\ElasticaBundle\HybridResult;
use FOS\ElasticaBundle\Paginator\FantaPaginatorAdapter;
use Pagerfanta\Pagerfanta;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use function Symfony\Component\String\u;
/**
* Populate the suggest elasticsearch index.
*/
final class PopulateSuggestCommand extends BaseCommand
{
public static $defaultName = self::NAMESPACE.':populate';
protected static $defaultDescription = 'Populate the "suggest" Elasticsearch index';
private Inflector $inflector;
public function __construct(
private TransformedFinder $articlesFinder,
private Index $suggestIndex
) {
parent::__construct();
$this->inflector = new Inflector(new NoopWordInflector(), new NoopWordInflector());
}
protected function configure(): void
{
[$desc, $class] = [self::$defaultDescription, self::class];
$this->setHelp(
<<<EOT
$desc
COMMAND:
<comment>$class</comment>
<info>%command.full_name%</info>
EOT
);
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$output->writeln((string) self::$defaultDescription);
$pagination = $this->findHybridPaginated($this->articlesFinder);
$nbPages = $pagination->getNbPages();
$keywords = [];
foreach (range(1, $nbPages) as $page) {
$pagination->setCurrentPage($page);
foreach ($pagination->getCurrentPageResults() as $result) {
if ($result instanceof HybridResult) {
foreach ($result->getResult()->getSource() as $property => $text) {
if ($property === 'type') {
continue;
}
$locale = explode('_', $this->inflector->tableize($property))[1] ?? 'en';
$text = strip_tags($text ?? '');
$words = str_word_count($text, 2, 'รงรฉรขรชรฎรฏรดรปร รจรนลรรรรรรรรรรรล'); // FGS dot not remove french accents! ๐
$textArray = array_filter($words);
$keywords[$locale] = array_merge($keywords[$locale] ?? [], $textArray);
}
}
}
}
// Index by locale
foreach ($keywords as $locale => $localeKeywords) {
// Remove small words and remaining craps (emojis) ๐
$localeKeywords = array_unique(array_map('mb_strtolower', $localeKeywords));
$localeKeywords = array_filter($localeKeywords, static function ($v): bool {
return u($v)->trim()->length() > 2;
});
$documents = [];
foreach ($localeKeywords as $keyword) {
$documents[] = (new Document())
->setType('keyword')
->set('locale', $locale)
->set('suggest', $keyword);
}
$responseSet = $this->suggestIndex->addDocuments($documents);
$output->writeln(sprintf(' -> TODO: %d -> DONE: <info>%d</info>, "%s" keywords indexed.', \count($documents), $responseSet->count(), $locale));
}
return self::SUCCESS;
}
/**
* @return Pagerfanta<mixed>
*/
private function findHybridPaginated(TransformedFinder $articlesFinder): Pagerfanta
{
$paginatorAdapter = $articlesFinder->createHybridPaginatorAdapter('');
return new Pagerfanta(new FantaPaginatorAdapter($paginatorAdapter));
}
}
Some explanations: ๐ก
- We perform a wildcard search to get the total number of pages.
- We iterate over each page to get the articles.
- For each article, we extract all the keys from the Elasticsearch document.
- For each key, we extract from the text all the words with the PHP str_word_count() function.
- We eliminate all empty, duplicates and too short words.
- For each remaining word, we create an Elasticsearch document with the correct associated locale.
- Eventually, we run the indexation process with the
addDocuments
function.
As I am writing there is about 3500 indexed words. Here is what outputs the populate MakeFile
entry:
/Users/coil/Sites/strangebuzz.com$ make populate
php bin/console fos:elastica:reset
Resetting app
Resetting suggest
Resetting app
53/53 [============================] 100%
Refreshing app
Populate the "suggest" elasticsearch index
-> TODO: 2167 -> DONE: 2167, "fr" keywords indexed.
-> TODO: 1549 -> DONE: 1549, "en" keywords indexed.
Here is it's content:
## โโ elasticsearch ๐ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
populate: ## Reset and populate the Elasticsearch index
#@$(SYMFONY) fos:elastica:reset
#@$(SYMFONY) fos:elastica:populate --index=app
@$(SYMFONY) strangebuzz:index-articles
You can find my full Symfony MakeFile in this snippet. So, now that the index is populated, let's see how to use it for the autocomplete feature.
Implementing the autocomplete
The goal here will be to have an action that returns via Ajax the suggestions for the autocomplete field as the user is typing. So let's create a new controller that will handle this:
<?php
declare(strict_types=1);
// src/Controller/SuggestController.php
namespace App\Controller;
use App\Elasticsearch\ElastiCoil;
use Symfony\Component\HttpFoundation\JsonResponse;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\Routing\Annotation\Route;
#[Route(path: '/{_locale}', requirements: ['_locale' => '%locales_requirements%'])]
final class SuggestController extends AbstractController
{
#[Route(path: ['en' => '/suggest', 'fr' => '/suggerer'], name: 'suggest')]
public function suggest(Request $request, string $_locale, ElastiCoil $elastiCoil): JsonResponse
{
$q = $request->query->get('q', '');
return $this->json($elastiCoil->getSuggestions($q, $_locale));
}
}
And the related custom Elasticsearch service:
<?php
declare(strict_types=1);
// src/Elasticsearch/ElastiCoil.php
namespace App\Elasticsearch;
use Elastica\Query;
use Elastica\Suggest;
use Elastica\Suggest\Completion;
use Elastica\Util;
use FOS\ElasticaBundle\Elastica\Index;
final class ElastiCoil
{
public const SUGGEST_NAME = 'completion';
public const SUGGEST_FIELD = 'suggest';
public function __construct(
private readonly Index $suggestIndex
) {
}
/**
* Get the a suggest object for a keyword and locale.
*/
public function getSuggest(string $q, string $locale): Suggest
{
$completionSuggest = (new Completion(self::SUGGEST_NAME, self::SUGGEST_FIELD))
->setPrefix(Util::escapeTerm($q))
->setParam('context', ['locale' => $locale])
->setSize(5);
return new Suggest($completionSuggest);
}
/**
* Return suggestions for a keyword and locale as a simple array.
*
* @return array<string>
*/
public function getSuggestions(string $q, string $locale): array
{
$suggest = $this->getSuggest($q, $locale);
$query = (new Query())->setSuggest($suggest);
$suggests = $this->suggestIndex->search($query)->getSuggests();
return $suggests[self::SUGGEST_NAME][0]['options'] ?? [];
}
}
Some explanations: ๐ก
- As the search action, we get the user input by getting the "q" GET parameter.
- Then, we create an elastica
Suggest
object with the name of the mapping property to use. - Just below, we add a context that will allow us to filter returned items: in this case, we filter on the current page locale (en or fr).
- Then, we extract the returned options of the Elasticsearch response.
- Eventually, we return a
JsonResponse
with a simple array with the options to display to the user.
Displaying the suggestions
Now that the suggest
action is done, we can use it with an autocomplete widget. Try it on the form just below. It's the same form we used in the previous articles (only some JavaScript was added to get the suggestions). As you can see, on this page, only English words are returned, but if you try on the French version, you can verify that only French ones are. It's the same action, but the filter was done thanks to the context we added to the Elasticsearch suggester.
To render the widget, I used a vue.js component. Just a comment about the route we use here: we don't have to specify the locale: {{ path('suggest') }}
because the routing component will automatically add it (here it's en). This autocomplete is also on the search results page. Here the code of the component include:
Click here to see the Vue.js component code.
{# https://github.com/BosNaufal/vue-autocomplete #}
{% set placeholder = placeholder is defined ? placeholder : 'enter_one_or_several_keywords'|trans({}, 'search') %}
<autocomplete
ref="autocomplete"
aria-describedby="qHelp"
url="{{ path('search_suggest') }}" {# current local is injected #}
anchor="text" {# not used, custom render #}
label="text"
:on-should-render-child="autocompleteRenderChild"
:required="true"
id="post-q"
name="q"
:classes="{ wrapper: 'form-wrapper', input: 'form-control', list: 'data-list', item: 'data-list-item' }"
placeholder="{{ placeholder }}"
init-value="{{ app.request.query.get('q') }}"
:options="[]"
:min="3"
:encode-params="true"
>
</autocomplete>
Conclusion
That was the last part of this Elasticsearch tutorial. It was exciting writing it (but very long!) at the same time I was implementing the features on this website. There is still a lot to do, but I am happy with what I did so far ๐. I am using the search every day to find articles or snippets quickly. There is excellent news: The FOSElastica bundle is updated to support elastica 7.0. So, as soon as it's out, I'll modify this tutorial to use the last Elasticsearch version: 7.6! ๐
That's it! I hope you like it. Check out the links below to have additional information related to the post. As always, feedback, likes and retweets are welcome. (see the box below) See you! COil. ๐
Read the doc More on the web More on Stackoverflow
They gave feedback and helped me to fix errors and typos in this article; many thanks to greg0ire, jmsche, Nico.F (Slack Symfony). ๐
Call to action
Did you like this post? You can help me back in several ways: (use the Tweet on the right to comment or to contact me )
- Report any error/typo.
- Report something that could be improved.
- Like and retweet!
- Follow me on Twitter Follow me on Twitter
- Subscribe to the RSS feed.
- Click on the More on Stackoverflow buttons to make me win "Announcer" badges ๐ .
Thank you for reading! And see you soon on Strangebuzz! ๐
[๐ฌ๐ง] 9th blog post of the year. This is the last part of my tutorial "Implementing a search engine with #elasticsearch and #symfony." https://t.co/iy0rufc93F Proofreading, comments, likes and retweets are welcome ! ๐ Annual goal: 9/12(75%)#php #strangebuzz #blog #blogging #tutorial
— [SB] COil (@C0il) November 19, 2019