Wednesday, 11 July 2018

Don’t Waste Your Time With MySQL Full-Text Search

Don’t Waste Your Time With MySQL Full-Text Search

I spent precious hours trying to use the InnoDB full-text search feature included in MySQL since version 5.6, to eventually discover the use cases where it works are rather limited
As an illustration, let’s create a very basic table:
SELECT TABLE document (
  id int PRIMARY KEY,
  content longtext NOT NULL,
  FULLTEXT KEY (content)
)
The table is filled with 2 million rows. The average length of content is 2 kilobytes.
Now, let’s run a simple query:
SELECT id 
FROM document 
WHERE MATCH(content) AGAINST ('commercial' IN BOOLEAN MODE) 
LIMIT 50
It’s executed in 50 ms. No problem.
Now, let’s combine filtering on a “normal” column and full-text search in the same query:
SELECT id 
from document 
WHERE MATCH(content) AGAINST ('commercial' IN BOOLEAN MODE)
AND id > 10000000
LIMIT 50
It’s executed in 50 seconds, which makes it unusable…
Please note this is the same query as above, with only one additional condition (id > 10000000). Moreover, the query doesn’t return any rows, because the additional condition excludes all rows (all IDs are below 10,000,000).
Why is it so slow? It’s because the term “commercial” used in my query is common (~300,000 documents contain the term) and because MySQL is not good at merging indexes. Basically, MySQL uses the full-text index to lookup for the term “commercial”, and then it does a kind of nested join to lookup for the 300,000 rows and check the condition on id. The former is very quick, but the latter is extremely slow.
It would be really great for MySQL to be able to combine multiple index using an in memory bitmap. This would make the built-in full-text search a lot more useful. In the meantime, I have to find an alternate solution.
If you have any idea about how to solve this without switching to another database engine, please reach out to me.

0 comments:

Post a Comment