Full Text Searching is a feature that allows to identify natuaral-language documents (textual entities) matching a query. It provides a lingustic support that goes beyond regular expressions by handling the derived forms of words e.g. declension, conjugation, etc. As a result it brings more relevency to search results.
Let’s create an example data set.
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
meta JSONB
);
Each document consists of few text column and a JSONB
column with some metadata inside.
tsvector
is a PostgreSQL data type that provides the optimum performance for full text searching. This column holds a subset of content from other fields in a form adapted for full text searching. It concatenates and normalizes textual content from those other fields.
ALTER TABLE articles ADD COLUMN document TSVECTOR;
CREATE INDEX document_idx ON articles USING gin(document);
Once created it needs to be filled with desired values. In the following example we decided to provide full text searching for the content of title
, content
column along with a tags
field from JSONB
column. Each column can have a weight to specify its importance with A
being the most important and D
being the least important.
UPDATE articles
SET document =
setweight(to_tsvector(coalesce(title,'')), 'A') ||
setweight(to_tsvector(coalesce(content,'')), 'B') ||
setweight(to_tsvector(coalesce(meta->>'tags','')), 'D');
Additionally, there should be a trigger to update the field on INSERT/UPDATE
whenever one of the source columns changes.
CREATE FUNCTION articles_document_trigger() RETURNS trigger AS $$
begin
new.document :=
setweight(to_tsvector(coalesce(title,'')), 'A') ||
setweight(to_tsvector(coalesce(content,'')), 'B') ||
setweight(to_tsvector(coalesce(meta->>'tags','')), 'D');
return new;
end
$$ LANGUAGE plpgsql;
The trigger will be run just before any INSERT
or UPDATE
operation.
CREATE TRIGGER articles_document_update BEFORE INSERT OR UPDATE
ON articles
FOR EACH ROW EXECUTE PROCEDURE articles_document_trigger();
Here’s a query to search for documents matching a specific term i.e. a word or sequence of words
SELECT id, title
FROM articles, plainto_tsquery('postgresql') AS q
WHERE (document @@ q);
Here’s a query to search for documents starting with a specific word:
SELECT id, title
FROM articles, tsquery('simple', 'postgresql:*') AS q
WHERE (document @@ q);
Full text seaching is a powerful feature of PostgreSQL. It provides a convenient and performant technique to query for textual data. There is no need for additional software or libraries to start using it. Everything is stored in a single database. This way it is easier to manage and maintain.