PostgreSQL Full Text Searching
Full Text Searching is a feature that allows to identify natuaral-language documents (textual entities) matching a query. It provides a lingustic support that goes beyond regular expressions by handling the derived forms of words e.g. declension, conjugation, etc. As a result it brings more relevency to search results.
Let’s create an example data set.
CREATE TABLE articles ( id SERIAL PRIMARY KEY, title TEXT NOT NULL, content TEXT NOT NULL, meta JSONB );
Each document consists of few text column and a
JSONB column with some metadata inside.
tsvector is a PostgreSQL data type that provides the optimum performance for full text searching. This column holds a subset of content from other fields in a form adapted for full text searching. It concatenates and normalizes textual content from those other fields.
ALTER TABLE articles ADD COLUMN document TSVECTOR; CREATE INDEX document_idx ON articles USING gin(document);
Once created it needs to be filled with desired values. In the following example we decided to provide full text searching for the content of
content column along with a
tags field from
JSONB column. Each column can have a weight to specify its importance with
A being the most important and
D being the least important.
UPDATE articles SET document = setweight(to_tsvector(coalesce(title,'')), 'A') || setweight(to_tsvector(coalesce(content,'')), 'B') || setweight(to_tsvector(coalesce(meta->>'tags','')), 'D');
Additionally, there should be a trigger to update the field on
INSERT/UPDATE whenever one of the source columns changes.
CREATE FUNCTION articles_document_trigger() RETURNS trigger AS $$ begin new.document := setweight(to_tsvector(coalesce(title,'')), 'A') || setweight(to_tsvector(coalesce(content,'')), 'B') || setweight(to_tsvector(coalesce(meta->>'tags','')), 'D'); return new; end $$ LANGUAGE plpgsql;
The trigger will be run just before any
CREATE TRIGGER articles_document_update BEFORE INSERT OR UPDATE ON articles FOR EACH ROW EXECUTE PROCEDURE articles_document_trigger();
Here’s a query to search for documents matching a specific term i.e. a word or sequence of words
SELECT id, title FROM articles, plainto_tsquery('postgresql') AS q WHERE (document @@ q);
Here’s a query to search for documents starting with a specific word:
SELECT id, title FROM articles, tsquery('simple', 'postgresql:*') AS q WHERE (document @@ q);
Full text seaching is a powerful feature of PostgreSQL. It provides a convenient and performant technique to query for textual data. There is no need for additional software or libraries to start using it. Everything is stored in a single database. This way it is easier to manage and maintain.