FroquizFroquiz
HomeQuizzesSenior ChallengeGet CertifiedBlogAbout
Sign InStart Quiz
Sign InStart Quiz
Froquiz

The most comprehensive quiz platform for software engineers. Test yourself with 10000+ questions and advance your career.

LinkedIn

Platform

  • Start Quizzes
  • Topics
  • Blog
  • My Profile
  • Sign In

About

  • About Us
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

Β© 2026 Froquiz. All rights reserved.Built with passion for technology
Blog & Articles

PostgreSQL Full-Text Search: tsvector, tsquery, Ranking and Multilingual Search

Build powerful full-text search with PostgreSQL. Covers tsvector, tsquery, text search configuration, GIN indexes, ranking with ts_rank, highlighting, and multilingual search.

Yusuf SeyitoğluMarch 17, 20261 views10 min read

PostgreSQL Full-Text Search: tsvector, tsquery, Ranking and Multilingual Search

PostgreSQL has a sophisticated built-in full-text search engine that handles stemming, stop words, ranking, and multiple languages. For many applications, it eliminates the need for an external search service like Elasticsearch. This guide shows you how to build production-quality search using only PostgreSQL.

Core Concepts

tsvector

A tsvector is a preprocessed, sorted list of unique words (lexemes) extracted from a document, with their positions. It handles:

  • Normalization β€” lowercasing, removing punctuation
  • Stemming β€” "running" β†’ "run", "cats" β†’ "cat"
  • Stop words β€” removing "the", "is", "a", etc.
sql
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog'); -- 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2 -- stop words (the, over) removed, words stemmed and sorted

tsquery

A tsquery is a search expression using operators:

  • & β€” AND (both terms must match)
  • | β€” OR (either term matches)
  • ! β€” NOT (term must not match)
  • <-> β€” phrase search (adjacent words)
sql
SELECT to_tsquery('english', 'javascript & (react | vue)'); -- 'javascript' & ( 'react' | 'vue' ) SELECT to_tsquery('english', 'full <-> text <-> search'); -- phrase: "full text search" (words must be adjacent in order) -- plainto_tsquery: natural language input (no operators needed from user) SELECT plainto_tsquery('english', 'full text search postgresql'); -- 'full' & 'text' & 'search' & 'postgresql' -- websearch_to_tsquery: Google-like syntax SELECT websearch_to_tsquery('english', 'react hooks -class "use effect"'); -- 'react' & 'hook' & !'class' & 'use' <-> 'effect'

The @@ operator

Test if a tsvector matches a tsquery:

sql
SELECT to_tsvector('english', 'PostgreSQL full text search') @@ to_tsquery('english', 'postgresql & search'); -- true SELECT to_tsvector('english', 'PostgreSQL full text search') @@ to_tsquery('english', 'mysql'); -- false

Setting Up Full-Text Search on a Table

Adding a tsvector column

sql
ALTER TABLE articles ADD COLUMN search_vector tsvector; -- Populate the column UPDATE articles SET search_vector = to_tsvector( 'english', COALESCE(title, '') || ' ' || COALESCE(body, '') || ' ' || COALESCE(tags, '') ); -- Auto-update with a trigger CREATE OR REPLACE FUNCTION articles_search_trigger() RETURNS TRIGGER AS $$ BEGIN NEW.search_vector := to_tsvector( 'english', COALESCE(NEW.title, '') || ' ' || COALESCE(NEW.body, '') || ' ' || COALESCE(NEW.tags, '') ); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER articles_search_update BEFORE INSERT OR UPDATE ON articles FOR EACH ROW EXECUTE FUNCTION articles_search_trigger();

GIN Index for fast search

Without an index, full-text search scans every row. A GIN index makes it fast:

sql
CREATE INDEX idx_articles_search ON articles USING GIN(search_vector); -- Now this query uses the index SELECT id, title FROM articles WHERE search_vector @@ to_tsquery('english', 'postgresql & indexing') LIMIT 20;

Weighted Search

Different parts of a document can have different importance. Weight A (highest) β†’ D (lowest):

sql
UPDATE articles SET search_vector = setweight(to_tsvector('english', COALESCE(title, '')), 'A') || setweight(to_tsvector('english', COALESCE(summary, '')), 'B') || setweight(to_tsvector('english', COALESCE(body, '')), 'C') || setweight(to_tsvector('english', COALESCE(tags, '')), 'A');

Matches in the title and tags rank higher than matches in the body.

Ranking Results

sql
SELECT id, title, ts_rank(search_vector, query) AS rank, ts_rank_cd(search_vector, query) AS rank_cd -- cover density variant FROM articles, to_tsquery('english', 'postgresql & performance') query WHERE search_vector @@ query ORDER BY rank DESC LIMIT 10;

ts_rank considers term frequency. ts_rank_cd also considers how close terms appear to each other β€” often better for multi-term queries.

Normalize by document length

sql
-- Divide rank by 1 + log(document length) to avoid long documents dominating SELECT id, title, ts_rank(search_vector, query, 1) AS rank -- normalization option 1 FROM articles, to_tsquery('english', 'database') query WHERE search_vector @@ query ORDER BY rank DESC;

Headline (Excerpt with Highlighting)

Show matching context with highlighted terms:

sql
SELECT title, ts_headline( 'english', body, to_tsquery('english', 'postgresql & search'), 'MaxWords=30, MinWords=15, StartSel=<mark>, StopSel=</mark>' ) AS excerpt FROM articles WHERE search_vector @@ to_tsquery('english', 'postgresql & search');

Result:

code
...using <mark>PostgreSQL</mark>'s built-in <mark>search</mark> capabilities...

Complete Search Function

sql
CREATE OR REPLACE FUNCTION search_articles( query_text TEXT, result_limit INT DEFAULT 10, result_offset INT DEFAULT 0 ) RETURNS TABLE( id INT, title TEXT, excerpt TEXT, rank FLOAT4, total_count BIGINT ) AS $$ DECLARE search_query tsquery; BEGIN search_query := websearch_to_tsquery('english', query_text); RETURN QUERY SELECT a.id, a.title, ts_headline( 'english', a.body, search_query, 'MaxWords=25, MinWords=10, StartSel=<b>, StopSel=</b>' ) AS excerpt, ts_rank_cd(a.search_vector, search_query) AS rank, COUNT(*) OVER() AS total_count FROM articles a WHERE a.search_vector @@ search_query AND a.published = true ORDER BY rank DESC LIMIT result_limit OFFSET result_offset; END; $$ LANGUAGE plpgsql; -- Usage SELECT * FROM search_articles('postgresql full text search', 10, 0);

Multilingual Search

sql
-- Check available text search configurations SELECT cfgname FROM pg_ts_config; -- english, french, german, spanish, portuguese, ... -- Per-row language UPDATE articles SET search_vector = to_tsvector(language::regconfig, body) WHERE language IN ('english', 'french', 'german'); -- Search in a specific language SELECT title FROM articles WHERE to_tsvector('french', body) @@ to_tsquery('french', 'recherche');

Fuzzy Search with pg_trgm

For typo-tolerant search, combine full-text search with trigram similarity:

sql
CREATE EXTENSION pg_trgm; CREATE INDEX idx_articles_title_trgm ON articles USING GIN(title gin_trgm_ops); -- Find titles similar to the query (handles typos) SELECT title, similarity(title, 'postgresq') AS sim FROM articles WHERE title % 'postgresq' -- similarity threshold ORDER BY sim DESC LIMIT 10; -- Combined: full-text + fuzzy fallback WITH fts AS ( SELECT id, title, ts_rank(search_vector, query) AS rank FROM articles, to_tsquery('english', 'postgresql') query WHERE search_vector @@ query ), fuzzy AS ( SELECT id, title, similarity(title, 'postgresql') AS rank FROM articles WHERE title % 'postgresql' AND id NOT IN (SELECT id FROM fts) ) SELECT * FROM fts UNION ALL SELECT * FROM fuzzy ORDER BY rank DESC;

Common Interview Questions

Q: What is the difference between LIKE and full-text search in PostgreSQL?

LIKE '%term%' does a simple substring match β€” it cannot use a B-tree index on the interior, does not understand language (no stemming or stop words), and cannot rank results. Full-text search with tsvector/tsquery supports GIN indexes for fast search, handles stemming (matching "run" finds "running"), removes stop words, ranks results by relevance, and supports phrase and proximity search.

Q: What is a GIN index and why is it used for full-text search?

GIN (Generalized Inverted Index) stores an index mapping each unique lexeme in the corpus to the set of documents containing it β€” the inverse of a document pointing to its words. This makes it extremely fast for looking up "which documents contain this word" queries. B-tree indexes store values in sorted order, which is efficient for range queries but not for searching inside complex types like tsvector.

Q: How do you keep the search_vector column up to date?

The standard approach is a BEFORE INSERT OR UPDATE trigger that recalculates and stores the tsvector whenever a row is inserted or updated. This keeps the index always current at the cost of slightly slower writes. For very write-heavy tables, an alternative is to not store tsvector at all and compute it inline:

code
WHERE to_tsvector('english', body) @@ query
β€” but this prevents GIN index use.

Practice SQL on Froquiz

Full-text search and advanced PostgreSQL features appear in backend and data engineering interviews. Test your SQL knowledge on Froquiz across all difficulty levels.

Summary

  • tsvector is a preprocessed, stemmed word list; tsquery is a boolean search expression
  • @@ matches a tsvector against a tsquery β€” the core search operator
  • GIN index on search_vector makes full-text search fast β€” always add it to production tables
  • setweight assigns A-D importance to different parts (title, body, tags)
  • ts_rank and ts_rank_cd score relevance; sort by rank descending for best results
  • ts_headline generates excerpts with highlighted matching terms
  • websearch_to_tsquery accepts natural language input β€” ideal for user-facing search boxes
  • Add pg_trgm for fuzzy/typo-tolerant matching alongside full-text search

About Author

Yusuf Seyitoğlu

Author β†’

Other Posts

  • Python Testing with pytest: Fixtures, Parametrize, Mocking and Best PracticesMar 17
  • TanStack Query (React Query): Server State Management Done RightMar 17
  • Software Architecture Patterns: MVC, Clean Architecture, Hexagonal and Event-DrivenMar 17
All Blogs