The letter S in a light blue, stylized speech bubble followed by SpeakBits
SpeakBitsThe letter S in a light blue, stylized speech bubble followed by SpeakBits
Trending
Top
New
Controversial
Search
Groups

Enjoying SpeakBits?

Support the development of it by donating to Patreon or Ko-Fi.
About
Rules
Terms
Privacy
EULA
Cookies
Blog
Have feedback? We'd love to hear it!

Postgres as a search engine

anyblockers.com
submitted
8 mos ago
byboredgamertoprogramming

Summary

Use Postgres to create a robust search engine. Build a retrieval system with semantic, full-text, and fuzzy search to be used as a backbone in RAG pipelines.

I’ve replaced the Supabase function with just CTEs and a query, as well as prefixed the params with $. Here’s what it’ll look like:

Here we’re using Reciprocal Fusion (RRF) to merge the results. It ensures that items ranked high in multiple lists are given a high rank in the final list.

The new query uses the new pg_trgm.similarity_threshold operator to filter out text where similarity is larger than 0.3.

Tuning full text search results can be done by adding a body column to a document. We expect the title to be short and the body to be longer and include more details.

The title is given more weight than the body because it is a concise representation of the document's main topic. The body is given a weight of 0.2.

Length normalization: When you want to strongly normalize by document length. Unique word normalization is when you want more diverse vocabulary. Use the bitwise option to specify one or more behaviors.

Use a bi-directional encoder to retrieve the initial N results, then a cross-encoder to rank these against the search query. The bi-encoders are fast, making it great for searching a multitude of documents. The cross- Encoders is slower but more performant, making them great for reranking retrieved results.

Postgres’s reliance on TF-IDF for full-text search can struggle with very long documents and rare terms. Use the tools below to help you with your search.

 web site website internet site site menu cash machine cash dispenser automated teller machine automatic teller machine automated teller automatic teller ATM crossword puzzle crossword-0
14

7 Comments

3
justadev
8 mos ago
I don't think I ever would have thought to join all three together like that but it's pretty brilliant!
2
boredgamerOP
8 mos ago
It's worked out pretty well for me! Like it says though, get too long and it struggles
2
joseph
8 mos ago
What's the performance like for this?
2
boredgamerOP
8 mos ago
100s of milliseconds but nothing too long
2
joseph
8 mos ago
Might be good to implement on the SpeakBits search then. One of the three is already there but the other two should be really helpful.
2
boredgamerOP
8 mos ago
Should give more relevant results then!
2
joseph
8 mos ago
Here's hoping! I feel the current implementation is pretty good though