Group Abstract

Message Boards

WOLFRAM COMMUNITY

4

|

10.6K Views

|

5 Replies

|

8 Total Likes

View groups...

Follow this post

Share

Share this post:

GROUPS:

Staff Picks Computer Science Data Science Mathematics External Programs and Systems Wolfram Language Machine Learning Natural Language Processing Artificial Intelligence

Leveling up RAG: implementing hybrid search and reranking

Roee Ziv, Ben-Gurion University

Posted 5 months ago

Attachments: DOWNLOAD-DESKTOP...nb

POSTED BY: Roee Ziv

5 Replies

Sort By:

1

Roee Ziv, Ben-Gurion University

Posted 4 months ago

Thank you for this excellent and thorough analysis -- this is exactly the kind of community experimentation I was hoping the post would inspire. Your Japanese Constitution example is a fascinating stress test because it exposes a challenge that my CV corpus does not: a large, homogeneous corpus where many chunks share nearly identical vocabulary and structure. With 103 constitutional articles that all use formal legal language about "members," "Houses," "term of office," and "shall," both BM25 and semantic search face a much harder disambiguation problem than distinguishing between four named candidates. Your results actually validate the progressive improvement at each pipeline stage beautifully — even if the target of top-3 wasn't fully reached: \| Stage \| Articles 45, 46, 102 within Top-N \| \|---\|---\| \| SemanticSearch \| Top 6 \| \| bm25Search \| Top 7 \| \| hybridSearch \| Top 4 \| \| rerank \| Top 5 \| The hybrid fusion step demonstrably compressed the relevant articles from scattered across the top 6–7 into the top 4. That is the RRF mechanism working as intended. Your oracle filter idea is very compelling -- using the LLM itself as a post-retrieval relevance judge is essentially what the research community calls an LLM-as-judge reranker, and it is increasingly recognized as one of the most effective final-stage filters. It's a natural fourth stage in the pipeline: Query → Hybrid Retrieval → Reranking → Oracle Filter → LLM Synthesis The tradeoff is cost and latency (one LLM call per candidate chunk), but as you demonstrated, it can be the decisive step when the other stages bring relevant chunks close but not quite into the top-k. Your observation about BM25 and Japanese word segmentation is also very important. BM25 relies on whitespace tokenization, which works naturally for English but breaks down for languages like Japanese that do not use spaces between words. For readers who want to apply hybrid search to Japanese, integrating a proper morphological tokenizer (such as MeCab for Japanese) into the `tokenize` function would be an essential adaptation. Thank you again for sharing this -- your oracle filter extension and the multilingual considerations are genuinely valuable contributions that go beyond what I covered. I may reference your results in a future follow-up.

POSTED BY: Roee Ziv

2

Kotaro Okazaki, Fsas Technologies

Posted 4 months ago

POSTED BY: Kotaro Okazaki

0

Roee Ziv, Ben-Gurion University

Posted 5 months ago

Excited for future contributions :)

POSTED BY: Roee Ziv

0

Roee Ziv, Ben-Gurion University

Posted 5 months ago

Thank you! It’s truly an honor to contribute to the amazing Wolfram community - grateful to be part of such an inspiring group.

POSTED BY: Roee Ziv

1

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 5 months ago

-- you have earned *Featured Contributor Badge* Your exceptional post has been selected for our editorial column *Staff Picks* http://wolfr.am/StaffPicks and Your Profile is now distinguished by a *Featured Contributor Badge* and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback