The problem with this project is that it doesn't solve the valuable problem, whi...

yoeven · on Dec 21, 2023

Yeah I agree to an extent. Using a traditional search engine would be simpler and easier to implement but wouldn't able to accurately contextualise the actual content of the video based on the users question which is the focus on the tool. However, I do agree that there is a lot of space for growth and adding a traditional form of full text fuzzy search which will help with some of the ranking problems and it is part of the plans to mix the best of both worlds :)

Ranking is a huge topic by itself, which beyond similarity/text matching, other topics like SEO, popularity, etc plays a huge part and those are aspects that I'm looking forward to understand better and see how the community can contribute as well!

ZephyrBlu · on Dec 21, 2023

> but wouldn't able to accurately contextualise the actual content of the video based on the users question which is the focus on the tool

Are you sure this is what users want? Semantic search is not appropriate for all situations.

> adding a traditional form of full text fuzzy search which will help with some of the ranking problems

Full text fuzzy search is NOT a performant search engine and is NOT related to ranking or relevance. Ranking is an independent process after finding matching results.

Semantic search would make more sense in the context of ranking rather than pure search. E.g. you use traditional search to find matching documents then semantic search on matches to rank them.

> Ranking is a huge topic by itself, which beyond similarity/text matching, other topics like SEO, popularity, etc plays a huge part

Based on my mediocre understanding of ranking, basic ranking is generally not about any of these factors. Presumably because they are too slow and computationally intensive. Maybe there are multiple layers of ranking for these different features though.

My understanding is that basic ranking is/was more about metrics like TF-IDF. I’m sure there are more advanced modern techniques, but also likely more complicated.

Search is a ridiculously big and complex topic. If you want this to be more than a toy project I think it would be wise to focus on much smaller sliver and have a much clearer value prop.

You are currently trying to tackle multiple big problems simultaneously.