Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think I found a mistake. In the article you write: "We then compare that against our database of vectors and find the one(s) that match the closest, i.e., have the lowest dot product and highest similarity."

We want to maximize the normalized dot product (or cosine similarity) to find semantically similar text chunks.



So this would be the highest dot product. Finding the lowest(closest to zero) would mean the two vectors are closer to orthogonal and thus very _not_ similar in direction (semantic meaning). If negative they are semantically opposite.


Good catch! This is a mistake, fixing now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: