[Day 190] Learning about evaluating vector search engines for RAG apps
Hello :) Today is Day 190! A quick summary of today: covered Module 3 : vector search from the LLM zoomcamp All the code from today is on my repo . The first part of the module was related to doing semantic search using dense vectors in Elasticsearch 1. Loaded Q&A documents from a json file 2. Created dense vectors for each document using a pre-trained model 3. Created an index in Elasticsearch 4. Indexed the documents in Elasticsearch 5. Performed a semantic search using the dense vectors 6. Filtered the results using a specific section (picture is from the course) Next I learned about evaluating the retrieval mechanism 1. Generate unique IDs for each document to distinguish each other 2. Generate 5 sample questions for each document using the GPT API 3. Save the results to a file to use for evaluation first 10 rows from the created dataset: The ID is needed to connect the sample created questions to the documenta they are related to. Next, I learned about two evaluation metr...