Posts

Showing posts from July 9, 2024

[Day 190] Learning about evaluating vector search engines for RAG apps

Image
 Hello :) Today is Day 190! A quick summary of today: covered Module 3 : vector search from the LLM zoomcamp All the code from today is on my repo . The first part of the module was related to doing semantic search using dense vectors in Elasticsearch 1. Loaded Q&A documents from a json file 2. Created dense vectors for each document using a pre-trained model 3. Created an index in Elasticsearch 4. Indexed the documents in Elasticsearch 5. Performed a semantic search using the dense vectors 6. Filtered the results using a specific section (picture is from the course) Next I learned about evaluating the retrieval mechanism 1. Generate unique IDs for each document to distinguish each other 2. Generate 5 sample questions for each document using the GPT API 3. Save the results to a file to use for evaluation first 10 rows from the created dataset: The ID is needed to connect the sample created questions to the documenta they are related to. Next, I learned about two evaluation metr...

[Day 189] I finished the Car Insurance Fraud MLOps project. Thank you MLOps zoomcamp for teaching me so much!

Image
 Hello :) Today is Day 189! A quick summary of today: completed the project and wrote project description The whole project and all info is on my repo Here is a project diagram I created using lucid.app Well ~ today I added some final things to my project.  First I added terraform code to create GCP buckets for mlflow articats and raw data, and also start a VM. The last one is kind of cool because when we start the VM, mlflow starts automatically as well because the code to start mlflow is included in the `metadata_startup_script` which is: I also added the whole terraform init, plan and apply setup in the project's Makefile for easy set and start up. So now the make help looks like: Next, I added pylint Following the suggestions from MLOps zoomcamp, I added pylint and fixed a lot of whitespace, and import order lines of code as per pylint's suggestions.  Next, I tried to add some tests using pytest I added 5 simple tests to some of the prefect flows.  The exact test...