RAG "Second Brain" for Technical Docs with code
Hello all,
I’m building a RAG (Retrieval-Augmented Generation) system as a "second brain" for ~3-4k docs I’ve collected, each with descriptions and code snippets on cloud, OS, and more. Here’s my approach so far:
- Focus: Starting with retrieval and storage, aiming for quick access to relevant docs.
- Structure of each document: Does it make sense to use also LLM API to create a short, standardised abstract of each doc to help with organization and tagging.
- Storage Options: vector DB / relational for metadata?
Questions:
Any tips for structuring docs with code and descriptions for efficient retrieval
Has LLM summarization/tagging worked well for your projects?
Which VectorDB do you recommend?
Thank you all!