LlamaIndex Prompt Engineering Tutorial (FlowGPT)
LlamaIndex Prompt Engineering Tutorial (FlowGPT)
LlamaIndex Prompt Engineering Tutorial (FlowGPT)
Use Cases
Question-Answering
Text Generation
Summarization
Planning
LLM’s
Context
● How do we best augment LLMs with our own private data?
? Use Cases
Question-Answering
Text Generation
Summarization
Vector Stores SQL DB’s Planning
LLM’s
LlamaIndex: A data framework for LLM applications
● Data Management and Query Engine for your LLM application
● Offers components across the data lifecycle: ingest, index, and query over data
Data Ingestion
Data Structures Queries
(LlamaHub 🦙)
● Connect your existing ● Store and index your ● Retrieve and query over
data sources and data data for different use data
formats (API’s, PDF’s, cases. Integrate with ● Includes: QA,
docs, SQL, etc.) different db’s (vector Summarization, Agents,
db, graph db, kv db) and more
Data Connectors: powered by LlamaHub 🦙
● Easily ingest any kind of data, from anywhere
○ into unified document containers
● Powered by community-driven hub
○ rapidly growing (100+ loaders and counting!)
● Growing support for multimodal documents (e.g. with inline images)
In-memory, Vectors,
MongoDB knowledge
graphs,
keywords
Storage Abstractions
KV Stores:
● In-memory
● MongoDB
● S3
Vector Stores:
● Pinecone
● Weaviate
● Chroma
● Milvus
● Faiss
● Qdrant
● Redis
● Deeplake
● Metal
● DynamoDB
● LanceDB
● Opensearch
● etc.
RAG Stack for building a QA System
Data Ingestion / Parsing Data Querying
Chunk
Chunk
Doc
Chunk Chunk
Chunk
Vector LLM
Chunk
Database
Chunk
Current State:
● Load in documents into a text representation
(e.g. from LlamaHub)
Chunk ● Split up document(s) into even chunks (by
sentences, or by tokens)
● Load into vector database
Chunk
Doc
Vector
Chunk Database
Chunk
Naive RAG Stack (Querying)
Current State:
● Find top-k most similar chunks from vector
database collection
● Plug into LLM Response Synthesis
Module
Chunk
Chunk
Vector LLM
Chunk
Database
Response Synthesis
Create and refine
Response Synthesis
Tree Summarize
Let’s Build LLM Response Synthesis!
https://colab.research.google.com/drive/15Qk6cXCj8U5RcvdykGSRdWemn497Dq
cv?usp=sharing
Challenge with RAG Stack
● Top-k retrieval can be limiting - works mostly for questions about specific facts
● What if we wanted to ask summarization questions?
Summary Index: Returns All Context
from llama_index import SummaryIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = SummaryIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
Answer
● The author began writing and programming before college, and studied philosophy in college before switching to AI.
● He realized that AI, as practiced at the time, was a hoax and decided to focus on Lisp hacking instead.
● He wrote a book about Lisp hacking and graduated with a PhD in computer science.
● ….
Building a Unified Query Interface