0% found this document useful (0 votes)
202 views

PWC AI Engineer Interview Assignment Guidelines

The document describes building a question answering application using AWS serverless technologies. It uses Langchain to retrieve relevant text from documents for context when answering questions. An AWS CDK project is created to define the infrastructure. Poetry is used for dependency management. Clinical trial documents are ingested and embedded for retrieval. Users can then ask questions and get answers augmented with relevant context passages.

Uploaded by

Harsha vardhna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
202 views

PWC AI Engineer Interview Assignment Guidelines

The document describes building a question answering application using AWS serverless technologies. It uses Langchain to retrieve relevant text from documents for context when answering questions. An AWS CDK project is created to define the infrastructure. Poetry is used for dependency management. Clinical trial documents are ingested and embedded for retrieval. Users can then ask questions and get answers augmented with relevant context passages.

Uploaded by

Harsha vardhna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

‭ erverless GenAI Question Answering application using AWS CDK, AWS‬

S
‭Lambda, Poetry, Langchain, OpenAI, Pinecone and Gradio‬
‭ his post demonstrates building a basic Document Q&A Generative AI application using AWS serverless technologies,‬
T
‭Langchain, AWS Cloud Development Kit (CDK), Poetry, OpenAI, Pinecone and Gradio.‬
‭ arge Language Models (LLMs) perform quite well on general knowledge retrieval tasks because they have been trained‬
L
‭on a large corpus of openly available internet data. However, in an enterprise environment or use cases where we need to‬
‭ask questions on more specialized topics, a general purpose LLM will have issues in coming up with precise responses.‬
‭LLMs can be used for more complex and knowledge-intensive tasks by using an approach called Retrieval-Augmented‬
‭Generation (RAG). RAG combines retrieval-based methods with generative language models to improve the quality of‬
‭generated text, especially in question answering and text generation tasks.‬
‭ or the purposes of this example, we will build a simple system to do question-answering on openly available clinical trial‬
F
‭protocol documents (https://clinicaltrials.gov/). To setup the knowledge base for the RAG approach, we will build a data‬
‭pipeline that ingests a clinical protocol document (pdf format), converts the pdf to text format, chunk the document into‬
‭smaller pieces, convert the chunks into embeddings using an embedding model and store the embeddings in a vector‬
‭database.‬
‭ ow, when we ask a question to the LLM about the clinical trial, we can provide more context to the LLM by providing‬
N
‭relevant chunks of the document as part of the prompt. This can be done by using a framework such as Langchain, that‬
‭takes the user question, converts the question into a vector representation, does a semantic search against the‬
‭knowledge base, retrieves the relevant chunks, ranks them by relevance and sends the chunks as context in the prompt‬
‭to the LLM.‬

‭Architecture‬
‭The following diagram shows a high level architecture of the system:‬

‭3‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭AWS Services and OSS Frameworks Used‬
‭ WS Cloud Development Kit v2 (CDK)‬‭: Framework for‬‭defining cloud infrastructure in code and provisioning it using‬
A
‭infrastructure-as-code. It enables developers to use programming languages like Python, TypeScript & Java to define‬
‭AWS resources in a higher-level, declarative manner. It provides benefits such as the ability to use code completion in an‬
‭IDE instead of wrangling with yaml files, out of the box default, sensible configurations, declarative syntax, reusable code‬
‭through constructs, built-in leading practices.‬
‭ inecone‬‭: Vector engine to store and search high-dimensional‬‭vector embeddings in real-time. You could also use AWS‬
P
‭alternatives for a vector database. AWS recently announced the preview launch of a vector engine for Amazon‬
‭OpenSearch Serverless (‬‭https://aws.amazon.com/opensearch-service/serverless-vector-engine/‬‭).‬‭There are other options‬
‭available on AWS for a vector database alternative such as‬‭Amazon Aurora PostgreSQL 15.3 with the pgvector‬
‭extension.‬
‭ angChain‬‭: LangChain is a framework for developing‬‭applications powered by language models. It enables developers‬
L
‭to combine LLMs with external data to create custom-knowledge chatbots, question-answering systems, summarization‬
‭tools, and other applications that can interact with users in natural language. We will build a AWS Lambda Layer that‬
‭includes Langchain and other relevant libraries for the backend systems to use as part of the application.‬
‭ oetry‬‭: Poetry is a dependency management and packaging‬‭tool for Python programming. It aims to simplify and improve‬
P
‭the process of managing project dependencies, creating virtual environments, and packaging Python applications and‬
‭libraries. Poetry provides a unified solution that combines the capabilities of other existing tools like pip, virtualenv, and‬
‭setuptools into a single workflow. Poetry allows developers to declare project dependencies in a dedicated pyproject.toml‬
‭file. Poetry performs dependency resolution to confirm that the specified packages and their versions are compatible with‬
‭each other. It generates a "lock file" called poetry.lock, which records the exact versions of the dependencies used in the‬
‭project.‬
‭ radio‬‭: Gradio is an open-source Python framework‬‭that enables developers to create interactive web applications for‬
G
‭data science and machine learning projects with minimal effort. It enables users to create dynamic and interactive web‬
‭applications directly from their Python code, without the need for extensive web development expertise.‬

‭Backend Data Ingestion Pipeline‬


‭•‬ ‭We will start by installing the AWS CDK and creating a template CDK project.‬
‭Install the AWS CDK Toolkit globally using the following Node Package Manager command.‬

‭Unset‬
npm‬‭
‭ install‬‭
-g‬‭
aws-cdk‬

‭Run the following command to verify you installed the tool correctly and print the version installed.‬

‭Unset‬
cdk‬‭
‭ --version‬

‭Run the following command to create a sample CDK project in Python language.‬

‭C/C++‬
cdk‬‭
‭ init‬‭
app‬‭
--language‬‭
python‬

‭4‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Within the sample CDK project folder, create a sample Poetry project. Copy over the pyproject.toml file to the base CDK‬
‭folder. Delete the sample poetry project folder. Deleted .venv file that was setup when CDK initiated the project.‬
‭•‬ ‭Run ‘poetry shell’, to start a new virtual environment. Run ‘poetry add aws-cdk-lib’, ‘poetry add constructs’, 'poetry add‬
‭langchain pinecone-client pypdf openai, gradio, python-dotenv' to add packages.‬

‭You should see an output similar to this after you run the poetry install command.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
install‬
Updating‬‭
‭ dependencies‬
Resolving‬‭
‭ dependencies...‬‭
(2.4s)‬

Package‬‭
‭ operations:‬‭
52‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

•‬‭
‭ Installing‬‭
attrs‬‭
(23.1.0)‬

‭5‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭‬‭
• Installing‬‭
exceptiongroup‬‭(1.1.2)‬
•‬‭
‭ Installing‬‭
packaging‬‭(23.1)‬
•‬‭
‭ Installing‬‭
six‬‭
(1.16.0)‬
•‬‭
‭ Installing‬‭
typing-extensions‬‭(4.7.1)‬
•‬‭
‭ Installing‬‭
zipp‬‭
(3.16.2)‬
•‬‭
‭ Installing‬‭
cattrs‬‭
(23.1.2)‬
•‬‭
‭ Installing‬‭
certifi‬‭(2023.7.22)‬
•‬‭
‭ Installing‬‭
charset-normalizer‬‭ (3.2.0)‬
•‬‭
‭ Installing‬‭
frozenlist‬‭(1.4.0)‬
•‬‭
‭ Installing‬‭
idna‬‭
(3.4)‬
•‬‭
‭ Installing‬‭
importlib-resources‬‭ (6.0.1)‬
•‬‭
‭ Installing‬‭
marshmallow‬‭(3.20.1)‬
•‬‭
‭ Installing‬‭
multidict‬‭(6.0.4)‬
•‬‭
‭ Installing‬‭
mypy-extensions‬‭(1.0.0)‬
•‬‭
‭ Installing‬‭
publication‬‭(0.0.3)‬
•‬‭
‭ Installing‬‭
python-dateutil‬‭(2.8.2)‬
•‬‭
‭ Installing‬‭
typeguard‬‭(2.13.3)‬
•‬‭
‭ Installing‬‭
urllib3‬‭(2.0.4)‬
•‬‭
‭ Installing‬‭
aiosignal‬‭(1.3.1)‬
•‬‭
‭ Installing‬‭
async-timeout‬‭(4.0.3)‬
•‬‭
‭ Installing‬‭
greenlet‬‭(2.0.2)‬
•‬‭
‭ Installing‬‭
jsii‬‭
(1.87.0)‬
•‬‭
‭ Installing‬‭
marshmallow-enum‬‭(1.5.1)‬
•‬‭
‭ Installing‬‭
numpy‬‭
(1.25.2)‬
•‬‭
‭ Installing‬‭
pydantic‬‭(1.10.12)‬
•‬‭
‭ Installing‬‭
requests‬‭(2.31.0)‬
•‬‭
‭ Installing‬‭
typing-inspect‬‭(0.9.0)‬
•‬‭
‭ Installing‬‭
yarl‬‭
(1.9.2)‬
•‬‭
‭ Installing‬‭
aiohttp‬‭(3.8.5)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-awscli-v1‬‭ (2.2.200)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-kubectl-v20‬‭ (2.1.2)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-node-proxy-agent-v5‬‭ (2.0.166)‬
•‬‭
‭ Installing‬‭
click‬‭
(8.1.6)‬
•‬‭
‭ Installing‬‭
constructs‬‭(10.2.69)‬
•‬‭
‭ Installing‬‭
dataclasses-json‬‭(0.5.9)‬
•‬‭
‭ Installing‬‭
langsmith‬‭(0.0.22)‬
•‬‭
‭ Installing‬‭
mccabe‬‭
(0.7.0)‬
•‬‭
‭ Installing‬‭
numexpr‬‭(2.8.5)‬
•‬‭
‭ Installing‬‭
openapi-schema-pydantic‬‭ (1.2.4)‬
•‬‭
‭ Installing‬‭
pathspec‬‭(0.11.2)‬
•‬‭
‭ Installing‬‭
platformdirs‬‭(3.10.0)‬
•‬‭
‭ Installing‬‭
pycodestyle‬‭(2.11.0)‬
•‬‭
‭ Installing‬‭
pyflakes‬‭(3.1.0)‬
•‬‭
‭ Installing‬‭
pyyaml‬‭
(6.0.1)‬
•‬‭
‭ Installing‬‭
sqlalchemy‬‭(2.0.19)‬
•‬‭
‭ Installing‬‭
tenacity‬‭(8.2.2)‬
•‬‭
‭ Installing‬‭
tomli‬‭
(2.0.1)‬
•‬‭
‭ Installing‬‭
aws-cdk-lib‬‭(2.91.0)‬
•‬‭
‭ Installing‬‭
black‬‭
(23.7.0)‬
•‬‭
‭ Installing‬‭
flake8‬‭
(6.1.0)‬
•‬‭
‭ Installing‬‭
langchain‬‭(0.0.251)‬

‭6‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


Writing‬‭
‭ lock‬‭
file‬

Installing‬‭
‭ the‬‭
current‬‭
project:‬‭
genai_blog‬‭
(0.1.0)‬

‭•‬ ‭Create an index in Pinecone (‬‭https://www.pinecone.io‬‭).‬‭You could use the Starter free tier option that allows you to‬
‭create one index and one project. Give the index a name, enter 1536 for dimension size and euclidean as Metric.‬

‭ he reason we use 1536 as the dimension is because we will be using the OpenAI embeddings for this application.‬
T
‭Please refer this link for details: (‬‭https://platform.openai.com/docs/guides/embeddings/what-are-embeddings‬‭)‬
‭Next, create an API key within your Pinecone account. Note the API key value, Environment and Pinecone index name.‬
‭•‬ ‭We will use the GPT 3.5 model as the LLM for this application. Go to‬‭https://platform.openai.com/account/api-keys‬‭and‬
‭create an API key.‬
‭•‬ ‭We will save all the keys into a secret in AWS Secrets Manager and retrieve the keys wherever required in our‬
‭application using the boto3 library. Create an api_keys.json file, enter your keys and create a secret using the aws cli.‬

‭7‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Create a LangChain Layer in AWS Lambda - Create a folder layers/common-layer/python. Create a requirements.txt file‬
‭under/common-layer and add langchain, urllib3, openai, tiktoken as dependencies. The file structure should look similar‬
‭to the following screenshot.‬

‭•‬ ‭Create a Makefile at the project root level.‬

‭8‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Add a ‘make' command in cdk.json.‬

‭9‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭The project structure and the initial dependencies have now been set up. We can add more libraries as we go along‬
‭building the application via poetry install. If the library needs to be a part of the Lambda layers package, we will need to‬
‭add those dependencies to the requirements.txt file under layers folder.‬
‭•‬ ‭We will create a stack 'genai_blog_stack' using the Python programming language. The code will create the following:‬
‭–‬‭An S3 bucket‬
‭–‬‭Create a SQS queue to receive events when documents land in the "genai-demo" bucket under 'raw' prefix,‬
‭–‬‭Attach an event to the S3 bucket to send notifications to the SQS queue,‬
‭–‬‭Build a common layer for libraries such as langchain, etc.‬
‭–‬‭Add a lambda function to poll the queue, read the pdf, extract the pdf and convert to embeddings,‬
‭–‬‭Grant necessary permissions to the Lambda function‬
‭–‬‭Configure the Lambda function to be triggered by messages from the SQS queue.‬
‭–‬‭Grant permission to the Lambda function to access the secret‬

‭All this in less than 50 lines of code! Following is the code snippet:‬

‭Python‬
‭rom‬‭
f aws_cdk‬‭
import‬‭
RemovalPolicy,‬‭Stack,‬‭
Duration‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3‬‭
as‬‭
s3‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_sqs‬‭
as‬‭
sqs‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3_notifications‬‭as‬‭
s3n‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda‬‭
as‬‭
lambda_‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda_event_sources‬‭as‬‭event_sources‬
from‬‭
‭ constructs‬‭
import‬‭
Construct‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_secretsmanager‬‭as‬‭
secretsmanager‬

class‬‭
‭ GenaiBlogStack(Stack):‬
def‬‭
‭ __init__(self,‬‭
scope:‬‭
Construct,‬‭
construct_id:‬‭
str‬
,‬‭
‭ **kwargs)‬‭
->‬‭
None‬
:‬

super‬
‭ ().__init__(scope,‬‭
‭ construct_id,‬‭
**kwargs)‬

‭ Create the S3 bucket‬


#
bucket‬‭
‭ =‬‭
s3.Bucket(self,‬‭
"genai-demo"‬
,‬‭
‭ removal_policy=RemovalPolicy.DESTROY)‬

# Create a SQS queue to receive events when documents land in the "genai-demo" bucket‬

under 'raw' prefix‬

queue‬‭
‭ =‬‭
sqs.Queue(self,‬‭
"genai-demo-queue"‬
,‬‭
‭ visibility_timeout=Duration.minutes(‬
15‬
‭ ))‬

‭ Pass the queue URL as an environment variable‬


#
env_queue_url‬‭
‭ =‬‭
queue.queue_url‬

‭ Attach an event to the S3 bucket to send notifications to the SQS queue‬


#
bucket.add_event_notification(‬

s3.EventType.OBJECT_CREATED,‬

s3n.SqsDestination(queue),‬

s3.NotificationKeyFilter(‬

prefix=‬
‭ "raw/"‬
‭ ,‬

suffix=‬
‭ ".pdf"‬

)‬

‭10‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


)‬

‭ Build common layer for libraries such as langchain, etc.‬


#
common_lambda_layer‬‭
‭ =‬‭
lambda_.LayerVersion(‬
self,‬

"CommonLambdaLayer"‬
‭ ,‬

code=lambda_.Code.from_asset(‬
‭ "./layers/common-layer"‬
‭ ),‬

compatible_runtimes=[lambda_.Runtime.PYTHON_3_9],‬

layer_version_name=‬
‭ "common_lambda_layer"‬

)‬

# Add a lambda function to poll the queue, read the pdf, extract the pdf and convert‬

to embeddings‬

pdf_extraction_lambda‬‭
‭ =‬‭
lambda_.Function(‬
self,‬

'GenaiPDFHandler'‬
‭ ,‬

runtime=lambda_.Runtime.PYTHON_3_9,‬

timeout=Duration.seconds(‬
‭ 300‬
‭ ),‬

memory_size=‬
‭ 2048‬
‭ ,‬

code=lambda_.Code.from_asset(‬
‭ 'lambda'‬
‭ ),‬

handler=‬
‭ 'genaiblog_pdf_extraction.handler'‬
‭ ,‬

layers‬‭
‭ =[common_lambda_layer],‬
environment={‬

"QUEUE_URL"‬
‭ :‬‭
‭ env_queue_url‬
}‬

)‬

# Grant necessary permissions to the Lambda function‬

queue.grant_send_messages(pdf_extraction_lambda)‬

‭ Configure the Lambda function to be triggered by messages from the SQS queue‬
#
queue_trigger‬‭
‭ =‬‭
event_sources.SqsEventSource(queue,‬‭
batch_size=‬
1‬
‭ )‬

pdf_extraction_lambda.add_event_source(queue_trigger)‬

‭ Add S3 read permissions to the Lambda function‬


#
bucket.grant_read(pdf_extraction_lambda)‬

‭ Grant permission to the Lambda function to access the secret‬


#
example_secret‬‭
‭ =‬‭secretsmanager.Secret.from_secret_name_v2(‬
scope=self,‬‭
‭ id‬
=‭
‭"‬secretExample"‬
,‬‭
‭ secret_name=‬
"demo/gb"‬

)‬

example_secret.grant_read(grantee=pdf_extraction_lambda)‬

‭•‬ ‭Write the lambda function to read the pdf file uploaded to S3 bucket, extract pdf to text, convert to embeddings and‬
‭store in a vector database.‬

‭Python‬
‭mport‬‭
i json‬
import‬‭
‭ boto3‬

‭11‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭mport‬‭
i os‬
import‬‭
‭ pinecone‬
from‬‭
‭ langchain‬‭
import‬‭
PromptTemplate‬
from‬‭
‭ langchain.document_loaders‬‭ import‬‭
PyPDFLoader‬
from‬‭
‭ langchain.llms.openai‬‭import‬‭
OpenAI‬
from‬‭
‭ langchain.vectorstores.pinecone‬‭ import‬‭
Pinecone‬
from‬‭
‭ langchain.embeddings.openai‬‭ import‬‭
OpenAIEmbeddings‬
from‬‭
‭ langchain.chains‬‭import‬‭
RetrievalQA‬
from‬‭
‭ botocore.exceptions‬‭import‬‭
ClientError‬

s3‬‭
‭ =‬‭
boto3.client(‬
's3'‬
‭ )‬

def‬‭
‭ get_secret():‬

‭ecret_name‬‭
s =‬‭
"demo/gb"‬
region_name‬‭
‭ =‬‭
"us-east-1"‬

‭ Create a Secrets Manager client‬


#
session‬‭
‭ =‬‭boto3.session.Session()‬
client‬‭
‭ =‬‭
session.client(‬
service_name=‬
‭ 'secretsmanager'‬
‭ ,‬

region_name=region_name‬

)‬

# Retrieve the secret value‬

try‬
‭ :‬

get_secret_value_response‬‭
‭ =‬‭
client.get_secret_value(‬
SecretId=secret_name‬

)‬

except‬‭
‭ ClientError‬‭as‬‭
e:‬
raise‬‭
‭ e‬

‭ Decrypts secret using the associated KMS key.‬


#
secret‬‭
‭ =‬‭
get_secret_value_response[‬
'SecretString'‬
‭ ]‬

secret_json‬‭
‭ =‬‭
json.loads(secret)‬
return‬‭
‭ secret_json‬

def‬‭
‭ handler(event,‬‭
context):‬
print‬
‭ (‬
‭ "request: {}"‬
‭ .‬
‭ format‬
‭ (json.dumps(event)))‬

get_secret()‬

# Loops through the records. We have set queue‬‭


‭ batch size to 1. So, only 1 record will be‬
retrieved.‬

for‬‭
‭ record‬‭
in‬‭
event[‬"Records"‬
‭ ]:‬

body‬‭
‭ =‬‭
record[‬"body"‬
‭ ]‬

body_json‬‭
‭ =‬‭
json.loads(body)‬

‭ucket_name‬‭
b =‬‭
body_json[‬
"Records"‬
‭ ][‬
‭ 0‭
‭]‬[‬
"s3"‬
‭ ][‬
‭ "bucket"‬
‭ ][‬
‭ "name"‬
‭ ]‬

print‬
‭ (bucket_name)‬

key‬‭
‭ =‬‭
body_json[‬"Records"‬
‭ ][‬
‭ 0‭
‭]
‬[‬
"s3"‬
‭ ][‬
‭ "object"‬
‭ ][‬
‭ "key"‬
‭ ]‬

print‬
‭ (key)‬

# Read PDF file from S3 into Lambda temporary‬‭


‭ storage‬

‭12‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭ile_name‬‭
f =‬‭
os.path.basename(key)‬
local_file_path‬‭
‭ =‬‭
'/tmp/'‬‭
+‬‭
file_name‬‭# Path‬‭
to store the downloaded file locally‬
s3.download_file(bucket_name,‬‭
‭ key,‬‭
local_file_path)‬

# Load PDF using pypdf into array of documents,‬‭


‭ where each document contains the page‬
content and metadata with page number.‬

loader‬‭
‭ =‬‭
PyPDFLoader(local_file_path)‬
pages‬‭
‭ =‬‭
loader.load_and_split()‬
print‬
‭ (pages[‬
‭ 0‭
‭]
‬)‬

‭ Retrieve API keys from Secrets Manager‬


#
secret‬‭
‭ =‬‭
get_secret()‬
pinecone_api_key‬‭
‭ =‬‭
secret[‬'PINECONE_API_KEY'‬
‭ ]‬

pinecone_env‬‭
‭ =‬‭
secret[‬ 'PINECONE_API_ENV'‬
‭ ]‬

pinecone_index_name‬‭
‭ =‬‭
secret[‬
'PINECONE_INDEX_NAME'‬
‭ ]‬

openai_api_key‬‭
‭ =‬‭
secret[‬ 'OPENAI_API_KEY'‬
‭ ]‬

# Generate embeddings for the pages, insert‬‭


‭ into Pinecone vector database, and expose the‬
index in a retriever interface‬

pinecone.init(api_key=pinecone_api_key,‬‭
‭ environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬
Pinecone.from_documents(pages,‬‭
‭ embeddings,‬‭
index_name=pinecone_index_name)‬

print‬
‭ (f‬
‭ '‬
‭ {key}‬‭
‭ loaded‬‭
to‬‭
pinecone‬‭
vector‬‭
store...‬
'‭
‭ )
‬‬
return‬‭
‭ {‬
"statusCode"‬
‭ :‬‭
‭ 200‬
,‬

"headers"‬
‭ :‬‭
‭ {‭
"
‬Content-Type"‬ :‬‭
‭ "text/plain"‬},‬

"body"‬
‭ :‬‭
‭ "Document successfully loaded to vector‬‭ store"‬
,‬

}‬

‭•‬ ‭Run the following command to synthesize the AWS CloudFormation templates based on the CDK code.‬

‭Unset‬
cdk‬‭
‭ synth‬

‭•‬ ‭Run the following command to deploy the stack.‬

‭Unset‬
cdk‬‭
‭ deploy‬

‭13‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Wait for the application to be successfully deployed. We will upload an openly available protocol document to our S3‬
‭bucket. Use the following command to upload the pdf to your bucket. Protocol available to download here:‬
‭https://classic.clinicaltrials.gov/ProvidedDocs/78/NCT03078478/Prot_000.pdf)‬

‭Unset‬
aws‬‭
‭ s3‬‭
cp‬‭
NCT03078478.pdf‬‭
s3://genai-blog-pipeline-genaidemoaxxx-cxxxxcxx/raw/‬

‭•‬ ‭Uploading the pdf document to S3 will cause an event to be generated and details stored in the SQS queue. Our‬
‭'genaiblog_pdf_extraction' Lambda function will receive the details of the document, read it from S3, convert pdf to text,‬
‭split the document into chunks, call OpenAI embeddings model to convert the chunks to embeddings and store in the‬
‭Pinecone vector database.‬
‭•‬ ‭After the Lambda has completed execution, check your Pinecone dashboard to confirm the documents were stored in‬
‭the vector database. In the enclosed screenshot, we see 249 vectors were stored after the execution of the Lambda.‬

‭14‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭This pipeline can be now used to ingest additional pdf documents and saved as a knowledge base in the vector store.‬

‭Frontend Chat User Interface‬


‭We will use the gradio framework to build a Chatbot interface for our question-answering application.‬
‭•‬ ‭Add gradio as a dependency to your project using 'poetry add gradio'.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
add‬‭
gradio‬
Using‬‭
‭ version‬‭
^3.40.1‬‭
for‬‭
gradio‬

‭pdating‬‭
U dependencies‬
Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/72/7d/2ad1b94106f8b1971d1eff0ebb97a81d980c448732a3e62‬

4bba281bd274d/matplotlib-3.7.2-cp310-cp31‬

Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123‬

dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.‬

Resolving‬‭
‭ dependencies...‬‭
(15.3s)‬

Package‬‭
‭ operations:‬‭
43‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

‭‬‭
• Installing‬‭
rpds-py‬‭(0.9.2)‬
•‬‭
‭ Installing‬‭
sniffio‬‭(1.3.0)‬
•‬‭
‭ Installing‬‭
anyio‬‭
(3.7.1)‬
•‬‭
‭ Installing‬‭
h11‬‭
(0.14.0)‬
•‬‭
‭ Installing‬‭
referencing‬‭(0.30.2)‬
•‬‭
‭ Installing‬‭
filelock‬‭(3.12.2)‬
•‬‭
‭ Installing‬‭
fsspec‬‭
(2023.6.0)‬
•‬‭
‭ Installing‬‭
httpcore‬‭(0.17.3)‬
...........................‬

‭•‬ ‭Our gradio frontend will take user inputs about the protocol documents, and use Langchain to retrieve the relevant‬
‭chunks of information from the Pinecone vector store, pass the query and the retrieved chunks as context to the LLM.‬
‭Therefore, we will need to provide the relevant API keys to the gradio application. In this example, we are running the‬
‭frontend application locally. However, this application could be containerized and deployed on a AWS Elastic‬
‭Kubernetes Service cluster and scaled using Application Load Balancers.‬
‭•‬ ‭For our local deployment, we will use the python-dotenv package to provide the API keys to the application. Create a‬
‭folder 'frontend' in your project directory and add the app.py and .env files.‬

‭15‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Python‬
‭mport‬‭
i os‬
import‬‭
‭ gradio‬‭
as‬‭
gr‬
from‬‭
‭ langchain.vectorstores‬‭
import‬‭Pinecone‬
from‬‭
‭ langchain.chains‬‭
import‬‭
RetrievalQA‬
from‬‭
‭ langchain.llms.openai‬‭
import‬‭
OpenAI‬
from‬‭
‭ langchain.embeddings‬‭
import‬‭
OpenAIEmbeddings‬
import‬‭
‭ pinecone‬
from‬‭
‭ dotenv‬‭
import‬‭
load_dotenv‬

load_dotenv()‬

‭inecone_api_key‬‭
p =‬‭
os.getenv(‬
"PINECONE_API_KEY"‬
‭ )‬

pinecone_env‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_API_ENV"‬
‭ )‬

pinecone_index_name‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_INDEX_NAME"‬
‭ )‬

openai_api_key‬‭
‭ =‬‭
os.getenv(‬
"OPENAI_API_KEY"‬
‭ )‬

‭inecone.init(api_key=pinecone_api_key,‬‭
p environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬

def‬‭
‭ predict(message,‬‭history):‬
# Initialize the OpenAI module, load and run the Retrieval Q&A chain‬

vectordb‬‭
‭ =‬‭Pinecone.from_existing_index(index_name=pinecone_index_name,‬
embedding=embeddings)‬

retriever‬‭
‭ =‬‭
vectordb.as_retriever()‬

‭lm‬‭
l =‬‭
OpenAI(temperature=‬0‬
‭ ,‬‭
‭ openai_api_key=openai_api_key)‬
qa‬‭
‭ =‬‭RetrievalQA.from_chain_type(llm,‬‭ chain_type=‬
"stuff"‬
‭ ,‬‭
‭ retriever=retriever)‬
response‬‭
‭ =‬‭
qa.run(message)‬
return‬‭
‭ response‬

‭r.ChatInterface(predict,‬
g
title=‬
‭ "Clinical Trials Q&A Bot"‬
‭ ,‬

description=‬
‭ "Ask questions about Clinical Trial protocol documents..."‬
‭ ,‬

).launch()‬

TEST‬

‭Run the frontend application using the following command:‬

‭Unset‬
gradio app.py‬

‭•‬ ‭This should bring up a chatbot interface on the following local url: http://127.0.0.1:7860/. Now, you can ask very specific‬
‭questions and have the LLM respond with information from the protocol documents using the Retrieval Augmented‬
‭Generation (RAG) approach..‬

‭16‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭References‬
‭•‬ ‭AWS Cloud Development Kit (AWS CDK) v2 (https://docs.aws.amazon.com/cdk/v2/guide/home.html)‬
‭•‬ ‭Amazon OpenSearch Service (‬‭https://docs.aws.amazon.com/opensearch-service/index.html‬‭)‬
‭•‬ ‭Pinecone vector database (https://www.pinecone.io/)‬
‭•‬ ‭LangChain (https://python.langchain.com/docs/get_started/introduction.html)‬
‭•‬ ‭Poetry (‬‭https://python-poetry.org/docs/‬‭)‬
‭•‬ ‭Gradio (https://www.gradio.app)‬

‭17‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Contact Us‬

‭18‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Thank you‬

‭pwc.com‬

‭19‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

You might also like