0% found this document useful (0 votes)

202 views

PWC AI Engineer Interview Assignment Guidelines

The document describes building a question answering application using AWS serverless technologies. It uses Langchain to retrieve relevant text from documents for context when answering questions. An AWS CDK project is created to define the infrastructure. Poetry is used for dependency management. Clinical trial documents are ingested and embedded for retrieval. Users can then ask questions and get answers augmented with relevant context passages.

Uploaded by

Harsha vardhna

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

202 views

PWC AI Engineer Interview Assignment Guidelines

Uploaded by

Harsha vardhna

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

‭ erverless GenAI Question Answering application using AWS CDK, AWS‬

S
‭Lambda, Poetry, Langchain, OpenAI, Pinecone and Gradio‬
‭ his post demonstrates building a basic Document Q&A Generative AI application using AWS serverless technologies,‬
T
‭Langchain, AWS Cloud Development Kit (CDK), Poetry, OpenAI, Pinecone and Gradio.‬
‭ arge Language Models (LLMs) perform quite well on general knowledge retrieval tasks because they have been trained‬
L
‭on a large corpus of openly available internet data. However, in an enterprise environment or use cases where we need to‬
‭ask questions on more specialized topics, a general purpose LLM will have issues in coming up with precise responses.‬
‭LLMs can be used for more complex and knowledge-intensive tasks by using an approach called Retrieval-Augmented‬
‭Generation (RAG). RAG combines retrieval-based methods with generative language models to improve the quality of‬
‭generated text, especially in question answering and text generation tasks.‬
‭ or the purposes of this example, we will build a simple system to do question-answering on openly available clinical trial‬
F
‭protocol documents (https://clinicaltrials.gov/). To setup the knowledge base for the RAG approach, we will build a data‬
‭pipeline that ingests a clinical protocol document (pdf format), converts the pdf to text format, chunk the document into‬
‭smaller pieces, convert the chunks into embeddings using an embedding model and store the embeddings in a vector‬
‭database.‬
‭ ow, when we ask a question to the LLM about the clinical trial, we can provide more context to the LLM by providing‬
N
‭relevant chunks of the document as part of the prompt. This can be done by using a framework such as Langchain, that‬
‭takes the user question, converts the question into a vector representation, does a semantic search against the‬
‭knowledge base, retrieves the relevant chunks, ranks them by relevance and sends the chunks as context in the prompt‬
‭to the LLM.‬

‭Architecture‬
‭The following diagram shows a high level architecture of the system:‬

‭3‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭AWS Services and OSS Frameworks Used‬
‭ WS Cloud Development Kit v2 (CDK)‬‭: Framework for‬‭defining cloud infrastructure in code and provisioning it using‬
A
‭infrastructure-as-code. It enables developers to use programming languages like Python, TypeScript & Java to define‬
‭AWS resources in a higher-level, declarative manner. It provides benefits such as the ability to use code completion in an‬
‭IDE instead of wrangling with yaml files, out of the box default, sensible configurations, declarative syntax, reusable code‬
‭through constructs, built-in leading practices.‬
‭ inecone‬‭: Vector engine to store and search high-dimensional‬‭vector embeddings in real-time. You could also use AWS‬
P
‭alternatives for a vector database. AWS recently announced the preview launch of a vector engine for Amazon‬
‭OpenSearch Serverless (‬‭https://aws.amazon.com/opensearch-service/serverless-vector-engine/‬‭).‬‭There are other options‬
‭available on AWS for a vector database alternative such as‬‭Amazon Aurora PostgreSQL 15.3 with the pgvector‬
‭extension.‬
‭ angChain‬‭: LangChain is a framework for developing‬‭applications powered by language models. It enables developers‬
L
‭to combine LLMs with external data to create custom-knowledge chatbots, question-answering systems, summarization‬
‭tools, and other applications that can interact with users in natural language. We will build a AWS Lambda Layer that‬
‭includes Langchain and other relevant libraries for the backend systems to use as part of the application.‬
‭ oetry‬‭: Poetry is a dependency management and packaging‬‭tool for Python programming. It aims to simplify and improve‬
P
‭the process of managing project dependencies, creating virtual environments, and packaging Python applications and‬
‭libraries. Poetry provides a unified solution that combines the capabilities of other existing tools like pip, virtualenv, and‬
‭setuptools into a single workflow. Poetry allows developers to declare project dependencies in a dedicated pyproject.toml‬
‭file. Poetry performs dependency resolution to confirm that the specified packages and their versions are compatible with‬
‭each other. It generates a "lock file" called poetry.lock, which records the exact versions of the dependencies used in the‬
‭project.‬
‭ radio‬‭: Gradio is an open-source Python framework‬‭that enables developers to create interactive web applications for‬
G
‭data science and machine learning projects with minimal effort. It enables users to create dynamic and interactive web‬
‭applications directly from their Python code, without the need for extensive web development expertise.‬

‭Backend Data Ingestion Pipeline‬

‭•‬ ‭We will start by installing the AWS CDK and creating a template CDK project.‬
‭Install the AWS CDK Toolkit globally using the following Node Package Manager command.‬

‭Unset‬
npm‬‭
‭ install‬‭
-g‬‭
aws-cdk‬

‭Run the following command to verify you installed the tool correctly and print the version installed.‬

‭Unset‬
cdk‬‭
‭ --version‬

‭Run the following command to create a sample CDK project in Python language.‬

‭C/C++‬
cdk‬‭
‭ init‬‭
app‬‭
--language‬‭
python‬

‭4‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭Within the sample CDK project folder, create a sample Poetry project. Copy over the pyproject.toml file to the base CDK‬
‭folder. Delete the sample poetry project folder. Deleted .venv file that was setup when CDK initiated the project.‬
‭•‬ ‭Run ‘poetry shell’, to start a new virtual environment. Run ‘poetry add aws-cdk-lib’, ‘poetry add constructs’, 'poetry add‬
‭langchain pinecone-client pypdf openai, gradio, python-dotenv' to add packages.‬

‭You should see an output similar to this after you run the poetry install command.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
install‬
Updating‬‭
‭ dependencies‬
Resolving‬‭
‭ dependencies...‬‭
(2.4s)‬

Package‬‭
‭ operations:‬‭
52‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

•‬‭
‭ Installing‬‭
attrs‬‭
(23.1.0)‬

‭5‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭‬‭
• Installing‬‭
exceptiongroup‬‭(1.1.2)‬
•‬‭
‭ Installing‬‭
packaging‬‭(23.1)‬
•‬‭
‭ Installing‬‭
six‬‭
(1.16.0)‬
•‬‭
‭ Installing‬‭
typing-extensions‬‭(4.7.1)‬
•‬‭
‭ Installing‬‭
zipp‬‭
(3.16.2)‬
•‬‭
‭ Installing‬‭
cattrs‬‭
(23.1.2)‬
•‬‭
‭ Installing‬‭
certifi‬‭(2023.7.22)‬
•‬‭
‭ Installing‬‭
charset-normalizer‬‭ (3.2.0)‬
•‬‭
‭ Installing‬‭
frozenlist‬‭(1.4.0)‬
•‬‭
‭ Installing‬‭
idna‬‭
(3.4)‬
•‬‭
‭ Installing‬‭
importlib-resources‬‭ (6.0.1)‬
•‬‭
‭ Installing‬‭
marshmallow‬‭(3.20.1)‬
•‬‭
‭ Installing‬‭
multidict‬‭(6.0.4)‬
•‬‭
‭ Installing‬‭
mypy-extensions‬‭(1.0.0)‬
•‬‭
‭ Installing‬‭
publication‬‭(0.0.3)‬
•‬‭
‭ Installing‬‭
python-dateutil‬‭(2.8.2)‬
•‬‭
‭ Installing‬‭
typeguard‬‭(2.13.3)‬
•‬‭
‭ Installing‬‭
urllib3‬‭(2.0.4)‬
•‬‭
‭ Installing‬‭
aiosignal‬‭(1.3.1)‬
•‬‭
‭ Installing‬‭
async-timeout‬‭(4.0.3)‬
•‬‭
‭ Installing‬‭
greenlet‬‭(2.0.2)‬
•‬‭
‭ Installing‬‭
jsii‬‭
(1.87.0)‬
•‬‭
‭ Installing‬‭
marshmallow-enum‬‭(1.5.1)‬
•‬‭
‭ Installing‬‭
numpy‬‭
(1.25.2)‬
•‬‭
‭ Installing‬‭
pydantic‬‭(1.10.12)‬
•‬‭
‭ Installing‬‭
requests‬‭(2.31.0)‬
•‬‭
‭ Installing‬‭
typing-inspect‬‭(0.9.0)‬
•‬‭
‭ Installing‬‭
yarl‬‭
(1.9.2)‬
•‬‭
‭ Installing‬‭
aiohttp‬‭(3.8.5)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-awscli-v1‬‭ (2.2.200)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-kubectl-v20‬‭ (2.1.2)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-node-proxy-agent-v5‬‭ (2.0.166)‬
•‬‭
‭ Installing‬‭
click‬‭
(8.1.6)‬
•‬‭
‭ Installing‬‭
constructs‬‭(10.2.69)‬
•‬‭
‭ Installing‬‭
dataclasses-json‬‭(0.5.9)‬
•‬‭
‭ Installing‬‭
langsmith‬‭(0.0.22)‬
•‬‭
‭ Installing‬‭
mccabe‬‭
(0.7.0)‬
•‬‭
‭ Installing‬‭
numexpr‬‭(2.8.5)‬
•‬‭
‭ Installing‬‭
openapi-schema-pydantic‬‭ (1.2.4)‬
•‬‭
‭ Installing‬‭
pathspec‬‭(0.11.2)‬
•‬‭
‭ Installing‬‭
platformdirs‬‭(3.10.0)‬
•‬‭
‭ Installing‬‭
pycodestyle‬‭(2.11.0)‬
•‬‭
‭ Installing‬‭
pyflakes‬‭(3.1.0)‬
•‬‭
‭ Installing‬‭
pyyaml‬‭
(6.0.1)‬
•‬‭
‭ Installing‬‭
sqlalchemy‬‭(2.0.19)‬
•‬‭
‭ Installing‬‭
tenacity‬‭(8.2.2)‬
•‬‭
‭ Installing‬‭
tomli‬‭
(2.0.1)‬
•‬‭
‭ Installing‬‭
aws-cdk-lib‬‭(2.91.0)‬
•‬‭
‭ Installing‬‭
black‬‭
(23.7.0)‬
•‬‭
‭ Installing‬‭
flake8‬‭
(6.1.0)‬
•‬‭
‭ Installing‬‭
langchain‬‭(0.0.251)‬

‭6‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

Writing‬‭
‭ lock‬‭
file‬

Installing‬‭
‭ the‬‭
current‬‭
project:‬‭
genai_blog‬‭
(0.1.0)‬

‭•‬ ‭Create an index in Pinecone (‬‭https://www.pinecone.io‬‭).‬‭You could use the Starter free tier option that allows you to‬
‭create one index and one project. Give the index a name, enter 1536 for dimension size and euclidean as Metric.‬

‭ he reason we use 1536 as the dimension is because we will be using the OpenAI embeddings for this application.‬
T
‭Please refer this link for details: (‬‭https://platform.openai.com/docs/guides/embeddings/what-are-embeddings‬‭)‬
‭Next, create an API key within your Pinecone account. Note the API key value, Environment and Pinecone index name.‬
‭•‬ ‭We will use the GPT 3.5 model as the LLM for this application. Go to‬‭https://platform.openai.com/account/api-keys‬‭and‬
‭create an API key.‬
‭•‬ ‭We will save all the keys into a secret in AWS Secrets Manager and retrieve the keys wherever required in our‬
‭application using the boto3 library. Create an api_keys.json file, enter your keys and create a secret using the aws cli.‬

‭7‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭Create a LangChain Layer in AWS Lambda - Create a folder layers/common-layer/python. Create a requirements.txt file‬
‭under/common-layer and add langchain, urllib3, openai, tiktoken as dependencies. The file structure should look similar‬
‭to the following screenshot.‬

‭•‬ ‭Create a Makefile at the project root level.‬

‭8‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭Add a ‘make' command in cdk.json.‬

‭9‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭The project structure and the initial dependencies have now been set up. We can add more libraries as we go along‬
‭building the application via poetry install. If the library needs to be a part of the Lambda layers package, we will need to‬
‭add those dependencies to the requirements.txt file under layers folder.‬
‭•‬ ‭We will create a stack 'genai_blog_stack' using the Python programming language. The code will create the following:‬
‭–‬‭An S3 bucket‬
‭–‬‭Create a SQS queue to receive events when documents land in the "genai-demo" bucket under 'raw' prefix,‬
‭–‬‭Attach an event to the S3 bucket to send notifications to the SQS queue,‬
‭–‬‭Build a common layer for libraries such as langchain, etc.‬
‭–‬‭Add a lambda function to poll the queue, read the pdf, extract the pdf and convert to embeddings,‬
‭–‬‭Grant necessary permissions to the Lambda function‬
‭–‬‭Configure the Lambda function to be triggered by messages from the SQS queue.‬
‭–‬‭Grant permission to the Lambda function to access the secret‬

‭All this in less than 50 lines of code! Following is the code snippet:‬

‭Python‬
‭rom‬‭
f aws_cdk‬‭
import‬‭
RemovalPolicy,‬‭Stack,‬‭
Duration‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3‬‭
as‬‭
s3‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_sqs‬‭
as‬‭
sqs‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3_notifications‬‭as‬‭
s3n‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda‬‭
as‬‭
lambda_‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda_event_sources‬‭as‬‭event_sources‬
from‬‭
‭ constructs‬‭
import‬‭
Construct‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_secretsmanager‬‭as‬‭
secretsmanager‬

class‬‭
‭ GenaiBlogStack(Stack):‬
def‬‭
‭ __init__(self,‬‭
scope:‬‭
Construct,‬‭
construct_id:‬‭
str‬
,‬‭
‭ **kwargs)‬‭
->‬‭
None‬
:‬
‭
super‬
‭ ().__init__(scope,‬‭
‭ construct_id,‬‭
**kwargs)‬

‭ Create the S3 bucket‬

#
bucket‬‭
‭ =‬‭
s3.Bucket(self,‬‭
"genai-demo"‬
,‬‭
‭ removal_policy=RemovalPolicy.DESTROY)‬

# Create a SQS queue to receive events when documents land in the "genai-demo" bucket‬
‭
under 'raw' prefix‬
‭
queue‬‭
‭ =‬‭
sqs.Queue(self,‬‭
"genai-demo-queue"‬
,‬‭
‭ visibility_timeout=Duration.minutes(‬
15‬
‭ ))‬
‭

‭ Pass the queue URL as an environment variable‬

#
env_queue_url‬‭
‭ =‬‭
queue.queue_url‬

‭ Attach an event to the S3 bucket to send notifications to the SQS queue‬

#
bucket.add_event_notification(‬
‭
s3.EventType.OBJECT_CREATED,‬
‭
s3n.SqsDestination(queue),‬
‭
s3.NotificationKeyFilter(‬
‭
prefix=‬
‭ "raw/"‬
‭ ,‬
‭
suffix=‬
‭ ".pdf"‬
‭
)‬
‭

‭10‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

)‬
‭

‭ Build common layer for libraries such as langchain, etc.‬

#
common_lambda_layer‬‭
‭ =‬‭
lambda_.LayerVersion(‬
self,‬
‭
"CommonLambdaLayer"‬
‭ ,‬
‭
code=lambda_.Code.from_asset(‬
‭ "./layers/common-layer"‬
‭ ),‬
‭
compatible_runtimes=[lambda_.Runtime.PYTHON_3_9],‬
‭
layer_version_name=‬
‭ "common_lambda_layer"‬
‭
)‬
‭

# Add a lambda function to poll the queue, read the pdf, extract the pdf and convert‬
‭
to embeddings‬
‭
pdf_extraction_lambda‬‭
‭ =‬‭
lambda_.Function(‬
self,‬
‭
'GenaiPDFHandler'‬
‭ ,‬
‭
runtime=lambda_.Runtime.PYTHON_3_9,‬
‭
timeout=Duration.seconds(‬
‭ 300‬
‭ ),‬
‭
memory_size=‬
‭ 2048‬
‭ ,‬
‭
code=lambda_.Code.from_asset(‬
‭ 'lambda'‬
‭ ),‬
‭
handler=‬
‭ 'genaiblog_pdf_extraction.handler'‬
‭ ,‬
‭
layers‬‭
‭ =[common_lambda_layer],‬
environment={‬
‭
"QUEUE_URL"‬
‭ :‬‭
‭ env_queue_url‬
}‬
‭
)‬
‭
# Grant necessary permissions to the Lambda function‬
‭
queue.grant_send_messages(pdf_extraction_lambda)‬
‭

‭ Configure the Lambda function to be triggered by messages from the SQS queue‬
#
queue_trigger‬‭
‭ =‬‭
event_sources.SqsEventSource(queue,‬‭
batch_size=‬
1‬
‭ )‬
‭
pdf_extraction_lambda.add_event_source(queue_trigger)‬
‭

‭ Add S3 read permissions to the Lambda function‬

#
bucket.grant_read(pdf_extraction_lambda)‬
‭

‭ Grant permission to the Lambda function to access the secret‬

#
example_secret‬‭
‭ =‬‭secretsmanager.Secret.from_secret_name_v2(‬
scope=self,‬‭
‭ id‬
=‭
‭"‬secretExample"‬
,‬‭
‭ secret_name=‬
"demo/gb"‬
‭
)‬
‭
example_secret.grant_read(grantee=pdf_extraction_lambda)‬
‭

‭•‬ ‭Write the lambda function to read the pdf file uploaded to S3 bucket, extract pdf to text, convert to embeddings and‬
‭store in a vector database.‬

‭Python‬
‭mport‬‭
i json‬
import‬‭
‭ boto3‬

‭11‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭mport‬‭
i os‬
import‬‭
‭ pinecone‬
from‬‭
‭ langchain‬‭
import‬‭
PromptTemplate‬
from‬‭
‭ langchain.document_loaders‬‭ import‬‭
PyPDFLoader‬
from‬‭
‭ langchain.llms.openai‬‭import‬‭
OpenAI‬
from‬‭
‭ langchain.vectorstores.pinecone‬‭ import‬‭
Pinecone‬
from‬‭
‭ langchain.embeddings.openai‬‭ import‬‭
OpenAIEmbeddings‬
from‬‭
‭ langchain.chains‬‭import‬‭
RetrievalQA‬
from‬‭
‭ botocore.exceptions‬‭import‬‭
ClientError‬

s3‬‭
‭ =‬‭
boto3.client(‬
's3'‬
‭ )‬
‭

def‬‭
‭ get_secret():‬

‭ecret_name‬‭
s =‬‭
"demo/gb"‬
region_name‬‭
‭ =‬‭
"us-east-1"‬

‭ Create a Secrets Manager client‬

#
session‬‭
‭ =‬‭boto3.session.Session()‬
client‬‭
‭ =‬‭
session.client(‬
service_name=‬
‭ 'secretsmanager'‬
‭ ,‬
‭
region_name=region_name‬
‭
)‬
‭
# Retrieve the secret value‬
‭
try‬
‭ :‬
‭
get_secret_value_response‬‭
‭ =‬‭
client.get_secret_value(‬
SecretId=secret_name‬
‭
)‬
‭
except‬‭
‭ ClientError‬‭as‬‭
e:‬
raise‬‭
‭ e‬

‭ Decrypts secret using the associated KMS key.‬

#
secret‬‭
‭ =‬‭
get_secret_value_response[‬
'SecretString'‬
‭ ]‬
‭
secret_json‬‭
‭ =‬‭
json.loads(secret)‬
return‬‭
‭ secret_json‬

def‬‭
‭ handler(event,‬‭
context):‬
print‬
‭ (‬
‭ "request: {}"‬
‭ .‬
‭ format‬
‭ (json.dumps(event)))‬
‭
get_secret()‬
‭

# Loops through the records. We have set queue‬‭

‭ batch size to 1. So, only 1 record will be‬
retrieved.‬
‭
for‬‭
‭ record‬‭
in‬‭
event[‬"Records"‬
‭ ]:‬
‭
body‬‭
‭ =‬‭
record[‬"body"‬
‭ ]‬
‭
body_json‬‭
‭ =‬‭
json.loads(body)‬

‭ucket_name‬‭
b =‬‭
body_json[‬
"Records"‬
‭ ][‬
‭ 0‭
‭]‬[‬
"s3"‬
‭ ][‬
‭ "bucket"‬
‭ ][‬
‭ "name"‬
‭ ]‬
‭
print‬
‭ (bucket_name)‬
‭
key‬‭
‭ =‬‭
body_json[‬"Records"‬
‭ ][‬
‭ 0‭
‭]
‬[‬
"s3"‬
‭ ][‬
‭ "object"‬
‭ ][‬
‭ "key"‬
‭ ]‬
‭
print‬
‭ (key)‬
‭

# Read PDF file from S3 into Lambda temporary‬‭

‭ storage‬

‭12‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭ile_name‬‭
f =‬‭
os.path.basename(key)‬
local_file_path‬‭
‭ =‬‭
'/tmp/'‬‭
+‬‭
file_name‬‭# Path‬‭
to store the downloaded file locally‬
s3.download_file(bucket_name,‬‭
‭ key,‬‭
local_file_path)‬

# Load PDF using pypdf into array of documents,‬‭

‭ where each document contains the page‬
content and metadata with page number.‬
‭
loader‬‭
‭ =‬‭
PyPDFLoader(local_file_path)‬
pages‬‭
‭ =‬‭
loader.load_and_split()‬
print‬
‭ (pages[‬
‭ 0‭
‭]
‬)‬

‭ Retrieve API keys from Secrets Manager‬

#
secret‬‭
‭ =‬‭
get_secret()‬
pinecone_api_key‬‭
‭ =‬‭
secret[‬'PINECONE_API_KEY'‬
‭ ]‬
‭
pinecone_env‬‭
‭ =‬‭
secret[‬ 'PINECONE_API_ENV'‬
‭ ]‬
‭
pinecone_index_name‬‭
‭ =‬‭
secret[‬
'PINECONE_INDEX_NAME'‬
‭ ]‬
‭
openai_api_key‬‭
‭ =‬‭
secret[‬ 'OPENAI_API_KEY'‬
‭ ]‬
‭

# Generate embeddings for the pages, insert‬‭

‭ into Pinecone vector database, and expose the‬
index in a retriever interface‬
‭
pinecone.init(api_key=pinecone_api_key,‬‭
‭ environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬
Pinecone.from_documents(pages,‬‭
‭ embeddings,‬‭
index_name=pinecone_index_name)‬

print‬
‭ (f‬
‭ '‬
‭ {key}‬‭
‭ loaded‬‭
to‬‭
pinecone‬‭
vector‬‭
store...‬
'‭
‭ )
‬‬
return‬‭
‭ {‬
"statusCode"‬
‭ :‬‭
‭ 200‬
,‬
‭
"headers"‬
‭ :‬‭
‭ {‭
"
‬Content-Type"‬ :‬‭
‭ "text/plain"‬},‬
‭
"body"‬
‭ :‬‭
‭ "Document successfully loaded to vector‬‭ store"‬
,‬
‭
}‬
‭

‭•‬ ‭Run the following command to synthesize the AWS CloudFormation templates based on the CDK code.‬

‭Unset‬
cdk‬‭
‭ synth‬

‭•‬ ‭Run the following command to deploy the stack.‬

‭Unset‬
cdk‬‭
‭ deploy‬

‭13‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭Wait for the application to be successfully deployed. We will upload an openly available protocol document to our S3‬
‭bucket. Use the following command to upload the pdf to your bucket. Protocol available to download here:‬
‭https://classic.clinicaltrials.gov/ProvidedDocs/78/NCT03078478/Prot_000.pdf)‬

‭Unset‬
aws‬‭
‭ s3‬‭
cp‬‭
NCT03078478.pdf‬‭
s3://genai-blog-pipeline-genaidemoaxxx-cxxxxcxx/raw/‬

‭•‬ ‭Uploading the pdf document to S3 will cause an event to be generated and details stored in the SQS queue. Our‬
‭'genaiblog_pdf_extraction' Lambda function will receive the details of the document, read it from S3, convert pdf to text,‬
‭split the document into chunks, call OpenAI embeddings model to convert the chunks to embeddings and store in the‬
‭Pinecone vector database.‬
‭•‬ ‭After the Lambda has completed execution, check your Pinecone dashboard to confirm the documents were stored in‬
‭the vector database. In the enclosed screenshot, we see 249 vectors were stored after the execution of the Lambda.‬

‭14‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭•‬ ‭This pipeline can be now used to ingest additional pdf documents and saved as a knowledge base in the vector store.‬

‭Frontend Chat User Interface‬

‭We will use the gradio framework to build a Chatbot interface for our question-answering application.‬
‭•‬ ‭Add gradio as a dependency to your project using 'poetry add gradio'.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
add‬‭
gradio‬
Using‬‭
‭ version‬‭
^3.40.1‬‭
for‬‭
gradio‬

‭pdating‬‭
U dependencies‬
Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/72/7d/2ad1b94106f8b1971d1eff0ebb97a81d980c448732a3e62‬
‭
4bba281bd274d/matplotlib-3.7.2-cp310-cp31‬
‭
Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123‬
‭
dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.‬
‭
Resolving‬‭
‭ dependencies...‬‭
(15.3s)‬

Package‬‭
‭ operations:‬‭
43‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

‭‬‭
• Installing‬‭
rpds-py‬‭(0.9.2)‬
•‬‭
‭ Installing‬‭
sniffio‬‭(1.3.0)‬
•‬‭
‭ Installing‬‭
anyio‬‭
(3.7.1)‬
•‬‭
‭ Installing‬‭
h11‬‭
(0.14.0)‬
•‬‭
‭ Installing‬‭
referencing‬‭(0.30.2)‬
•‬‭
‭ Installing‬‭
filelock‬‭(3.12.2)‬
•‬‭
‭ Installing‬‭
fsspec‬‭
(2023.6.0)‬
•‬‭
‭ Installing‬‭
httpcore‬‭(0.17.3)‬
...........................‬
‭

‭•‬ ‭Our gradio frontend will take user inputs about the protocol documents, and use Langchain to retrieve the relevant‬
‭chunks of information from the Pinecone vector store, pass the query and the retrieved chunks as context to the LLM.‬
‭Therefore, we will need to provide the relevant API keys to the gradio application. In this example, we are running the‬
‭frontend application locally. However, this application could be containerized and deployed on a AWS Elastic‬
‭Kubernetes Service cluster and scaled using Application Load Balancers.‬
‭•‬ ‭For our local deployment, we will use the python-dotenv package to provide the API keys to the application. Create a‬
‭folder 'frontend' in your project directory and add the app.py and .env files.‬

‭15‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭Python‬
‭mport‬‭
i os‬
import‬‭
‭ gradio‬‭
as‬‭
gr‬
from‬‭
‭ langchain.vectorstores‬‭
import‬‭Pinecone‬
from‬‭
‭ langchain.chains‬‭
import‬‭
RetrievalQA‬
from‬‭
‭ langchain.llms.openai‬‭
import‬‭
OpenAI‬
from‬‭
‭ langchain.embeddings‬‭
import‬‭
OpenAIEmbeddings‬
import‬‭
‭ pinecone‬
from‬‭
‭ dotenv‬‭
import‬‭
load_dotenv‬

load_dotenv()‬
‭

‭inecone_api_key‬‭
p =‬‭
os.getenv(‬
"PINECONE_API_KEY"‬
‭ )‬
‭
pinecone_env‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_API_ENV"‬
‭ )‬
‭
pinecone_index_name‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_INDEX_NAME"‬
‭ )‬
‭
openai_api_key‬‭
‭ =‬‭
os.getenv(‬
"OPENAI_API_KEY"‬
‭ )‬
‭

‭inecone.init(api_key=pinecone_api_key,‬‭
p environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬

def‬‭
‭ predict(message,‬‭history):‬
# Initialize the OpenAI module, load and run the Retrieval Q&A chain‬
‭
vectordb‬‭
‭ =‬‭Pinecone.from_existing_index(index_name=pinecone_index_name,‬
embedding=embeddings)‬
‭
retriever‬‭
‭ =‬‭
vectordb.as_retriever()‬

‭lm‬‭
l =‬‭
OpenAI(temperature=‬0‬
‭ ,‬‭
‭ openai_api_key=openai_api_key)‬
qa‬‭
‭ =‬‭RetrievalQA.from_chain_type(llm,‬‭ chain_type=‬
"stuff"‬
‭ ,‬‭
‭ retriever=retriever)‬
response‬‭
‭ =‬‭
qa.run(message)‬
return‬‭
‭ response‬

‭r.ChatInterface(predict,‬
g
title=‬
‭ "Clinical Trials Q&A Bot"‬
‭ ,‬
‭
description=‬
‭ "Ask questions about Clinical Trial protocol documents..."‬
‭ ,‬
‭
).launch()‬
‭
TEST‬
‭

‭Run the frontend application using the following command:‬

‭Unset‬
gradio app.py‬
‭

‭•‬ ‭This should bring up a chatbot interface on the following local url: http://127.0.0.1:7860/. Now, you can ask very specific‬
‭questions and have the LLM respond with information from the protocol documents using the Retrieval Augmented‬
‭Generation (RAG) approach..‬

‭16‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭References‬
‭•‬ ‭AWS Cloud Development Kit (AWS CDK) v2 (https://docs.aws.amazon.com/cdk/v2/guide/home.html)‬
‭•‬ ‭Amazon OpenSearch Service (‬‭https://docs.aws.amazon.com/opensearch-service/index.html‬‭)‬
‭•‬ ‭Pinecone vector database (https://www.pinecone.io/)‬
‭•‬ ‭LangChain (https://python.langchain.com/docs/get_started/introduction.html)‬
‭•‬ ‭Poetry (‬‭https://python-poetry.org/docs/‬‭)‬
‭•‬ ‭Gradio (https://www.gradio.app)‬

‭17‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭Contact Us‬

‭18‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

‭Thank you‬

‭pwc.com‬

‭19‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

Full Data Modeling and Database Design 2nd Edition Narayan S. Umanath Ebook All Chapters
100% (14)
Full Data Modeling and Database Design 2nd Edition Narayan S. Umanath Ebook All Chapters
58 pages
M1 - Introducing Google Cloud v5.2 - ILT
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
69 pages
HLD-ECDIS 600 Installation Manual
100% (2)
HLD-ECDIS 600 Installation Manual
75 pages
De Mod 5 Deploy Workloads With Databricks Workflows
No ratings yet
De Mod 5 Deploy Workloads With Databricks Workflows
19 pages
(IJIT-V6I5P7) :ravishankar Belkunde
No ratings yet
(IJIT-V6I5P7) :ravishankar Belkunde
9 pages
Advanced Architecting On AWS
0% (1)
Advanced Architecting On AWS
1 page
Microsoft Dynamics GP 2013 Implementation
From Everand
Microsoft Dynamics GP 2013 Implementation
Victoria Yudin
No ratings yet
FortiADC GLB Deployment Guide
No ratings yet
FortiADC GLB Deployment Guide
23 pages
3D Modeling For Beginners
75% (12)
3D Modeling For Beginners
202 pages
Low Level Design
No ratings yet
Low Level Design
23 pages
Aws Perspective
No ratings yet
Aws Perspective
70 pages
Comparison of Power BI Tableau and Cognos Webinar Senturus
No ratings yet
Comparison of Power BI Tableau and Cognos Webinar Senturus
35 pages
ERModel PDF
100% (1)
ERModel PDF
82 pages
Set Your Data in Motion
No ratings yet
Set Your Data in Motion
8 pages
1 - Optimize Amazon SageMaker Deployment Strategies
No ratings yet
1 - Optimize Amazon SageMaker Deployment Strategies
45 pages
PWC Germany Case Study
No ratings yet
PWC Germany Case Study
3 pages
Stream Processing at Lyft
No ratings yet
Stream Processing at Lyft
20 pages
I&A Tech Solution Architecture Guidelines
No ratings yet
I&A Tech Solution Architecture Guidelines
321 pages
Matillion Optimizing Snowflake
No ratings yet
Matillion Optimizing Snowflake
23 pages
Ambari Operations
No ratings yet
Ambari Operations
194 pages
Neo4j-Manual-2 0 1
No ratings yet
Neo4j-Manual-2 0 1
593 pages
Cloud Data Warehouse
No ratings yet
Cloud Data Warehouse
7 pages
Cert DEWD (Edits)
No ratings yet
Cert DEWD (Edits)
158 pages
3 Lecture 3-ETL
100% (1)
3 Lecture 3-ETL
42 pages
ETL Specific
No ratings yet
ETL Specific
12 pages
What Are DBT Sources
No ratings yet
What Are DBT Sources
109 pages
Airflow 2 X
No ratings yet
Airflow 2 X
39 pages
ETL vs. ELT: Frictionless Data Integration - Diyotta
No ratings yet
ETL vs. ELT: Frictionless Data Integration - Diyotta
3 pages
Applied Coding Track
No ratings yet
Applied Coding Track
10 pages
Datawarehouse To Data Lakehouse
No ratings yet
Datawarehouse To Data Lakehouse
48 pages
Talend ESB Container AG 50b en
No ratings yet
Talend ESB Container AG 50b en
63 pages
Using The Cost of Quality Approach For Software
No ratings yet
Using The Cost of Quality Approach For Software
6 pages
Data Lineage
No ratings yet
Data Lineage
14 pages
Snowflake To Oracle
No ratings yet
Snowflake To Oracle
16 pages
Apache Airflow Fundamentals Study Guide
No ratings yet
Apache Airflow Fundamentals Study Guide
7 pages
Apache Druid: Sudhindra Tirupati Nagaraj
No ratings yet
Apache Druid: Sudhindra Tirupati Nagaraj
12 pages
Spring Cloud Dataflow Reference
No ratings yet
Spring Cloud Dataflow Reference
130 pages
Slide 13 - Kafka
No ratings yet
Slide 13 - Kafka
109 pages
Snapdeal - Supply Chain Final
0% (1)
Snapdeal - Supply Chain Final
45 pages
Spark Optimizations & Deployment
No ratings yet
Spark Optimizations & Deployment
13 pages
DataEngineer Roadmap
No ratings yet
DataEngineer Roadmap
12 pages
Eb Cloud Data Warehouse Comparison Ebook en
No ratings yet
Eb Cloud Data Warehouse Comparison Ebook en
10 pages
Azure AnalysisServiceOverview
No ratings yet
Azure AnalysisServiceOverview
173 pages
Bian 91 Excel
No ratings yet
Bian 91 Excel
2,191 pages
Azure Data Engineer Mock Interview - Project Special
No ratings yet
Azure Data Engineer Mock Interview - Project Special
11 pages
Essential Python Libraries and Functions For Data Science 1706295212
No ratings yet
Essential Python Libraries and Functions For Data Science 1706295212
12 pages
Learning Apache Spark With Python
No ratings yet
Learning Apache Spark With Python
10 pages
Decentralized Web Platform - Public
No ratings yet
Decentralized Web Platform - Public
18 pages
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
100% (4)
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
62 pages
Datawarehouse Tools
No ratings yet
Datawarehouse Tools
8 pages
Unstructured Dataload Into Hive Database Through PySpark
No ratings yet
Unstructured Dataload Into Hive Database Through PySpark
9 pages
Data Mining N Business Intelligence
No ratings yet
Data Mining N Business Intelligence
63 pages
NiFi Pyro
No ratings yet
NiFi Pyro
24 pages
Snow SQL
No ratings yet
Snow SQL
3 pages
[FREE PDF sample] Python Unit Test Automation: Practical Techniques for Python Developers and Testers 1 / converted Edition Ashwin Pajankar ebooks
100% (2)
[FREE PDF sample] Python Unit Test Automation: Practical Techniques for Python Developers and Testers 1 / converted Edition Ashwin Pajankar ebooks
35 pages
Flask With Aws Cloudwatch
No ratings yet
Flask With Aws Cloudwatch
6 pages
Understand The Impact of Genai On Indian Healthcare Ecosystem
No ratings yet
Understand The Impact of Genai On Indian Healthcare Ecosystem
48 pages
AWS Athena Knowledgebase
No ratings yet
AWS Athena Knowledgebase
4 pages
Cognos Query Tips and Guidelines
No ratings yet
Cognos Query Tips and Guidelines
11 pages
Python Jinja Tutorial
No ratings yet
Python Jinja Tutorial
10 pages
Aws Redshift: Calculations Are Typically Executed On Small Number of Columns
No ratings yet
Aws Redshift: Calculations Are Typically Executed On Small Number of Columns
8 pages
Cloud Computing Module-05 Search Creators
100% (1)
Cloud Computing Module-05 Search Creators
25 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
S136
No ratings yet
S136
2 pages
CloudNative_III
No ratings yet
CloudNative_III
46 pages
4it1 01r Que 20220517 PDF
No ratings yet
4it1 01r Que 20220517 PDF
24 pages
Scheduleandannexurestr 78792
No ratings yet
Scheduleandannexurestr 78792
5 pages
Uzuy Log
No ratings yet
Uzuy Log
5 pages
BIOS User Manual C341 V100
No ratings yet
BIOS User Manual C341 V100
66 pages
110.detection of Lung Cancer From CT Image Using SVM Classification and Compare The Survival Rate of Patients Using 3D Convolutional Neural Network (3D CNN) On Lung Nodules Data Set
No ratings yet
110.detection of Lung Cancer From CT Image Using SVM Classification and Compare The Survival Rate of Patients Using 3D Convolutional Neural Network (3D CNN) On Lung Nodules Data Set
12 pages
Cloud Applications
No ratings yet
Cloud Applications
73 pages
77-725 MICROSOFT WORD 2016: Test 3
No ratings yet
77-725 MICROSOFT WORD 2016: Test 3
10 pages
Rhino User's Guide For Windows
No ratings yet
Rhino User's Guide For Windows
276 pages
NONLIN 7 05 Computer Program for Nonline
No ratings yet
NONLIN 7 05 Computer Program for Nonline
105 pages
Unit 2
No ratings yet
Unit 2
24 pages
Virtual Commissioning Whitepaper
100% (1)
Virtual Commissioning Whitepaper
9 pages
Software. Process Device Manager SIMATIC PDM 8-11. Overview
No ratings yet
Software. Process Device Manager SIMATIC PDM 8-11. Overview
11 pages
Es 100
No ratings yet
Es 100
10 pages
Brush Installation Instructions
No ratings yet
Brush Installation Instructions
6 pages
Introduction To Web Performance Optimization
No ratings yet
Introduction To Web Performance Optimization
17 pages
Flutter Apprentice Learn to Build Cross Platform Apps 2nd Edition Mike Katz - Explore the complete ebook content with the fastest download
100% (3)
Flutter Apprentice Learn to Build Cross Platform Apps 2nd Edition Mike Katz - Explore the complete ebook content with the fastest download
77 pages
IBM ToolBox For Java JTOpen
No ratings yet
IBM ToolBox For Java JTOpen
772 pages
Vehicle Tracking System
No ratings yet
Vehicle Tracking System
44 pages
Latex Lab Manual
No ratings yet
Latex Lab Manual
57 pages
Citrix Adc Virtual Platforms
No ratings yet
Citrix Adc Virtual Platforms
5 pages
NNTZT2 TZTL12F-15F Built-In Data Installation
No ratings yet
NNTZT2 TZTL12F-15F Built-In Data Installation
5 pages
Managing Alarms in SCADA Expert Vijeo Citect 7.5
No ratings yet
Managing Alarms in SCADA Expert Vijeo Citect 7.5
84 pages
Presentation by NCICT Batch 19: Ncict VTC Kallady Batticaloa
No ratings yet
Presentation by NCICT Batch 19: Ncict VTC Kallady Batticaloa
34 pages
SMAPI Crash
No ratings yet
SMAPI Crash
24 pages
DX Diag
No ratings yet
DX Diag
42 pages