readme

4623966b · abir.chebbi · 64a49f18 · 4623966b
Commit 4623966b authored 8 months ago by abir.chebbi
--- a/README.md
+++ b/README.md
@@ -4,6 +4,7 @@
 1. AWS CLI: Ensure AWS CLI is installed and configured on your laptop(refer to the setup guide provided in Session 1).
 2. Ensure python is installed: python 3.8 or higher.
 3. Install required python libraries listed in the 'requirements.txt': 
+
 `pip3 install -r requirements.txt`


@@ -11,18 +12,22 @@

 ### Step 1: Object storage Creation
 Create an S3 bucket and upload a few PDF files by running: 
+
 `python create-S3-and-put-docs.py --bucket_name [YourBucketName] --local_path [PathToYourPDFFiles]`
+
 Where:
-`--bucket_name`: The name for the new S3 bucket to be created.
-`--local_path`: The local directory path where the PDF files are stored.
+- **--bucket_name**: The name for the new S3 bucket to be created.
+- **--local_path**: The local directory path where the PDF files are stored.


 ### Step 2: Vector Store Creation
 Create a vector database for storing embeddings by running: 
+
 `python create-vector-db.py --collection_name [Name_of_colletion] --IAM_user [YourIAM_User]`
+
 Where: 
-`--collection_name`: Name of the collection that you want to create to store embeddings.
-`--IAM_USER` : For example for group 14 the IAM USER = master-group-14
+- **--collection_name**: Name of the collection that you want to create to store embeddings.
+- **--IAM_USER** : For example for group 14 the IAM USER = master-group-14


 This script performs the following actions:
@@ -35,12 +40,14 @@ This script performs the following actions:
 After setting up the S3 bucket and Vector Store, we could process PDF files to generate and store embeddings in the vector database.

 Run: 
+
 `python main.py --bucket_name [YourBucketName] --endpoint [YourVectorDBEndpoint]`

 Where: 
-`--bucket_name`: The name of the S3 bucket containing the PDF files.
-`--endpoint`: Endpoint for the vector database.
-`--index_name`: The index_name where to store the embeddings in the collection.
+
+- **--bucket_name**: The name of the S3 bucket containing the PDF files.
+- **--endpoint**: Endpoint for the vector database.
+- **--index_name**: The index_name where to store the embeddings in the collection.

 The main.py script will:
 1. Download PDF files from the S3 bucket.