diff --git a/1.png b/1.png index f1c014421de317046b43210697e6d9ba87ecdb5b..6c6b36b9308130b9ba854f18a6911b193e44c47c 100644 Binary files a/1.png and b/1.png differ diff --git a/README.md b/README.md index de99c5e77cb7c8c13a29c49c5ccb0b829ab6e540..0f1e8c8f36022229c4a5134aebd8abd198074959 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,11 @@ ## GOAL -In this exercise, students will deploy a Kubernetes cluster locally to manage an application that retrieves and stores electrical consumption data, forecasts future consumption, and presents both historical and projected consumption trends. +In this exercise, students will locally deploy a Kubernetes cluster. This K8S cluster will be used to manage an application which retrieves and stores electrical consumption data, forecasts future consumption, and presents both historical and projected consumption trends. -The electrical consumption is reprsented by a a CSV file stored on S3. This CSV file has 11 columns. The first column is the time stamp. The other ten columns each represent the power measurements (P) of a smart meter's electricity consumption. Measurements are taken every 15 minutes. A row in the CSV file therefore corresponds to the power measurement at a given time t (HH:00, HH:15, HH:30. HH:45) for the 10 smart meters. The measures cover the period 01.01.2021 - 31.05.2022. +The electrical consumption is represented by a a CSV file stored on S3. This CSV file has 11 columns. The first column is the time stamp. The other ten columns each represent the power measurements (P) of a smart meter's electricity consumption. Measurements are taken every 15 minutes. A row in the CSV file therefore corresponds the power measurement at a given time t (HH:00, HH:15, HH:30. HH:45) for the 10 smart meters. The measures cover the period 01.01.2021 - 31.05.2022. -The application will be deployed on a local kubernetes cluster created using the [kind] (https://kind.sigs.k8s.io/) tool. +The application is to be deployed on a local kubernetes cluster created using the [kind] (https://kind.sigs.k8s.io/) tool. ## Kind @@ -24,7 +24,7 @@ Data Retrieval is a Python program that reads the S3 bucket where the CSV file i ### Forecast -Based on the data provided, forecasts X amounts of days in the future. It utilizes LSTM to forecast based on historical data from Groupe E. It will create as many instances as devices in the database (in this case, it will be capped at 10 instances/devices). Every minute, it will forecast the following day, storing the result in the database as a RedisTimeSeries. +Based on the data provided, the **Forecast** module forecasts X amounts of days in the future. It uses the LSTM Machine learning algorithm. The goal here is to create as many instances as devices in the database. Every minute, we will be forecasting the following day, storing the result in the database as a RedisTimeSeries. ### Redis @@ -32,7 +32,7 @@ Redis is an in-memory database largely used as a cache. In this case, we’ll us ### Grafana -Grafana is an analytics visualization platform which will be used to visualize the historical and forecasted data being processed. It connects to Redis and displays the RedisTimeSeries as a line graph, showing passed and future power consumption. +Grafana is an analytics visualization platform used to visualize the historical and forecasted data being processed. It connects to Redis and displays the RedisTimeSeries as a line graph, showing passed and future power consumption. ## Setup 1. Clone this repository and create an account on [Docker Hub](https://hub.docker.com). @@ -43,7 +43,7 @@ Grafana is an analytics visualization platform which will be used to visualize t Kind will create a cluster by deploying a docker container for each node in the cluster. Each container will have a kubernetes runtime, and kind will setup the required networking to make the cluster work. -At this point, you should be able visualize your cluster by running: +At this point, you should be able visualise your cluster by running: ```bash docker ps @@ -98,11 +98,11 @@ Both the `data-retrieval` and `forecast` folders have the following structure: ### Task 1: Redis deployment 1. Fill the redis-deployment.yaml -2. Dploy the redis module using kubectl. +2. Deploy the redis module using kubectl. ### Task 2: Data Retrieval deployment -The data-retrieval module must access the S3 object storage to read the CSV file containing the electrical consumption. You must therfore use the AWS Acess key and secret key. We use "secrets" to propagate confidential data through the cluster. Read the data-retrieval-deployment.yaml carefully and spot the name of the secrets used. The secrets must be generated by the following command: +The data-retrieval module must access the S3 object storage to read the CSV file containing the electrical consumption. You must therfore use the AWS Acess key and secret key. We use "secrets" to propagate confidential data through the cluster. Read the data-retrieval-deployment.yaml carefully and find out where the secrets should be used. The secrets must be generated by the following command: ```bash kubectl create secret generic <name-of-the-secrets> \ @@ -110,7 +110,7 @@ kubectl create secret generic <name-of-the-secrets> \ --from-literal=AWS_SECRET_ACCESS_KEY=<your secret key> ``` 1. Build the Data Retrieval docker and push it to your dockerhub account -2. complete the file "data-retrieval-deployment.yaml" +2. Complete the file "data-retrieval-deployment.yaml" 3. Deploy the data-retrieval module using kubectl ### Task 3: Forecast deployment