{"id":4911,"date":"2022-09-16T09:00:00","date_gmt":"2022-09-16T07:00:00","guid":{"rendered":"https:\/\/blog.besharp.it\/?p=4911"},"modified":"2022-09-14T11:40:50","modified_gmt":"2022-09-14T09:40:50","slug":"paas-on-aws-how-to-build-it-the-perfect-way","status":"publish","type":"post","link":"https:\/\/blog.besharp.it\/paas-on-aws-how-to-build-it-the-perfect-way\/","title":{"rendered":"PaaS on AWS: how to build it the perfect way"},"content":{"rendered":"\n

Introduction<\/h1>\n\n\n\n

Thanks to the tools offered by cloud providers, in recent years, many PaaS products have become popular, i.e. services that allow you to provide and manage complete computing platforms without having to take on all the complexities associated with maintaining the underlying infrastructure.<\/p>\n\n\n\n

The complexity of these applications is closely linked to the scalability and speed of the entire solution, so we must be able to create and remove resources in a fluid and efficient way in an environment capable of guaranteeing the clear separation of the infrastructure of an end user from that of another.<\/p>\n\n\n\n

Another critical point to be analyzed in the preliminary stages of a PaaS project is the so-called shared responsibility model<\/a>, which is the definition of the operational tasks charged to the “vendor” and those charged to the user: who has to take care of maintaining the infrastructure secure from the point of view of access to resources and data encryption? Who must ensure the high availability of the solution?<\/p>\n\n\n\n

When we decide to move our workload from on-premise to Amazon Web Services, we are agreeing to delegate the management of some architectural aspects to the Cloud provider: in this case, we speak of “cloud security” (charged to AWS ) against “security in the cloud” (to be paid by the customer).<\/p>\n\n\n\n

In this article, we will analyze the key points for a correct implementation of a PaaS product, which in this case consists of dedicated virtual hosts that can be managed and updated independently through the use of their own repository.<\/p>\n\n\n\n

Two other articles will follow in which we will see in more detail the technical refinements and the considerations made for the individual architectural components, so stay tuned!<\/p>\n\n\n\n

Needs and requests<\/h2>\n\n\n\n

Our goal, as anticipated before, is to create a virtual host vending machine on Amazon EC2 on which our customers can upload their customized software according to their needs. In addition, the deployment of the infrastructure and software will have to be automated through the CI\/CD pipeline to simplify the management of the resources that will be provided to users.<\/p>\n\n\n\n

We have chosen to use the following technologies and frameworks:<\/p>\n\n\n\n

GitLab<\/h2>\n\n\n\n

GitLab the sources of our code and the AMI configuration files are saved on GitLab, a DevOps platform that allows you to perform version control using git. GitLab<\/a> offers different payment plans, from the free version, perfect for personal or test projects, to an enterprise version that adds dedicated support, vulnerability management, more CI \/ CD pipelines, etc.<\/p>\n\n\n\n

This is the only constraint of the project.<\/p>\n\n\n\n

Packer<\/h2>\n\n\n\n

Packer<\/a> is a product designed by HashiCorp that allows you to automate the creation of machine images using templates written in HashiCorp’s proprietary language (HCL). It integrates very well with AWS as, by launching the build command, a new AMI will be created containing the parameters indicated within the template, thus allowing you to launch preconfigured EC2 instances.<\/p>\n\n\n\n

An example of a Paker configuration file (ami-creation.pkr.hcl):<\/p>\n\n\n\n

variable \"region\" {\nType = string\ndefault = eu-west-1\n}\nsource \"amazon-ebs\" \"ami-creation\" {\nAmi_name = \"NAME OF THE FINAL AMI\"\nAmi_description = \"DESCRIPTION OF THE AMI\"\nVpc_id = \"VPC IN WHICH PACKER WILL CREATE THE EC2 THAT WILL GENERATE THE AMI\"\nSubnet_id = \"SUBNET IN WHICH PACKER WILL CREATE THE EC2 THAT WILL GENERATE THE AMI\"\nSecurity_group_id = \"SECURITY GROUP FOR EC2\"\nInstance_type = \"SIZE OF EC2\"\nRegion = \"REGION IN WHICH THE EC2 AND THE A.M.I. ARE CREATED\"\nSource_ami = \"STARTING AMI ID\"\nSsh_username = \"ubuntu\"\nCommunicator = \"ssh\"\nIam_instance_profile = \"EC2 MANAGEMENT PROFILE\"\nlaunch_block_device_mappings {\nDevice_name = \"\/ dev \/ sda1\"\nVolume_size = 16\nVolume_type = \"gp3\"\ndelete_on_termination = true\n}\nrun_tags = {\nName = \"NAME OF EC2 BUILD BY PACKER\"\n}\ntags = {\nTAGS FOR THE RESULTING AMI\n}\n}\nbuild {\nsources = [\"source.amazon-ebs.ami-creation\"] provisioner \"file\" {\ndestination = \".\/install.sh\"\nsource = \"install.sh\"\n}\nprovisioner \"shell\" {\ninline = [\n\"sudo chmod + x preInstall.sh && sudo .\/preInstall.sh\",\n\"sudo -E .\/install.sh\"\n]\n}\n}\n<\/code><\/pre>\n\n\n\n

Chosen for its versatility and speed of execution compared to other competitors, the only flaw found is in the encryption of the AMI as it increases significantly (from 5 to 8 minutes more per execution)<\/p>\n\n\n\n

RDS<\/h2>\n\n\n\n

RDS<\/a> (Relational Database Service) is a fully managed AWS service that allows you to create and manage Database Servers by selecting one of the various DB engines offered. In our case we chose MySQL. RDS allows us to scale our data layer more smoothly than an implementation through EC2 instances.<\/p>\n\n\n\n

We chose it for its versatility and scalability as the volume of traffic that customers will develop is not always predictable.<\/p>\n\n\n\n

S3<\/h2>\n\n\n\n

S3<\/a> (Simple Storage Service) is a completely serverless service that allows us to save files using virtually unlimited storage space at a low cost and accessible via API calls. It offers numerous features, including encryption-at-rest, storage classes (useful when you want to reduce the costs of storing infrequently accessed data), access management, and cross-region replication to reduce the likelihood of data loss.<\/p>\n\n\n\n

EC2<\/h2>\n\n\n\n

EC2<\/a> (Elastic Compute Cloud) is a service that allows you to create a virtual machine starting from the choice of the \u201chardware\u201d configuration (it offers more than 500 combinations) through the sizing and number of attached disks to the choice of the pre-installed operating system. HPC workloads can also be run for versatility and power of service.<\/p>\n\n\n\n

We chose this service for its flexibility in customizing the operating system and the requirements necessary for running the software.<\/p>\n\n\n\n

CodeBuild, CodeDeploy & CodePipeline<\/h2>\n\n\n\n

We decided to rely on CodeBuild, CodeDeploy, and CodePipeline to build our CI \/ CD pipeline. CodeBuild<\/a> allows you to create compilation systems, units, and integration tests orchestrated by our CodePipeline<\/a> pipeline in a managed and scalable way. Among the numerous features of CodePipeline, we find the possibility of adding Manual Approval steps (i.e., a step that pauses the execution of our pipeline that can be resumed after a human intervention), direct integration with other services such as CloudFormation, etc.<\/p>\n\n\n\n

Technologies and project setup<\/h2>\n\n\n\n

To deploy our infrastructure resources, we decided to use the CloudFormation service with the help of AWS CDK.<\/p>\n\n\n\n

CDK (Cloud Development Kit) is an official framework developed by Amazon Web Services to create Infrastructure as Code (IaC). It is compatible with different programming languages \u200b\u200b(Typescript, Javascript, Python, Java, and C #) and allows, through its constructs, to define the infrastructural resources of our project (Bucket S3, Application Load Balancer, VPC, Database, etc.) in a programmatic way. Our project written with CDK, once “compiled”, will generate a CloudFormation template in JSON format that can be distributed through the service of the same name. The main advantage of using CDK over other IaC frameworks comes from the fact that, by writing the infrastructure using a programming language, we can take advantage of language-specific constructs and abstractions (iterations, conditions, objects, functions, etc.) to make our template easier to read, thus reducing the copy\/paste and therefore the lines of code to manage.<\/p>\n\n\n\n

As for GitLab, it was decided to create an API gateway with a REST API to intercept the webhooks launched by GitLab based on pushes and use the AWS CodePipeline service to manage the software deployment in the dedicated EC2.<\/p>\n\n\n\n

We decided to divide the project into three macro areas, each of which refers to a specific infrastructural layer. Each macro area corresponds to a git repository with a dedicated CDK stack. This solution allows us to separate the areas of expertise and assign each of them to a dedicated team of developers.<\/p>\n\n\n\n

  1. Global infrastructure<\/li>
  2. Infrastructure pipeline<\/li>
  3. Software pipeline<\/li><\/ol>\n\n\n\n

    1. Global infrastructure<\/h3>\n\n\n\n

    The Global Infrastructure<\/strong> repo contains all the stacks that are used to deploy shared infrastructure services divided by the environment in which they are released.<\/p>\n\n\n\n

    These are the shared services, managed by the client, that manage the entire first layer, from roles to the network to encryption keys to the shared load balancer:<\/p>\n\n\n\n