Things to consider about AWS EC2 instance types for horizontal scaling

I’m going to scale an application for 100,000 users. The application was hosted in NodeJS. I have created docker images for my application and also using AWS ALB etc. My application is small and my main concern is the number of users going to hit the application. The application is taking only 600mb of memory (max) for a container. So, I used 8 t2.small (2GB RAM machine) instances and hosted 3 containers in each instance (i.e., 8 X 3 = 24 containers (3 in each container)). With this architecture, I can scale this for up to 5000 users. I can horizontally scale this for up to 100,000 users, but my concern is that What if I choose an m4.large instead of the t2.small machine that I chose.

Because instead of using 8 t2.small machines (8 X 2GB = 16GB), we can also use 2 m4.large (2 X 8GB = 16GB) machines also. And can also host 24 containers in it.

Why I chose t2.small instances was the vCPU value. Both t2.small and m4.large has 2vCPUs. So if we go for 2 m4.large machines, there will be 4vCPUs for these 24 containers. But if we go with 8 t2.small instances, we will get 16vCPUs for these 24 containers.

But, is there are any other factors that I need to consider? Any advice would be highly appreciated.

Answer

m type instances are memory optimized, if the memory is not required m instances are probably not the right choice.
It strongly depends on your application and the ressource requirements under load.

A factor is cost and dyamic scaling, which depends on the situation. If you need high availability at least two instances are needed. Considering your scenario of 2 x m4, it would implicate that you have at any point in time the required resources for a 100% load running. Typically applications have peak times and times where only a fraction of the resources are needed. Going for 8 x t2 would mean that you are in a position where you could scale down the resources to 25% of the required resources while keeping high availability. All these considerations do have an impact at the cost.

Suggest to:

  • determine the baseline user amount (min. provisioning).
  • divide that by the required high availability factor (e.g. two).
  • sample typical user requests into a loadtest (e.g. jmeter), design the loadtest density to meet the calculated values
  • fire up different instance types which could suite the needs (do not use a loadbalancer for these tests)
  • monitor them during running the loadtest (e.g. cpu, memory) to determine which type is best suited for your application
  • design the autoscaling accordingly (use the experience from the loadtests as starting point for scaling triggers)
  • overprovision depending on the demand behaviour (users)
  • if nedded pre-scale up before rush times

Attribution
Source : Link , Question Author : Neron Joseph , Answer Author : hargut

Leave a Comment