How to define the architecture for scalable applications

João Pedro São Gregório Silva | June 20, 2022


One of the biggest challenges when developing an application is to deliver acceptable performance regardless of the number of users. Scalability is not easy and often requires you to take decisions early in the architecture to make sure that you will be able to meet the expectations. In this article we are going to investigate three strategies from the 12-Factor App that can help you define an architecture that is truly scalable.

State within process

Imagine you run a bakery. More precisely, a small bakery, and you rarely have more than one customer at a time. It doesn't make sense for you to write all orders on a paper or book, since you can remember them with no problem.

After a couple months, your bakery is a success, and with more customers to serve, you decide to hire other three attendants. The symptoms of the issue are obvious:

  • You have to repeat the order to the other attendants if you need to take a quick break
  • They don't know the customers as good as you and get confused as the customer asks for "I'll have the usual".
  • Orders get lost

Keeping things only in your memory is not scalable. Take sessions for example; one common approach is to store the information inside the process’s memory. While this looks like an innocent decision, it completely breaks the ability to scale the service out. This is where databases come into play: they provide a way for you to persist data outside of your application and let multiple instances access it safely without having to keep a copy of all the data in memory.

To make an application truly scalable, you need to consider that each instance is a first-class citizen in the system and should be able to run independently of the others. This means, you must avoid storing information inside a process and have data being accessed by several processes simultaneously. This is known as: stateless design.


Now things are going well in your bakery. The orders are centralized, and you don't have to keep everything in your mind all the time. But, sure enough, things don't always go the way we want, and this time you notice that certain things are causing you to hit a bottleneck. Things like:

  • You only have one register, so you can't really hire more people to work as a cashier.
  • In the kitchen you only have equipment for one person

Someone naïve may think "Well, let's hire a faster cashier and baker". But that's not going to work well. The amount of work someone can do is almost fixed and even the fastest worker just produce a fraction more than the average.

In our analogy, this means that, if you expect a service to scale up infinitely, you may be disappointed. CPUs and GPUs have limits, motherboards have limited amount of memory slots, and so on. Another problem with vertical scalability is the lack of elasticity, there isn't a feasible way to change your machine on the fly without causing interruptions.

Another option is horizontal scaling. This means that, instead of focusing on faster machines, you scale up by adding more instances. For our analogy, this would mean having more than one register, more than one kitchen, and so on. This strategy works best with stateless applications.

Of course, horizontal scalability is not a silver bullet and neither excludes you from using vertical scaling. Common problems that you may face include more difficult load balancing and the need for more sophisticated monitoring and management.

If cost and performance optimization is your thing, try to find the balance between horizontal and vertical scaling. It might not be trivial, but it will save you time and money in the long run.


Mapping the concept of disposability to our analogy results in something like a collaborator taking a vacation, or maybe having to go home due to sickness. Now imagine that every time that an event like this occurs, the employee must fill dozens of pages describing what he was doing, in order to not affect the business. This would be a terrible experience and certainly a bottleneck.

In the cloud native world, there are many reasons for a process to be disposed. Things like "we are scaling down since the current traffic is low" or "this process cannot recover from an error" are common. The opposite can also occur, maybe you want to quickly scale to meet demand. In many scenarios, these processes need to happen as fast as possible to avoid downtime or bad user experience.

A disposable process is one that can be started and stopped quickly without affecting the application. This means that it must not have persistent state, but also that it is easy to start a new instance of it. But that's not all, one thing that's often overlooked is the ability to gracefully shut down. This means that, when a termination signal is received, the process should stop accepting new work, finish the work in progress, and then close.

For example, a web server, where requests are usually short-lived, may start rejecting new connections but continue to serve existing connections until they close. Another scenario where this may be useful is if you have a process that is pulling data from an external API. If the process is shut down, you don't want to lose data, so it will continue to process the current batch of data, and then close.

Key takeaways

In conclusion, these are the three key takeaways that you should take from this article:

  • If possible, go for the stateless design, it's a requirement for a true 12-factor app.
  • Think about concurrency from the beginning, designing smaller services that can be quickly scaled out.
  • Develop disposable processes to both avoid downtime and quickly respond to changes in traffic


We were able to see how the three concepts, statelessness, concurrency, and disposability can help us create an application that is scalable and resilient. These concepts are not limited to cloud native environments, but they are more relevant in those. It's important to note that a scalability strategy is not a silver bullet, it's not a guarantee of infinite performance. But it certainly is a good start and the best thing you can do for your application.


This article was written by João Pedro São Gregório Silva, Innovation Expert at Encora. Thanks to Isac Sacchi Souza and João Augusto Caleffi for reviews and insights.

About Encora

Fast-growing tech companies partner with Encora to outsource product development and drive growth. Contact us to learn more about our software engineering capabilities.

Contact us

Insight Content


Share this Post

Featured Insights

Fill Out Later