<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>The Digital Cat - software architecture</title><link href="https://www.thedigitalcatonline.com/" rel="alternate"></link><link href="https://www.thedigitalcatonline.com/categories/software-architecture/atom.xml" rel="self"></link><id>https://www.thedigitalcatonline.com/</id><updated>2022-08-23T12:00:00+00:00</updated><subtitle>Adventures of a curious cat in the land of programming</subtitle><entry><title>Data Partitioning and Consistent Hashing</title><link href="https://www.thedigitalcatonline.com/blog/2022/08/23/data-partitioning-and-consistent-hashing/" rel="alternate"></link><published>2022-08-23T12:00:00+00:00</published><updated>2022-08-23T12:00:00+00:00</updated><author><name>Leonardo Giordani</name></author><id>tag:www.thedigitalcatonline.com,2022-08-23:/blog/2022/08/23/data-partitioning-and-consistent-hashing/</id><summary type="html"></summary><content type="html">&lt;p&gt;This post is an introduction to partitioning, a technique for distributed storage systems, and to consistent hashing, a specific partitioning algorithm that promotes even distribution of data, while allowing to dynamically change the number of servers and minimising the cost of the transition.&lt;/p&gt;&lt;p&gt;My interest in partitioning dates back to 2015 when I was following courses at the MongoDB university and learned about &lt;em&gt;sharding&lt;/em&gt;, the name MongoDB uses for partitioning. I was fascinated by the topic and discovered the technique known as &lt;em&gt;consistent hashing&lt;/em&gt;; I enjoyed it a lot, so much that I wrote a little demo in Python to understand it better. Later, I focused on other things and forgot the project completely, until recently, when &lt;a href="https://github.com/drocpdp"&gt;David Eynon&lt;/a&gt; sent me a PR on GitHub to replace a deprecated testing library. So, I decided to brush up on my knowledge of consistent hashing and, as I often do on this blog, dump my thoughts in a post.&lt;/p&gt;&lt;p&gt;The topic of distributed storage and data processing is arguably rich and complicated, so while I will try to give a broader context to the concepts that I will introduce, I by no means intend to write a comprehensive guide to the subject matter. The audience of this post is developers who do not know what partitioning and consistent hashing are and want to take their first step into those topics.&lt;/p&gt;&lt;div class="infobox"&gt;&lt;div class="title"&gt;Code syntax&lt;/div&gt;&lt;div&gt;&lt;p&gt;You will find some code examples mentioned in the post, which are written using the Python notation. If you are not familiar with the language, these are the main rules&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;code&gt;x**y&lt;/code&gt; means x&lt;sup&gt;y&lt;/sup&gt;, e.g. &lt;code&gt;2**3 =&amp;gt; 8&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;&lt;code&gt;x//y&lt;/code&gt; means the integer division between &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt;, e.g. &lt;code&gt;11//4 =&amp;gt; 2&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;&lt;code&gt;x%y&lt;/code&gt; means the modulo operation (remainder of integer division), e.g. &lt;code&gt;11%4 =&amp;gt; 3&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 id="rationale-2d0e"&gt;Rationale&lt;a class="headerlink" href="#rationale-2d0e" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;When we design a system, we might want to scatter data among multiple sources to allow real concurrency of access and a more targeted optimisation.&lt;/p&gt;&lt;p&gt;For example, we might observe that in a given social media application there are two types of queries: some are very infrequent and involve tables related to personal data and the user profile, others are extremely frequent and pretty intensive, and are related to the content shared by the user. In this case, we might decide to store the tables related to the profile and the tables that are related to content in two different systems, A and B (here, the word &lt;em&gt;system&lt;/em&gt; might be freely replaced by &lt;em&gt;computer&lt;/em&gt;, &lt;em&gt;database&lt;/em&gt;, &lt;em&gt;storage system&lt;/em&gt;, or other similar components).&lt;/p&gt;&lt;p&gt;This means that the infrequent queries that fetch personal data will be served by system A, while the more frequent and intensive queries related to content will be served by system B.&lt;/p&gt;&lt;p&gt;Suddenly, we have the chance to deploy system B using more powerful and expensive hardware, or an architecture with better performances, without increasing the cost for tables that won&amp;#x27;t benefit from such an improvement as the ones stored in system A.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/partitioning_rationale.jpg"&gt;&lt;/div&gt;&lt;p&gt;This is a standard approach in system design, and it requires the introduction of an additional layer of control that will route requests to the right source. This layer might be implemented in several places, for example:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;in the code of our application, with conditional structures that query different data sources&lt;/li&gt;&lt;li&gt;in the framework that we are using for the application, for example in a middleware that automatically routes requests according to nature or the query&lt;/li&gt;&lt;li&gt;in a wrapper around the storage that hides the fact that data exists in two different systems&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;In the last case, this technique is usually called &lt;em&gt;partitioning&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;In this post, I will try to show the challenges we face when we partition data and focus on some of the algorithms that can be used to implement it, in particular on consistent hashing. Please note that, while some of these techniques are used by databases to provide internal partitioning, they have a wider range of applications and might come in handy in different contexts.&lt;/p&gt;&lt;h2 id="design-choices-d096"&gt;Design choices&lt;a class="headerlink" href="#design-choices-d096" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Every design choice in a system depends on the requirements, and when it comes to data storage the most important factors are the &lt;em&gt;nature of the data&lt;/em&gt;, its &lt;em&gt;distribution&lt;/em&gt;, and the &lt;em&gt;access patterns&lt;/em&gt;. Consider for example databases and Content Delivery Networks (CDNs): both are meant to store data, and the storage size of both can vary substantially. However, there are important differences between the two that greatly affect the design choices. Let&amp;#x27;s see some simple examples:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;databases are meant to store data in a long-term fashion, while caches are by definition short-lived. This means that an important requirement for databases is data preservation, and we should do everything in our power to avoid losing parts of the database. A cache, conversely, holds data for a short time, either predetermined by the system or forced by a change in the data source. As you can see in this case we not only take data loss as part of the equation, but we get to the point where we trigger it on purpose.&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;applications often make use of range queries, which means that they retrieve sets of results spanning a range of values of one of the keys; for example, you might want to see all employers within a certain range of salaries, or all users that have more than a certain amount of followers. In such cases, it makes little sense to scatter data among different physical sources, thus making the retrieval more complicated and ultimately affecting performances. Databases see very often an access pattern of this type, while caches, being usually implemented as key/value stores, do not need to take this into account.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id="a-practical-example-of-partitioning-1b06"&gt;A practical example of partitioning&lt;a class="headerlink" href="#a-practical-example-of-partitioning-1b06" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Let us consider a simple key/value store, for example a common address book where the key is the name of the contact and the value a rich document with their personal details. If multiple users access the store, chances are that the system will at a certain point struggle to serve all the requests, so we might want to partition the data to allow concurrent access. We can for example sort them alphabetically and split them in two, storing all values with a key that begins with the letters A-M in one server and the rest (keys N-Z) in the second one.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_1.jpg"&gt;&lt;/div&gt;&lt;p&gt;This might seem a good idea, but we will soon discover that performances are not great. Unfortunately, our address book doesn&amp;#x27;t contain the same number of people for each letter, as (for example) we know more people whose name starts with A or C than with X or Z.&lt;/p&gt;&lt;p&gt;That poses a problem, as our partitioning doesn&amp;#x27;t achieve the desired outcome, that of splitting requests evenly between the two servers. If we increase the number of partitions, serving smaller groups of letters, we will just worsen the problem, to the point where a partition might be completely empty and thus receive no traffic: since the problem comes from the data distribution, we need to find a way to change that property.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_2.jpg"&gt;&lt;/div&gt;&lt;p&gt;One way to deal with the problem is to change the boundaries of the partitions so that we get an almost even distribution of values among them. For example, we might store keys starting with A-B in the first partition, keys starting with C-D in the second, and all the rest in the third one.&lt;/p&gt;&lt;p&gt;The problem with such a strategy is that it is highly dependent on the actual data that we are storing. Not only does this mean the solution has to be customised for each use case (the partitions in the example might be good for one address book and completely wrong for another), but adding data to the storage might change the distribution and invalidate the solution.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_3.jpg"&gt;&lt;/div&gt;&lt;h2 id="hash-functions-to-the-rescue-2cc2"&gt;Hash functions to the rescue&lt;a class="headerlink" href="#hash-functions-to-the-rescue-2cc2" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;An interesting solution to the problem of distributing data evenly is represented by hash functions. As I explained in my post &lt;a href="https://www.thedigitalcatonline.com/blog/2018/04/06/introduction-to-hashing/"&gt;Introduction to hashing&lt;/a&gt;, good hash functions produce a highly uniform distribution, which makes them ideal in this case. Please note that hash functions can help with &lt;em&gt;routing queries&lt;/em&gt; and not with &lt;em&gt;storing data&lt;/em&gt;. Hashed values cannot replace the content, as they are not bijective, i.e. given two different inputs the output might be the same (collision), so they can only be used to decide &lt;em&gt;where&lt;/em&gt; to store a piece of information.&lt;/p&gt;&lt;p&gt;We can at this point devise a storage strategy based on hash functions. We can divide the output space of the hash function (codomain) into a certain amount of partitions and be sure that each one of them will contain a similar amount of elements. For example, the hash function might output a 32-bit number, so we know that each hashed value will be between 0 and 2&lt;sup&gt;32&lt;/sup&gt; (4,294,967,295), and from here it&amp;#x27;s pretty straightforward to find partition boundaries. For example, we can create 16 partitions numbered 0 to 15, each one containing 2&lt;sup&gt;28&lt;/sup&gt; hash values (268,435,456).&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/hash_functions.jpg"&gt;&lt;/div&gt;&lt;p&gt;Routing is at this point very simple, as we can mathematically find the partition number given the hash. There are many ways to do this but two simple approaches are&lt;/p&gt;&lt;ul&gt;&lt;li&gt;using the integer division &lt;code&gt;hash(k) // partition_size&lt;/code&gt;, e.g. &lt;code&gt;hash(k) // 2**28&lt;/code&gt;. All keys from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;268435455&lt;/code&gt; end up in partition 0 (&lt;code&gt;268435455 // 2**28&lt;/code&gt;), keys from &lt;code&gt;268435456&lt;/code&gt; to &lt;code&gt;536870911&lt;/code&gt; end up in partition 1, and so on.&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;using the modulo operator &lt;code&gt;hash(k) % number_of_partitions&lt;/code&gt;, e.g. &lt;code&gt;hash(k) % 16&lt;/code&gt;. This assigns values to partitions in a round robin fashion, where key &lt;code&gt;0&lt;/code&gt; goes to partition 0 (&lt;code&gt;0%16&lt;/code&gt;), key &lt;code&gt;1&lt;/code&gt; to partition 1 (&lt;code&gt;1%16&lt;/code&gt;), key &lt;code&gt;15&lt;/code&gt; to partition 15 (&lt;code&gt;15%16&lt;/code&gt;), and then starts again with key &lt;code&gt;16&lt;/code&gt; which goes to partition 0 (&lt;code&gt;16%16&lt;/code&gt;), and so on.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This architecture has the clear advantage that thanks to the properties of hash functions, data is scattered evenly among the partitions. This means that when we query the system, requests will also be divided evenly, thus giving us a good distribution of the load.&lt;/p&gt;&lt;p&gt;As we will see later, however, this is not a good approach for dynamic systems.&lt;/p&gt;&lt;h2 id="partitioning-use-cases-d7c3"&gt;Partitioning use cases&lt;a class="headerlink" href="#partitioning-use-cases-d7c3" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Hash functions are definitely interesting but they are not the perfect solution in every case. Let&amp;#x27;s have a brief look at three different types of systems that might benefit from partitioning and discuss their specific requirements.&lt;/p&gt;&lt;h3 id="load-balancers-b5f9"&gt;Load balancers&lt;/h3&gt;&lt;p&gt;Pure load balancers solve a simple problem: to spread requests evenly across multiple &lt;em&gt;identical&lt;/em&gt; servers. The key word here is &amp;quot;identical&amp;quot;, as you cannot pick the wrong server, thus no routing can result in an error. However, spreading the load unevenly can result in performance loss, and possibly also service failure. For example, if a server gets overloaded queries might hit a timeout while waiting to be served.&lt;/p&gt;&lt;p&gt;For this reason, when load balancing is not content-aware, for example in a simple HTTP server scenario, round-robin partitioning is a good choice. The system just assigns new requests to servers on a rotation basis, which ensures perfectly even distribution. For example, this algorithm is the default choice for AWS Application Load Balancers.&lt;/p&gt;&lt;p&gt;Clearly, load balancers can be more complicated and feature-rich even without becoming content aware. The aforementioned AWS ALBs, for example, support also the &amp;quot;least outstanding requests&amp;quot; algorithm, which in simple words means choosing the server with the smallest workload.&lt;/p&gt;&lt;h3 id="caches-27ec"&gt;Caches&lt;/h3&gt;&lt;p&gt;Caches are systems that temporarily store data whose retrieval is expensive, either for the user or for the provider. For example, if a system runs a long query on a database caching the result will be beneficial both for the system and the database. For the former, because a repeated run will get the result much faster and for the latter because the load of the new query is zero.&lt;/p&gt;&lt;p&gt;Caches can be found everywhere and vary dramatically in size, but they are one of the best examples of systems that benefit from partitioning. As I mentioned before, their standard usage patterns don&amp;#x27;t include range queries and data loss (flushing) is part of their normal workflow.&lt;/p&gt;&lt;p&gt;A Content Delivery Network (CDN) is a specific type of cache that is distributed geographically. The purpose of the CDN nodes is to store content in a location that is physically near the users, thus increasing the performance of the system. This means that two geographically distinct nodes of a CDN contain the same values (replication), and the routing policy is solely based on the physical position of the user with respect to the node. Internally, each CDN node can be implemented using partitioning, though, which might speed up the performances of that specific node.&lt;/p&gt;&lt;h3 id="databases-14fd"&gt;Databases&lt;/h3&gt;&lt;p&gt;As for databases, I already mentioned that the most important problem is range queries or if you prefer, content-aware partitioning. In general, you can&amp;#x27;t partition a database without taking into account the content, or you will incur severe performance losses. So, when it comes to databases, partitioning has to be the result of a specific design and can&amp;#x27;t be applied regardless of the database schema. &lt;/p&gt;&lt;p&gt;To better understand the challenge, let&amp;#x27;s consider a simple database whose elements are employees with a name and a salary. Now, if we want to partition this database we have to choose a key for the partitioning itself. It might be the primary key, the name, or the salary, as these are the only values available in each record.&lt;/p&gt;&lt;p&gt;Say we use hash functions to partition the database and use the employee salary as a key. Because of the properties of hash functions, employees with the exact same salary will end up being stored in the same partition, but employees with similar salaries might end up in different ones. This depends on the number of partitions, clearly, but the main point is that records that are &amp;quot;near&amp;quot; (according to the selected key) now are potentially very far.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/hash_functions_and_range_queries.jpg"&gt;&lt;/div&gt;&lt;p&gt;In the example above I used MD5 as the routing hash function, and you can reproduce the calculations using the following Python code&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 57500283691658467528082923406379043196&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 209589555716047624083879134729984902154&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 12&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;

&lt;span class="c1"&gt;# 10&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Things do not go much better if we use the integer division. If we have 16 partitions, each one of them contains 2&lt;sup&gt;124&lt;/sup&gt; values&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="c1"&gt;# 2&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;124&lt;/span&gt;

&lt;span class="c1"&gt;# 9&lt;/span&gt;
&lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;124&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Now, let&amp;#x27;s consider a query that selects all employees within a certain range of salaries. If the database is not partitioned, all records are kept on the same server, and if we optimised the system for such a query, the records will also be physically adjacent (e.g. stored in nearby memory addresses). This makes the query blazing fast, but if the database is partitioned the query has to collect values from multiple partitions which greatly penalises performance.&lt;/p&gt;&lt;p&gt;We can see a real example of this design challenge in the documentation of MongoDB, a non-relational database that supports partitioning (called &lt;em&gt;sharding&lt;/em&gt;). MongoDB supports &lt;a href="https://www.mongodb.com/docs/manual/core/hashed-sharding/"&gt;hashed sharding&lt;/a&gt; and &lt;a href="https://www.mongodb.com/docs/manual/core/ranged-sharding/"&gt;ranged sharding&lt;/a&gt;. In their words&lt;/p&gt;&lt;p&gt;&lt;em&gt;Hashed sharding uses either a single field hashed index or a compound hashed index as the shard key to partition data across your sharded cluster.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Range-based sharding involves dividing data into contiguous ranges determined by the shard key values. In this model, documents with &amp;quot;close&amp;quot; shard key values are likely to be in the same chunk or shard. This allows for efficient queries where reads target documents within a contiguous range. However, both read and write performance may decrease with poor shard key selection.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;I highly recommend reading the two pages I linked above as they will give you a good idea of how a real system uses the concepts I introduced and what design challenges are involved when using partitioning.&lt;/p&gt;&lt;h2 id="caching-and-scaling-strategies-90f4"&gt;Caching and scaling strategies&lt;a class="headerlink" href="#caching-and-scaling-strategies-90f4" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;When we design distributed caches, an interesting problem we might face is that of scaling the system in and out to match the current load without wasting resources.&lt;/p&gt;&lt;p&gt;When the cache is under a light load we might want to run a small number of servers, but as soon as the number of requests increases we need to proportionally increase the number of cache nodes if we want to avoid a performance drop. This is usually not a big problem for partitioned databases, since in that case we change the number of partitions only occasionally to adjust performances or to increase the storage size, but caches like CDNs might need continuous adjustments during a single day.&lt;/p&gt;&lt;p&gt;Increasing or decreasing the number of nodes in a distributed cache might however be a pretty destructive action. Depending on the routing algorithm, if we add nodes (scale out) we might need to move data from existing ones to the newly added ones, and if we remove nodes (scale in) we will certainly lose the data contained in them. Both scenarios result in a (potentially massive) cache invalidation which can&amp;#x27;t be taken lightly.&lt;/p&gt;&lt;p&gt;The hash-based routing method presented in the previous section has terrible performances when it comes to scaling because any change in the number of servers impacts the key boundaries of the existing ones. Let&amp;#x27;s see a practical example of that and calculate the actual figures.&lt;/p&gt;&lt;h3 id="scaling-out-with-hash-partitioning-d6de"&gt;Scaling out with hash partitioning&lt;/h3&gt;&lt;p&gt;Every time you consider a process or an algorithm you should have a look at how it behaves in the worst possible condition, to have a glimpse of what you might run into when you use it. For this reason, the following example considers a scale-out scenario in which all cache nodes are full. The best case is obviously when all nodes are empty, but in that case we don&amp;#x27;t need to scale out at all.&lt;/p&gt;&lt;p&gt;Let&amp;#x27;s consider a 32-bit hash function and 16 partitions numbered 0 to 15. Since the hash function space is 2&lt;sup&gt;32&lt;/sup&gt; (4,294,967,296), each partition will contain 2&lt;sup&gt;28&lt;/sup&gt; hash values (268,435,456). Each node is full, which means that all the possible 2&lt;sup&gt;28&lt;/sup&gt; slots are assigned to a cached item, that is some data stored in the server that corresponds to that partition. The system is using the integer division routing system.&lt;/p&gt;&lt;p&gt;If we scale out to 17 partitions, increasing the pool by just by 1 node, each node will now contain a smaller part of the global data space, as now we split it among more nodes. In particular, each node used to contain 1/16 of the global data (268,435,456), and will now contain 1/17 of it (approx. 252,645,135). Our biggest problem is now managing the transition between the initial setup and the new one.&lt;/p&gt;&lt;p&gt;The first node hosted 1/16 of the data space, the keys from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;268435455&lt;/code&gt;. It will now contain 1/17 of the data space, the keys from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;252645134&lt;/code&gt;. To simplify the example it is useful to convert everything into a common unit of measure: the node used to contain 17/272 of the space (1/16) and contains now 16/272 (1/17) of it.&lt;/p&gt;&lt;p&gt;This means that 1/272 of the whole data space has to be moved to the second node, corresponding to the keys from &lt;code&gt;252645135&lt;/code&gt; to &lt;code&gt;268435455&lt;/code&gt;. It is important to note that these keys cannot be moved to the newly added node, but have to be moved to the second node because the algorithm we use maps keys to nodes in order.&lt;/p&gt;&lt;p&gt;This means that the second node will receive 1/272 of the whole data space. Since it originally already contained 17/272 of the whole space it should now theoretically contain 18/272 of it. However, as it happened for the first node, we want to balance the content and reduce it to 16/272, so now we have 2/272 of the whole space that we want to move to the third node.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/ripple_effect.jpg"&gt;&lt;/div&gt;&lt;p&gt;So, we move 1/272 from node 1 to node 2, 2/272 from node 2 to node 3, 3/272 from node 3 to node 4, and going on with the example we end up moving 16/272 (1/17) from the 16th node to the 17th, which fills it with the correct amount of keys. However, in doing so we moved 136/272 (1/272 + 2/272 + 3/272 + ... + 16/272) of the data space between nodes, which is exactly 50% of it.&lt;/p&gt;&lt;p&gt;So, for any initial size and a scale out of 1 single node, we have to move 50 of the data stored in our cache, and it might only get worse by increasing the number of final nodes until we end up having to move almost 100 of it (in an extreme case). A similar effect plagues the scale-in action, where one or more nodes are removed from the pool, and the keys they contain have to be migrated to the remaining nodes, creating a ripple effect to redistribute the keys according to the algorithm.&lt;/p&gt;&lt;p&gt;Using a modulo routing strategy doesn&amp;#x27;t change things: as I mentioned before, the core issue is that the addition of new nodes changes the routing of the whole data space, requiring a massive migration of keys in the entire system.&lt;/p&gt;&lt;h2 id="a-different-approach-be6e"&gt;A different approach&lt;a class="headerlink" href="#a-different-approach-be6e" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;While the idea of using hash functions looked very promising, we quickly found that the trivial implementation has very poor performances in a dynamic setting. As we clearly saw in the previous section, the problem is that upon scaling more than half of the keys have to be moved across nodes, so if we could find a way to avoid this we could still use hash functions to scatter data uniformly across the nodes.&lt;/p&gt;&lt;p&gt;As you might have already figured out, the issue comes from the attempt to keep all nodes perfectly balanced. The modulo and integer division algorithms distribute keys evenly (as long as the hash function has a good diffusion), but this is a double-edged sword. The balance is extremely beneficial in a static environment, but it is also the Achilles heel of this architecture when we change the number of nodes.&lt;/p&gt;&lt;p&gt;When we design a system, requirements are paramount. Everything we add to the final product should be there to satisfy one or more requirements. However, often requirements clash with each other, and trying to implement all of them at once might lead to situations where there is no apparent solution. In such cases, it is useful to temporarily drop one or more requirements and investigate the options we have, and this is exactly what we can do in this case: maintaining balance is an important feature, but let&amp;#x27;s see what would happen if we didn&amp;#x27;t have that requirement.&lt;/p&gt;&lt;p&gt;If we don&amp;#x27;t care about balancing nodes we can solve the problem with a different approach. Instead of using the integer division to find the slot, we can keep a table of the minimum hash served by each slot and route requests according to that. Each row of the table will have a minimum hash and the node that serves them.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/hash_table.jpg"&gt;&lt;/div&gt;&lt;p&gt;This means that when we increase the number of slots we can just drop a new slot anywhere and assign to it all the keys that fall under its domain. This means that the new node will become the owner of keys that belonged to another node as it happened before, but with an important difference. Now all keys come from another single node, and the amount of keys moved is a fraction of those contained in it (which is much less than half of the keys). In the worst case, we need to move all keys contained in a node, which once again is much less than half of the keys.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/hash_table_add_node.jpg"&gt;&lt;/div&gt;&lt;p&gt;As you can see, this relieves the load of one single node. According to what we said before, we are not trying to balance the load of the whole cluster. If we could use this technique to cover multiple spaces with a single added node, though, we could relieve the load of more than one other node. In principle this is simple: we just need to add multiple rows with the same node to the table.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/data-partitioning-and-consistent-hashing/hash_table_add_node_multiple.jpg"&gt;&lt;/div&gt;&lt;p&gt;Pay attention to the fact that we added multiple rows, that is multiple partitions, but they are all served by the same physical node. This has several advantages:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;It fills the new node with keys coming from several different nodes without rippling effects.&lt;/li&gt;&lt;li&gt;The key transfer load is spread among different nodes, noticeably hitting only the new node.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;There is also an interesting turn of events: since keys for the new node are fetched from several different existing nodes, the process will keep the cluster balanced! This is a remarkable outcome: we temporarily dropped a requirement and found a solution that provides that exact requirement in a different way.&lt;/p&gt;&lt;p&gt;The key part of this new process is the idea that multiple partitions can be served by the same node. The only missing part at this point is a way to identify the new partitions (the sets of hashes) served by the new node in a deterministic way.&lt;/p&gt;&lt;h2 id="consistent-hashing-1397"&gt;Consistent hashing&lt;a class="headerlink" href="#consistent-hashing-1397" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Finally, let me introduce consistent hashing as a technique to implement the process described above.&lt;/p&gt;&lt;p&gt;As we discussed in the previous section, the only missing part is an algorithm that produces a deterministic set of hash ranges for a single new node. These hash ranges represent the partitions served by that node and should be scattered across the whole hash space. It is important for them to be spread because this way they will each receive some keys from existing nodes, instead of migrating a bulk of keys from a single one. The more evenly spread, the better the distribution of the load and the more balanced the resulting cluster.&lt;/p&gt;&lt;p&gt;As we saw previously, any time we need to scatter data across a given space in a deterministic way, hash functions are a good choice, and they can be used in this case as well. The idea is simple: &lt;em&gt;each partition of a node is assigned a name and this name is hashed with the same function used to hash the keys stored in the system&lt;/em&gt;. This will produce a deterministic value in the hash space, and &lt;em&gt;that value will be the minimum value served by that partition&lt;/em&gt;. Thanks to diffusion the names of all partitions will generate different hash values that won&amp;#x27;t easily clash, and this is the way we generate the routing table.&lt;/p&gt;&lt;p&gt;Let&amp;#x27;s see an example, bearing in mind that the specific function can change among implementations.&lt;/p&gt;&lt;p&gt;For simplicity&amp;#x27;s sake, I used a custom hash function that outputs 28-bit hashes (7 hexadecimal digits). This makes it possible to compare hashes visually and simplifies the example. To do this I took the first 7 digits of the SHA1 hash with the following Python code&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;hash_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()[:&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;thus creating a hash function whose values go from &lt;code&gt;0x0000000&lt;/code&gt; to &lt;code&gt;0xfffffff&lt;/code&gt;. At the end of the post you will find the Python code that I used to generate the following routing tables, and you are free to experiment using different settings.&lt;/p&gt;&lt;p&gt;WARNING: this is not a good hash function! SHA1 produces 160 bits hashes, so taking the first 28 bits reduces the hash space to a microscopic fraction of the total, as we go from 2&lt;sup&gt;160&lt;/sup&gt; total hashes to 2&lt;sup&gt;28&lt;/sup&gt;. Please keep in mind that this is done only to simplify the visualisation of the example.&lt;/p&gt;&lt;p&gt;All our nodes are called &lt;code&gt;server-X&lt;/code&gt; with &lt;code&gt;X&lt;/code&gt; being a letter of the English alphabet, thus giving us &lt;code&gt;server-a&lt;/code&gt;, &lt;code&gt;server-b&lt;/code&gt;, and so on. I decided to create 5 partitions per server, numbered from 0 to 4, which are generated appending &lt;code&gt;-Y&lt;/code&gt; to the name, where &lt;code&gt;Y&lt;/code&gt; is the number of the partition. For example:&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;server-a-0 -- hash --&amp;gt; 148456820
server-a-1 -- hash --&amp;gt; 57674441
server-a-2 -- hash --&amp;gt; 216250418
server-a-3 -- hash --&amp;gt; 30595746
server-a-4 -- hash --&amp;gt; 23746828
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;If we do this for two nodes (&lt;code&gt;server-a&lt;/code&gt; and &lt;code&gt;server-b&lt;/code&gt;) and then sort the results we will get a full routing table&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt; 23746828 --&amp;gt; server-a-4 ( 6848918 hashes)
 30595746 --&amp;gt; server-a-3 (27078695 hashes)
 57674441 --&amp;gt; server-a-1 ( 3228787 hashes)
 60903228 --&amp;gt; server-b-2 (17957108 hashes)
 78860336 --&amp;gt; server-b-0 ( 7773725 hashes)
 86634061 --&amp;gt; server-b-4 (61822759 hashes)
148456820 --&amp;gt; server-a-0 (67793598 hashes)
216250418 --&amp;gt; server-a-2 (17304439 hashes)
233554857 --&amp;gt; server-b-3 (29289666 hashes)
262844523 --&amp;gt; server-b-1 ( 5590932 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Remember that the hashes in the routing table are the minimum hash served by that partition. For example, the first line tells us that all hashes from &lt;code&gt;23746828&lt;/code&gt; are served by the partition &lt;code&gt;server-a-4&lt;/code&gt;, while hashes from &lt;code&gt;30595746&lt;/code&gt; are served by the partition &lt;code&gt;server-a-3&lt;/code&gt;. This means that the partition &lt;code&gt;server-a-4&lt;/code&gt; serves 6848918 hashes (as you can read in the table). A key whose hash is &lt;code&gt;79249022&lt;/code&gt; will be served by &lt;code&gt;server-b-0&lt;/code&gt;&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt; 60903228 --&amp;gt; server-b-2 (17957108 hashes)
 78860336 --&amp;gt; server-b-0 ( 7773725 hashes)
                     ^
                     |
 79249022 -----------+

 86634061 --&amp;gt; server-b-4 (61822759 hashes)
148456820 --&amp;gt; server-a-0 (67793598 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Since partitions are not physically separated, but are just virtual entities belonging to a node, the route table can be simplified to&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt; 23746828 -- &amp;gt; server-a (37156400 hashes)
 60903228 -- &amp;gt; server-b (87553592 hashes)
148456820 -- &amp;gt; server-a (85098037 hashes)
233554857 -- &amp;gt; server-b (34880598 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;hr&gt;&lt;p&gt;What we achieved is remarkable, but there are still two problems. Let&amp;#x27;s have a look at a simple routing table for three nodes with 5 partitions each&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;3 nodes with 5 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt; 23746828 --&amp;gt; server-a (23267855 hashes)
 47014683 --&amp;gt; server-c (10659758 hashes)
 57674441 --&amp;gt; server-a ( 3228787 hashes)
 60903228 --&amp;gt; server-b (63557309 hashes)
124460537 --&amp;gt; server-c (23996283 hashes)
148456820 --&amp;gt; server-a (31382512 hashes)
179839332 --&amp;gt; server-c (36411086 hashes)
216250418 --&amp;gt; server-a (17304439 hashes)
233554857 --&amp;gt; server-b (15386579 hashes)
248941436 --&amp;gt; server-c (13903087 hashes)
262844523 --&amp;gt; server-b ( 5590932 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;First, the lowest value is not 0, which means that there are some hashes (23,746,828 in this case) which are not served by any slot. Second, in general the distribution doesn&amp;#x27;t cover the space evenly, as some nodes receive too many keys compared to others. This second problem isn&amp;#x27;t actually visible in the setups I showed so far, but it becomes noticeable increasing the number of servers. For example, with two nodes we have this situation&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;2 nodes with 5 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;server-a: 122254437 hashes
server-b: 146181018 hashes
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;while with 5 nodes it becomes&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;5 nodes with 5 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;server-a: 64211359 hashes
server-b: 66179053 hashes
server-c: 57545779 hashes
server-d: 43217324 hashes
server-e: 37281940 hashes
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;As you can see, in the second case the load of &lt;code&gt;server-e&lt;/code&gt; is 56% that of &lt;code&gt;server-b&lt;/code&gt;.&lt;/p&gt;&lt;hr&gt;&lt;p&gt;The first problem is easily solved assigning the initial hashes to the last node, that is considering the hash space mapped on a circle. This means that for 2 nodes with 5 partitions each we have&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;Routing table of 2 nodes with 5 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;Full routing table
&lt;span class="hll"&gt;        0 --&amp;gt; server-b-1 (23746828 hashes)
&lt;/span&gt; 23746828 --&amp;gt; server-a-4 (6848918 hashes)
 30595746 --&amp;gt; server-a-3 (27078695 hashes)
 57674441 --&amp;gt; server-a-1 (3228787 hashes)
 60903228 --&amp;gt; server-b-2 (17957108 hashes)
 78860336 --&amp;gt; server-b-0 (7773725 hashes)
 86634061 --&amp;gt; server-b-4 (61822759 hashes)
148456820 --&amp;gt; server-a-0 (67793598 hashes)
216250418 --&amp;gt; server-a-2 (17304439 hashes)
233554857 --&amp;gt; server-b-3 (29289666 hashes)
262844523 --&amp;gt; server-b-1 (5590932 hashes)

Simplified routing table
&lt;span class="hll"&gt;        0 -- &amp;gt; server-b (23746828 hashes)
&lt;/span&gt; 23746828 -- &amp;gt; server-a (37156400 hashes)
 60903228 -- &amp;gt; server-b (87553592 hashes)
148456820 -- &amp;gt; server-a (85098037 hashes)
233554857 -- &amp;gt; server-b (34880598 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;where the partition &lt;code&gt;server-b-1&lt;/code&gt; contains the orphaned initial hashes.&lt;/p&gt;&lt;p&gt;The second problem is a matter of statistical approach. The hash function that we use to map the partition name to the key space cannot be controlled, as its diffusion property has been designed to avoid a regular spacing of values. However, if we increase the number of partitions we expect the hash function to spread values across the whole space. At that point, each partition will be assigned just a tiny key space, and the differences between partitions will be less noticeable. In other words, by increasing the number of partitions dramatically we should achieve a better distribution. Let&amp;#x27;s compare the results of 5 nodes with 2 partitions each&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;5 nodes with 2 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;server-a 36500586
server-b 76678431
server-c 31738329
server-d 56183426
server-e 67334683
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;with the results of 5 nodes with 3000 partitions each&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;5 nodes with 3000 partitions each&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;server-a 53385222
server-b 53855877
server-c 53755762
server-d 53597662
server-e 53840932
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;There is clearly an upper limit to the number of partitions that we can create. If we create more partitions than the possible number of hashes we will end up having empty ones and incurring routing errors as some of them will clash, but this is a purely theoretical case: using standard real hash functions we generate hashes of at least 160 bits, which means a codomain of 2&lt;sup&gt;160&lt;/sup&gt; possible values (more than 10&lt;sup&gt;48&lt;/sup&gt;). With 10,000 nodes (which is a considerable amount of servers in 2022) the threshold would be greater than 10&lt;sup&gt;44&lt;/sup&gt; partitions per server.&lt;/p&gt;&lt;p&gt;So far, we achieved great results, but we already managed to properly partition the space with simple techniques. The real power of consistent hashing is in the way it behaves in a dynamic setting.&lt;/p&gt;&lt;h2 id="consistent-hashing-and-scaling-649a"&gt;Consistent hashing and scaling&lt;a class="headerlink" href="#consistent-hashing-and-scaling-649a" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;The interesting thing about consistent hashing is its amazing behaviour in a dynamic environment. As you might remember, the problem with hash partitioning was that a change in the number of nodes had ripple effects that resulted in a massive migration of at least half the keys.&lt;/p&gt;&lt;p&gt;With consistent hashing, when we add a new node we need to generate the hash values for that and put them in the routing table, and at that point we need to migrate the keys that fall under the domain of the newly created slots. Let&amp;#x27;s see an example before we discuss the performances.&lt;/p&gt;&lt;p&gt;The initial setup is 2 nodes with 5 partitions each&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;2 nodes with 5 partitions&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;Full routing table
        0 --&amp;gt; server-b-1 (23746828 hashes)
 23746828 --&amp;gt; server-a-4 (6848918 hashes)
 30595746 --&amp;gt; server-a-3 (27078695 hashes)
 57674441 --&amp;gt; server-a-1 (3228787 hashes)
 60903228 --&amp;gt; server-b-2 (17957108 hashes)
 78860336 --&amp;gt; server-b-0 (7773725 hashes)
 86634061 --&amp;gt; server-b-4 (61822759 hashes)
148456820 --&amp;gt; server-a-0 (67793598 hashes)
216250418 --&amp;gt; server-a-2 (17304439 hashes)
233554857 --&amp;gt; server-b-3 (29289666 hashes)
262844523 --&amp;gt; server-b-1 (5590932 hashes)

Simplified routing table
        0 -- &amp;gt; server-b (23746828 hashes)
 23746828 -- &amp;gt; server-a (37156400 hashes)
 60903228 -- &amp;gt; server-b (87553592 hashes)
148456820 -- &amp;gt; server-a (85098037 hashes)
233554857 -- &amp;gt; server-b (34880598 hashes)

Stats
server-a 122254437
server-b 146181018

TOTAL HASHES: 268435455/268435455
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;if we add one node we migrate to this new setup&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;3 nodes with 5 partitions&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;Full routing table
        0 --&amp;gt; server-b-1 (23746828 hashes)
 23746828 --&amp;gt; server-a-4 (6848918 hashes)
 30595746 --&amp;gt; server-a-3 (16418937 hashes)
 47014683 --&amp;gt; server-c-3 (10659758 hashes)
 57674441 --&amp;gt; server-a-1 (3228787 hashes)
 60903228 --&amp;gt; server-b-2 (17957108 hashes)
 78860336 --&amp;gt; server-b-0 (7773725 hashes)
 86634061 --&amp;gt; server-b-4 (37826476 hashes)
124460537 --&amp;gt; server-c-2 (23996283 hashes)
148456820 --&amp;gt; server-a-0 (31382512 hashes)
179839332 --&amp;gt; server-c-1 (25303093 hashes)
205142425 --&amp;gt; server-c-4 (11107993 hashes)
216250418 --&amp;gt; server-a-2 (17304439 hashes)
233554857 --&amp;gt; server-b-3 (15386579 hashes)
248941436 --&amp;gt; server-c-0 (13903087 hashes)
262844523 --&amp;gt; server-b-1 (5590932 hashes)

Simplified routing table
        0 -- &amp;gt; server-b (23746828 hashes)
 23746828 -- &amp;gt; server-a (23267855 hashes)
 47014683 -- &amp;gt; server-c (10659758 hashes)
 57674441 -- &amp;gt; server-a ( 3228787 hashes)
 60903228 -- &amp;gt; server-b (63557309 hashes)
124460537 -- &amp;gt; server-c (23996283 hashes)
148456820 -- &amp;gt; server-a (31382512 hashes)
179839332 -- &amp;gt; server-c (36411086 hashes)
216250418 -- &amp;gt; server-a (17304439 hashes)
233554857 -- &amp;gt; server-b (15386579 hashes)
248941436 -- &amp;gt; server-c (13903087 hashes)
262844523 -- &amp;gt; server-b ( 5590932 hashes)

Stats
server-a 75183593
server-b 108281648
server-c 84970214

TOTAL HASHES: 268435455/268435455
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Let&amp;#x27;s have a closer look to what happens with &lt;code&gt;server-c&lt;/code&gt;&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;Simplified routing table
        0 -- &amp;gt; server-b (23746828 hashes)
 23746828 -- &amp;gt; server-a (23267855 hashes) ----+ 10659758 hashes
                                              | from server-a
 47014683 -- &amp;gt; server-c (10659758 hashes) &amp;lt;---+
 57674441 -- &amp;gt; server-a ( 3228787 hashes)
 60903228 -- &amp;gt; server-b (63557309 hashes) ----+ 23996283 hashes
                                              | from server-b
124460537 -- &amp;gt; server-c (23996283 hashes) &amp;lt;---+
148456820 -- &amp;gt; server-a (31382512 hashes) ----+ 36411086 hashes
                                              | from server-a
179839332 -- &amp;gt; server-c (36411086 hashes) &amp;lt;---+
216250418 -- &amp;gt; server-a (17304439 hashes)
233554857 -- &amp;gt; server-b (15386579 hashes) ----+ 13903087 hashes
                                              | from server-b
248941436 -- &amp;gt; server-c (13903087 hashes) &amp;lt;---+
262844523 -- &amp;gt; server-b ( 5590932 hashes)
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Globally, &lt;code&gt;server-c&lt;/code&gt; receives 47,070,844 hashes from &lt;code&gt;server-a&lt;/code&gt; and 37,899,370 hashes from &lt;code&gt;server-b&lt;/code&gt;, which results in a migration of approximately 30% of the total hashes. As you can see there is no ripple effect here, as the boundaries of the existing partitions do not change.&lt;/p&gt;&lt;p&gt;Let&amp;#x27;s consider the performances in the worst case when we add one single node. If we are terribly unlucky (and we use a hash function with clear issues) each partition of the new node will cover completely a partition of an existing node. Assuming that the initial setup with N nodes created a balanced cluster, each node contains 1/Nth of the total keys, and in the worst case we need to move all of them from an existing node to the newly added one.&lt;/p&gt;&lt;p&gt;So, adding one node to a cluster of N nodes using consistent hashing results, in the worst case, in the migration of 1/Nth of the keys. In the previous example, then, we expected to migrate &lt;em&gt;at most&lt;/em&gt; 50% of the keys (1/2), and we ended up migrating 30$ of them.&lt;/p&gt;&lt;p&gt;This is a terrific result. Not only it&amp;#x27;s much better than the previous one (&lt;em&gt;at least&lt;/em&gt; 50% of the keys), but it gets better increasing the size of the cluster. In a cluster with 100 nodes, adding a node will result (in the worst case!) in the migration of 1/100 of the keys.&lt;/p&gt;&lt;h2 id="source-code-b277"&gt;Source code&lt;a class="headerlink" href="#source-code-b277" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;All routing tables shown in the post have been created with the following Python script. Please bear in mind that this is just demo code, so things haven&amp;#x27;t been optimised or designed particularly well. Feel free to change the hash function and the parameters of the script to experiment and see what consistent hashing can do.&lt;/p&gt;&lt;div class="code"&gt;&lt;div class="title"&gt;&lt;code&gt;consistent_hashing_demo.py&lt;/code&gt;&lt;/div&gt;&lt;div class="content"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;string&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;operator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;itemgetter&lt;/span&gt;

&lt;span class="n"&gt;NUM_NODES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;NUM_PARTITIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;hash_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;encoded_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hash_encoded_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoded_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hash_encoded_name&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;create_partitions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;partitions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;partition_hashes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;partition_number&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partitions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;partition_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;partition_number&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
        &lt;span class="n"&gt;partition_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hash_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partition_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;partition_hashes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;partition_hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;partition_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;partition_hashes&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;create_routing_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_names&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;partitions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;node_names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;create_partitions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;partitions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;NUM_NODES&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ascii_lowercase&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Too many servers&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;server-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ascii_lowercase&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;NUM_NODES&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="n"&gt;routing_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_routing_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NUM_PARTITIONS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;routing_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;routing_table&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;routing_table&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;routing_table&lt;/span&gt;

&lt;span class="n"&gt;routing_table_shift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;routing_table&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mh"&gt;0xFFFFFFF&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;END&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;full_routing_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;routing_table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;routing_table_shift&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;full_routing_table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Full routing table&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;full_routing_table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s1"&gt;9&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; --&amp;gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;partition_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; hashes)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;grouped_routing_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;itertools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;full_routing_table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;simplified_routing_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grouped_routing_table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;consecutive_partitions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;simplified_routing_table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;consecutive_partitions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;consecutive_partitions&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Simplified routing table&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;simplified_routing_table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s1"&gt;9&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; -- &amp;gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s1"&gt;8&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s1"&gt; hashes)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Stats&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;slots&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;simplified_routing_table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;total_hashes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;slots&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;total_hashes&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;node_name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;total_hashes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;served_hashes&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;TOTAL HASHES: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_hashes&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 id="final-words-9803"&gt;Final words&lt;a class="headerlink" href="#final-words-9803" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I hope this long post was useful to introduce you to the topic of partitioning and in general to system design. As I mentioned, such concepts are currently in use by well-known systems, and still discussed as none of them is perfect, so it is worth understanding the fundamental issues before adopting a specific solution.&lt;/p&gt;&lt;h2 id="resources-edc5"&gt;Resources&lt;a class="headerlink" href="#resources-edc5" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;Martin Kleppmann, &lt;em&gt;Designing Data-Intensive Applications&lt;/em&gt;, Chapter 6 &amp;quot;Partitioning&amp;quot;, O’Reilly 2017 &lt;a href="https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/"&gt;official site&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;The &lt;a href="https://en.wikipedia.org/wiki/Consistent_hashing"&gt;Wikipedia article&lt;/a&gt; about consistent hashing.&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.toptal.com/big-data/consistent-hashing"&gt;A Guide to Consistent Hashing&lt;/a&gt; by Juan Pablo Carzolio.&lt;/li&gt;&lt;li&gt;The &lt;a href="https://www.cs.princeton.edu/courses/archive/fall09/cos518/papers/chash.pdf"&gt;original article&lt;/a&gt; by David Karger et al.: &amp;quot;Consistent Hashing and Random Trees: Distributed Caching protocols for Relieving Hot Spots ont the World Wide Web&amp;quot;.&lt;/li&gt;&lt;li&gt;An &lt;a href="https://arxiv.org/pdf/1406.2294.pdf"&gt;alternative algorithm&lt;/a&gt; by John Lamping and Eric Veach: &amp;quot;A Fast, Minimal Memory, Consistent Hash Algorithm&amp;quot;.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id="feedback-d845"&gt;Feedback&lt;a class="headerlink" href="#feedback-d845" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Feel free to reach me on &lt;a href="https://twitter.com/thedigicat"&gt;Twitter&lt;/a&gt; if you have questions. The &lt;a href="https://github.com/TheDigitalCatOnline/blog_source/issues"&gt;GitHub issues&lt;/a&gt; page is the best place to submit corrections.&lt;/p&gt;&lt;p&gt;Photo by &lt;a href="https://unsplash.com/@alexlvrs?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"&gt;Alex Lvrs&lt;/a&gt; on &lt;a href="https://unsplash.com/s/photos/cake?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;.&lt;/p&gt;</content><category term="Programming"></category><category term="algorithms"></category><category term="software architecture"></category><category term="big data"></category><category term="cryptography"></category><category term="devops"></category><category term="distributed systems"></category><category term="Python"></category></entry><entry><title>Stop using tools as if they were solutions</title><link href="https://www.thedigitalcatonline.com/blog/2021/05/25/stop-using-tools-as-if-they-were-solutions/" rel="alternate"></link><published>2021-05-25T09:00:00+01:00</published><updated>2021-08-22T09:00:00+00:00</updated><author><name>Leonardo Giordani</name></author><id>tag:www.thedigitalcatonline.com,2021-05-25:/blog/2021/05/25/stop-using-tools-as-if-they-were-solutions/</id><summary type="html"></summary><content type="html">&lt;p&gt;&lt;em&gt;&amp;quot;I will write a class&amp;quot;&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;I can&amp;#x27;t tell you how many times I have heard this sentence from candidates during coding interviews.&lt;/p&gt;&lt;p&gt;What&amp;#x27;s wrong with this sentence? Nothing, out of context, but let me add this little detail: this is usually the first sentence I hear when the candidate tries to tackle a problem.&lt;/p&gt;&lt;p&gt;I know that coding interviews can be very stressful and I also think that leading such interviews requires a lot of effort to avoid transforming them into nitpicking sessions in which the candidate feels every single keystroke is scrutinised and analysed. As if the destiny of the whole company depended on how fast you can code a function that reverses a string!&lt;/p&gt;&lt;p&gt;But even taking into account interview anxiety, I think such an approach reveals something wrong deeper in the way we approach problems as programmers. This is the result of a culture that mistakes tools for solutions, and if I can detect it in senior programmers it means it already propagated into our teams and our companies.&lt;/p&gt;&lt;h2 id="the-problem-solving-challenge-f589"&gt;The problem-solving challenge&lt;a class="headerlink" href="#the-problem-solving-challenge-f589" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;When you face a problem (&lt;em&gt;any&lt;/em&gt; problem) you need to devise a strategy to solve it. You need to have an idea of what to do before doing it, otherwise you are reacting and not acting.&lt;/p&gt;&lt;p&gt;When you practice any type of combat sport you train your body to react to specific inputs (attacks) with automatic reactions (defences, counterattacks), but you usually do it because in a real fight you don&amp;#x27;t have the time to make a conscious decision. Such &amp;quot;perfect&amp;quot; reactions, though, are the result of a constant and very focused effort to transform consciously selected actions into involuntary ones. Without training, a pure reaction is usually an average response at best.&lt;/p&gt;&lt;p&gt;In problem-solving we face the same challenge. Either we devise a strategy, or our approach will be clumsy and ultimately not efficient.&lt;/p&gt;&lt;p&gt;Imagine you were tasked to build a bridge between two sides of a river. Would your first concern be the specific type of hammers that the workers should use? After all, you can have ball-peen hammers, sledgehammers, brick hammers, and many other types. Choosing the wrong one might severely affect the performances of your workers.&lt;/p&gt;&lt;p&gt;That&amp;#x27;s hardly the first thing you should ask yourself. I&amp;#x27;m pretty sure you agree that knowing the distance that the bridge should cover is much more urgent. Also, the type and the amount of traffic that it has to carry (walkers, cars, trucks, trains) is an important factor, and you should be concerned about the budget that you are allowed.&lt;/p&gt;&lt;p&gt;Why are these questions more important than the one about hammers? Because the answers to these questions can heavily influence the whole project. They are pillars of your architecture and not details&lt;sup&gt;[&lt;a id="ref-footnote-1-90e26625" href="#cnt-footnote-1-90e26625"&gt;1&lt;/a&gt;]&lt;/sup&gt;. My colleague Ken Pemberton always reminds me that most of the time we don&amp;#x27;t ask ourselves an even more important question: &amp;quot;What problem are you trying to solve?&amp;quot;. In the example above, a bridge might not be the best solution in the first place.&lt;/p&gt;&lt;p&gt;I think the process, at least when it comes to software projects, can be divided into three connected phases: decomposition, communication, implementation.&lt;/p&gt;&lt;h3 id="decomposition-a9dc"&gt;Decomposition&lt;/h3&gt;&lt;p&gt;Macroscopically, a &lt;em&gt;processing system&lt;/em&gt; is made of an initial state, some transformations or intermediate states, and a final state.&lt;/p&gt;&lt;p&gt;Usually, it&amp;#x27;s simple to identify the initial and final state, while it&amp;#x27;s harder to describe what happens between the two. So, we need to proceed iteratively, describing the system using black boxes, and then opening each one of them, zooming in to describe what happens inside.&lt;/p&gt;&lt;p&gt;At any zoom level, from the 10,000 feet overview down to the description of a single function, you need to identify 4 things: the &lt;strong&gt;input&lt;/strong&gt;, the &lt;strong&gt;output&lt;/strong&gt;, the &lt;strong&gt;actors&lt;/strong&gt; and the &lt;strong&gt;data flow&lt;/strong&gt;.&lt;/p&gt;&lt;p&gt;The input is what enters the black box. It has usually been decided at a higher level of zoom or while discussing a component that provides it as output. So, it is given, and if it turns out to be inadequate we should take a step back in the design and question how we can provide proper input. The same is valid for the output.&lt;/p&gt;&lt;p&gt;The actors must be black boxes that accept data and transform it, and the data flow is how information is exchanged between the actors. This is clearly where it can take a long time to find a good solution, and we might need to go back and forth several times.&lt;/p&gt;&lt;p&gt;Let&amp;#x27;s look at an example. A search engine is a complicated piece of software, and implementing it is not a matter of 1 hour of work. But we can decompose it pretty easily, starting from the fact that the input of the system is a query, and that the output is an ordered set of results. So, my overview of this component is the following: the user inputs a query, the query is processed and the system returns a list of results, ordered by quality.&lt;/p&gt;&lt;p&gt;I didn&amp;#x27;t describe what &amp;quot;quality&amp;quot; is, nor discussed the specific implementation of the system that stores all possible results. Those details are buried down somewhere at a certain level of zoom and are utterly useless at this level.&lt;/p&gt;&lt;h3 id="communication-1fc1"&gt;Communication&lt;/h3&gt;&lt;p&gt;Any level of zoom in the decomposition can be described, and the amount of specific technical knowledge needed to understand the explanation should be directly proportional to the zoom level. You might have heard the quote &amp;quot;You do not really understand something unless you can explain it to your grandmother.&amp;quot; I believe this might be very offensive to grandmothers, but paraphrasing it, I would say that &amp;quot;There should be a zoom level at which the project is understandable by anyone who doesn&amp;#x27;t have a specific knowledge of the field&amp;quot;.&lt;/p&gt;&lt;p&gt;Indeed, the problem of technical communication is that tech-savvy gurus are usually not able to decompose what they are working on into black boxes that are sufficiently abstract to be understandable by any human being. Please note this can happen to anyone, not only to programmers. I had to listen enough times to people working in banking, insurances, or project management (just to name a few different fields) to know that they can be unable to describe their job or specific aspects of it without using 4 obscure words every 5 words, the fifth one probably being a conjunction.&lt;/p&gt;&lt;p&gt;Being a blogger and an author I want to add a consideration about communication. Explaining things is the best way to see if everything is clear in your mind, which is another way to read the previous quote (without involving grandmothers). The very same post that you are reading started as an intuition, a small list of ideas, and so far has been rewritten 6 times. In the process, I understood the topics I am discussing much better than I did when I first felt the need to write them down.&lt;/p&gt;&lt;h3 id="implementation-ef77"&gt;Implementation&lt;/h3&gt;&lt;p&gt;Professor Sidney Morris, in a &lt;a href="https://www.youtube.com/watch?v=T1snRQEQuEk"&gt;very interesting video&lt;/a&gt; about how to write proofs in mathematics, describes the process with these words:&lt;/p&gt;&lt;div class="callout"&gt;&lt;div class="content"&gt;&lt;ul&gt;&lt;li&gt;Step 1: write down what we are given&lt;/li&gt;&lt;li&gt;Step 2: write down the definition of each technical term in what we are given&lt;/li&gt;&lt;li&gt;Step 3: write down what we are required to prove&lt;/li&gt;&lt;li&gt;Step 4: write down the definition of each technical term in what we are required to prove&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;So these 4 steps are quite easy, quite straightforward.&lt;/p&gt;
&lt;p&gt;The next step is not as easy&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Step 5: THINK!&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;While we don&amp;#x27;t need to aim to the same level of formality required to mathematicians who prove theorems, we can surely keep the spirit of the process: write down and define what you have, write down and define what you want to achieve. Then, think.&lt;/p&gt;&lt;p&gt;We tend to take for granted that we can think, after all we do it all day long. But focusing our attention on a specific topic, giving it time, exploring it, considering questions about it, evaluating possible answers, all these things are increasingly unpopular. This is not the place for a critique of our society full of noise, where ideas, products, and works of art are watched for mere seconds before getting a like and passing into oblivion. But it is worth noting that thinking is &lt;em&gt;not&lt;/em&gt; easy.&lt;/p&gt;&lt;p&gt;The implementation of a black box might require a lot of thinking, and we have to accept this. It might require a lot of rewrites, prove unsuccessful only after a certain amount of time, or even require a separate project to be properly managed. There are no shortcuts here.&lt;/p&gt;&lt;h2 id="the-coding-interview-problem-9b56"&gt;The coding interview problem&lt;a class="headerlink" href="#the-coding-interview-problem-9b56" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;What do we do during a coding interview? What are we trying to understand with this excruciating exercise that puts people in a pillory for one hour? &lt;/p&gt;&lt;p&gt;What we should do, in my opinion, is to &lt;em&gt;help the candidate to show how they solve problems&lt;/em&gt;. We should facilitate a discussion along the lines of the three points that I mentioned: decomposition, communication, implementation. As you can see implementation is not avoided, it&amp;#x27;s a coding interview because there should be a part of it in which we write code, but it should be done only after we established a decomposition of the system.&lt;/p&gt;&lt;p&gt;I also believe that the assignment should be purposefully too complex to implement in a single one-hour session, and this should be explicitly communicated. This forces the candidate to design instead of rushing headlong into implementing the first requirement of the exercise without reading the rest. At any point, if the candidate is unable to implement a specific step, we can also move on to other steps and fake the input. This way we get many benefits:&lt;/p&gt;&lt;ol start=1&gt;&lt;li&gt;The candidate won&amp;#x27;t feel stressed by the need of showing how good they are at coding. The design part is a friendly chat, where suggestions can be made and specific technologies/solutions might be discarded if not known to the candidate.&lt;/li&gt;&lt;li&gt;They won&amp;#x27;t perceive the interview as a failure because they couldn&amp;#x27;t implement a single step or because they didn&amp;#x27;t complete the assignment in time.&lt;/li&gt;&lt;li&gt;We can explore the way the candidate communicates, the way they decompose complex processes, how well they understand problems and, eventually, how they write code.&lt;/li&gt;&lt;li&gt;We can adjust the level of difficulty of the interview or explore specific topics in detail just asking the candidate to focus on a specific detail.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;As an interviewer, I value the decomposition phase much more than the part in which you show me how well you remember all the functions of the Python standard library in a stressful situation. The truth is that I look them up very often and I don&amp;#x27;t look down on people because they don&amp;#x27;t remember the name of a method. I have one hour to decide if you are a good addition to the company, if you can be a good teammate for my next project, and if (possibly with some training) you can be given the responsibility for part of the system. In that hour I need to capture the main traits of your approach.&lt;/p&gt;&lt;p&gt;Don&amp;#x27;t get me wrong, I am a terrible nitpicker and probably on the brink of being OCD about some things, such as naming or tidiness of the code. But I try to take my own advice. What is the most important thing about you that I can understand? I think it would be extremely disappointing to discover that I hired someone who knows the standard library by heart but can&amp;#x27;t pick the right technology to complete a project before the deadline.&lt;/p&gt;&lt;p&gt;I understand that when you are interviewed you feel like you are in a position of weakness and that you are sitting there at the mercy of an evil interviewer whose purpose is only to uncover what you don&amp;#x27;t know. I&amp;#x27;m sorry if you had to face such interviewers. I had to, and I understand the frustration. My advice is: always remember that working for a company is a matter of giving your time and your energy in exchange for personal growth. You might be interviewing for your dream job, but if the interviewer is not interested in you and your growth it&amp;#x27;s probably not that useful for you to work with them.&lt;/p&gt;&lt;p&gt;So, as a candidate, you have a responsibility to show the interviewer how you can solve problems. If you show how good you are at coding, you will impress only interviewers that are interested in your coding skills, and this is, in my opinion, a very limited part of what you can do as a programmer. You need to show that you can design, and this is independent of the level you are at.&lt;/p&gt;&lt;p&gt;You need to show that you understand problems, that you can compare solutions, that you can take your risks picking one specific strategy and that if needed you can stop at a certain point and say &amp;quot;This is the wrong approach&amp;quot;.&lt;/p&gt;&lt;h2 id="patterns-e366"&gt;Patterns&lt;a class="headerlink" href="#patterns-e366" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Design patterns are defined by Erich Gamma and his co-authors in their seminal book&lt;sup&gt;[&lt;a id="ref-footnote-2-38f5e29a" href="#cnt-footnote-2-38f5e29a"&gt;2&lt;/a&gt;]&lt;/sup&gt; with these words: &amp;quot;[...] patterns solve specific design problems and make object-oriented designs more flexible, elegant, and ultimately reusable. [...] A designer who is familiar with such patterns can apply them immediately to design problems without having to rediscover them.&amp;quot;&lt;/p&gt;&lt;p&gt;I want to focus on the words &amp;quot;solve specific design problems&amp;quot; because what I notice is that many people apply patterns without having understood the problem they are trying to solve. Even worse, they look at the world through the lens of the patterns they know, twisting the nature of problems to fit the solution they know.&lt;/p&gt;&lt;p&gt;Back to the original sentence. &amp;quot;I will write a class&amp;quot; is considered the go-to solution in OOP languages. What we believe is that, in an OOP language, whatever the problem, the solution is to write a class. So, our first move on the chessboard of the interview is to write a class. This is a dangerous misuse of a pattern such as data encapsulation, and an expert interviewer will checkmate us in one move. I saw candidates facing problems that could be solved in 10 minutes with two functions and a dictionary spending more than 50 minutes swamped in a multitude of classes, trying to figure out which object contained the data they needed at a certain point of the process.&lt;/p&gt;&lt;p&gt;Clearly, classes might be the best solution for some problems, but this should come at the end of your analysis. You write a class because you have data and functions that can be put together, and this is valid for any other technology. Always ask yourself: what is the reason why I use this? What is the problem that I&amp;#x27;m trying to solve?&lt;/p&gt;&lt;h2 id="a-dangerous-culture-3fdd"&gt;A dangerous culture&lt;a class="headerlink" href="#a-dangerous-culture-3fdd" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;We all make the same mistake here: we push (or at least accept) a culture in which we teach and learn tools as go-to solutions without teaching to identify and face problems.&lt;/p&gt;&lt;p&gt;Programming languages, architectural patterns, algorithms. Those are all tools to implement solutions, they are not the solutions. You should learn them, down to the most minute details if you can, but never put them on the table before you understood the problem.&lt;/p&gt;&lt;p&gt;Alexis Carrel said, &amp;quot;A few observation and much reasoning lead to error; many observations and a little reasoning to truth.&amp;quot;&lt;sup&gt;[&lt;a id="ref-footnote-3-d415f73a" href="#cnt-footnote-3-d415f73a"&gt;3&lt;/a&gt;]&lt;/sup&gt; The advice that I take from the French Nobel Prize winner is: what is in front of you has to be observed deeply to find out its real nature. What things are is much more important than what we think they are and how we think we should treat them (&amp;quot;reasoning&amp;quot;). And what things are, if observed properly, will also reveal ways to interact with them, to manipulate them, to solve them.&lt;/p&gt;&lt;p&gt;If you want a clear example of the opposite, observe a programmer (maybe you yourself) looking for help on an error the web framework or the compiler threw at them. Copy and paste the error message into Google, pick the first result (Stack Overflow), scroll down until you find some code, apply. I dare you to call this &amp;quot;engineering&amp;quot;. Many times we don&amp;#x27;t even read the Stack Overflow question, we directly read the answer, not to mention the fact that many times we don&amp;#x27;t even read the error message!&lt;/p&gt;&lt;p&gt;I recommend reading a very interesting article by Joseph Gefroh, &lt;a href="https://medium.com/swlh/why-your-technical-interview-is-broken-and-how-to-fix-it-7004da002aa8"&gt;Why Your Technical Interview Is Broken, and How to Fix It&lt;/a&gt;, where he discusses the various types of skills that you can explore during an interview, and which ones you should be interested in. In particular, I couldn&amp;#x27;t agree more with his point about algorithmic interviews, as I believe they are deeply flawed.&lt;/p&gt;&lt;p&gt;I also recommend having a look at the &lt;a href="https://github.com/guardian/coding-exercises"&gt;Guardian Coding Exercises&lt;/a&gt; and to read the description of the repository. I think they are a good example of tests that allow the candidate and the interviewer to work together, to actually meet and to discuss a solution. There is no &amp;quot;right&amp;quot; way to solve them, and many of them cannot be solved in 45 minutes, which is usually the time given to a candidate after an initial introductory chat.&lt;/p&gt;&lt;h2 id="conclusion-506c"&gt;Conclusion&lt;a class="headerlink" href="#conclusion-506c" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I hope these short considerations helped you to see my point. We should all shift our gaze from the tools we have to the nature of problems and to their solutions. We are missing an important step here, which is ultimately what defines a good engineer and which is the most important thing that you can learn in your career. Observe problems, stop and think, devise a strategy, zoom out and zoom in. Learn to use tools, don&amp;#x27;t be used by them.&lt;/p&gt;&lt;p&gt;We need to push for this approach in our interviews, but also try to promote this culture in our teams and companies.&lt;/p&gt;&lt;hr&gt;&lt;div id="_footnotes"&gt;&lt;div id="cnt-footnote-1-90e26625"&gt;&lt;a href="#ref-footnote-1-90e26625"&gt;1&lt;/a&gt;  &lt;p&gt;See &amp;quot;What is a software architecture?&amp;quot; in &lt;a href="https://www.thedigitalcatbooks.com"&gt;Clean Architectures in Python&lt;/a&gt;.&lt;/p&gt;&lt;/div&gt;&lt;div id="cnt-footnote-2-38f5e29a"&gt;&lt;a href="#ref-footnote-2-38f5e29a"&gt;2&lt;/a&gt;  &lt;p&gt;&lt;em&gt;Design Patterns: Elements of Reusable Object-Oriented Software&lt;/em&gt; by Gamma, Vlissides, Johnson, and Helm&lt;/p&gt;&lt;/div&gt;&lt;div id="cnt-footnote-3-d415f73a"&gt;&lt;a href="#ref-footnote-3-d415f73a"&gt;3&lt;/a&gt;  &lt;p&gt;&lt;em&gt;Réflexions sur la vie&lt;/em&gt;, Paris, 1952&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 id="feedback-d845"&gt;Feedback&lt;a class="headerlink" href="#feedback-d845" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Feel free to reach me on &lt;a href="https://twitter.com/thedigicat"&gt;Twitter&lt;/a&gt; if you have questions. The &lt;a href="https://github.com/TheDigitalCatOnline/blog_source/issues"&gt;GitHub issues&lt;/a&gt; page is the best place to submit corrections.&lt;/p&gt;</content><category term="Programming"></category><category term="software architecture"></category><category term="design"></category><category term="tools"></category></entry><entry><title>Dissecting a Web stack</title><link href="https://www.thedigitalcatonline.com/blog/2020/02/16/dissecting-a-web-stack/" rel="alternate"></link><published>2020-02-16T15:00:00+00:00</published><updated>2020-10-27T08:30:00+00:00</updated><author><name>Leonardo Giordani</name></author><id>tag:www.thedigitalcatonline.com,2020-02-16:/blog/2020/02/16/dissecting-a-web-stack/</id><summary type="html">&lt;p&gt;A layer-by-layer review of the components of a web stack and the reasons behind them&lt;/p&gt;</summary><content type="html">&lt;blockquote&gt;
&lt;p&gt;It was gross. They wanted me to dissect a frog.&lt;/p&gt;
&lt;p&gt;(Beetlejuice, 1988)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Having recently worked with young web developers who were exposed for the first time to proper production infrastructure, I received many questions about the various components that one can find in the architecture of a "Web service". These questions clearly expressed the confusion (and sometimes the frustration) of developers who understand how to create endpoints in a high-level language such as Node.js or Python, but were never introduced to the complexity of what happens between the user's browser and their framework of choice. Most of the times they don't know why the framework itself is there in the first place.&lt;/p&gt;
&lt;p&gt;The challenge is clear if we just list (in random order), some of the words we use when we discuss (Python) Web development: HTTP, cookies, web server, Websockets, FTP, multi-threaded, reverse proxy, Django, nginx, static files, POST, certificates, framework, Flask, SSL, GET, WSGI, session management, TLS, load balancing, Apache.&lt;/p&gt;
&lt;p&gt;In this post, I want to review all the words mentioned above (and a couple more) trying to build a production-ready web service from the ground up. I hope this might help young developers to get the whole picture and to make sense of these "obscure" names that senior developers like me tend to drop in everyday conversations (sometimes arguably out of turn).&lt;/p&gt;
&lt;p&gt;As the focus of the post is the global architecture and the reasons behind the presence of specific components, the example service I will use will be a basic HTML web page. The reference language will be Python but the overall discussion applies to any language or framework.&lt;/p&gt;
&lt;p&gt;My approach will be that of first stating the rationale and then implementing a possible solution. After this, I will point out missing pieces or unresolved issues and move on with the next layer. At the end of the process, the reader should have a clear picture of why each component has been added to the system.&lt;/p&gt;
&lt;h2 id="the-perfect-architecture"&gt;The perfect architecture&lt;a class="headerlink" href="#the-perfect-architecture" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A very important underlying concept of system architectures is that there is no &lt;em&gt;perfect solution&lt;/em&gt; devised by some wiser genius, that we just need to apply. Unfortunately, often people mistake design patterns for such a "magic solution". The "Design Patterns" original book, however, states that&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Your design should be specific to the problem at hand but also general enough to address future problems and requirements. You also want to avoid redesign, or at least minimize it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And later&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Design patterns make it easier to reuse successful designs and architectures. [...] Design patterns help you choose design alternatives that make a system reusable and avoid alternatives that compromise reusability.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The authors of the book are discussing Object-oriented Programming, but these sentences can be applied to any architecture. As you can see, we have a "problem at hand" and "design alternatives", which means that the most important thing to understand is the requirements, both the present and future ones. Only with clear requirements in mind, one can effectively design a solution, possibly tapping into the great number of patterns that other designers already devised.&lt;/p&gt;
&lt;p&gt;A very last remark. A web stack is a complex beast, made of several components and software packages developed by different programmers with different goals in mind. It is perfectly understandable, then, that such components have some degree of superposition. While the division line between theoretical layers is usually very clear, in practice the separation is often blurry. Expect this a lot, and you will never be lost in a web stack anymore.&lt;/p&gt;
&lt;h2 id="some-definitions"&gt;Some definitions&lt;a class="headerlink" href="#some-definitions" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Let's briefly review some of the most important concepts involved in a Web stack, the protocols.&lt;/p&gt;
&lt;h3 id="tcpip"&gt;TCP/IP&lt;a class="headerlink" href="#tcpip" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;TCP/IP is a network protocol, that is, a &lt;em&gt;set of established rules&lt;/em&gt; two computers have to follow to get connected over a physical network to exchange messages. TCP/IP is composed of two different protocols covering two different layers of the OSI stack, namely the Transport (TCP) and the Network (IP) ones. TCP/IP can be implemented on top of any physical interface (Data Link and Physical OSI layers), such as Ethernet and Wireless. Actors in a TCP/IP network are identified by a &lt;em&gt;socket&lt;/em&gt;, which is a tuple made of an IP address and a port number.&lt;/p&gt;
&lt;p&gt;As far as we are concerned when developing a Web service, however, we need to be aware that TCP/IP is a &lt;em&gt;reliable&lt;/em&gt; protocol, which in telecommunications means that the protocol itself takes care or retransmissions when packets get lost. In other words, while the speed of the communication is not granted, we can be sure that once a message is sent it will reach its destination without errors.&lt;/p&gt;
&lt;h3 id="http"&gt;HTTP&lt;a class="headerlink" href="#http" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;TCP/IP can guarantee that the raw bytes one computer sends will reach their destination, but this leaves completely untouched the problem of how to send meaningful information. In particular, in 1989 the problem Tim Barners-Lee wanted to solve was how to uniquely name hypertext resources in a network and how to access them.&lt;/p&gt;
&lt;p&gt;HTTP is the protocol that was devised to solve such a problem and has since greatly evolved. With the help of other protocols such as WebSocket, HTTP invaded areas of communication for which it was originally considered unsuitable such as real-time communication or gaming.&lt;/p&gt;
&lt;p&gt;At its core, HTTP is a protocol that states the format of a text request and the possible text responses. The initial version 0.9 published in 1991 defined the concept of URL and allowed only the GET operation that requested a specific resource. HTTP 1.0 and 1.1 added crucial features such as headers, more methods, and important performance optimisations. At the time of writing the adoption of HTTP/2 is around 45% of the websites in the world, and HTTP/3 is still a draft.&lt;/p&gt;
&lt;p&gt;The most important feature of HTTP we need to keep in mind as developers is that it is a &lt;em&gt;stateless&lt;/em&gt; protocol. This means that the protocol doesn't require the server to keep track of the state of the communication between requests, basically leaving session management to the developer of the service itself.&lt;/p&gt;
&lt;p&gt;Session management is crucial nowadays because you usually want to have an authentication layer in front of a service, where a user provides credentials and accesses some private data. It is, however, useful in other contexts such as visual preferences or choices made by the user and re-used in later accesses to the same website. Typical solutions to the session management problem of HTTP involve the use of cookies or session tokens.&lt;/p&gt;
&lt;h3 id="https"&gt;HTTPS&lt;a class="headerlink" href="#https" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Security has become a very important word in recent years, and with a reason. The amount of sensitive data we exchange on the Internet or store on digital devices is increasing exponentially, but unfortunately so is the number of malicious attackers and the level of damage they can cause with their actions. The HTTP protocol is inherently&lt;/p&gt;
&lt;p&gt;HTTP is inherently insecure, being a plain text communication between two servers that usually happens on a completely untrustable network such as the Internet. While security wasn't an issue when the protocol was initially conceived, it is nowadays a problem of paramount importance, as we exchange private information, often vital for people's security or for businesses. We need to be sure we are sending information to the correct server and that the data we send cannot be intercepted.&lt;/p&gt;
&lt;p&gt;HTTPS solves both the problem of tampering and eavesdropping, encrypting HTTP with the Transport Layer Security (TLS) protocol, that also enforces the usage of digital certificates, issued by a trusted authority. At the time of writing, approximately 80% of websites loaded by Firefox use HTTPS by default. When a server receives an HTTPS connection and transforms it into an HTTP one it is usually said that it &lt;em&gt;terminates TLS&lt;/em&gt; (or SSL, the old name of TLS).&lt;/p&gt;
&lt;h3 id="websocket"&gt;WebSocket&lt;a class="headerlink" href="#websocket" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;One great disadvantage of HTTP is that communication is always initiated by the client and that the server can send data only when this is explicitly requested. Polling can be implemented to provide an initial solution, but it cannot guarantee the performances of proper full-duplex communication, where a channel is kept open between server and client and both can send data without being requested. Such a channel is provided by the WebSocket protocol.&lt;/p&gt;
&lt;p&gt;WebSocket is a killer technology for applications like online gaming, real-time feeds like financial tickers or sports news, or multimedia communication like conferencing or remote education.&lt;/p&gt;
&lt;p&gt;It is important to understand that WebSocket is not HTTP, and can exist without it. It is also true that this new protocol was designed to be used on top of an existing HTTP connection, so a WebSocket communication is often found in parts of a Web page, which was originally retrieved using HTTP in the first place.&lt;/p&gt;
&lt;h2 id="implementing-a-service-over-http"&gt;Implementing a service over HTTP&lt;a class="headerlink" href="#implementing-a-service-over-http" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Let's finally start discussing bits and bytes. The starting point for our journey is a service over HTTP, which means there is an HTTP request-response exchange. As an example, let us consider a GET request, the simplest of the HTTP methods.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;GET&lt;/span&gt; &lt;span class="nn"&gt;/&lt;/span&gt; &lt;span class="kr"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;localhost&lt;/span&gt;
&lt;span class="na"&gt;User-Agent&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;curl/7.65.3&lt;/span&gt;
&lt;span class="na"&gt;Accept&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;*/*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, the client is sending a pure text message to the server, with the format specified by the HTTP protocol. The first line contains the method name (&lt;code&gt;GET&lt;/code&gt;), the URL (&lt;code&gt;/&lt;/code&gt;) and the protocol we are using, including its version (&lt;code&gt;HTTP/1.1&lt;/code&gt;). The remaining lines are called &lt;em&gt;headers&lt;/em&gt; and contain metadata that can help the server to manage the request. The complete value of the &lt;code&gt;Host&lt;/code&gt; header is in this case &lt;code&gt;localhost:80&lt;/code&gt;, but as the standard port for HTTP services is 80, we don't need to specify it.&lt;/p&gt;
&lt;p&gt;If the server &lt;code&gt;localhost&lt;/code&gt; is &lt;em&gt;serving&lt;/em&gt; HTTP (i.e. running some software that understands HTTP) on port 80 the response we might get is something similar to&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kr"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.0&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt; &lt;span class="ne"&gt;OK&lt;/span&gt;
&lt;span class="na"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;Mon, 10 Feb 2020 08:41:33 GMT&lt;/span&gt;
&lt;span class="na"&gt;Content-type&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;text/html&lt;/span&gt;
&lt;span class="na"&gt;Content-Length&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;26889&lt;/span&gt;
&lt;span class="na"&gt;Last-Modified&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;Mon, 10 Feb 2020 08:41:27 GMT&lt;/span&gt;

&lt;span class="cp"&gt;&amp;lt;!DOCTYPE HTML&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
...
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As happened for the request, the response is a text message, formatted according to the standard. The first line mentions the protocol and the status of the request (&lt;code&gt;200&lt;/code&gt; in this case, that means success), while the following lines contain metadata in various headers. Finally, after an empty line, the message contains the resource the client asked for, the source code of the base URL of the website in this case. Since this HTML page probably contains references to other resources like CSS, JS, images, and so on, the browser will send several other requests to gather all the data it needs to properly show the page to the user.&lt;/p&gt;
&lt;p&gt;So, the first problem we have is that of implementing a server that understands this protocol and sends a proper response when it receives an HTTP request. We should try to load the requested resource and return either a success (HTTP 200) if we can find it, or a failure (HTTP 404) if we can't.&lt;/p&gt;
&lt;h2 id="1-sockets-and-parsers"&gt;1 Sockets and parsers&lt;a class="headerlink" href="#1-sockets-and-parsers" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="11-rationale"&gt;1.1 Rationale&lt;a class="headerlink" href="#11-rationale" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;TCP/IP is a network protocol that works with &lt;em&gt;sockets&lt;/em&gt;. A socket is a tuple of an IP address (unique in the network) and a port (unique for a specific IP address) that the computer uses to communicate with others. A socket is a file-like object in an operating system, that can be thus &lt;em&gt;opened&lt;/em&gt; and &lt;em&gt;closed&lt;/em&gt;, and that we can &lt;em&gt;read&lt;/em&gt; from or &lt;em&gt;write&lt;/em&gt; to. Socket programming is a pretty low-level approach to the network, but you need to be aware that every software in your computer that provides network access has ultimately to deal with sockets (most probably through some library, though).&lt;/p&gt;
&lt;p&gt;Since we are building things from the ground up, let's implement a small Python program that opens a socket connection, receives an HTTP request, and sends an HTTP response. As port 80 is a "low port" (a number smaller than 1024), we usually don't have permissions to open sockets there, so I will use port 8080. This is not a problem for now, as HTTP can be served on any port.&lt;/p&gt;
&lt;h3 id="12-implementation"&gt;1.2 Implementation&lt;a class="headerlink" href="#12-implementation" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Create the file &lt;code&gt;server.py&lt;/code&gt; and type this code. Yes, &lt;strong&gt;type it&lt;/strong&gt;, don't just copy and paste, you will not learn anything otherwise.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;socket&lt;/span&gt;

&lt;span class="c1"&gt;## Create a socket instance&lt;/span&gt;
&lt;span class="c1"&gt;## AF_INET: use IP protocol version 4&lt;/span&gt;
&lt;span class="c1"&gt;## SOCK_STREAM: full-duplex byte stream&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOCK_STREAM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Allow reuse of addresses&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setsockopt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOL_SOCKET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SO_REUSEADDR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Bind the socket to any address, port 8080, and listen&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;## Serve forever&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Accept the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Receive data from this socket using a buffer of 1024 bytes&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Print out the data&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Close the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This little program accepts a connection on port 8080 and prints the received data on the terminal. You can test it executing it and then running &lt;code&gt;curl localhost:8080&lt;/code&gt; in another terminal. You should see something like&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;server.py&lt;span class="w"&gt; &lt;/span&gt;
GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1
Host:&lt;span class="w"&gt; &lt;/span&gt;localhost:8080
User-Agent:&lt;span class="w"&gt; &lt;/span&gt;curl/7.65.3
Accept:&lt;span class="w"&gt; &lt;/span&gt;*/*
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The server keeps running the code in the &lt;code&gt;while&lt;/code&gt; loop, so if you want to terminate it you have to do it with Ctrl+C. So far so good, but this is not an HTTP server yet, as it sends no response; you should actually receive an error message from curl that says &lt;code&gt;curl: (52) Empty reply from server&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Sending back a standard response is very simple, we just need to call &lt;code&gt;conn.sendall&lt;/code&gt; passing the raw bytes. A minimal HTTP response contains the protocol and the status, an empty line, and the actual content, for example&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kr"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt; &lt;span class="ne"&gt;OK&lt;/span&gt;

Hi there!
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Our server becomes then&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;socket&lt;/span&gt;

&lt;span class="c1"&gt;## Create a socket instance&lt;/span&gt;
&lt;span class="c1"&gt;## AF_INET: use IP protocol version 4&lt;/span&gt;
&lt;span class="c1"&gt;## SOCK_STREAM: full-duplex byte stream&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOCK_STREAM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Allow reuse of addresses&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setsockopt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOL_SOCKET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SO_REUSEADDR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Bind the socket to any address, port 8080, and listen&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;## Serve forever&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Accept the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Receive data from this socket using a buffer of 1024 bytes&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Print out the data&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sendall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;HTTP/1.1 200 OK&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Hi there!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Close the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;At this point, we are not really responding to the user's request, however. Try different curl command lines like &lt;code&gt;curl localhost:8080/index.html&lt;/code&gt; or &lt;code&gt;curl localhost:8080/main.css&lt;/code&gt; and you will always receive the same response. We should try to find the resource the user is asking for and send that back in the response content.&lt;/p&gt;
&lt;p&gt;This version of the HTTP server properly extracts the resource and tries to load it from the current directory, returning either a success of a failure&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;socket&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;re&lt;/span&gt;

&lt;span class="c1"&gt;## Create a socket instance&lt;/span&gt;
&lt;span class="c1"&gt;## AF_INET: use IP protocol version 4&lt;/span&gt;
&lt;span class="c1"&gt;## SOCK_STREAM: full-duplex byte stream&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AF_INET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOCK_STREAM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Allow reuse of addresses&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setsockopt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SOL_SOCKET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SO_REUSEADDR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;## Bind the socket to any address, port 8080, and listen&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;HEAD_200&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;HTTP/1.1 200 OK&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;HEAD_404&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;HTTP/1.1 404 Not Found&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;## Serve forever&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Accept the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Receive data from this socket using a buffer of 1024 bytes&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Print out the data&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;GET /(.*) HTTP&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HEAD_200&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Resource &lt;/span&gt;&lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="s1"&gt; correctly served&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HEAD_404&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Resource /&lt;/span&gt;&lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="s2"&gt; cannot be found&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Resource &lt;/span&gt;&lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="s1"&gt; cannot be loaded&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;--------------------&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sendall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Close the connection&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see this implementation is extremely simple. If you create a simple local file named &lt;code&gt;index.html&lt;/code&gt; with this content&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;head&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;title&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;This is my page&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;title&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;link&lt;/span&gt; &lt;span class="na"&gt;rel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;main.css&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;head&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Some random content&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;and run &lt;code&gt;curl localhost:8080/index.html&lt;/code&gt; you will see the content of the file. At this point, you can even use your browser to open &lt;code&gt;http://localhost:8080/index.html&lt;/code&gt; and you will see the title of the page and the content. A Web browser is a software capable of sending HTTP requests and of interpreting the content of the responses if this is HTML (and many other file types like images or videos), so it can &lt;em&gt;render&lt;/em&gt; the content of the message. The browser is also responsible of retrieving the missing resources needed for the rendering, so when you provide links to style sheets or JS scripts with the &lt;code&gt;&amp;lt;link&amp;gt;&lt;/code&gt; or the &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags in the HTML code of a page, you are instructing the browser to send an HTTP GET request for those files as well.&lt;/p&gt;
&lt;p&gt;The output of &lt;code&gt;server.py&lt;/code&gt; when I access &lt;code&gt;http://localhost:8080/index.html&lt;/code&gt; is&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;GET&lt;/span&gt; &lt;span class="nn"&gt;/index.html&lt;/span&gt; &lt;span class="kr"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;localhost:8080&lt;/span&gt;
&lt;span class="na"&gt;User-Agent&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0&lt;/span&gt;
&lt;span class="na"&gt;Accept&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8&lt;/span&gt;
&lt;span class="na"&gt;Accept-Language&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;en-GB,en;q=0.5&lt;/span&gt;
&lt;span class="na"&gt;Accept-Encoding&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;gzip, deflate&lt;/span&gt;
&lt;span class="na"&gt;Connection&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;keep-alive&lt;/span&gt;
&lt;span class="na"&gt;Upgrade-Insecure-Requests&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;1&lt;/span&gt;
&lt;span class="na"&gt;Pragma&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;no-cache&lt;/span&gt;
&lt;span class="na"&gt;Cache-Control&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="l"&gt;no-cache&lt;/span&gt;


Resource index.html correctly served
--------------------
GET /main.css HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
Accept: text/css,*/*;q=0.1
Accept-Language: en-GB,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Referer: http://localhost:8080/index.html
Pragma: no-cache
Cache-Control: no-cache


Resource main.css cannot be loaded
--------------------
GET /favicon.ico HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
Accept: image/webp,*/*
Accept-Language: en-GB,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache


Resource favicon.ico cannot be loaded
--------------------
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see the browser sends rich HTTP requests, with a lot of headers, automatically requesting the CSS file mentioned in the HTML code and automatically trying to retrieve a favicon image.&lt;/p&gt;
&lt;h3 id="13-resources"&gt;1.3 Resources&lt;a class="headerlink" href="#13-resources" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;These resources provide more detailed information on the topics discussed in this section&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.python.org/3/howto/sockets.html"&gt;Python 3 Socket Programming HOWTO&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5"&gt;HTTP/1.1 Request format&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6"&gt;HTTP/1.1 Response format&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The source code of this example is available &lt;a href="https://github.com/lgiordani/dissecting-a-web-stack-code/tree/master/1_sockets_and_parsers"&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="14-issues"&gt;1.4 Issues&lt;a class="headerlink" href="#14-issues" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It gives a certain dose of satisfaction to build something from scratch and discover that it works smoothly with full-fledged software like the browser you use every day. I also think it is very interesting to discover that technologies like HTTP, that basically run the world nowadays, are at their core very simple.&lt;/p&gt;
&lt;p&gt;That said, there are many features of HTTP that we didn't cover with our simple socket programming. For starters, HTTP/1.0 introduced other methods after GET, such as POST that is of paramount importance for today's websites, where users keep sending information to servers through forms. To implement all 9 HTTP methods we need to properly parse the incoming request and add relevant functions to our code.&lt;/p&gt;
&lt;p&gt;At this point, however, you might notice that we are dealing a lot with low-level details of the protocol, which is usually not the core of our business. When we build a service over HTTP we believe that we have the knowledge to properly implement some code that can simplify a certain process, be it searching for other websites, shopping for books or sharing pictures with friends. We don't want to spend our time understanding the subtleties of the TCP/IP sockets and writing parsers for request-response protocols. It is nice to see how these technologies work, but on a daily basis, we need to focus on something at a higher level.&lt;/p&gt;
&lt;p&gt;The situation of our small HTTP server is possibly worsened by the fact that HTTP is a stateless protocol. The protocol doesn't provide any way to connect two successive requests, thus keeping track of the &lt;em&gt;state&lt;/em&gt; of the communication, which is the cornerstone of modern Internet. Every time we authenticate on a website and we want to visit other pages we need the server to remember who we are, and this implies keeping track of the state of the connection.&lt;/p&gt;
&lt;p&gt;Long story short: to work as a proper HTTP server, our code should at this point implement all HTTP methods and cookies management. We also need to support other protocols like Websockets. These are all but trivial tasks, so we definitely need to add some component to the whole system that lets us focus on the business logic and not on the low-level details of application protocols.&lt;/p&gt;
&lt;h2 id="2-web-framework"&gt;2 Web framework&lt;a class="headerlink" href="#2-web-framework" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="21-rationale"&gt;2.1 Rationale&lt;a class="headerlink" href="#21-rationale" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Enter the Web framework!&lt;/p&gt;
&lt;p&gt;As I discussed many times (see &lt;a href="https://www.thedigitalcatonline.com/blog/2018/12/20/cabook/"&gt;the book on clean architectures&lt;/a&gt; or &lt;a href="https://www.thedigitalcatonline.com/blog/2016/11/14/clean-architectures-in-python-a-step-by-step-example/"&gt;the relative post&lt;/a&gt;) the role of the Web framework is that of &lt;em&gt;converting HTTP requests into function calls&lt;/em&gt;, and function return values into HTTP responses. The framework's true nature is that of a layer that connects a working business logic to the Web, through HTTP and related protocols. The framework takes care of session management for us and maps URLs to functions, allowing us to focus on the application logic.&lt;/p&gt;
&lt;p&gt;In the grand scheme of an HTTP service, this is what the framework is supposed to do. Everything the framework provides out of this scope, like layers to access DBs, template engines, and interfaces to other systems, is an addition that you, as a programmer, may find useful, but is not in principle part of the reason why we added the framework to the system. We add the framework because it acts as a layer between our business logic and HTTP.&lt;/p&gt;
&lt;h3 id="22-implementation"&gt;2.2 Implementation&lt;a class="headerlink" href="#22-implementation" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Thanks to Miguel Gringberg and his &lt;a href="https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-i-hello-world"&gt;amazing Flask mega-tutorial&lt;/a&gt; I can set up Flask in seconds. I will not run through the tutorial here, as you can follow it on Miguel's website. I will only use the content of the first article (out of 23!) to create an extremely simple "Hello, world" application.&lt;/p&gt;
&lt;p&gt;To run the following example you will need a virtual environment and you will have to &lt;code&gt;pip install flask&lt;/code&gt;. Follow Miguel's tutorial if you need more details on this.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;app/__init__.py&lt;/code&gt; file is&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;flask&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;

&lt;span class="n"&gt;application&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;app&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;routes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;and the &lt;code&gt;app/routes.py&lt;/code&gt; file is&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;app&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;application&lt;/span&gt;


&lt;span class="nd"&gt;@application&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nd"&gt;@application&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/index&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Hello, world!&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can already see here the power of a framework in action. We defined an &lt;code&gt;index&lt;/code&gt; function and connected it with two different URLs (&lt;code&gt;/&lt;/code&gt; and &lt;code&gt;/index&lt;/code&gt;) in 3 lines of Python. This leaves us time and energy to properly work on the business logic, that in this case is a revolutionary "Hello, world!". Nobody ever did this before.&lt;/p&gt;
&lt;p&gt;Finally, the &lt;code&gt;service.py&lt;/code&gt; file is&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;app&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;application&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Flask comes with a so-called development web server (do these words ring any bell now?) that we can run on a terminal&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;FLASK_APP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;service.py&lt;span class="w"&gt; &lt;/span&gt;flask&lt;span class="w"&gt; &lt;/span&gt;run
&lt;span class="w"&gt; &lt;/span&gt;*&lt;span class="w"&gt; &lt;/span&gt;Serving&lt;span class="w"&gt; &lt;/span&gt;Flask&lt;span class="w"&gt; &lt;/span&gt;app&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;service.py&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;*&lt;span class="w"&gt; &lt;/span&gt;Environment:&lt;span class="w"&gt; &lt;/span&gt;production
&lt;span class="w"&gt;   &lt;/span&gt;WARNING:&lt;span class="w"&gt; &lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;development&lt;span class="w"&gt; &lt;/span&gt;server.&lt;span class="w"&gt; &lt;/span&gt;Do&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;use&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;production&lt;span class="w"&gt; &lt;/span&gt;deployment.
&lt;span class="w"&gt;   &lt;/span&gt;Use&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;production&lt;span class="w"&gt; &lt;/span&gt;WSGI&lt;span class="w"&gt; &lt;/span&gt;server&lt;span class="w"&gt; &lt;/span&gt;instead.
&lt;span class="w"&gt; &lt;/span&gt;*&lt;span class="w"&gt; &lt;/span&gt;Debug&lt;span class="w"&gt; &lt;/span&gt;mode:&lt;span class="w"&gt; &lt;/span&gt;off
&lt;span class="w"&gt; &lt;/span&gt;*&lt;span class="w"&gt; &lt;/span&gt;Running&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;http://127.0.0.1:5000/&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;Press&lt;span class="w"&gt; &lt;/span&gt;CTRL+C&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;quit&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can now visit the given URL with your browser and see that everything works properly. Remember that 127.0.0.1 is the special IP address that refers to "this computer"; the name &lt;code&gt;localhost&lt;/code&gt; is usually created by the operating system as an alias for that, so the two are interchangeable. As you can see the standard port for Flask's development server is 5000, so you have to mention it explicitly, otherwise your browser would try to access port 80 (the default HTTP one). When you connect with the browser you will see some log messages about the HTTP requests&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="m"&gt;127&lt;/span&gt;.0.0.1&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;:54:27&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;GET / HTTP/1.1&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-
&lt;span class="m"&gt;127&lt;/span&gt;.0.0.1&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;:54:28&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;GET /favicon.ico HTTP/1.1&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;404&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can recognise both now, as those are the same request we got with our little server in the previous part of the article.&lt;/p&gt;
&lt;h3 id="23-resources"&gt;2.3 Resources&lt;a class="headerlink" href="#23-resources" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;These resources provide more detailed information on the topics discussed in this section&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-i-hello-world"&gt;Miguel Gringberg's amazing Flask mega-tutorial&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Localhost"&gt;What is localhost&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The source code of this example is available &lt;a href="https://github.com/lgiordani/dissecting-a-web-stack-code/tree/master/2_web_framework"&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="24-issues"&gt;2.4 Issues&lt;a class="headerlink" href="#24-issues" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Apparently, we solved all our problems, and many programmers just stop here. They learn how to use the framework (which is a big achievement!), but as we will shortly discover, this is not enough for a production system. Let's have a closer look at the output of the Flask server. It clearly says, among other things&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="w"&gt;   &lt;/span&gt;WARNING:&lt;span class="w"&gt; &lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;development&lt;span class="w"&gt; &lt;/span&gt;server.&lt;span class="w"&gt; &lt;/span&gt;Do&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;use&lt;span class="w"&gt; &lt;/span&gt;it&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;production&lt;span class="w"&gt; &lt;/span&gt;deployment.
&lt;span class="w"&gt;   &lt;/span&gt;Use&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;production&lt;span class="w"&gt; &lt;/span&gt;WSGI&lt;span class="w"&gt; &lt;/span&gt;server&lt;span class="w"&gt; &lt;/span&gt;instead.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The main issue we have when we deal with any production system is represented by performances. Think about what we do with JavaScript when we minimise the code: we consciously obfuscate the code in order to make the file smaller, but this is done for the sole purpose of making the file faster to retrieve.&lt;/p&gt;
&lt;p&gt;For HTTP servers the story is not very different. The Web framework usually provides a development Web server, as Flask does, which properly implements HTTP, but does it in a very inefficient way. For starters, this is a &lt;em&gt;blocking&lt;/em&gt; framework, which means that if our request takes seconds to be served (for example because the endpoint retrieves data from a very slow database), any other request will have to wait to be served in a queue. That ultimately means that the user will see a spinner in the browser's tab and just shake their head thinking that we can't build a modern website. Other performances concerns might be connected with memory management or disk caches, but in general, we are safe to say that this web server cannot handle any production load (i.e. multiple users accessing the web site at the same time and expecting good quality of service).&lt;/p&gt;
&lt;p&gt;This is hardly surprising. After all, we didn't want to deal with TCP/IP connections to focus on our business, so we delegated this to other coders who maintain the framework. The framework's authors, in turn, want to focus on things like middleware, routes, proper handling of HTTP methods, and so on. They don't want to spend time trying to optimise the performances of the "multi-user" experience. This is especially true in the Python world (and somehow less true for Node.js, for example): Python is not heavily concurrency-oriented, and both the style of programming and the performances are not favouring fast, non-blocking applications. This is changing lately, with async and improvements in the interpreter, but I leave this for another post.&lt;/p&gt;
&lt;p&gt;So, now that we have a full-fledged HTTP service, we need to make it so fast that users won't even notice this is not running locally on their computer.&lt;/p&gt;
&lt;h2 id="3-concurrency-and-facades"&gt;3 Concurrency and façades&lt;a class="headerlink" href="#3-concurrency-and-facades" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="31-rationale"&gt;3.1 Rationale&lt;a class="headerlink" href="#31-rationale" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Well, whenever you have performance issues, just go for concurrency. Now you have many problems!
(see &lt;a href="https://twitter.com/davidlohr/status/288786300067270656?lang=en"&gt;here&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;Yes, concurrency solves many problems and it's the source of just as much, so we need to find a way to use it in the safest and less complicated way. We basically might want to add a layer that runs the framework in some concurrent way, without requiring us to change anything in the framework itself.&lt;/p&gt;
&lt;p&gt;And whenever you have to homogenise different things just create a layer of indirection. This solves any problem but one. (see &lt;a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering"&gt;here&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;So we need to create a layer that runs our service in a concurrent way, but we also want to keep it detached from the specific implementation of the service, that is independent of the framework or library that we are using.&lt;/p&gt;
&lt;h3 id="32-implementation"&gt;3.2 Implementation&lt;a class="headerlink" href="#32-implementation" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In this case, the solution is that of giving a &lt;em&gt;specification&lt;/em&gt; of the API that web frameworks have to expose, in order to be usable by independent third-party components. In the Python world, this set of rules has been named WSGI, the Web Server Gateway Interface, but such interfaces exist for other languages such as Java or Ruby. The "gateway" mentioned here is the part of the system outside the framework, which in this discussion is the part that deals with production performances. Through WSGI we are defining a way for frameworks to expose a common interface, leaving people interested in concurrency free to implement something independently.&lt;/p&gt;
&lt;p&gt;If the framework is compatible with the gateway interface, we can add software that deals with concurrency and uses the framework through the compatibility layer. Such a component is a production-ready HTTP server, and two common choices in the Python world are Gunicorn and uWSGI.&lt;/p&gt;
&lt;p&gt;Production-ready HTTP server means that the software understands HTTP as the development server already did, but at the same time pushes performances in order to sustain a bigger workload, and as we said before this is done through concurrency.&lt;/p&gt;
&lt;p&gt;Flask is compatible with WSGI, so we can make it work with Gunicorn. To install it in our virtual environment run &lt;code&gt;pip install gunicorn&lt;/code&gt; and set it up creating a file names &lt;code&gt;wsgi.py&lt;/code&gt; with the following content&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;app&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;application&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;application&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To run Gunicorn specify the number of concurrent instances and the external port&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;gunicorn&lt;span class="w"&gt; &lt;/span&gt;--workers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bind&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;wsgi
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13393&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Starting&lt;span class="w"&gt; &lt;/span&gt;gunicorn&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;20&lt;/span&gt;.0.4
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13393&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Listening&lt;span class="w"&gt; &lt;/span&gt;at:&lt;span class="w"&gt; &lt;/span&gt;http://0.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;13393&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13393&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Using&lt;span class="w"&gt; &lt;/span&gt;worker:&lt;span class="w"&gt; &lt;/span&gt;sync
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13396&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;13396&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13397&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;13397&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-12&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;18&lt;/span&gt;:39:07&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;13398&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;13398&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, Gunicorn has the concept of &lt;em&gt;workers&lt;/em&gt; which are a generic way to express concurrency. Specifically, Gunicorn implements a pre-fork worker model, which means that it (pre)creates a different Unix process for each worker. You can check this running &lt;code&gt;ps&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;ps&lt;span class="w"&gt; &lt;/span&gt;ax&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;gunicorn
&lt;span class="m"&gt;14919&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pts/1&lt;span class="w"&gt;    &lt;/span&gt;S+&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;:00&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/python3&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/gunicorn&lt;span class="w"&gt; &lt;/span&gt;--workers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bind&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;wsgi
&lt;span class="m"&gt;14922&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pts/1&lt;span class="w"&gt;    &lt;/span&gt;S+&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;:00&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/python3&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/gunicorn&lt;span class="w"&gt; &lt;/span&gt;--workers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bind&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;wsgi
&lt;span class="m"&gt;14923&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pts/1&lt;span class="w"&gt;    &lt;/span&gt;S+&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;:00&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/python3&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/gunicorn&lt;span class="w"&gt; &lt;/span&gt;--workers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bind&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;wsgi
&lt;span class="m"&gt;14924&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pts/1&lt;span class="w"&gt;    &lt;/span&gt;S+&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;:00&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/python3&lt;span class="w"&gt; &lt;/span&gt;~/venv3/bin/gunicorn&lt;span class="w"&gt; &lt;/span&gt;--workers&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bind&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;wsgi
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Using processes is just one of the two ways to implement concurrency in a Unix system, the other being using threads. The benefits and demerits of each solution are outside the scope of this post, however. For the time being just remember that you are dealing with multiple workers that process incoming requests asynchronously, thus implementing a non-blocking server, ready to accept multiple connections.&lt;/p&gt;
&lt;h3 id="33-resources"&gt;3.3 Resources&lt;a class="headerlink" href="#33-resources" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;These resources provide more detailed information on the topics discussed in this section&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;a href="https://wsgi.readthedocs.io/en/latest/index.html"&gt;WSGI official documentation&lt;/a&gt; and the &lt;a href="https://en.wikipedia.org/wiki/Web_Server_Gateway_Interface"&gt;Wikipedia page
&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The homepages of &lt;a href="https://gunicorn.org/"&gt;Gunicorn&lt;/a&gt; and &lt;a href="https://uwsgi-docs.readthedocs.io/en/latest/"&gt;uWSGI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A good entry point for your journey into the crazy world of concurrency: &lt;a href="https://en.wikipedia.org/wiki/Multithreading_(computer_architecture)"&gt;multithreading&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The source code of this example is available &lt;a href="https://github.com/lgiordani/dissecting-a-web-stack-code/tree/master/3_concurrency_and_facades"&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="34-issues"&gt;3.4 Issues&lt;a class="headerlink" href="#34-issues" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Using a Gunicorn we have now a production-ready HTTP server, and apparently implemented everything we need. There are still many considerations and missing pieces, though.&lt;/p&gt;
&lt;h4 id="performances-again"&gt;Performances (again)&lt;a class="headerlink" href="#performances-again" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Are 3 workers enough to sustain the load of our new killer mobile application? We expect thousands of visitors per minute, so maybe we should add some. But while we increase the amount of workers, we have to keep in mind that the machine we are using has a finite amount of CPU power and memory. So, once again, we have to focus on performances, and in particular on scalability: how can we keep adding workers without having to stop the application, replace the machine with a more powerful one, and restart the service?&lt;/p&gt;
&lt;h4 id="embrace-change"&gt;Embrace change&lt;a class="headerlink" href="#embrace-change" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;This is not the only problem we have to face in production. An important aspect of technology is that it changes over time, as new and (hopefully) better solutions become widespread. We usually design systems dividing them as much as possible into communicating layers exactly because we want to be free to replace a layer with something else, be it a simpler component or a more advanced one, one with better performances or maybe just a cheaper one. So, once again, we want to be able to evolve the underlying system keeping the same interface, exactly as we did in the case of web frameworks.&lt;/p&gt;
&lt;h4 id="https_1"&gt;HTTPS&lt;a class="headerlink" href="#https_1" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Another missing part of the system is HTTPS. Gunicorn and uWSGI do not understand the HTTPS protocol, so we need something in front of them that will deal with the "S" part of the protocol, leaving the "HTTP" part to the internal layers.&lt;/p&gt;
&lt;h4 id="load-balancers"&gt;Load balancers&lt;a class="headerlink" href="#load-balancers" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;In general, a &lt;em&gt;load balancer&lt;/em&gt; is just a component in a system that distributes work among a pool of workers. Gunicorn is already distributing load among its workers, so this is not a new concept, but we generally want to do it on a bigger level, among machines or among entire systems. Load balancing can be hierarchical and be structured on many levels. We can also assign more importance to some components of the system, flagging them as ready to accept more load (for example because their hardware is better). Load balancers are extremely important in network services, and the definition of load can be extremely different from system to system: generally speaking, in a Web service the number of connections is the standard measure of the load, as we assume that on average all connections bring the same amount of work to the system.&lt;/p&gt;
&lt;h4 id="reverse-proxies"&gt;Reverse proxies&lt;a class="headerlink" href="#reverse-proxies" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Load balancers are forward proxies, as they allow a client to contact any server in a pool. At the same time, a &lt;em&gt;reverse proxy&lt;/em&gt; allows a client to retrieve data produced by several systems through the same entry point. Reverse proxies are a perfect way to route HTTP requests to sub-systems that can be implemented with different technologies. For example, you might want to have part of the system implemented with Python, using Django and Postgres, and another part served by an AWS Lambda function written in Go and connected with a non-relational database such as DynamoDB. Usually, in HTTP services this choice is made according to the URL (for example routing every URL that begins with &lt;code&gt;/api/&lt;/code&gt;).&lt;/p&gt;
&lt;h4 id="logic"&gt;Logic&lt;a class="headerlink" href="#logic" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;We also want a layer that can implement a certain amount of logic, to manage simple rules that are not related to the service we implemented. A typical example is that of HTTP redirections: what happens if a user accesses the service with an &lt;code&gt;http://&lt;/code&gt; prefix instead of &lt;code&gt;https://&lt;/code&gt;? The correct way to deal with this is through an HTTP 301 code, but you don't want such a request to reach your framework, wasting resources for such a simple task.&lt;/p&gt;
&lt;h2 id="4-the-web-server"&gt;4 The Web server&lt;a class="headerlink" href="#4-the-web-server" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="41-rationale"&gt;4.1 Rationale&lt;a class="headerlink" href="#41-rationale" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The general label of &lt;em&gt;Web server&lt;/em&gt; is given to software that performs the tasks we discussed. Two very common choices for this part of the system are nginx and Apache, two open source projects that are currently leading the market. With different technical approaches, they both implement all the features we discussed in the previous section (and many more).&lt;/p&gt;
&lt;h3 id="42-implementation"&gt;4.2 Implementation&lt;a class="headerlink" href="#42-implementation" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;To test nginx without having to fight with the OS and install too many packages we can use Docker. Docker is useful to simulate a multi-machine environment, but it might also be your technology of choice for the actual production environment (AWS ECS works with Docker containers, for example).&lt;/p&gt;
&lt;p&gt;The base configuration that we will run is very simple. One container will contain the Flask code and run the framework with Gunicorn, while the other container will run nginx. Gunicorn will serve HTTP on the internal port 8000, not exposed by Docker and thus not reachable from our browser, while nignx will expose port 80, the traditional HTTP port.&lt;/p&gt;
&lt;p&gt;In the same directory of the file &lt;code&gt;wsgi.py&lt;/code&gt;, create a &lt;code&gt;Dockerfile&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;python:3.6&lt;/span&gt;

&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;app&lt;span class="w"&gt; &lt;/span&gt;/app
&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;wsgi.py&lt;span class="w"&gt; &lt;/span&gt;/

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;
&lt;span class="k"&gt;RUN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;flask&lt;span class="w"&gt; &lt;/span&gt;gunicorn
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;8000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This starts from a Python Docker image, adds the &lt;code&gt;app&lt;/code&gt; directory and the &lt;code&gt;wsgi.py&lt;/code&gt; file, and installs Gunicorn. Now create a configuration for nginx in a file called &lt;code&gt;nginx.conf&lt;/code&gt; in the same directory&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;listen&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;server_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;location&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kn"&gt;proxy_pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://application:8000/&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This defines a server that listens on port 80 and that connects all the URL starting with &lt;code&gt;/&lt;/code&gt; with a server called &lt;code&gt;application&lt;/code&gt; on port 8000, which is the container running Gunicorn.&lt;/p&gt;
&lt;p&gt;Last, create a file &lt;code&gt;docker-compose.yml&lt;/code&gt; that will describe the configuration of the containers.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;3.7&amp;quot;&lt;/span&gt;
&lt;span class="nt"&gt;services&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;application&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;build&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;.&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;dockerfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Dockerfile&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;gunicorn --workers 3 --bind 0.0.0.0:8000 wsgi&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;expose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8000&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;nginx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;nginx&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;volumes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;./nginx.conf:/etc/nginx/conf.d/default.conf&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8080:80&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;depends_on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;application&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see the name &lt;code&gt;application&lt;/code&gt; that we mentioned in the nginx configuration file is not a magic string, but is the name we assigned to the Gunicorn container in the Docker Compose configuration. Please note that nginx listens on port 80 inside the container, but the port is published as 8080 on the host.&lt;/p&gt;
&lt;p&gt;To create this infrastructure we need to install Docker Compose in our virtual environment through &lt;code&gt;pip install docker-compose&lt;/code&gt;. I also created a file named &lt;code&gt;.env&lt;/code&gt; with the name of the project&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nv"&gt;COMPOSE_PROJECT_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;service
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;At this point you can run Docker Compose with &lt;code&gt;docker-compose up -d&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;up&lt;span class="w"&gt; &lt;/span&gt;-d
Creating&lt;span class="w"&gt; &lt;/span&gt;network&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;service_default&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;default&lt;span class="w"&gt; &lt;/span&gt;driver
Creating&lt;span class="w"&gt; &lt;/span&gt;service_application_1&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Creating&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1&lt;span class="w"&gt;       &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If everything is working correctly, opening the browser and visiting &lt;code&gt;localhost:8080&lt;/code&gt; should show you the HTML page Flask is serving.&lt;/p&gt;
&lt;p&gt;Through &lt;code&gt;docker-compose logs&lt;/code&gt; we can check what services are doing. We can recognise the output of Gunicorn in the logs of the service named &lt;code&gt;application&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;logs&lt;span class="w"&gt; &lt;/span&gt;application
Attaching&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;service_application_1
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Starting&lt;span class="w"&gt; &lt;/span&gt;gunicorn&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;20&lt;/span&gt;.0.4
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Listening&lt;span class="w"&gt; &lt;/span&gt;at:&lt;span class="w"&gt; &lt;/span&gt;http://0.0.0.0:8000&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Using&lt;span class="w"&gt; &lt;/span&gt;worker:&lt;span class="w"&gt; &lt;/span&gt;sync
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;9&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;9&lt;/span&gt;
application_1&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;-02-14&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;08&lt;/span&gt;:35:42&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;INFO&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Booting&lt;span class="w"&gt; &lt;/span&gt;worker&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;pid:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;but the one we are mostly interested with now is the service named &lt;code&gt;nginx&lt;/code&gt;, so let's follow the logs in real-time with &lt;code&gt;docker-compose logs -f nginx&lt;/code&gt;. Refresh the &lt;code&gt;localhost&lt;/code&gt; page you visited with the browser, and the container should output something like&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;logs&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;nginx
Attaching&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.192.1&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:08:42:20&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;GET / HTTP/1.1&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;13&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;-&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;-&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;which is the standard log format of nginx. It shows the IP address of the client (&lt;code&gt;192.168.192.1&lt;/code&gt;), the connection timestamp, the HTTP request and the response status code (200), plus other information on the client itself.&lt;/p&gt;
&lt;p&gt;Let's now increase the number of services, to see the load balancing mechanism in action. To do this, first we need to change the log format of nginx to show the IP address of the machine that served the request. Change the &lt;code&gt;nginx.conf&lt;/code&gt; file adding the &lt;code&gt;log_format&lt;/code&gt; and &lt;code&gt;access_log&lt;/code&gt; options&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;log_format&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;upstreamlog&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#39;[&lt;/span&gt;&lt;span class="nv"&gt;$time_local]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;to:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$upstream_addr:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$status&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;listen&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;server_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;location&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kn"&gt;proxy_pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://application:8000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;access_log&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;/var/log/nginx/access.log&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;upstreamlog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;$upstream_addr&lt;/code&gt; variable is the one that contains the IP address of the server proxied by nginx. Now run &lt;code&gt;docker-compose down&lt;/code&gt; to stop all containers and then &lt;code&gt;docker-compose up -d --scale application=3&lt;/code&gt; to start them again&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;down
Stopping&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1&lt;span class="w"&gt;       &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Stopping&lt;span class="w"&gt; &lt;/span&gt;service_application_1&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Removing&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1&lt;span class="w"&gt;       &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Removing&lt;span class="w"&gt; &lt;/span&gt;service_application_1&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Removing&lt;span class="w"&gt; &lt;/span&gt;network&lt;span class="w"&gt; &lt;/span&gt;service_default
$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;up&lt;span class="w"&gt; &lt;/span&gt;-d&lt;span class="w"&gt; &lt;/span&gt;--scale&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;application&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;
Creating&lt;span class="w"&gt; &lt;/span&gt;network&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;service_default&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;default&lt;span class="w"&gt; &lt;/span&gt;driver
Creating&lt;span class="w"&gt; &lt;/span&gt;service_application_1&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Creating&lt;span class="w"&gt; &lt;/span&gt;service_application_2&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Creating&lt;span class="w"&gt; &lt;/span&gt;service_application_3&lt;span class="w"&gt; &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
Creating&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1&lt;span class="w"&gt;       &lt;/span&gt;...&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, Docker Compose runs now 3 containers for the &lt;code&gt;application&lt;/code&gt; service. If you open the logs stream and visit the page in the browser you will now see a slightly different output&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;logs&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;nginx
Attaching&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:16&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.4:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;where you can spot &lt;code&gt;to: 192.168.240.4:8000&lt;/code&gt; which is the IP address of one of the application containers. Please note that the IP address you see might be different, as it depends on the Docker network settings. If you now visit the page again multiple times you should notice a change in the upstream address, something like&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;logs&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;nginx
Attaching&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:16&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.4:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:17&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:17&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.3:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:17&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.4:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:09:00:17&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This shows that nginx is performing load balancing, but to tell the truth this is happening through Docker's DNS, and not by an explicit action performed by the web server. We can verify this accessing the nginx container and running &lt;code&gt;dig application&lt;/code&gt; (you need to run &lt;code&gt;apt update&lt;/code&gt; and &lt;code&gt;apt install dnsutils&lt;/code&gt; to install &lt;code&gt;dig&lt;/code&gt;)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;nginx&lt;span class="w"&gt; &lt;/span&gt;/bin/bash
root@99c2f348140e:/#&lt;span class="w"&gt; &lt;/span&gt;apt&lt;span class="w"&gt; &lt;/span&gt;update
root@99c2f348140e:/#&lt;span class="w"&gt; &lt;/span&gt;apt&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;-y&lt;span class="w"&gt; &lt;/span&gt;dnsutils
root@99c2f348140e:/#&lt;span class="w"&gt; &lt;/span&gt;dig&lt;span class="w"&gt; &lt;/span&gt;application

&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;&amp;lt;&amp;gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;DiG&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;9&lt;/span&gt;.11.5-P4-5.1-Debian&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;&amp;lt;&amp;gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;application
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;global&lt;span class="w"&gt; &lt;/span&gt;options:&lt;span class="w"&gt; &lt;/span&gt;+cmd
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Got&lt;span class="w"&gt; &lt;/span&gt;answer:
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt;&amp;gt;HEADER&lt;span class="s"&gt;&amp;lt;&amp;lt;- opco&lt;/span&gt;de:&lt;span class="w"&gt; &lt;/span&gt;QUERY,&lt;span class="w"&gt; &lt;/span&gt;status:&lt;span class="w"&gt; &lt;/span&gt;NOERROR,&lt;span class="w"&gt; &lt;/span&gt;id:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;7221&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;flags:&lt;span class="w"&gt; &lt;/span&gt;qr&lt;span class="w"&gt; &lt;/span&gt;rd&lt;span class="w"&gt; &lt;/span&gt;ra&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;QUERY:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;ANSWER:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;AUTHORITY:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;ADDITIONAL:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;

&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;QUESTION&lt;span class="w"&gt; &lt;/span&gt;SECTION:
&lt;span class="p"&gt;;&lt;/span&gt;application.&lt;span class="w"&gt;                   &lt;/span&gt;IN&lt;span class="w"&gt;      &lt;/span&gt;A

&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;ANSWER&lt;span class="w"&gt; &lt;/span&gt;SECTION:
application.&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;IN&lt;span class="w"&gt;      &lt;/span&gt;A&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.2
application.&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;IN&lt;span class="w"&gt;      &lt;/span&gt;A&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.4
application.&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;IN&lt;span class="w"&gt;      &lt;/span&gt;A&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="m"&gt;192&lt;/span&gt;.168.240.3

&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Query&lt;span class="w"&gt; &lt;/span&gt;time:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;msec
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;SERVER:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;127&lt;/span&gt;.0.0.11#53&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;127&lt;/span&gt;.0.0.11&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;WHEN:&lt;span class="w"&gt; &lt;/span&gt;Fri&lt;span class="w"&gt; &lt;/span&gt;Feb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;09&lt;/span&gt;:57:24&lt;span class="w"&gt; &lt;/span&gt;UTC&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2020&lt;/span&gt;
&lt;span class="p"&gt;;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;MSG&lt;span class="w"&gt; &lt;/span&gt;SIZE&lt;span class="w"&gt;  &lt;/span&gt;rcvd:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;110&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To see load balancing performed by nginx we can explicitly define two services and assign them different weights. Run &lt;code&gt;docker-compose down&lt;/code&gt; and change the nginx configuration to&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;upstream&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;app&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;application1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;weight=3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;application2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;log_format&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;upstreamlog&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#39;[&lt;/span&gt;&lt;span class="nv"&gt;$time_local]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;to:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$upstream_addr:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$status&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;listen&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;server_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;location&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kn"&gt;proxy_pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kn"&gt;access_log&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;/var/log/nginx/access.log&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;upstreamlog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We defined here an &lt;code&gt;upstream&lt;/code&gt; structure that lists two different services, &lt;code&gt;application1&lt;/code&gt; and &lt;code&gt;application2&lt;/code&gt;, giving to the first one a weight of 3. This mean that each 4 requests, 3 will be routed to the first service, and one to the second service. Now nginx is not just relying on the DNS, but consciously choosing between two different services.&lt;/p&gt;
&lt;p&gt;Let's define the services accordingly in the Docker Compose configuration file&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;version: &amp;quot;3&amp;quot;
services:
  application1:
    build:
      context: .
      dockerfile: Dockerfile
    command: gunicorn --workers 6 --bind 0.0.0.0:8000 wsgi
    expose:
        &lt;span class="k"&gt;-&lt;/span&gt; 8000

  application2:
    build:
      context: .
      dockerfile: Dockerfile
    command: gunicorn --workers 3 --bind 0.0.0.0:8000 wsgi
    expose:
        &lt;span class="k"&gt;-&lt;/span&gt; 8000

  nginx:
    image: nginx
    volumes:
     &lt;span class="k"&gt;-&lt;/span&gt; ./nginx.conf:/etc/nginx/conf.d/default.conf
    ports:
      &lt;span class="k"&gt;-&lt;/span&gt; 80:80
    depends_on:
      &lt;span class="k"&gt;-&lt;/span&gt; application1
      &lt;span class="k"&gt;-&lt;/span&gt; application2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I basically duplicated the definition of &lt;code&gt;application&lt;/code&gt;, but the first service is running now 6 workers, just for the sake of showing a possible difference between the two. Now run &lt;code&gt;docker-compose up -d&lt;/code&gt; and &lt;code&gt;docker-compose logs -f nginx&lt;/code&gt;. If you refresh the page on the browser multiple times you will see something like&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;docker-compose&lt;span class="w"&gt; &lt;/span&gt;logs&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;nginx
Attaching&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;service_nginx_1
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:25&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:25&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/favicon.ico&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;404&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:30&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.3:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:31&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:32&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:33&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:33&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.3:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:34&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:34&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:35&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.2:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
nginx_1&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;14&lt;/span&gt;/Feb/2020:11:03:35&lt;span class="w"&gt; &lt;/span&gt;+0000&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;localhost&lt;span class="w"&gt; &lt;/span&gt;to:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;172&lt;/span&gt;.18.0.3:8000:&lt;span class="w"&gt; &lt;/span&gt;GET&lt;span class="w"&gt; &lt;/span&gt;/&lt;span class="w"&gt; &lt;/span&gt;HTTP/1.1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;where you can clearly notice the load balancing between &lt;code&gt;172.18.0.2&lt;/code&gt; (&lt;code&gt;application1&lt;/code&gt;) and &lt;code&gt;172.18.0.3&lt;/code&gt; (&lt;code&gt;application2&lt;/code&gt;) in action.&lt;/p&gt;
&lt;p&gt;I will not show here an example of reverse proxy or HTTPS to prevent this post to become too long. You can find resources on those topics in the next section.&lt;/p&gt;
&lt;h3 id="43-resources"&gt;4.3 Resources&lt;a class="headerlink" href="#43-resources" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;These resources provide more detailed information on the topics discussed in this section&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Docker Compose &lt;a href="https://docs.docker.com/compose/"&gt;official documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;nginx &lt;a href="http://nginx.org/en/docs/"&gt;documentation&lt;/a&gt;: in particular the sections about &lt;a href="http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format"&gt;log_format&lt;/a&gt; and &lt;a href="http://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream"&gt;upstream&lt;/a&gt; directives&lt;/li&gt;
&lt;li&gt;How to &lt;a href="https://docs.nginx.com/nginx/admin-guide/monitoring/logging/"&gt;configure logging&lt;/a&gt; in nginx&lt;/li&gt;
&lt;li&gt;How to &lt;a href="https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/"&gt;configure load balancing&lt;/a&gt; in nginx&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.nginx.com/nginx/admin-guide/security-controls/terminating-ssl-http/"&gt;Setting up an HTTPS Server&lt;/a&gt; with nginx and &lt;a href="https://www.humankode.com/ssl/create-a-selfsigned-certificate-for-nginx-in-5-minutes"&gt;how to created self-signed certificates&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;How to &lt;a href="https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/"&gt;create a reverse proxy&lt;/a&gt; with nginx, the documentation of the &lt;a href="http://nginx.org/en/docs/http/ngx_http_core_module.html#location"&gt;&lt;code&gt;location&lt;/code&gt;&lt;/a&gt; directive and &lt;a href="https://www.digitalocean.com/community/tutorials/understanding-nginx-server-and-location-block-selection-algorithms"&gt;some insights&lt;/a&gt; on the location choosing algorithms (one of the most complex parts of nginx)&lt;/li&gt;
&lt;li&gt;The source code of this example is available &lt;a href="https://github.com/lgiordani/dissecting-a-web-stack-code/tree/master/4_the_web_server"&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="44-issues"&gt;4.4 Issues&lt;a class="headerlink" href="#44-issues" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Well, finally we can say that the job is done. Now we have a production-ready web server in front of our multi-threaded web framework and we can focus on writing Python code instead of dealing with HTTP headers.&lt;/p&gt;
&lt;p&gt;Using a web server allows us to scale the infrastructure just adding new instances behind it, without interrupting the service. The HTTP concurrent server runs multiple instances of our framework, and the framework itself abstracts HTTP, mapping it to our high-level language.&lt;/p&gt;
&lt;h2 id="bonus-cloud-infrastructures"&gt;Bonus: cloud infrastructures&lt;a class="headerlink" href="#bonus-cloud-infrastructures" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Back in the early years of the Internet, companies used to have their own servers on-premise, and system administrators used to run the whole stack directly on the bare operating system. Needless to say, this was complicated, expensive, and failure-prone.&lt;/p&gt;
&lt;p&gt;Nowadays "the cloud" is the way to go, so I want to briefly mention some components that can help you run such a web stack on AWS, which is the platform I know the most and the most widespread cloud provider in the world at the time of writing.&lt;/p&gt;
&lt;h3 id="elastic-beanstalk"&gt;Elastic Beanstalk&lt;a class="headerlink" href="#elastic-beanstalk" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This is the entry-level solution for simple applications, being a managed infrastructure that provides load balancing, auto-scaling, and monitoring. You can use several programming languages (among which Python and Node.js) and choose between different web servers like for example Apache or nginx. The components of an EB service are not hidden, but you don't have direct access to them, and you have to rely on configuration files to change the way they work. It's a good solution for simple services, but you will probably soon need more control.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://aws.amazon.com/elasticbeanstalk"&gt;Go to Elastic Beanstalk&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="elastic-container-service-ecs"&gt;Elastic Container Service (ECS)&lt;a class="headerlink" href="#elastic-container-service-ecs" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;With ECS you can run Docker containers grouping them in clusters and setting up auto-scale policies connected with metrics coming from CloudWatch. You have the choice of running them on EC2 instances (virtual machines) managed by you or on a serverless infrastructure called Fargate. ECS will run your Docker containers, but you still have to create DNS entries and load balancers on your own. You also have the choice of running your containers on Kubernetes using EKS (Elastic Kubernetes Service).&lt;/p&gt;
&lt;p&gt;&lt;a href="https://aws.amazon.com/ecs/"&gt;Go to Elastic Container Service&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="elastic-compute-cloud-ec2"&gt;Elastic Compute Cloud (EC2)&lt;a class="headerlink" href="#elastic-compute-cloud-ec2" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This is the bare metal of AWS, where you spin up stand-alone virtual machines or auto-scaling group of them. You can SSH into these instances and provide scripts to install and configure software. You can install here your application, web servers, databases, whatever you want. While this used to be the way to go at the very beginning of the cloud computing age I don't think you should go for it. There is so much a cloud provider can give you in terms of associated services like logs or monitoring, and in terms of performances, that it doesn't make sense to avoid using them. EC2 is still there, anyway, and if you run ECS on top of it you need to know what you can and what you can't do.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://aws.amazon.com/ec2/"&gt;Go to Elastic Compute Cloud&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="elastic-load-balancing"&gt;Elastic Load Balancing&lt;a class="headerlink" href="#elastic-load-balancing" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;While Network Load Balancers (NLB) manage pure TCP/IP connections, Application Load Balancers are dedicated to HTTP, and they can perform many of the services we need. They can reverse proxy through rules (that were recently improved) and they can terminate TLS, using certificates created in ACM (AWS Certificate Manager). As you can see, ALBs are a good replacement for a web server, even though they clearly lack the extreme configurability of a software. You can, however, use them as the first layer of load balancing, still using nginx or Apache behind them if you need some of the features they provide.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://aws.amazon.com/elasticloadbalancing/"&gt;Go to Elastic Load Balancing&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="cloudfront"&gt;CloudFront&lt;a class="headerlink" href="#cloudfront" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;CloudFront is a Content Delivery Network, that is a geographically-distributed cache that provides faster access to your content. While CDNs are not part of the stack that I discussed in this post I think it is worth mentioning CF as it can speed-up any static content, and also terminate TLS in connection with AWS Certificate Manager.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://aws.amazon.com/cloudfront/"&gt;Go to CloudFront&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="conclusion"&gt;Conclusion&lt;a class="headerlink" href="#conclusion" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;As you can see a web stack is a pretty rich set of components, and the reason behind them is often related to performances. There are a lot of technologies that we take for granted, and that fortunately have become easier to deploy, but I still believe a full-stack engineer should be aware not only of the existence of such layers, but also of their purpose and at least their basic configuration.&lt;/p&gt;
&lt;h2 id="feedback"&gt;Feedback&lt;a class="headerlink" href="#feedback" title="Permanent link"&gt;&amp;para;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Feel free to reach me on &lt;a href="https://twitter.com/thedigicat"&gt;Twitter&lt;/a&gt; if you have questions. The &lt;a href="https://github.com/TheDigitalCatOnline/blog_source/issues"&gt;GitHub issues&lt;/a&gt; page is the best place to submit corrections.&lt;/p&gt;</content><category term="Programming"></category><category term="software architecture"></category><category term="concurrent programming"></category><category term="cryptography"></category><category term="devops"></category><category term="infrastructure"></category><category term="Flask"></category><category term="Django"></category><category term="Python"></category><category term="SSL"></category><category term="HTTP"></category><category term="WWW"></category><category term="AWS"></category></entry><entry><title>Clean Architectures in Python: the book</title><link href="https://www.thedigitalcatonline.com/blog/2018/12/20/cabook/" rel="alternate"></link><published>2018-12-20T08:00:00+01:00</published><updated>2021-08-22T08:00:00+00:00</updated><author><name>Leonardo Giordani</name></author><id>tag:www.thedigitalcatonline.com,2018-12-20:/blog/2018/12/20/cabook/</id><summary type="html">&lt;p&gt;A practical approach to better software design&lt;/p&gt;</summary><content type="html">&lt;p&gt;The second edition of the book "Clean Architectures in Python. A practical approach to better software design" is out!&lt;/p&gt;
&lt;div class="center-image"&gt;
&lt;img src="/images/cabook/cover.jpg" alt="Cover" /&gt;
&lt;/div&gt;

&lt;p&gt;You can read the book online on the new website &lt;a href="https://www.thedigitalcatbooks.com/"&gt;The Digital Cat Books&lt;/a&gt; or download a PDF version &lt;a href="https://leanpub.com/clean-architectures-in-python"&gt;from Leanpub&lt;/a&gt;. If you enjoy it, please tweet about it with the &lt;code&gt;#pycabook&lt;/code&gt; hashtag.&lt;/p&gt;
&lt;p&gt;So far more than 16,000 readers downloaded the book. Thank you all!&lt;/p&gt;</content><category term="Projects"></category><category term="OOP"></category><category term="Python"></category><category term="Python2"></category><category term="Python3"></category><category term="TDD"></category><category term="testing"></category><category term="software architecture"></category></entry><entry><title>Clean architectures in Python: a step-by-step example</title><link href="https://www.thedigitalcatonline.com/blog/2016/11/14/clean-architectures-in-python-a-step-by-step-example/" rel="alternate"></link><published>2016-11-14T19:00:00+00:00</published><updated>2021-09-24T12:00:00+00:00</updated><author><name>Leonardo Giordani</name></author><id>tag:www.thedigitalcatonline.com,2016-11-14:/blog/2016/11/14/clean-architectures-in-python-a-step-by-step-example/</id><summary type="html">&lt;p&gt;How to create software that can be easily changed and adapted, following the clean architecture principles&lt;/p&gt;</summary><content type="html">&lt;p&gt;In 2015 I was introduced by my friend &lt;a href="https://github.com/gekorob"&gt;Roberto Ciatti&lt;/a&gt; to the concept of Clean Architecture, as it is called by Robert Martin. The well-known Uncle Bob talks a lot about this concept at conferences and wrote some very interesting posts about it. What he calls &amp;quot;Clean Architecture&amp;quot; is a way of structuring a software system, a set of consideration (more than strict rules) about the different layers and the role of the actors in it.&lt;/p&gt;&lt;p&gt;As he clearly states in a post aptly titled &lt;a href="https://blog.8thlight.com/uncle-bob/2012/08/13/the-clean-architecture.html"&gt;The Clean Architecture&lt;/a&gt;, the idea behind this design is not new. As a matter of fact, it is a set of concepts that have been pushed by many software engineers over the last 3 decades. One of the first implementations may be found in the Boundary-Control-Entity model proposed by Ivar Jacobson in his masterpiece &lt;a href="https://www.ivarjacobson.com/publications/books/object-oriented-software-engineering-1992"&gt;Object-Oriented Software Engineering: A Use Case Driven Approach&lt;/a&gt; published in 1992, but Martin lists other more recent versions of this architecture.&lt;/p&gt;&lt;p&gt;I will not repeat here what he had already explained better than I can do, so I will just point out some resources you may check to start exploring these concepts:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://blog.8thlight.com/uncle-bob/2012/08/13/the-clean-architecture.html"&gt;The Clean Architecture&lt;/a&gt; a post by Robert Martin that concisely describes the goals of the architecture. It also lists resources that describe similar architectures.&lt;/li&gt;&lt;li&gt;&lt;a href="https://blog.8thlight.com/uncle-bob/2014/05/12/TheOpenClosedPrinciple.html"&gt;The Open Closed Principle&lt;/a&gt; a post by Robert Martin not strictly correlated with the Clean Architecture concept but important for the separation concept.&lt;/li&gt;&lt;li&gt;Hakka Labs: Robert &amp;quot;Uncle Bob&amp;quot; Martin - &lt;a href="https://www.youtube.com/watch?v=HhNIttd87xs"&gt;Architecture: The Lost Years&lt;/a&gt; a video of Robert Martin from Hakka Labs.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.taimila.com/blog/ddd-and-testing-strategy/"&gt;DDD &amp;amp; Testing Strategy&lt;/a&gt; by Lauri Taimila&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.amazon.co.uk/Clean-Architecture-Robert-C-Martin-x/dp/0134494164/ref=la_B000APG87E_1_3?s=books&amp;ie=UTF8&amp;qid=1479146201&amp;sr=1-3"&gt;Clean Architecture&lt;/a&gt; by Robert Martin, published by Prentice Hall.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id="a-day-in-the-life-of-a-clean-system-7792"&gt;A day in the life of a clean system&lt;a class="headerlink" href="#a-day-in-the-life-of-a-clean-system-7792" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I will introduce here a (very simple) system designed with a clean architecture. The purpose of this section is to familiarise with main concepts like &lt;em&gt;separation of concerns&lt;/em&gt; and &lt;em&gt;inversion of control&lt;/em&gt;, which are paramount in system design. While I describe how data flows in the system, I will purposefully omit details, so that we can focus on the global idea and not worry too much about the implementation.&lt;/p&gt;&lt;h3 id="the-data-flow-a31a"&gt;The data flow&lt;/h3&gt;&lt;p&gt;In the rest of the book, we will design together part of a simple web application that provides a room renting system. So, let&amp;#x27;s consider that our &amp;quot;Rent-o-Matic&amp;quot; application (inspired by the Sludge-O-Matic™ from Day of the Tentacle) is running at &lt;a href="https://www.rentomatic.com"&gt;https://www.rentomatic.com&lt;/a&gt;, and that a user wants to see the available rooms. They open the browser and type the address, then clicking on menus and buttons they reach the page with the list of all the rooms that our company rents.&lt;/p&gt;&lt;p&gt;Let&amp;#x27;s assume that this URL is &lt;code&gt;/rooms?status=available&lt;/code&gt;. When the user&amp;#x27;s browser accesses that URL, an HTTP request reaches our system, where there is a component that is waiting for HTTP connections. Let&amp;#x27;s call this component &amp;quot;web framework&amp;quot;.&lt;/p&gt;&lt;p&gt;The purpose of the web framework is to understand the HTTP request and to retrieve the data that we need to provide a response. In this simple case there are two important parts of the request, namely the endpoint itself (&lt;code&gt;/rooms&lt;/code&gt;), and a single query string parameter, &lt;code&gt;status=available&lt;/code&gt;. Endpoints are like commands for our system, so when a user accesses one of them, they signal to the system that a specific service has been requested, which in this case is the list of all the rooms that are available for rent.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure01.svg" alt="The web framework serving HTTP"&gt;&lt;div class="title"&gt;The web framework serving HTTP&lt;/div&gt;&lt;/div&gt;&lt;p&gt;The domain in which the web framework operates is that of the HTTP protocol, so when the web framework has decoded the request it should pass the relevant information to another component that will process it. This other component is called &lt;em&gt;use case&lt;/em&gt;, and it is the crucial and most important component of the whole clean system as it implements the &lt;em&gt;business logic&lt;/em&gt;.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure02.svg" alt="The business logic"&gt;&lt;div class="title"&gt;The business logic&lt;/div&gt;&lt;/div&gt;&lt;p&gt;The business logic is an important concept in system design. You are creating a system because you have some knowledge that you think might be useful to the world, or at the very least marketable. This knowledge is, at the end of the day, a way to process data, a way to extract or present data that maybe others don&amp;#x27;t have. A search engine can find all the web pages that are related to the terms in a query, a social network shows you the posts of people you follow and sorts them according to a specific algorithm, a travel company finds the best options for your journey between two locations, and so on. All these are good examples of business logic.&lt;/p&gt;&lt;div class="admonition tip"&gt;&lt;i class="fa fa-lightbulb"&gt;&lt;/i&gt;&lt;div class="content"&gt;&lt;div class="title"&gt;Business logic&lt;/div&gt;&lt;div&gt;&lt;p&gt;Business logic is the specific algorithm or process that you want to implement, the way you transform data to provide a service. It is the most important part of the system.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;The use case implements a very specific part of the whole business logic. In this case we have a use case to search for rooms with a given value of the parameter &lt;code&gt;status&lt;/code&gt;. This means that the use case has to extract all the rooms that are managed by our company and filter them to show only the ones that are available.&lt;/p&gt;&lt;p&gt;Why can&amp;#x27;t the web framework do it? Well, the main purpose of a good system architecture is to &lt;em&gt;separate concerns&lt;/em&gt;, that is to keep different responsibilities and domains separated. The web framework is there to process the HTTP protocol, and is maintained by programmers that are concerned with that specific part of the system, and adding the business logic to it mixes two very different fields.&lt;/p&gt;&lt;div class="admonition tip"&gt;&lt;i class="fa fa-lightbulb"&gt;&lt;/i&gt;&lt;div class="content"&gt;&lt;div class="title"&gt;Separation of concerns&lt;/div&gt;&lt;div&gt;&lt;p&gt;Different parts a system should manage different parts of the process. Whenever two separate parts of a system work on the same data or the same part of a process they are &lt;em&gt;coupled&lt;/em&gt;. While coupling is unavoidable, the higher the coupling between two components the harder is to change one without affecting the other.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;As we will see, separating layers allows us to maintain the system with less effort, making single parts of it more testable and easily replaceable.&lt;/p&gt;&lt;p&gt;In the example that we are discussing here, the use case needs to fetch all the rooms that are in an available state, extracting them from a source of data. This is the business logic, and in this case it is very straightforward, as it will probably consist of a simple filtering on the value of an attribute. This might however not be the case. An example of a more advanced business logic might be an ordering based on a recommendation system, which might require the use case to connect with more components than just the data source.&lt;/p&gt;&lt;p&gt;So, the information that the use case wants to process is stored somewhere. Let&amp;#x27;s call this component &lt;em&gt;storage system&lt;/em&gt;. Many of you probably already pictured a database in your mind, maybe a relational one, but that is just one of the possible data sources. The abstraction represented by the storage system is: anything that the use case can access and that can provide data is a source. It might be a file, a database (either relational or not), a network endpoint, or a remote sensor.&lt;/p&gt;&lt;div class="admonition tip"&gt;&lt;i class="fa fa-lightbulb"&gt;&lt;/i&gt;&lt;div class="content"&gt;&lt;div class="title"&gt;Abstraction&lt;/div&gt;&lt;div&gt;&lt;p&gt;When designing a system, it is paramount to think in terms of abstractions, or building blocks. A component has a role in the system, regardless of the specific implementation of that component. The higher the level of the abstraction, the less detailed are the components. Clearly, high-level abstractions don&amp;#x27;t consider practical problems, which is why the abstract design has to be then implemented using specific solutions or technologies.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;For simplicity&amp;#x27;s sake, let&amp;#x27;s use a relational database like Postgres in this example, as it is likely to be familiar to the majority of readers, but keep in mind the more generic case.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure03.svg" alt="The storage"&gt;&lt;div class="title"&gt;The storage&lt;/div&gt;&lt;/div&gt;&lt;p&gt;How does the use case connect with the storage system? Clearly, if we hard code into the use case the calls to a specific system (e.g. using SQL) the two components will be &lt;em&gt;strongly coupled&lt;/em&gt;, which is something we try to avoid in system design. Coupled components are not independent, they are tightly connected, and changes occurring in one of the two force changes in the second one (and vice versa). This also means that testing components is more difficult, as one component cannot live without the other, and when the second component is a complex system like a database this can severely slow down development.&lt;/p&gt;&lt;p&gt;For example, let&amp;#x27;s assume the use case called directly a specific Python library to access PostgreSQL such as &lt;a href="https://www.psycopg.org/"&gt;psycopg&lt;/a&gt;. This would couple the use case with that specific source, and a change of database would result in a change of its code. This is far from being ideal, as the use case contains the business logic, which has not changed moving from one database system to the other. Parts of the system that do not contain the business logic should be treated like implementation details.&lt;/p&gt;&lt;div class="admonition tip"&gt;&lt;i class="fa fa-lightbulb"&gt;&lt;/i&gt;&lt;div class="content"&gt;&lt;div class="title"&gt;Implementation detail&lt;/div&gt;&lt;div&gt;&lt;p&gt;A specific solution or technology is called a &lt;em&gt;detail&lt;/em&gt; when it is not central to the design as a whole. The word doesn&amp;#x27;t refer to the inherent complexity of the subject, which might be greater than that of more central parts.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;A relational database is hundred of times richer and more complex than an HTTP endpoint, and this in turn is more complex than ordering a list of objects, but the core of the application is the use case, not the way we store data or the way we provide access to that. Usually, implementation details are mostly connected with performances or usability, while the core parts implement the pure business logic.&lt;/p&gt;&lt;p&gt;How can we avoid strong coupling? A simple solution is called &lt;em&gt;inversion of control&lt;/em&gt;, and I will briefly sketch it here, and show a proper implementation in a later section of the book, when we will implement this very example.&lt;/p&gt;&lt;p&gt;Inversion of control happens in two phases. First, the called object (the database in this case) is wrapped with a standard interface. This is a set of functionalities shared by every implementation of the target, and each interface translates the functionalities to calls to the specific language&lt;sup&gt;[&lt;a id="ref-footnote-1-df544db1" href="#cnt-footnote-1-df544db1"&gt;1&lt;/a&gt;]&lt;/sup&gt; of the wrapped implementation.&lt;/p&gt;&lt;div class="admonition tip"&gt;&lt;i class="fa fa-lightbulb"&gt;&lt;/i&gt;&lt;div class="content"&gt;&lt;div class="title"&gt;Inversion of control&lt;/div&gt;&lt;div&gt;&lt;p&gt;A technique used to avoid strong coupling between components of a system, that involves wrapping them so that they expose a certain interface. A component expecting that interface can then connect to them without knowing the details of the specific implementation, and thus being strongly coupled to the interface instead of the specific implementation.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;A real world example of this is that of power plugs: electric appliances are designed to be connected not with specific power plugs, but to any power plug that is build according to the specification (size, number of poles, etc). When you buy a TV in the UK, you expect it to come with a UK plug (BS 1363). If it doesn&amp;#x27;t, you need an &lt;em&gt;adapter&lt;/em&gt; that allows you to plug electronic devices into sockets of a foreign nation. In this case, we need to connect the use case (TV) to a database (power system) that has not been designed to match a common interface.&lt;/p&gt;&lt;p&gt;In the example we are discussing, the use case needs to extract all rooms with a given status, so the database wrapper needs to provide a single entry point that we might call &lt;code&gt;list_rooms_with_status&lt;/code&gt;.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure04.svg" alt="The storage interface"&gt;&lt;div class="title"&gt;The storage interface&lt;/div&gt;&lt;/div&gt;&lt;p&gt;In the second phase of inversion of control the caller (the use case) is modified to avoid hard coding the call to the specific implementation, as this would again couple the two. The use case accepts an incoming object as a parameter of its constructor, and receives a concrete instance of the adapter at creation time. The specific technique used to implement this depends greatly on the programming language we use. Python doesn&amp;#x27;t have an explicit syntax for interfaces, so we will just assume the object we pass implements the required methods.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure05.svg" alt="Inversion of control on the storage interface"&gt;&lt;div class="title"&gt;Inversion of control on the storage interface&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Now the use case is connected with the adapter and knows the interface, and it can call the entry point &lt;code&gt;list_rooms_with_status&lt;/code&gt; passing the status &lt;code&gt;available&lt;/code&gt;. The adapter knows the details of the storage system, so it converts the method call and the parameter in a specific call (or set of calls) that extract the requested data, and then converts them in the format expected by the use case. For example, it might return a Python list of dictionaries that represent rooms.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure06.svg" alt="The business logic extracts data from the storage"&gt;&lt;div class="title"&gt;The business logic extracts data from the storage&lt;/div&gt;&lt;/div&gt;&lt;p&gt;At this point, the use case has to apply the rest of the business logic, if needed, and return the result to the web framework.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure07.svg" alt="The business logic returns processed data to the web framework"&gt;&lt;div class="title"&gt;The business logic returns processed data to the web framework&lt;/div&gt;&lt;/div&gt;&lt;p&gt;The web framework converts the data received from the use case into an HTTP response. In this case, as we are considering an endpoint that is supposed to be reached explicitly by the user of the website, the web framework will return an HTML page in the body of the response, but if this was an internal endpoint, for example called by some asynchronous JavaScript code in the front-end, the body of the response would probably just be a JSON structure.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure08.svg" alt="The web framework returns the data in an HTTP response"&gt;&lt;div class="title"&gt;The web framework returns the data in an HTTP response&lt;/div&gt;&lt;/div&gt;&lt;h3 id="advantages-of-a-layered-architecture-a271"&gt;Advantages of a layered architecture&lt;/h3&gt;&lt;p&gt;As you can see, the stages of this process are clearly separated, and there is a great deal of data transformation between them. Using common data formats is one of the way we achieve independence, or loose coupling, between components of a computer system.&lt;/p&gt;&lt;p&gt;To better understand what loose coupling means for a programmer, let&amp;#x27;s consider the last picture. In the previous paragraphs I gave an example of a system that uses a web framework for the user interface and a relational database for the data source, but what would change if the front-end part was a command-line interface? And what would change if, instead of a relational database, there was another type of data source, for example a set of text files?&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure09.svg" alt="The web framework replaced by a CLI"&gt;&lt;div class="title"&gt;The web framework replaced by a CLI&lt;/div&gt;&lt;/div&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure10.svg" alt="A database replaced by a more trivial file-based storage"&gt;&lt;div class="title"&gt;A database replaced by a more trivial file-based storage&lt;/div&gt;&lt;/div&gt;&lt;p&gt;As you can see, both changes would require the replacement of some components. After all, we need different code to manage a command line instead of a web page. But the external shape of the system doesn&amp;#x27;t change, neither does the way data flows. We created a system in which the user interface (web framework, command-line interface) and the data source (relational database, text files) are details of the implementation, and not core parts of it.&lt;/p&gt;&lt;p&gt;The main immediate advantage of a layered architecture, however, is testability. When you clearly separate components you clearly establish the data each of them has to receive and produce, so you can ideally disconnect a single component and test it in isolation. Let&amp;#x27;s take the Web framework component that we added and consider it for a moment forgetting the rest of the architecture. We can ideally connect a tester to its inputs and outputs as you can see in the figure&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure11.svg" alt="Testing the web layer in isolation"&gt;&lt;div class="title"&gt;Testing the web layer in isolation&lt;/div&gt;&lt;/div&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/figure12.svg" alt="Detailed setup of the web layer testing, width=80%"&gt;&lt;div class="title"&gt;Detailed setup of the web layer testing&lt;/div&gt;&lt;/div&gt;&lt;p&gt;We know that the Web framework receives an HTTP request &lt;span class="callout"&gt;1&lt;/span&gt; with a specific target and a specific query string, and that it has to call &lt;span class="callout"&gt;2&lt;/span&gt; a method on the use case passing specific parameters. When the use case returns data &lt;span class="callout"&gt;3&lt;/span&gt;, the Web framework has to convert that into an HTTP response &lt;span class="callout"&gt;4&lt;/span&gt;. Since this is a test we can have a fake use case, that is an object that just mimics what the use case does without really implementing the business logic. We will then test that the Web framework calls the method &lt;span class="callout"&gt;2&lt;/span&gt; with the correct parameters, and that the HTTP response &lt;span class="callout"&gt;4&lt;/span&gt; contains the correct data in the proper format, and all this will happen without involving any other part of the system.&lt;/p&gt;&lt;h2 id="clean-architectures-in-python-the-book-c4fa"&gt;Clean Architectures in Python: the book&lt;a class="headerlink" href="#clean-architectures-in-python-the-book-c4fa" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I hope you found this introduction useful. What you read so far was the first chapter of the book &amp;quot;Clean Architectures in Python&amp;quot; that you can read online at &lt;a href="https://www.thedigitalcatbooks.com/pycabook-introduction/"&gt;The Digital Cat Books&lt;/a&gt;. The book is available as PDF and ebook &lt;a href="https://leanpub.com/clean-architectures-in-python"&gt;on Leanpub&lt;/a&gt;.&lt;/p&gt;&lt;div class="imageblock"&gt;&lt;img src="/images/cabook/cover.jpg" alt="The book cover"&gt;&lt;/div&gt;&lt;p&gt;Chapter 2 of the book briefly discusses the &lt;strong&gt;components&lt;/strong&gt; and the ideas behind this software architecture. Chapter 3 runs through &lt;strong&gt;a concrete example&lt;/strong&gt; of clean architecture and chapter 4 expands the example adding a &lt;strong&gt;web application&lt;/strong&gt; on top of it. Chapter 5 discusses &lt;strong&gt;error management&lt;/strong&gt; and improvements to the Python code developed in the previous chapters. Chapters 6 and 7 show how to plug &lt;strong&gt;different database systems&lt;/strong&gt; to the web service created previously, and chapter 8 wraps up the example showing how to run the application with a &lt;strong&gt;production-ready configuration&lt;/strong&gt;.&lt;/p&gt;&lt;div id="_footnotes"&gt;&lt;div id="cnt-footnote-1-df544db1"&gt;&lt;a href="#ref-footnote-1-df544db1"&gt;1&lt;/a&gt;  &lt;p&gt;The word &lt;em&gt;language&lt;/em&gt;, here, is meant in its broader sense. It might be a programming language, but also an API, a data format, or a protocol.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2 id="feedback-d845"&gt;Feedback&lt;a class="headerlink" href="#feedback-d845" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Feel free to reach me on &lt;a href="https://twitter.com/thedigicat"&gt;Twitter&lt;/a&gt; if you have questions. The &lt;a href="https://github.com/TheDigitalCatOnline/blog_source/issues"&gt;GitHub issues&lt;/a&gt; page is the best place to submit corrections.&lt;/p&gt;</content><category term="Programming"></category><category term="OOP"></category><category term="pytest"></category><category term="Python"></category><category term="Python2"></category><category term="Python3"></category><category term="TDD"></category><category term="software architecture"></category></entry></feed>