Consistent Backends and UX: Why Should You Care?
Article series
- Why care about consistency?
- What problems may occur?
- What are the barriers to adopting a consistent database?
- How does the new algorithm help?
Today, more and more new products are aiming to have an impact on the global market, and user experience is quickly becoming a key factor in determining its success or failure. The following aspects can significantly affect the user experience of the application:
- Performance and low latency
- The application runs as expected
- Security
- Features and UI
Let’s start our journey of pursuing the perfect user experience!
1) Performance and low latency
As others have said before: Performance is user experience (1, 2). Even a slight increase in latency can once again cost you the attention of potential visitors.
2) The application runs as expected
What exactly does "run as expected" mean? This means that if I change my name to "Robert" in the app and then reload the app, my name will be Robert instead of Brecht. It seems important that the app provides these guarantees, right?
Whether an application can fulfill these guarantees depends on the database. In the pursuit of low latency and high performance, we end up entering the world of distributed databases, where only a few newer databases can provide these guarantees. In the field of distributed databases, unless we choose a strong consistency (as opposed to final consistency) database, there may be various problems lurking. In this series, we will go into detail on what this means, which databases offer this feature called strong consistency, and how it can help you build super fast applications easily.
3) Security
Security doesn't seem to affect the user experience at first. However, once a user notices a security breach, the relationship can be irreparable.
4) Functions and UI
Impressive features and a great UI can have a significant impact on the user's consciousness and subconsciousness. Often, people want a particular product only after experiencing the look and feel of the product.
If the database saves setup and configuration time, our other work can focus on providing impressive features and a great UI. The good news is: Today, there are databases that meet all of the above requirements, without configuration or server configuration, and provide easy-to-use APIs such as GraphQL out of the box.
What is the difference between this new database? Let's step back and see how the ongoing pursuit of lower latency and better user experience, as well as advances in database research, ultimately lead to the advent of a new type of database that is the ideal building block for modern applications.
The pursuit of distributed
I. Content distribution network
As mentioned earlier, performance has a significant impact on user experience. There are several ways to improve latency, the most obvious one is to optimize application code. Once your application code is fully optimized, network latency and database read and write performance are often still bottlenecks. To meet our low latency requirements, we need to make sure our data is as close to the client as possible by distributing the data globally. We can achieve the second requirement (read-write performance) by having multiple machines work together, or in other words, copy data.
Distributed brings better performance, thus bringing a good user experience. We have seen widely used distributed solutions that can speed up the delivery of static data; it is called Content Distribution Network (CDN). The Jamstack community attaches great importance to CDNs to reduce latency in its applications. They usually use frameworks and tools such as Next.js/Now, Gatsby, and Netlify to pre-assemble the front-end React/Angular/Vue code into static websites so that they can provide these websites from the CDN.
Unfortunately, CDN is not suitable for all use cases, as we cannot rely on statically generated HTML pages to handle all applications. There are many types of dynamic applications that you cannot generate everything statically in advance. For example:
- Applications that require real-time updates to communicate instantly between users (e.g., chat applications, collaborative drawing or writing, games).
- Applications that render data by filtering, aggregating, sorting, and manipulating it in multiple ways that you can't pre-create everything.
II. Distributed Database
Typically, highly dynamic applications require distributed databases to improve performance. Like CDNs, distributed databases are designed to be global networks rather than single nodes. Essentially, we want to go from a scenario with a single database node...
...The scenario of transforming into a database and becoming a network. When a user connects from a specific continent, he will be automatically redirected to the nearest database. This will result in lower latency and happier end users.
If the database is an employee waiting for a call, the database employee will tell you that there are closer employees nearby and then forward the call. Fortunately, distributed databases automatically direct us to the nearest database employees, so we don’t have to bother with database employees in other continents anymore.
A distributed database is multi-regional and you will always be redirected to the nearest node.
In addition to latency, distributed databases also offer a second and third advantage. The second is redundancy, which means that if one database location in the network is completely destroyed by Godzilla attacks, your data will not be lost because other nodes still have a copy of your data.
Last but not least, the third advantage of using distributed databases is scalability. A database running on a server will soon become a bottleneck for applications. Instead, distributed databases copy data on multiple servers and can be automatically scaled and reduced according to the needs of the application. In some advanced distributed databases, this is entirely your responsibility. These databases are called "serverless", which means you don't even have to configure when the database should scale and scale down, you just pay for the use of the application, and that's it.
Distributing dynamic data brings us into the field of distributed databases. As mentioned earlier, there may be various problems. Compared to CDNs, data is highly dynamic; data can be changed quickly and can be filtered and sorted, which brings additional complexity. The Database World examines different ways to achieve this. Early approaches had to sacrifice to achieve the required performance and scalability. Let's see how distributed search evolves.
Distributed approach to traditional databases
A logical choice is to build on traditional databases (MySQL, PostgreSQL, SQL Server), because a lot of energy has been invested in it. However, traditional databases are not built for distribution, so a fairly simple distributed approach is adopted. A typical way to extend read is to use a read copy. Reading a copy is just a copy of your data from which you can read but not write. Such a copy (or replica) uninstalls the query from the node containing the original data. This mechanism is very simple because the data will be copied into the replica as the data arrives.
Due to this relatively simple method, the data of the copy is always older than the original. If you read data from a replica node at a specific point in time, you may get an older value than reading from the master node. This is called "stale reading". Programmers using traditional databases must be aware of this possibility and take into account this limitation to program. Remember the example we gave at the beginning? We write a value in it and reread it? When using a traditional database copy, you cannot expect to read what you write.
You can slightly improve the user experience by optimizingly applying the write results to the front-end before all replicas understand the writes. However, if the update has not reached the copy, the reload of the web page may return the UI to its previous state. The user then thinks that his changes have never been saved.
The first generation of distributed databases
In traditional database replication methods, the obvious bottleneck is that all writes go to the same node. The machine can scale up, but eventually it will encounter an upper limit. As your application becomes more popular and the number of writes increases, the database will no longer be able to quickly accept new data. In order to horizontally scale read and write, a distributed database was invented. A distributed database also holds multiple copies of the data, but you can write to each of these copies. Since you update data through each node, all nodes must communicate with each other and inform each other of new data. In other words, it is no longer one-way like traditional systems.
However, these types of databases may still suffer from the aforementioned stale reads and introduce many other potential problems related to write. Whether they suffer from these issues depends on their decisions regarding usability and consistency.
The first generation of distributed databases, commonly known as the "NoSQL Movement", is influenced by databases such as MongoDB and Neo4j, which also provide SQL with alternative languages and different modeling strategies (documents or graphs instead of tables). NoSQL databases usually do not have typical traditional database features such as constraints and joins. Over time, this name seems to be a bad name, as many databases considered NoSQL do provide some form of SQL. Various explanations have emerged, claiming that NoSQL databases:
- SQL is not provided as the query language.
- Not only SQL (NoSQL = Not Only SQL)
- Typical traditional features such as joins, constraints, ACID guarantees are not provided.
- Model its data in different ways (graphics, documents, or time models)
Then, some newer non-relational databases that provide SQL are called "NewSQL" to avoid confusion.
Error interpretation of CAP theorem
The first generation of databases is strongly inspired by the CAP theorem, which states that you cannot have both consistency and availability during network partitioning. Network partitioning is essentially something happens that causes two nodes to no longer inform each other of new data, and can occur for a variety of reasons (for example, sharks sometimes bite off Google’s cables, as reported). Consistency means that the data in the database is always correct, but not necessarily available to your application. Availability means that your database is always online and your application is always able to access that data, but this does not guarantee that the data is correct or the same in multiple nodes. We usually talk about high availability because there is no 100% availability. Availability is expressed in 9 digits (for example, 99.9999% availability), because there is always a series of events that can cause downtime.
But what happens without a network partition? Database vendors have an overly general understanding of the CAP theorem, either choosing to accept potential data loss or choosing to be available, regardless of network partitioning. While the CAP theorem is a good start, it does not emphasize that high availability and consistency can be achieved without network partitions. Most of the time, network partitions do not exist, so it makes sense to describe this situation by extending the CAP theorem to the PACELC theorem. The key difference is the last three letters (ELC), which represent Else Latency Consistency. The theorem states that if there is no network partition, the database must balance latency and consistency.
Simply put: if there is no network partition, the latency will also increase when consistency guarantees increase. But we will see that reality is even more subtle than that.
What does this have to do with user experience?
Let's look at an example of how abandoning consistency affects the user experience. Consider an app that provides you with a friendly interface to form a team of people; you can drag and drop people into different teams.
Once you drag and drop a person onto a team, an update is triggered to update the team. If the database does not guarantee that your application can read the results of this update immediately, the UI must be optimistic about applying these changes. In this case, something bad may happen:
- The user refreshes the page and no longer sees his updates and thinks his updates are gone. When he refreshed again, it suddenly came back.
- The database did not successfully store the update due to a conflict with another update. In this case, the update may be cancelled and the user will never know. He may only notice his changes disappear the next time he reloads.
This trade-off between consistency and latency has sparked a lot of heated discussions between front-end and back-end developers. The first group wants an excellent user experience, receives feedback when users perform an action, and can be 100% sure that the results of their actions will be consistently saved once they receive this feedback and respond to it. The second group wants to build a scalable and high-performance backend and believes there is nothing else to do except sacrifice the aforementioned user experience requirements to achieve this.
Both groups have reasonable arguments, but there is no universal solution to satisfy both. When transactions increase and the database becomes a bottleneck, their only option is to choose a distributed database that replicates traditional databases or sacrifices strong consistency in exchange for so-called "final consistency". In final consistency, updates to the database will eventually be applied to all machines, but there is no guarantee that the next transaction can read the updated value. In other words, if I update my name to "Robert", then if I query my name immediately after the update, I can't guarantee that I will actually receive "Robert".
Consistency Tax
To handle ultimate consistency, developers need to understand the limitations of such databases and do a lot of extra work. Programmers often turn to user experience skills to hide the limitations of databases, and the backend has to write many additional code layers to accommodate a variety of failure scenarios. Finding and building creative solutions around these limitations has greatly impacted the way front-end and back-end developers get their job done, greatly increasing technical complexity while still not delivering the ideal user experience.
We can think of the extra work required to ensure data correctness as a “tax” that application developers must pay to provide a good user experience. This is the tax on using software systems that do not provide effective guaranteed consistency in today's Web-scale concurrency environments. We call it the consistency tax.
Thankfully, a new generation of databases has emerged, it doesn't require you to pay consistency tax and can be scaled without sacrificing consistency!
Second generation distributed database
Second generation distributed databases have emerged to provide strong consistency (rather than final consistency). These databases scale well without losing data and returning stale data. In other words, they function as expected, no longer needing to know about restrictions or pay consistency taxes. If you update a value, the next time you read it will always reflect the updated value and different updates will be applied in the same chronological order of written. At the time of writing, FaunaDB, Spanner, and FoundationDB are the only databases that provide unlimited strong consistency (also known as strict serialization).
PACELC Theorem Revisit
Second generation distributed databases have implemented something that was previously considered impossible; they tend to be consistent and still provide low latency. This is possible, thanks to intelligent synchronization mechanisms such as Calvin, Spanner, and Percolator, which we will discuss in detail in Chapter 4 of this series. While older databases still struggle to provide high consistency guarantees at lower latency, databases built on these new intelligent algorithms do not have such limitations.
Database design greatly affects the achievable latency with high consistency.
Since these new algorithms allow databases to provide both strong consistency and low latency, there is often no good reason to abandon consistency (at least without network partitions). You do this only when the very low write latency is the only thing that really matters and you are willing to lose data to implement it.
Are these databases still NoSQL?
Classifying this generation of distributed databases is no longer trivial. People are still trying to (1, 2) explain what NoSQL means, but none of them completely make sense because NoSQL and SQL databases are getting closer to each other. The new distributed databases draw on different data models (document, graph, relational, time), and some of them provide ACID guarantees and even support SQL. They still have one thing in common with NoSQL: they are built to solve the limitations of traditional databases. A word can never describe the behavior of a database. In the future, it will be more meaningful to describe a distributed database by answering the following questions:
- Is it consistent?
- Is the distribution dependent on reading replicas, or is it truly distributed?
- What data models does it borrow?
- What is the expressive ability of a query language and what are its limitations?
in conclusion
We explain how applications now benefit from a new generation of globally distributed databases that can provide dynamic data from nearest locations, just like CDNs. We briefly reviewed the history of distributed databases and found that it was not smooth sailing. Many first-generation databases have been developed, and their consistency choices (mainly driven by the CAP theorem) require us to write more code while still degrading the user experience. Until recently, the database community has developed algorithms that allow distributed databases to combine low latency with strong consistency. A new era has arrived, an era where we no longer need to weigh the trade-off between data access and consistency!
At this point, you may want to see specific examples of potential pitfalls for the ultimate consistency database. In the next article in this series, we will cover this. Stay tuned for upcoming articles:
Article series
- Why care about consistency?
- What problems may occur?
- What are the barriers to adopting a consistent database?
- How does the new algorithm help?
The above is the detailed content of Consistent Backends and UX: Why Should You Care?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

It's out! Congrats to the Vue team for getting it done, I know it was a massive effort and a long time coming. All new docs, as well.

With the recent climb of Bitcoin’s price over 20k $USD, and to it recently breaking 30k, I thought it’s worth taking a deep dive back into creating Ethereum

I had someone write in with this very legit question. Lea just blogged about how you can get valid CSS properties themselves from the browser. That's like this.

The other day, I spotted this particularly lovely bit from Corey Ginnivan’s website where a collection of cards stack on top of one another as you scroll.

I'd say "website" fits better than "mobile app" but I like this framing from Max Lynch:

There are a number of these desktop apps where the goal is showing your site at different dimensions all at the same time. So you can, for example, be writing

If we need to show documentation to the user directly in the WordPress editor, what is the best way to do it?

Questions about purple slash areas in Flex layouts When using Flex layouts, you may encounter some confusing phenomena, such as in the developer tools (d...
