Why do databases exist?

A comprehensive understanding of databases is impossible without delving into humanity’s history. The desire to preserve knowledge throughout time has made writing one of the most enduring technologies, and looking back, it was first used in temples and caves, which can be recognized as the first non-computational databases of humankind.

Today, the industry emphasizes accurate and well-recorded information. As a matter of fact, the result of an increasing number of people gaining access to technology and joining the global network of information is reflected in research that states that the amount of data doubles every two years.

The history of modern databases began in 1960, when Charles Bachman designed the first database for computers, the integrated data store, or IDS, a predecessor to IBM’s Information Management System (IMS).

A decade after that, around 1970, one of the most significant events in the history of databases occurred when E. F. Codd published his paper A Relational Model of Data for Large Shared Data Banks, coining the term relational database.

Finally, as the next and probably most recent breakthrough in terms of data storage, came NoSQL, which refers to any non-relational database. Some say NoSQL stands for Non-SQL, while others say it stands for Not Only SQL.

NoSQL databases power some of the most popular online applications. Here are a few:

Google: Google uses NoSQL Bigtable for Google Mail, Google Maps, Google Earth, and Google Finance
Netflix: Netflix likes the high availability of the NoSQL database and uses a combination of SimpleDB, HBase, and Cassandra
Uber: Uber uses Riak, a distributed NoSQL database with a flexible key-value store model
LinkedIn: LinkedIn built its own NoSQL database called Espresso, which is a document-oriented database

The challenges of handling data

The evolution of database systems has been marked by key milestones over the decades. In the early days, when storage was expensive, the challenge was finding ways to reduce information waste. A reduction of even one million dollars’ worth of information was a significant achievement.

Did you know?

At the dawn of the database era, a megabyte used to cost around 5 million dollars!

https://ourworldindata.org/grapher/historical-cost-of-computer-memory-and-storage

Today, megabyte cost isn’t the challenge anymore as we’re living at the cost of 0.001 $/MB. As time passed and storage became cheaper, the methods of reducing duplicate data started to negatively impact an application’s response time. Normalization and the attempts to reduce data duplication, multiple join queries, and massive amounts of data did not help as much.

It’s no surprise that challenges to this model would eventually emerge. As noted by the esteemed and respected authors of the book Fundamentals of Software Architecture (https://www.amazon.com/dp/1492043451/), definitive solutions don’t exist; instead, we are presented with many solutions where each is accompanied by its own set of benefits and drawbacks.

Obviously, the same applies to databases.

There is no one-size-fits-all solution when it comes to data storage solutions.

In the 2000s, new storage solutions, such as NoSQL databases, began to gain popularity and architects had more options to choose from. This doesn’t mean that SQL stopped being relevant, but rather that architects must now navigate the complexities of choosing the right paradigm for each problem.

As the database landscape went through these phases, the application’s scenario also changed. Discussions moved toward the motivations and challenges of adopting a microservices architecture style, bringing us back to the multiple persistence strategies available. Traditionally, architectures included relational database solutions, with one or two instances (given its increased cost). Now, as new storage solutions mature, architectural solutions start to include persistence based on NoSQL databases, scaling up to multiple running instances. The possibility of storing data in multiple ways, throughout different services that compose a single broader solution, is a good environment for potential new solutions with polyglot persistence.

Polyglot persistence is the idea that computer applications can use different database types to take advantage of the fact that various engine systems are better equipped to handle different problems. Complex applications often involve different types of problems, so choosing the right tool for each job can be more productive than trying to solve all aspects of the problem using a single solution.

When analyzing solutions in most recent times, the reality confronts us, developers and architects, with the complexity of choice. How do we handle data, having to consider a scenario with multiple data types? To make it clear, we’re talking about mixing and matching hundreds of possible solutions. The best path is to prepare by learning about persistence fundamentals, best practices, and paradigms. And finally, being aware that no matter how much we desire a fast, scalable, highly available, precise, and consistent solution – we now know that, according to the CAP theorem, a concept discussed later in this chapter, that may be impossible.

Next, we’ll narrow down our focus specifically to persistence within the context of Java applications.

Persistence Best Practices for Java Applications

By : Otavio Santana, Karina Varela

Persistence Best Practices for Java Applications

By: Otavio Santana, Karina Varela

Overview of this book

Why do databases exist?

The challenges of handling data

Persistence Best Practices for Java Applications

By : Otavio Santana, Karina Varela

Persistence Best Practices for Java Applications

By: Otavio Santana, Karina Varela

Overview of this book

Why do databases exist?

The challenges of handling data

Create a Note

Delete Bookmark

Delete Note

Edit Note

Confirmation

Buy this book with your credits?