What are computer clusters

Data cluster

The term "Cluster"Is used in many fields and means something like" accumulation "or" pile ". In astronomy, for example, adjacent stars can form a star cluster. In IT, one speaks of a computer cluster as a network of computers that work closely together to solve a problem together (high performance computing cluster) or to increase availability (failover cluster). The term “cluster” generally describes things that belong together.

In computer science there are other clusters: the data clusters. These arrange data in such a way that access to related data is as effective as possible. Data clusters are therefore an important means of optimizing databases. Computer clusters are also common in connection with databases. For example, the phrase "use a cluster to improve database performance" could mean both a computer cluster and a data cluster. To avoid this confusion, it should be noted that a cluster always refers to a data cluster in this chapter.

The simplest data clusters in relational databases are the rows of a table. If possible, they are saved in the same data block. Only if a line does not fit into a single data block is it divided into several. This is especially the case with LOB columns.

Indices offer the possibility to cluster data in a targeted manner. The underlying concept was already discussed in Chapter 1, "Anatomy of a SQL Index“Introduced: An index is an ordered form of indexed data. Similar values ​​are stored next to each other. This means that an index forms a cluster of rows with similar values. This ability to cluster data is so important that I consider it to be the "Second power of indexing" denote.


Traversing the tree is the primary power of indexing.

Clustering is the second power of indexing.

The following sections explain how to use this power to improve query speed.