PostgreSQL's journey of exploration-LINUX-php.cn

Home

System Tutorial

LINUX

PostgreSQL's journey of exploration

王林

Jan 17, 2024 am 08:15 AM

linux linux tutorial Red Hat linux system linux command linux certification red hat linux linux video

PostgreSQLs journey of exploration

Postgres has several index types, and every new version seems to add some new index types. Each index type is useful, but which type to use depends on (1) the type of data, sometimes (2) the underlying data in the table and (3) the type of lookup performed. In the following content, we will introduce the index types you can use in Postgres, and when you should use which index type. Before we get started, here's a list of index types we'll walk you through:

B-Tree
Generalized Inverted Index (GIN)
Generalized Inverted Seach Tree (GiST)
Space partitioned GiST (SP-GiST)
Block Range Index (BRIN)
Hash
Now let’s start with indexing.

In Postgres, B-Tree indexes are the most common indexes you use

If you have a computer science degree, then B-Tree indexing may be the first index you learn. B-tree indexes create a tree that always maintains its balance. When it looks for something based on the index, it walks the tree to find the key and returns the data you're looking for. Using an index is significantly faster than a sequential scan because it only needs to read a few pages (when you are returning only a few records) as opposed to sequentially scanning thousands of records.

If you run a standard CREATE INDEX statement, it will create a B-tree index for you. B-tree indexes are valuable on most data types, such as text, numbers, and timestamps. If you're just starting to use indexes in your database, and don't use too many of Postgres' advanced features on your database, using a standard B-Tree index is probably your best option.

GIN index for multi-valued columns

Generalized Inverted Index, generally called GIN, is mostly suitable for data types when a single column contains multiple values.

According to Postgres documentation:

"GIN is designed to handle situations where the entry being indexed is a compound value, and the query processed by the index needs to search for values that occur in the compound entry. For example, this entry might be a document, and the query can search for the value contained in the document. Specify characters."

The most common data types included in this range are:

hStore
Array
Range
JSONB
One of the most satisfying things about GIN indexes is their ability to understand data stored in composite values. However, because a GIN index requires specific knowledge of the data structure for each individual type being added, not all data types are supported by the GIN index.

GiST index, for rows with overlapping values

Generalized Inverted Seach Tree (GiST) index is mostly suitable when your data overlaps with other rows of data in the same column. The best use of GiST indexes is if you declare a geometry data type and you want to know whether two polygons contain some points. In one case a particular point may be contained in a box, while at the same time, other points only exist in a polygon. Common data types indexed using GiST are:

Geometry type
Text type requiring full-text search
There are many fixed limits on the size of GiST indexes, otherwise, GiST indexes may become extremely large. At the cost of this, GiST indexes are lossy (inexact).

According to official documentation:

"GiST indexes are lossy, which means that the index may produce false matches, so it is necessary to check the real table rows to eliminate false matches. (PostgreSQL will automatically perform this action when necessary)"

This does not mean that you will get a false result, it just means that Postgres will do a small extra work to filter these false results before returning the data to you.

Special note: GIN and GiST indexes can often be used on the same data type. Usually one has good performance but takes up a lot of disk space, and vice versa. When it comes to GIN vs. GiST, there is no one-size-fits-all solution that will work in every situation, but the above rules should apply to most common situations.

SP-GiST index for larger data

The spatially partitioned GiST (SP-GiST) index adopts the spatially partitioned tree from Purdue Research. SP-GiST indexes are often used when your data has a natural clustering factor and is not a balanced tree. Phone numbers are a great example (at least US phone numbers are). They have the following format:

3-digit area code
3-digit prefix number (related to old telephone exchanges)
4-digit line number
This means that there is a natural clustering factor at the first three digits of the first set, followed by the second set of three digits, and then the numbers are evenly distributed. However, in some area codes of phone numbers, there is a higher saturation state than in others. The result can be a very unbalanced tree. Because there is a natural aggregation factor at the front and the data is not equally distributed, data like phone numbers might be a good case for SP-GiST.

BRIN index, for larger data

Block range indexes (BRIN) focus on some situations like SP-GiST, they are best used when the data has some natural ordering, and the data volume is often large. If you have a billion records in chronological order, BRIN may come in handy. If you are querying a large set of data that is naturally grouped, such as several zip codes, BRIN can help you ensure that similar zip codes are stored in close locations on disk.

When you have a very large database sorted by date or zip code, the BRIN index allows you to skip or exclude some unnecessary data very quickly. Additionally, BRIN indexes are relatively small compared to the overall data size, so when you have a large data set, BRIN indexes can perform better.

Hash index, finally not afraid of crash

Hash indexes have been present in Postgres for many years, however, until Postgres 10 was released, there was a huge caveat about their use, it was not WAL-logged. This means that if your server crashes and you can't failover to a standby or restore from an archive using something like wal-g, then you will lose that index until you rebuild it. With the release of Postgres 10, they are now WAL-logged, so you may consider using them again, but the real question is, should you?

Hash indexes sometimes provide faster lookups than B-Tree indexes, and are also fast to create. The biggest problem is that they are restricted to only "equality" comparison operations, so you can only use them for exact match lookups. This makes hash indexes much less flexible than commonly used B-Tree indexes, and you shouldn't think of them as a replacement, but as an index for special cases.

Which one should you use?

We have just introduced a lot, and it is normal if you are a little scared. If you know this before, CREATE INDEX will always create an index for you using a B-Tree, and the good news is that Postgres performs very well or very well for most databases. :) If you're considering using more Postgres features, here's a cheat sheet when you use other Postgres index types:

B-Tree - suitable for most data types and queries
GIN - for JSONB/hstore/arrays
GiST - suitable for full-text search and geometric data types
SP-GiST - suitable for large data sets that have natural aggregation factors but are unevenly distributed
BRIN - suitable for really large data sets with sequential order
Hash - good for equality operations, but usually a B-Tree index is still all you need.
If you have any questions or feedback about this article, feel free to join our slack channel.

The above is the detailed content of PostgreSQL's journey of exploration. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7519

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to start nginx in Linux Apr 14, 2025 pm 12:51 PM

Steps to start Nginx in Linux: Check whether Nginx is installed. Use systemctl start nginx to start the Nginx service. Use systemctl enable nginx to enable automatic startup of Nginx at system startup. Use systemctl status nginx to verify that the startup is successful. Visit http://localhost in a web browser to view the default welcome page.

How to check whether nginx is started Apr 14, 2025 pm 01:03 PM

How to confirm whether Nginx is started: 1. Use the command line: systemctl status nginx (Linux/Unix), netstat -ano | findstr 80 (Windows); 2. Check whether port 80 is open; 3. Check the Nginx startup message in the system log; 4. Use third-party tools, such as Nagios, Zabbix, and Icinga.

How to start nginx server Apr 14, 2025 pm 12:27 PM

Starting an Nginx server requires different steps according to different operating systems: Linux/Unix system: Install the Nginx package (for example, using apt-get or yum). Use systemctl to start an Nginx service (for example, sudo systemctl start nginx). Windows system: Download and install Windows binary files. Start Nginx using the nginx.exe executable (for example, nginx.exe -c conf\nginx.conf). No matter which operating system you use, you can access the server IP

How to solve nginx304 error Apr 14, 2025 pm 12:45 PM

Answer to the question: 304 Not Modified error indicates that the browser has cached the latest resource version of the client request. Solution: 1. Clear the browser cache; 2. Disable the browser cache; 3. Configure Nginx to allow client cache; 4. Check file permissions; 5. Check file hash; 6. Disable CDN or reverse proxy cache; 7. Restart Nginx.

How to check whether nginx is started? Apr 14, 2025 pm 12:48 PM

In Linux, use the following command to check whether Nginx is started: systemctl status nginx judges based on the command output: If "Active: active (running)" is displayed, Nginx is started. If "Active: inactive (dead)" is displayed, Nginx is stopped.

How to solve nginx403 error Apr 14, 2025 pm 12:54 PM

The server does not have permission to access the requested resource, resulting in a nginx 403 error. Solutions include: Check file permissions. Check the .htaccess configuration. Check nginx configuration. Configure SELinux permissions. Check the firewall rules. Troubleshoot other causes such as browser problems, server failures, or other possible errors.

How to clean nginx error log Apr 14, 2025 pm 12:21 PM

The error log is located in /var/log/nginx (Linux) or /usr/local/var/log/nginx (macOS). Use the command line to clean up the steps: 1. Back up the original log; 2. Create an empty file as a new log; 3. Restart the Nginx service. Automatic cleaning can also be used with third-party tools such as logrotate or configured.

Difference between centos and ubuntu Apr 14, 2025 pm 09:09 PM

The key differences between CentOS and Ubuntu are: origin (CentOS originates from Red Hat, for enterprises; Ubuntu originates from Debian, for individuals), package management (CentOS uses yum, focusing on stability; Ubuntu uses apt, for high update frequency), support cycle (CentOS provides 10 years of support, Ubuntu provides 5 years of LTS support), community support (CentOS focuses on stability, Ubuntu provides a wide range of tutorials and documents), uses (CentOS is biased towards servers, Ubuntu is suitable for servers and desktops), other differences include installation simplicity (CentOS is thin)

See all articles