isolation, made a digression about 隔离有关的问题入手,对 low-level data structure, then discussed 低级数据结构进行了论述,然后讨论了 row versions and observed how 行版本,并观察了如何从行版本中获取 data snapshots are obtained from row versions. 数据快照 。 Last time we talked about HOT updates and in-page vacuuming, and today we'll proceed to a well-known 上次我们讨论HOT更新和页内清理功能,今天我们将继续进行众所周知的 vacuum vulgaris. Really, so much has already been written about it that I can hardly add anything new, but the beauty of a full picture requires sacrifice. So keep patience. 真空处理 。 确实,关于它的文章已经写得太多了,我几乎无法添加任何新内容,但是要获得全貌的美丽,就需要付出牺牲。 因此,请保持耐心。

真空 (Vacuum)

真空做什么? (What does vacuum do?)

In-page vacuum works fast, but frees only part of the space. It works within one table page and does not touch indexes.

页内真空工作速度很快,但仅释放了一部分空间。 它在一个表页面内工作,并且不涉及索引。

The basic, «normal» vacuum is done using the VACUUM command, and we will call it just «vacuum» (leaving «autovacuum» for a separate discussion).

基本的“正常”真空是使用VACUUM命令完成的,我们将其称为“真空”(将“自动真空”留给单独讨论)。

So, vacuum processes the entire table. It vacuums away not only dead tuples, but also references to them from all indexes.

因此,真空处理整个桌子。 它不仅清除死元组,而且清除所有索引中对它们的引用。

Vacuuming is concurrent with other activities in the system. The table and indexes can be used in a regular way both for reads and updates (however, concurrent execution of commands such as CREATE INDEX, ALTER TABLE and some others is impossible).

吸尘与系统中的其他活动同时进行。 该表和索引可以按常规方式用于读取和更新(但是,不可能同时执行诸如CREATE INDEX,ALTER TABLE等命令)。

Only those table pages are looked through where some activities took place. To detect them, the visibility map is used (to remind you, the map tracks those pages that contain pretty old tuples, which are visible in all data snapshots for sure). Only those pages are processed that are not tracked by the visibility map, and the map itself gets updated.

仅查看那些表页面中进行某些活动的位置。 为了检测它们,使用了可见性地图 (提醒您,该地图跟踪那些包含非常老的元组的页面,这些页面肯定在所有数据快照中都可见)。 仅处理那些可见性地图未跟踪的页面,并且地图本身会更新。

The free space map also gets updated in the process to reflect the extra free space in the pages.

可用空间图也会在此过程中进行更新,以反映页面中的额外可用空间。

As usual, let's create a table:

和往常一样,让我们​​创建一个表:

=> CREATE TABLE vac(id serial,s char(100)
) WITH (autovacuum_enabled = off);
=> CREATE INDEX vac_s ON vac(s);
=> INSERT INTO vac(s) VALUES ('A');
=> UPDATE vac SET s = 'B';
=> UPDATE vac SET s = 'C';

We use the autovacuum_enabled parameter to turn the autovacuum process off. We will discuss it next time, and now it is critical for our experiments that we manually control vacuuming.

我们使用autovacuum_enabled参数来关闭自动清理过程。 下次我们将进行讨论,现在手动控制吸尘对于我们的实验至关重要。

The table now has three tuples, each of which are referenced from the index:

该表现在具有三个元组,每个元组都从索引中引用:

=> SELECT * FROM heap_page('vac',0);
ctid  | state  |   xmin   |   xmax   | hhu | hot | t_ctid
-------+--------+----------+----------+-----+-----+--------(0,1) | normal | 4000 (c) | 4001 (c) |     |     | (0,2)(0,2) | normal | 4001 (c) | 4002     |     |     | (0,3)(0,3) | normal | 4002     | 0 (a)    |     |     | (0,3)
(3 rows)
=> SELECT * FROM index_page('vac_s',1);
itemoffset | ctid
------------+-------1 | (0,1)2 | (0,2)3 | (0,3)
(3 rows)

After vacuuming, dead tuples get vacuumed away, and only one, live, tuple remains. And only one reference remains in the index:

吸尘后,死的元组被吸走,仅剩下一个活的元组。 索引中仅剩一个参考:

=> VACUUM vac;
=> SELECT * FROM heap_page('vac',0);
ctid  | state  |   xmin   | xmax  | hhu | hot | t_ctid
-------+--------+----------+-------+-----+-----+--------(0,1) | unused |          |       |     |     | (0,2) | unused |          |       |     |     | (0,3) | normal | 4002 (c) | 0 (a) |     |     | (0,3)
(3 rows)
=> SELECT * FROM index_page('vac_s',1);
itemoffset | ctid
------------+-------1 | (0,3)
(1 row)

Note that the first two pointers acquired the status «unused» instead of «dead», which they would acquire with in-page vacuum.

请注意,前两个指针获取的状态为“未使用”,而不是状态为“死”的状态,它们将在页内真空状态下获取。

再次关于交易范围 (About the transaction horizon once again)

How does PostgreSQL make out which tuples can be considered dead? We already touched upon the concept of transaction horizon when discussing data snapshots, but it won't hurt to reiterate such an important matter.

PostgreSQL如何确定哪些元组可以视为已死? 在讨论数据快照时 ,我们已经提到了事务视界的概念,但是重申这一重要问题并没有什么坏处。

Let's start the previous experiment again.

让我们再次开始上一个实验。

=> TRUNCATE vac;
=> INSERT INTO vac(s) VALUES ('A');
=> UPDATE vac SET s = 'B';

But before updating the row once again, let one more transaction start (but not end). In this example, it will use the Read Committed level, but it must get a true (not virtual) transaction number. For example, the transaction can change and even lock certain rows in any table, not obligatory vac:

但是在再次更新该行之前,让另一个事务开始(但不结束)。 在此示例中,它将使用Read Committed级别,但必须获得真实的(非虚拟的)交易号。 例如,事务可以更改甚至锁定任何表中的某些行,而不是强制vac

|  => BEGIN;
|  => SELECT s FROM t FOR UPDATE;
|    s
|  -----
|   FOO
|   BAR
|  (2 rows)
=> UPDATE vac SET s = 'C';

There are three rows in the table and three references in the index now. What will happen after vacuuming?

表中现在有三行,索引中现在有三个引用。 吸尘后会发生什么?

=> VACUUM vac;
=> SELECT * FROM heap_page('vac',0);
ctid  | state  |   xmin   |   xmax   | hhu | hot | t_ctid
-------+--------+----------+----------+-----+-----+--------(0,1) | unused |          |          |     |     | (0,2) | normal | 4005 (c) | 4007 (c) |     |     | (0,3)(0,3) | normal | 4007 (c) | 0 (a)    |     |     | (0,3)
(3 rows)
=> SELECT * FROM index_page('vac_s',1);
itemoffset | ctid
------------+-------1 | (0,2)2 | (0,3)
(2 rows)

Two tuples remain in the table: VACUUM decided that the (0,2) tuple cannot be vacuumed yet. The reason is certainly in the transaction horizon of the database, which in this example is determined by the non-completed transaction:

表中还剩下两个元组:VACUUM决定(0,2)元组还不能被清理。 原因当然是在数据库的事务范围内,在此示例中,这是由未完成的事务确定的:

|  => SELECT backend_xmin FROM pg_stat_activity WHERE pid = pg_backend_pid();
|   backend_xmin
|  --------------
|           4006
|  (1 row)

We can ask VACUUM to report what is happening:

我们可以要求VACUUM报告正在发生的事情:

=> VACUUM VERBOSE vac;
INFO:  vacuuming "public.vac"
INFO:  index "vac_s" now contains 2 row versions in 2 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "vac": found 0 removable, 2 nonremovable row versions in 1 out of 1 pages
DETAIL:  1 dead row versions cannot be removed yet, oldest xmin: 4006
There were 1 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Note that:

注意:

  • 2 nonremovable row versions — two tuples that cannot be deleted are found in the table.

    2 nonremovable row versions删除的2 nonremovable row versions -在表中找到两个无法删除的元组。

  • 1 dead row versions cannot be removed yet — one of them is dead.

    1 dead row versions cannot be removed yet -其中一个已死。

  • oldest xmin shows the current horizon.

    oldest xmin显示当前范围。

Let's reiterate the conclusion: if a database has long-lived transactions (not completed or being performed really long), this can entail table bloat regardless of how often vacuuming happens. Therefore, OLTP- and OLAP-type workloads poorly coexist in one PostgreSQL database: reports running for hours will not let updated tables be duly vacuumed. Creation of a separate replica for reporting purposes may be a possible solution to this.

让我们重申一下结论:如果数据库的事务寿命长(未完成或执行的时间很长),则无论清理发生的频率如何,都可能导致表膨胀。 因此,OLTP型和OLAP型工作负载很难在一个PostgreSQL数据库中共存:运行数小时的报告不会让更新后的表被适当清理。 创建用于报告目的的单独副本可能是解决此问题的方法。

After completion of an open transaction, the horizon moves, and the situation gets fixed:

在完成未结交易后,视界移动,情况得到了解决:

|  => COMMIT;
=> VACUUM VERBOSE vac;
INFO:  vacuuming "public.vac"
INFO:  scanned index "vac_s" to remove 1 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  "vac": removed 1 row versions in 1 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  index "vac_s" now contains 1 row versions in 2 pages
DETAIL:  1 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "vac": found 1 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4008
There were 1 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Now only latest, live, version of the row is left in the page:

现在,页面中仅剩下该行的最新实时版本:

=> SELECT * FROM heap_page('vac',0);
ctid  | state  |   xmin   | xmax  | hhu | hot | t_ctid
-------+--------+----------+-------+-----+-----+--------(0,1) | unused |          |       |     |     | (0,2) | unused |          |       |     |     | (0,3) | normal | 4007 (c) | 0 (a) |     |     | (0,3)
(3 rows)

The index also has only one row:

该索引也只有一行:

=> SELECT * FROM index_page('vac_s',1);
itemoffset | ctid
------------+-------1 | (0,3)
(1 row)

里面发生什么事? (What happens inside?)

Vacuuming must process the table and indexes at the same time and do this so as not to lock the other processes. How can it do so?

清理必须同时处理表和索引,并执行此操作,以免锁定其他进程。 怎么做呢?

All starts with the scanning heap phase (the visibility map taken into account, as already mentioned). In the pages read, dead tuples are detected, and their tids are written down to a specialized array. The array is stored in the local memory of the vacuum process, where maintenance_work_mem bytes of memory are allocated for it. The default value of this parameter is 64 MB. Note that the full amount of memory is allocated at once, rather than as the need arises. However, if the table is not large, a smaller amount of memory is allocated.

所有步骤都从扫描堆阶段开始(已经提到了可见性图)。 在读取的页面中,检测到死元组,并将其tid记入专用数组。 该数组存储在真空过程的本地内存中,在该内存中为其分配了maintenance_work_mem字节的内存。 此参数的默认值为64 MB。 请注意,而不是根据需要立即分配全部内存。 但是,如果表不大,则会分配较少的内存。

Then we either reach the end of the table or the memory allocated for the array is over. In either case, the vacuuming indexes phase starts. To this end, each index created on the table is fully scanned in search of the rows that reference the remembered tuples. The rows found are vacuumed away from index pages.

然后,我们要么到达表的末尾,要么为数组分配的内存结束了。 在任何一种情况下, 吸尘指数阶段都会开始。 为此,将对表上创建的每个索引进行完全扫描,以查找引用记住的元组的行。 找到的行将从索引页清除。

Here we confront the following: the indexes do not already have references to dead tuples, while the table still has them. And this is contrary to nothing: when executing a query, we either don't hit dead tuples (with index access) or reject them at the visibility check (when scanning the table).

在这里,我们面临以下问题:索引还没有对死元组的引用,而表中仍然有它们。 这与什么都没有相反:执行查询时,我们要么不击中死的元组(具有索引访问权限),要么在可见性检查时不扫描它们(扫描表时)。

After that, the vacuuming heap phase starts. The table is scanned again to read the appropriate pages, vacuum them of the remembered tuples and release the pointers. We can do this since there are no references from the indexes anymore.

之后,开始清理堆阶段。 再次扫描该表以读取适当的页面,将已记住的元组吸走它们并释放指针。 因为不再有来自索引的引用,所以我们可以这样做。

If the table was not entirely read during the first cycle, the array is cleared and everything is repeated from where we reached.

如果在第一个周期中未完全读取该表,则会清除该数组,并从到达的位置重复所有操作。

In summary:

综上所述:

  • The table is always scanned twice.该表始终被扫描两次。
  • If vacuuming deletes so many tuples that they all do not fit in memory of size maintenance_work_mem, all the indexes will be scanned as many times as needed.

    如果清理删除了太多的元组,以致它们都无法容纳在大小为maintenance_work_mem的内存中,则将根据需要对所有索引进行多次扫描。

For large tables, this can require a lot of time and add considerable system workload. Of course, queries will not be locked, but extra input/output is definitely undesirable.

对于大型表,这可能需要很多时间,并会增加相当大的系统工作量。 当然,查询不会被锁定,但是绝对不希望额外的输入/输出。

To speed up the process, it makes sense to either call VACUUM more often (so that not too many tuples are vacuumed away each time) or allocate more memory.

为了加快处理速度,可以更频繁地调用VACUUM(这样就不会每次清理掉太多的元组),或者分配更多的内存。

To note in parentheses, starting with version 11, PostgreSQL can skip index scans unless a compelling need arises. This must make the life easier for owners of large tables where rows are only added (but not changed).

在括号中需要注意的是,从版本11开始,PostgreSQL 可以跳过索引扫描,除非迫切需要。 对于仅添加(而不更改)行的大型表的所有者来说,这必须使生活变得更轻松。

监控方式 (Monitoring)

How can we figure out that VACUUM cannot do its job in one cycle?

我们如何确定VACUUM无法在一个周期内完成其工作?

We've already seen the first way: to call the VACUUM command with the VERBOSE option. In this case, information about the phases of the process will be output to the console.

我们已经看到了第一种方法:使用VERBOSE选项调用VACUUM命令。 在这种情况下,有关过程阶段的信息将输出到控制台。

Second, starting with version 9.6, the pg_stat_progress_vacuum view is available, which also provides all the necessary information.

其次,从9.6版开始,可以使用pg_stat_progress_vacuum视图,该视图还提供了所有必要的信息。

(The third way is also available: to output the information to the message log, but this works only for autovacuum, which will be discussed next time.)

(第三种方法也是可用的:将信息输出到消息日志,但这仅适用于自动真空,这将在下次讨论。)

Let's insert quite a few rows in the table, for the vacuum process to last pretty long, and let's update all of them, for VACUUM to get stuff to do.

让我们在表中插入很多行,以使真空过程持续很长时间,并更新所有这些行,以使VACUUM可以完成工作。

=> TRUNCATE vac;
=> INSERT INTO vac(s) SELECT 'A' FROM generate_series(1,500000);
=> UPDATE vac SET s  = 'B';

Let's reduce the memory size allocated for the array of identifiers:

让我们减少分配给标识符数组的内存大小:

=> ALTER SYSTEM SET maintenance_work_mem = '1MB';
=> SELECT pg_reload_conf();

Let's start VACUUM and while it is working, let's access the pg_stat_progress_vacuum view several times:

让我们启动VACUUM,在它工作时,让我们多次访问pg_stat_progress_vacuum视图:

=> VACUUM VERBOSE vac;
|  => SELECT * FROM pg_stat_progress_vacuum \gx
|  -[ RECORD 1 ]------+------------------
|  pid                | 6715
|  datid              | 41493
|  datname            | test
|  relid              | 57383
|  phase              | vacuuming indexes
|  heap_blks_total    | 16667
|  heap_blks_scanned  | 2908
|  heap_blks_vacuumed | 0
|  index_vacuum_count | 0
|  max_dead_tuples    | 174762
|  num_dead_tuples    | 174480
|  => SELECT * FROM pg_stat_progress_vacuum \gx
|  -[ RECORD 1 ]------+------------------
|  pid                | 6715
|  datid              | 41493
|  datname            | test
|  relid              | 57383
|  phase              | vacuuming indexes
|  heap_blks_total    | 16667
|  heap_blks_scanned  | 5816
|  heap_blks_vacuumed | 2907
|  index_vacuum_count | 1
|  max_dead_tuples    | 174762
|  num_dead_tuples    | 174480

Here we can see, in particular:

在这里我们可以特别看到:

  • The name of the current phase — we discussed three main phases, but there are more of them in general.

    当前阶段的名称-我们讨论了三个主要阶段,但总体上有更多阶段。

  • The total number of table pages (heap_blks_total).

    表页的总数( heap_blks_total )。

  • The number of scanned pages (heap_blks_scanned).

    扫描的页数( heap_blks_scanned )。

  • The number of already vacuumed pages (heap_blks_vacuumed).

    heap_blks_vacuumed页面数( heap_blks_vacuumed )。

  • The number of index vacuum cycles (index_vacuum_count).

    分度真空循环数( index_vacuum_count )。

The general progress is determined by the ratio of heap_blks_vacuumed to heap_blks_total, but we should take into account that this value changes in large increments rather than smoothly because of scanning the indexes. The main attention, however, should be given to the number of vacuum cycles: the number greater than 1 means that the memory allocated was not enough to complete vacuuming in one cycle.

总体进度由heap_blks_vacuumedheap_blks_total的比率确定,但我们应考虑到此值由于扫描索引而以较大的增量而不是平滑地变化。 但是,应主要注意真空循环的次数:数字大于1表示分配的内存不足以在一个循环中完成抽真空。

The output of the VACUUM VERBOSE command, already completed by that time, will show the general picture:

到那时已经完成的VACUUM VERBOSE命令的输出将显示一般图片:

INFO:  vacuuming "public.vac"
INFO:  scanned index "vac_s" to remove 174480 row versions
DETAIL:  CPU: user: 0.50 s, system: 0.07 s, elapsed: 1.36 s
INFO:  "vac": removed 174480 row versions in 2908 pages
DETAIL:  CPU: user: 0.02 s, system: 0.02 s, elapsed: 0.13 s
INFO:  scanned index "vac_s" to remove 174480 row versions
DETAIL:  CPU: user: 0.26 s, system: 0.07 s, elapsed: 0.81 s
INFO:  "vac": removed 174480 row versions in 2908 pages
DETAIL:  CPU: user: 0.01 s, system: 0.02 s, elapsed: 0.10 s
INFO:  scanned index "vac_s" to remove 151040 row versions
DETAIL:  CPU: user: 0.13 s, system: 0.04 s, elapsed: 0.47 s
INFO:  "vac": removed 151040 row versions in 2518 pages
DETAIL:  CPU: user: 0.01 s, system: 0.02 s, elapsed: 0.08 s
INFO:  index "vac_s" now contains 500000 row versions in 17821 pages
DETAIL:  500000 index row versions were removed.
8778 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "vac": found 500000 removable, 500000 nonremovable row versions in 16667 out of 16667 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4011
There were 0 unused item pointers.
0 pages are entirely empty.
CPU: user: 1.10 s, system: 0.37 s, elapsed: 3.71 s.
VACUUM

We can see here that three cycles over the indexes were done, and in each cycle, 174480 pointers to dead tuples were vacuumed away. Why exactly this number? One tid occupies 6 bytes, and 1024*1024/6 = 174762, which is the number that we see in pg_stat_progress_vacuum.max_dead_tuples. In reality, slightly less may be used: this ensures that when a next page is read, all pointers to dead tuples will fit in memory for sure.

我们可以在这里看到在索引上完成了三个循环,并且在每个循环中,吸取了指向死元组的174480个指针。 为什么是这个数字? 一个tid占用6个字节,而1024 * 1024/6 = 174762,这是我们在pg_stat_progress_vacuum.max_dead_tuples看到的pg_stat_progress_vacuum.max_dead_tuples 。 实际上,可能会使用更少的代码:这确保了在读取下一页时,所有指向无效元组的指针肯定会容纳在内存中。

分析 (Analysis)

Analysis, or, in other words, collecting statistics for the query planner, is formally unrelated to vacuuming at all. Nevertheless, we can perform the analysis not only using the ANALYZE command, but combine vacuuming and analysis in VACUUM ANALYZE. Here the vacuum is done first and then the analysis, so this gives no gains.

分析,换句话说,为查询计划者收集统计信息,与形式上的清理完全无关。 但是,我们不仅可以使用ANALYZE命令执行分析,而且可以在VACUUM ANALYZE中结合抽真空和分析。 在此先进行真空,然后进行分析,因此不会产生任何收益。

But as we will see later, autovacuum and automatic analysis are done in one process and are controlled in a similar way.

但是正如我们稍后将看到的,自动真空和自动分析是在一个过程中完成的,并且以类似的方式进行控制。

真空已满 (VACUUM FULL)

As noted above, vacuum frees more space than in-page vacuum, but still it does not entirely solve the problem.

如上所述,真空比页内真空释放更多的空间,但仍不能完全解决问题。

If for some reasons the size of a table or an index has increased a lot, VACUUM will free space inside the existing pages: «holes» will occur there, which will then be used for insertion of new tuples. But the number of pages won't change, and therefore, from the viewpoint of the operating system, the files will occupy exactly the same space as before the vacuum. And this is no good because:

如果由于某种原因,表或索引的大小增加了很多,VACUUM将释放现有页面内的空间:此处将出现“空洞”,然后将其用于插入新的元组。 但是页数不会改变,因此,从操作系统的角度来看,文件将占用与清理之前完全相同的空间。 这是不好的,因为:

  • Full scan of the table (or index) slows down.对表(或索引)的完全扫描速度变慢。
  • A larger buffer cache may be required (since it is the pages that are stored there and the density of useful information decreases).可能需要更大的缓冲区高速缓存(因为页面存储在其中,有用信息的密度降低了)。
  • In the index tree an extra level can occur, which will slow down index access.在索引树中,可能会出现额外的级别,这将减慢索引访问。
  • The files occupy extra space on disk and in backup copies.这些文件在磁盘和备份副本中会占用额外的空间。

(The only exception is fully vacuumed pages, located at the end of the file. These pages are trimmed from the file and returned to the operating system.)

(唯一的例外是位于文件末尾的完全清除的页面。这些页面已从文件中裁剪并返回到操作系统。)

If the share of useful information in the files falls below some reasonable limit, the administrator can do VACUUM FULL of the table. In this case, the table and all its indexes are rebuilt from scratch and the data are packed in a mostly compact way (of course, the fillfactor parameter taken into account). During the rebuild, PostgreSQL first rebuilds the table and then each of its indexes one-by-one. For each object, new files are created, and old files are removed at the end of rebuilding. We should take into account that extra disk space will be needed in the process.

如果文件中有用信息的份额低于某个合理的限制,则管理员可以对表进行VACUUM FULL。 在这种情况下,表及其所有索引都是从头开始重建的,并且数据以最紧凑的方式打包(当然,考虑了fillfactor参数)。 在重建过程中,PostgreSQL首先重建表,然后重建每个索引。 对于每个对象,将创建新文件,并在重建结束时删除旧文件。 我们应该考虑到在此过程中将需要额外的磁盘空间。

To illustrate this, let's again insert a certain number of rows into the table:

为了说明这一点,让我们再次在表中插入一定数量的行:

=> TRUNCATE vac;
=> INSERT INTO vac(s) SELECT 'A' FROM generate_series(1,500000);

How can we estimate the information density? To do this, it's convenient to use a specialized extension:

我们如何估计信息密度? 为此,使用专门的扩展很方便:

=> CREATE EXTENSION pgstattuple;
=> SELECT * FROM pgstattuple('vac') \gx
-[ RECORD 1 ]------+---------
table_len          | 68272128
tuple_count        | 500000
tuple_len          | 64500000
tuple_percent      | 94.47
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 38776
free_percent       | 0.06

The function reads the entire table and shows statistics: which data occupies how much space in the files. The main information of our interest now is the tuple_percent field: the percentage of useful data. It is less than 100 because of the inevitable information overhead inside a page, but is still pretty high.

该函数读取整个表并显示统计信息:哪些数据占据了文件中的空间。 现在,我们感兴趣的主要信息是tuple_percent字段:有用数据的百分比。 由于页面内不可避免的信息开销,它小于100,但仍然很高。

For the index, different information is output, but the avg_leaf_density field has the same meaning: the percentage of useful information (in leaf pages).

对于索引,将输出不同的信息,但是avg_leaf_density字段具有相同的含义:有用信息的百分比(在叶子页面中)。

=> SELECT * FROM pgstatindex('vac_s') \gx
-[ RECORD 1 ]------+---------
version            | 3
tree_level         | 3
index_size         | 72802304
root_block_no      | 2722
internal_pages     | 241
leaf_pages         | 8645
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 83.77
leaf_fragmentation | 64.25

And these are the sizes of the table and indexes:

这些是表和索引的大小:

=> SELECT pg_size_pretty(pg_table_size('vac')) table_size,pg_size_pretty(pg_indexes_size('vac')) index_size;
table_size | index_size
------------+------------65 MB      | 69 MB
(1 row)

Now let's delete 90% of all rows. We do a random choice of rows to delete, so that at least one row is highly likely to remain in each page:

现在,让我们删除所有行的90%。 我们对要删除的行进行随机选择,因此很可能在每一页中至少保留一行:

=> DELETE FROM vac WHERE random() < 0.9;
DELETE 450189

What size will the objects have after VACUUM?

真空吸尘后物体将有多大尺寸?

=> VACUUM vac;
=> SELECT pg_size_pretty(pg_table_size('vac')) table_size,pg_size_pretty(pg_indexes_size('vac')) index_size;
table_size | index_size
------------+------------65 MB      | 69 MB
(1 row)

We can see that the size did not change: VACUUM no way can reduce the size of files. And this is although the information density decreased by approximately 10 times:

我们可以看到大小没有变化:VACUUM无法减小文件大小。 尽管信息密度降低了大约10倍:

=> SELECT vac.tuple_percent, vac_s.avg_leaf_density
FROM pgstattuple('vac') vac, pgstatindex('vac_s') vac_s;
tuple_percent | avg_leaf_density
---------------+------------------9.41 |             9.73
(1 row)

Now let's check what we get after VACUUM FULL. Now the table and indexes use the following files:

现在,让我们检查一下VACUUM FULL之后得到了什么。 现在,表和索引使用以下文件:

=> SELECT pg_relation_filepath('vac'), pg_relation_filepath('vac_s');
pg_relation_filepath | pg_relation_filepath
----------------------+----------------------base/41493/57392     | base/41493/57393
(1 row)
=> VACUUM FULL vac;
=> SELECT pg_relation_filepath('vac'), pg_relation_filepath('vac_s');
pg_relation_filepath | pg_relation_filepath
----------------------+----------------------base/41493/57404     | base/41493/57407
(1 row)

The files are replaced with new ones now. The sizes of the table and indexes considerably decreased, while the information density increased accordingly:

现在,文件已被新文件替换。 表和索引的大小显着减小,而信息密度相应增加:

=> SELECT pg_size_pretty(pg_table_size('vac')) table_size,pg_size_pretty(pg_indexes_size('vac')) index_size;
table_size | index_size
------------+------------6648 kB    | 6480 kB
(1 row)
=> SELECT vac.tuple_percent, vac_s.avg_leaf_density
FROM pgstattuple('vac') vac, pgstatindex('vac_s') vac_s;
tuple_percent | avg_leaf_density
---------------+------------------94.39 |            91.08
(1 row)

Note that the information density in the index is even greater than the original one. It is more advantageous to rebuild an index (B-tree) from the data available than insert the data in an existing index row by row.

请注意,索引中的信息密度甚至大于原始信息。 从可用数据重建索引(B树)比将数据逐行插入现有索引中更为有利。

The functions of the pgstattuple extension that we used read the entire table. But this is inconvenient if the table is large, so the extension has the pgstattuple_approx function, which skips the pages marked in the visibility map and shows approximate figures.

我们使用的pgstattuple扩展功能读取了整个表。 但是,如果表很大,这将pgstattuple_approx ,因此扩展名具有pgstattuple_approx函数,该函数会跳过可见性图中标记的页面并显示近似数字。

One more way, but even less accurate, is to use the system catalog to roughly estimate the ratio of the data size to the file size. You can find examples of such queries in wiki.

另一种方法,但准确性更低,是使用系统目录粗略估计数据大小与文件大小的比率。 您可以在Wiki中找到此类查询的示例。

VACUUM FULL is not intended for regular use since it blocks any work with the table (querying included) for all the duration of the process. It's clear that for a heavily used system, this may appear unacceptable. Locks will be discussed separately, and now we'll only mention the pg_repack extension, which locks the table for only a short period of time at the end of the work.

VACUUM FULL不能用于常规用途,因为它在整个过程中都禁止使用表进行任何工作(包括查询)。 显然,对于使用率很高的系统,这似乎是不可接受的。 锁将单独讨论,现在我们仅提及pg_repack扩展,该扩展在工作结束时仅将表锁定一小段时间。

类似命令 (Similar commands)

There are a few commands that also fully rebuild tables and indexes and therefore resemble VACUUM FULL. All of them fully block any work with the table, they all remove old data files and create new ones.

有一些命令也可以完全重建表和索引,因此类似于VACUUM FULL。 它们全部完全阻止了该表的任何工作,它们都删除了旧数据文件并创建了新文件。

The CLUSTER command is in all similar to VACUUM FULL, but it also physically orders tuples according to one of the available indexes. This enables the planner to use index access more efficiently in some cases. But we should bear in mind that clustering is not maintained: the physical order of tuples will be broken with subsequent changes of the table.

CLUSTER命令与VACUUM FULL完全相似,但它实际上还会根据可用索引之一对元组进行排序。 这使计划人员在某些情况下可以更有效地使用索引访问。 但是我们应该记住,不能保持聚类:元组的物理顺序将随着表的后续更改而中断。

The REINDEX command rebuilds a separate index on the table. VACUUM FULL and CLUSTER actually use this command to rebuild indexes.

REINDEX命令在表上重建一个单独的索引。 VACUUM FULL和CLUSTER实际上使用此命令来重建索引。

The logic of the TRUNCATE command is similar to that of DELETE — it deletes all table rows. But DELETE, as was already mentioned, only marks tuples as deleted, and this requires further vacuuming. And TRUNCATE just creates a new, clean file instead. As a rule, this works faster, but we should mind that TRUNCATE will block any work with the table up to the end of the transaction.

TRUNCATE命令的逻辑类似于DELETE的逻辑—它删除所有表行。 但是,正如已经提到的,DELETE只将元组标记为已删除,这需要进一步清理。 而TRUNCATE只是创建一个新的干净文件。 通常,这会更快地工作,但是我们应该注意,TRUNCATE会阻止对表的任何工作,直到事务结束。

Read on. 继续阅读 。

翻译自: https://habr.com/en/company/postgrespro/blog/484106/

PostgreSQL-6中的MVCC。 真空相关推荐

  1. PostgreSQL中的MVCC机制

    MVCC,Multi-Version Concurrency Control,多版本并发控制. 一句话讲,MVCC就是用同一份数据临时保留多版本的方式,实现并发控制.它可以避免读写事务之间的互相阻塞, ...

  2. 详解PostgreSQL数据库中的两阶段锁

    点击上方"蓝字" 关注我们,享更多干货! 数据库中的对象是共享的,假如不同的用户同时修改某个对象,就会出现数据错乱,从而破坏数据库的数据一致性,违反事务的隔离性原则. 为了满足隔离 ...

  3. 如何从PostgreSQL json中提取数组

    如何从PostgreSQL json中提取数组 作者 digoal 日期 2016-09-10 标签 PostgreSQL , json , 数组 , jsonb 背景 在PostgreSQL中使用J ...

  4. postgres 显示变量_sql - 如何在PostgreSQL查询中声明变量

    sql - 如何在PostgreSQL查询中声明变量 如何声明变量以用于PostgreSQL 8.3查询? 在MS SQL Server中,我可以这样做: DECLARE @myvar INT SET ...

  5. PostgreSQL pg中的截取补齐lpad函数怎么用?

    PostgreSQL pg中的截取补齐lpad函数怎么用? 1 左边填充,右边截取 PostgreSQL中的lpad()函数有两个功能: 如果长度不够指定的长度,就在左边填充字符串 如果长度超出了指定 ...

  6. PostgreSQL 数据库中 DISTINCT 关键字的 4 种用法

    文章目录 DISTINCT DISTINCT ON IS DISTINCT FROM 聚合函数与 DISTINCT 大家好,我是只谈技术不剪发的 Tony 老师.PostgreSQL 不但高度兼容 S ...

  7. 认真学习MySQL中的MVCC机制

    什么是MVCC?MVCC(Multiversion Concurrency Control),多版本并发控制.顾名思义,MVCC是通过数据行的多个版本管理来实现数据库的并发控制.这项技术使得在Inno ...

  8. 【PostgreSQL与UDIG】Udig导入Postgresql数据库中矢量数据无法显示的问题

    Udig导入Postgresql数据库中矢量数据无法显示的问题 编程小白,记录学习中遇到的问题,希望可以帮助到其他的人. 一.问题描述 -----首先利用PostGIS将矢量数据导入postgreSQ ...

  9. PostgreSQL 14中TOAST的新压缩算法LZ4,它有多快?

    对于列压缩选项,PostgreSQL 14提供了新的压缩方法LZ4.与TOAST中现有的PGLZ压缩方法相比,LZ4压缩更快.本文介绍如何使用整个选项,并和其他压缩算法进行性能比较. 背景 PG中,页 ...

最新文章

  1. Zookeeper源码分析:Follower角色初始化
  2. ipmsg 绑定tcp错误
  3. Awk 实例,第 1 部分
  4. C语言中 if 和 else if 的区别
  5. Windwows7 下安装mysql5
  6. solr6.5的分词
  7. JSP的自定义标签(五)之Tag File
  8. MySQL-快速入门(14)MySQL性能优化
  9. Go编程语言能干什么
  10. linux文件系统-文件的写与读
  11. Ubuntu20.04开启night夜间模式保护视力
  12. ObjectMapper使用详细介绍
  13. 从软件工程师到有赞新零售技术负责人,34岁李星专访
  14. mindmanager 15 停止工作
  15. Java开发面试常见的技术问题整理
  16. mysql doesn t exist_Mysql的“Table 'mysql.servers' doesn't exist”的解决方法
  17. python生成关键词
  18. Gazebo机器人仿真
  19. h5怎么区分在ios、安卓、微信环境下?怎么调用原生函数
  20. 【热点解读】冬奥会上的中国元素

热门文章

  1. 2020年首次面试心得
  2. ORACLE解锁record is locked by another user
  3. 递归与动态规划---龙与地下城游戏问题
  4. 极速云计算机,云电脑
  5. 基于vue2+element+springboot+mybatis+jpa+mysql的小区物业管理系统
  6. 云电竞的服务器,斗鱼云游戏平台发布:无客户端玩游戏 网友:真就云玩家了呗...
  7. java调用手机麦克风录音以及保存音频文件到服务器
  8. PHP写游戏gm,端游怎么写gm工具
  9. 群星怎么让服务器稳定,DL服务器主机环境配置(ubuntu14.04+GTX1080+cuda8.0)解决桌面重复登录...
  10. laravel实现队列