案例-insert value偶发慢分析

2024-05-24 08:36

本文主要是介绍案例-insert value偶发慢分析,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

业务insert value偶发变慢,当我去查看活动会话的时候写入慢问题已经缓解了。
后来发现写入慢问题持续不到半分钟,insert value写入时间1-2s,写个抓活动会话的脚本还是能拿到会话信息:

     wait_event      | count
---------------------+-------[null]              |    11WALRead             |     1DataFileRead        |     2BgWriterMain        |     1WALWrite            |    40AutoVacuumMain      |     1ClientRead          |   385LogicalLauncherMain |     1

等待事件最异常的就是WALWrite 40个会话。
其中2个WALWrite等待的会话如下:

  pid  | usename  |          xact_start           |         state_change          |  wait_event   | wait_event_type | state  |                                                                                  partofquery
-------+----------+-------------------------------+-------------------------------+---------------+-----------------+--------+--------------------------------------------------------------144955 | lzluser11 | 2024-05-23 07:58:27.516574+08 | 2024-05-23 07:58:27.516588+08 | WALWrite       | LWLock          | active | insert into table1( 179869 | lzluser11 | 2024-05-23 07:58:28.116371+08 | 2024-05-23 07:58:28.116386+08 | WALWrite       | IO              | active | insert into table1( 

直接搜源码关于WALWrite的相关内容

 * WALWriteLock: must be held to write WAL buffers to disk (XLogWrite or* XLogFlush).
/*
* LWLockAcquireOrWait - Acquire lock, or wait until it's free
*
* The semantics of this function are a bit funky.  If the lock is currently
* free, it is acquired in the given mode, and the function returns true.  If
* the lock isn't immediately free, the function waits until it is released
* and returns false, but does not acquire the lock.
*
* This is currently used for WALWriteLock: when a backend flushes the WAL,
* holding WALWriteLock, it can flush the commit records of many other
* backends as a side-effect.  Those other backends need to wait until the
* flush finishes, but don't need to acquire the lock anymore.  They can just
* wake up, observe that their records have already been flushed, and return.
*/ 

wal从wal buffer写入磁盘时,必须持有WALWriteLock。
当backend刷wal时,持有WALWriteLock,它也能刷其他backends提交记录。其他backends需要等这个flush完成,但不需要再去持有锁了。如果他们的wal被刷了,他们能直接返回(而不是再去刷一次wal)。

XLogFlush极其重要, XLogFlush的关键代码在for循环里:

/** Ensure that all XLOG data through the given position is flushed to disk.** NOTE: this differs from XLogWrite mainly in that the WALWriteLock is not* already held, and we try to avoid acquiring it if possible.*/
void
XLogFlush(XLogRecPtr record)
{
.../** Now wait until we get the write lock, or someone else does the flush* for us.*/for (;;){XLogRecPtr	insertpos;/* read LogwrtResult and update local state */SpinLockAcquire(&XLogCtl->info_lck);if (WriteRqstPtr < XLogCtl->LogwrtRqst.Write)WriteRqstPtr = XLogCtl->LogwrtRqst.Write;LogwrtResult = XLogCtl->LogwrtResult;SpinLockRelease(&XLogCtl->info_lck);/* done already? */if (record <= LogwrtResult.Flush)break;/** Before actually performing the write, wait for all in-flight* insertions to the pages we're about to write to finish.*/insertpos = WaitXLogInsertionsToFinish(WriteRqstPtr);/** Try to get the write lock. If we can't get it immediately, wait* until it's released, and recheck if we still need to do the flush* or if the backend that held the lock did it for us already. This* helps to maintain a good rate of group committing when the system* is bottlenecked by the speed of fsyncing.*/if (!LWLockAcquireOrWait(WALWriteLock, LW_EXCLUSIVE)){/** The lock is now free, but we didn't acquire it yet. Before we* do, loop back to check if someone else flushed the record for* us already.*/continue;}/* Got the lock; recheck whether request is satisfied */LogwrtResult = XLogCtl->LogwrtResult;if (record <= LogwrtResult.Flush){LWLockRelease(WALWriteLock);break;}/** Sleep before flush! By adding a delay here, we may give further* backends the opportunity to join the backlog of group commit* followers; this can significantly improve transaction throughput,* at the risk of increasing transaction latency.** We do not sleep if enableFsync is not turned on, nor if there are* fewer than CommitSiblings other backends with active transactions.*/if (CommitDelay > 0 && enableFsync &&MinimumActiveBackends(CommitSiblings)){pg_usleep(CommitDelay);/** Re-check how far we can now flush the WAL. It's generally not* safe to call WaitXLogInsertionsToFinish while holding* WALWriteLock, because an in-progress insertion might need to* also grab WALWriteLock to make progress. But we know that all* the insertions up to insertpos have already finished, because* that's what the earlier WaitXLogInsertionsToFinish() returned.* We're only calling it again to allow insertpos to be moved* further forward, not to actually wait for anyone.*/insertpos = WaitXLogInsertionsToFinish(insertpos);}/* try to write/flush later additions to XLOG as well */WriteRqst.Write = insertpos;WriteRqst.Flush = insertpos;XLogWrite(WriteRqst, false);LWLockRelease(WALWriteLock);/* done */break;}
...
} 

XLogFlush函数是刷脏WAL的主要函数:

  1. 判断需要flush的脏WAL是否已经被其他人flush了,如是,则直接返回
  2. 尝试以排他模式获得锁WALWriteLock,不断尝试,直到获得锁为止
  3. 拿到锁了,再次检查需要flush的脏WAL是否已经被其他人flush了,如是,则释放WALWriteLock,然后返回(申请锁期间也有可能脏WAL被其他人flush,如果是当然什么也不用做)
  4. 等待commit_delay毫秒且并发提交事务数大于commit_siblings,更新wal的写入点,形成组提交 。这一步目前走不到,因为CommitDelay默认为0,相当于没有打开组提交
  5. 调用XLogWrite写日志,写完释放WALWriteLock

XLogFlush刷脏wal需要判断当前请求的脏wal是不是已经写了,如果没写,会持有WALWriteLock直到XLogWrite函数日志写入完成。XLogWrite是写wal的具体函数,如写如到哪个page的哪个位置。

回到之前活动会话的等待事件上。IO:WALWrite等待IO比较好理解,LWLOCK:WALWrite怎么去确认是不是问题呢?
从XLogFlush函数逻辑可知,WALWriteLock是pg在写脏wal时申请的排他LWlock(这很好理解,wal的提交信息是顺序写的,只能给排他模式写,不能谁写的快就谁写,不然数据容易出错),是串行的写wal提交信息。
了解这部分逻辑再回去看pg_stat_activity,会发现IO:WALWrite只有1个,而LWLOCK:WALWrite有几十个。
虽然不能直接看到LWLOCK的blocking chain,但是我们可以从源码中得知,LWLOCK:WALWrite在等待IO:WALWrite

官方文档有一段关于XLogFlush和调整wal buffer的描述:

Normally, WAL buffers should be written and flushed by an XLogFlush request, which is made, for the most part, at transaction commit time to ensure that transaction records are flushed to permanent storage. On systems with high WAL output, XLogFlush requests might not occur often enough to prevent XLogInsertRecord from having to do writes. On such systems one should increase the number of WAL buffers by modifying the wal_buffers parameter. When full_page_writes is set and the system is very busy, setting wal_buffers higher will help smooth response times during the period immediately following each checkpoint.

正常情况下,wal buffer会被XLogFlush flush,例如事务提交写wal日志到磁盘上。如果wal日志量很大,但是XLogFlush触发不频繁(也就是全是大事务),就需要XLogInsertRecord写没有提交的wal record,也就是说wal_buffer不够用了,此时增大wal_buffer会稍微对系统的响应时间有帮助。

There are two commonly used internal WAL functions: XLogInsertRecord and XLogFlush. XLogInsertRecord is used to place a new record into the WAL buffers in shared memory. If there is no space for the new record, XLogInsertRecord will have to write (move to kernel cache) a few filled WAL buffers

结合XLogInsertRecord函数的一段描述:

	 * We have now done all the preparatory work we can without holding a* lock or modifying shared state. From here on, inserting the new WAL* record to the shared WAL buffer cache is a two-step process:** 1. Reserve the right amount of space from the WAL. The current head of*	  reserved space is kept in Insert->CurrBytePos, and is protected by*	  insertpos_lck.** 2. Copy the record to the reserved WAL space. This involves finding the*	  correct WAL buffer containing the reserved space, and copying the*	  record in place. This can be done concurrently in multiple processes.

XLogInsertRecord函数是用来将新的wal record到WAL buffer中的。

  1. 写入需要保留一定量的空间
  2. 拷贝wal record到保留的wal空间(应指的是wal buffer中的保留空间)。多个进程可并行执行

wal record拷贝到wal buffer是可以并行执行的。这很难称为瓶颈,因为是内存拷贝还有并行。
而XLogFlush函数就不是了,XLogFlush写入的时候会一直持有一个排他LWlock。所以,在这种并发高的小事务场景中,提高wal buffer理论上效果不会太理想。

至此,可以排除wal_buffers内存调整,把关注点放在IO上。再看pg_stat_activity的IO类相关等待个数:

DataFileRead	4
DataFileExtend	1
WALWrite		1
WALRead			1

insert value慢只持续了不到1分钟时间,平时没有异常。但是从平时的会话信息上看IO类的WALWrite等待基本都在

  pid  | usename  |          xact_start           |         state_change          |  wait_event   | wait_event_type | state  |                                                                                  partofquery
-------+----------+-------------------------------+-------------------------------+---------------+-----------------+--------+--------------------------------------------------------------72668 | lzluser11 | 2024-05-23 09:32:20.828394+08 | 2024-05-23 09:32:20.82841+08  | WALWrite      | IO              | active | insert into table1(                                                                                                                                            +77215 | lzluser11 | 2024-05-23 09:33:01.342541+08 | 2024-05-23 09:33:01.342552+08 | WALWrite      | IO              | active | insert into table1                                                                                                                                          +94904 | lzluser11 | 2024-05-23 09:34:32.442309+08 | 2024-05-23 09:34:32.442323+08 | WALWrite      | IO              | active | insert into table1                                                                                                                                          +88024 | lzluser11 | 2024-05-23 09:36:28.779086+08 | 2024-05-23 09:36:28.779311+08 | WALWrite      | IO              | active | UPDATE table2 SET                                                                                                                                              +103236 | lzluser11 | 2024-05-23 09:37:04.144283+08 | 2024-05-23 09:37:04.144302+08 | WALWrite      | IO              | active | insert into table1                                                                                                                                          +47342 | lzluser11 | 2024-05-23 09:37:09.192683+08 | 2024-05-23 09:37:09.192699+08 | WALWrite      | IO              | active | insert into table1                                                                                                                                          +75399 | lzluser11 | 2024-05-23 09:45:30.743023+08 | 2024-05-23 09:45:30.743024+08 | WALWrite      | IO              | active | update table1                                                                                                                                               +221993 | lzluser11 | 2024-05-23 09:46:16.184532+08 | 2024-05-23 09:46:16.184541+08 | WALWrite      | IO              | active | insert into table1   

然而检查当时IO性能,写入15M/s并不高,与其他时间段相比甚至比较低,w_await同样很低

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
dm-322            0.00     0.00  187.00 1515.00  3572.00 15344.00    22.23     2.05    1.20    9.39    0.18   0.15  25.70

没有强有力的证据显示是存储性能问题。

目前来看应是瞬时并发insert value小事务,对flush wal时的锁争用。可以排除以下选项:

  1. 并发小事务,不需要调整wal buffer
  2. wal日志量不大,不需要开启日志压缩
  3. FPI不多,不需要调整checkpoint
  4. IO压力不大,不需要提升IO性能

至少可以有如下优化:

  1. 开启数据库组提交(怕背锅可不调整,需要再测试)
  2. 业务单个语句insert value语句合并提交,减少WALWriteLock锁争用

这篇关于案例-insert value偶发慢分析的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!


原文地址:
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.chinasem.cn/article/997909

相关文章

RabbitMQ消费端单线程与多线程案例讲解

《RabbitMQ消费端单线程与多线程案例讲解》文章解析RabbitMQ消费端单线程与多线程处理机制,说明concurrency控制消费者数量,max-concurrency控制最大线程数,prefe... 目录 一、基础概念详细解释:举个例子:✅ 单消费者 + 单线程消费❌ 单消费者 + 多线程消费❌ 多

Olingo分析和实践之EDM 辅助序列化器详解(最佳实践)

《Olingo分析和实践之EDM辅助序列化器详解(最佳实践)》EDM辅助序列化器是ApacheOlingoOData框架中无需完整EDM模型的智能序列化工具,通过运行时类型推断实现灵活数据转换,适用... 目录概念与定义什么是 EDM 辅助序列化器?核心概念设计目标核心特点1. EDM 信息可选2. 智能类

Olingo分析和实践之OData框架核心组件初始化(关键步骤)

《Olingo分析和实践之OData框架核心组件初始化(关键步骤)》ODataSpringBootService通过初始化OData实例和服务元数据,构建框架核心能力与数据模型结构,实现序列化、URI... 目录概述第一步:OData实例创建1.1 OData.newInstance() 详细分析1.1.1

Olingo分析和实践之ODataImpl详细分析(重要方法详解)

《Olingo分析和实践之ODataImpl详细分析(重要方法详解)》ODataImpl.java是ApacheOlingoOData框架的核心工厂类,负责创建序列化器、反序列化器和处理器等组件,... 目录概述主要职责类结构与继承关系核心功能分析1. 序列化器管理2. 反序列化器管理3. 处理器管理重要方

SpringBoot中六种批量更新Mysql的方式效率对比分析

《SpringBoot中六种批量更新Mysql的方式效率对比分析》文章比较了MySQL大数据量批量更新的多种方法,指出REPLACEINTO和ONDUPLICATEKEY效率最高但存在数据风险,MyB... 目录效率比较测试结构数据库初始化测试数据批量修改方案第一种 for第二种 case when第三种

解决1093 - You can‘t specify target table报错问题及原因分析

《解决1093-Youcan‘tspecifytargettable报错问题及原因分析》MySQL1093错误因UPDATE/DELETE语句的FROM子句直接引用目标表或嵌套子查询导致,... 目录报js错原因分析具体原因解决办法方法一:使用临时表方法二:使用JOIN方法三:使用EXISTS示例总结报错原

MySql基本查询之表的增删查改+聚合函数案例详解

《MySql基本查询之表的增删查改+聚合函数案例详解》本文详解SQL的CURD操作INSERT用于数据插入(单行/多行及冲突处理),SELECT实现数据检索(列选择、条件过滤、排序分页),UPDATE... 目录一、Create1.1 单行数据 + 全列插入1.2 多行数据 + 指定列插入1.3 插入否则更

MySQL中的LENGTH()函数用法详解与实例分析

《MySQL中的LENGTH()函数用法详解与实例分析》MySQLLENGTH()函数用于计算字符串的字节长度,区别于CHAR_LENGTH()的字符长度,适用于多字节字符集(如UTF-8)的数据验证... 目录1. LENGTH()函数的基本语法2. LENGTH()函数的返回值2.1 示例1:计算字符串

mybatis执行insert返回id实现详解

《mybatis执行insert返回id实现详解》MyBatis插入操作默认返回受影响行数,需通过useGeneratedKeys+keyProperty或selectKey获取主键ID,确保主键为自... 目录 两种方式获取自增 ID:1. ​​useGeneratedKeys+keyProperty(推

Python通用唯一标识符模块uuid使用案例详解

《Python通用唯一标识符模块uuid使用案例详解》Pythonuuid模块用于生成128位全局唯一标识符,支持UUID1-5版本,适用于分布式系统、数据库主键等场景,需注意隐私、碰撞概率及存储优... 目录简介核心功能1. UUID版本2. UUID属性3. 命名空间使用场景1. 生成唯一标识符2. 数