说说pgpool-II的 health check

pgpool-II中，与health check 相干的配置文件项有两个：

health_check_period

health_check_timeout

乍一看他们文档的解释，看官方网站的说法：

http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html

health_check_period

This parameter specifies the interval between the health checks in seconds.

Default is 0, which means health check is disabled. You need to reload pgpool.conf if you change health_check_period.

复制代码

health_check_timeout

pgpool-II periodically tries to connect to the backends to detect any error on the servers or networks. This error check procedure is called "health check".

If an error is detected, pgpool-II tries to perform failover or degeneration.

This parameter serves to prevent the health check from waiting for a long time in acase such as un unplugged network cable. The timeout value is in seconds. Default value is 20.

0 disables timeout (waits until TCP/IP timeout).

This health check requires one extra connection to each backend,

so max_connections in the postgresql.conf needs to be incremented as needed. You need to reload pgpool.conf if you change this value.

复制代码

实际的情形如何呢，这里以 pgpool-II 3.1 为例(为了看着方便，去掉了一部分不重要的代码)：

复制代码

* pgpool main program

int main(int argc, char **argv)

{

……

* This is the main loop

for (;;)

{

CHECK_REQUEST;

/* do we need health checking for PostgreSQL? */

if (pool_config->health_check_period > 0)

{

……

if (pool_config->health_check_timeout > 0)

{

* set health checker timeout. we want to detect

* communication path failure much earlier before

* TCP/IP stack detects it.

pool_signal(SIGALRM, health_check_timer_handler);

alarm(pool_config->health_check_timeout);

}

* do actual health check. trying to connect to the backend

errno = 0;

health_check_timer_expired = 0;

POOL_SETMASK(&UnBlockSig);

sts = health_check();

POOL_SETMASK(&BlockSig);

if (pool_config->parallel_mode || pool_config->enable_query_cache)

sys_sts = system_db_health_check();

if ((sts > 0 || sys_sts < 0)

&& (errno != EINTR || (errno == EINTR && health_check_timer_expired)))

{

if (sts > 0)

{

sts--;

if (!pool_config->parallel_mode)

{

if (POOL_DISALLOW_TO_FAILOVER(BACKEND_INFO(sts).flag))

{

pool_log("health_check: %d failover is canceld

　　　　　　　　　　　　　　　　　　　　　　because failover is disallowed", sts);

}

else

{

pool_log("set %d th backend down status", sts);

Req_info->kind = NODE_DOWN_REQUEST;

Req_info->node_id[0] = sts;

failover();

/* need to distribute this info to children */

}

else

{

retrycnt++;

pool_signal(SIGALRM, SIG_IGN); /* Cancel timer */

if (retrycnt > NUM_BACKENDS)

{

/* retry count over */

pool_log("set %d th backend down status", sts);

Req_info->kind = NODE_DOWN_REQUEST;

Req_info->node_id[0] = sts;

failover();

retrycnt = 0;

}

else

{

/* continue to retry */

sleep_time = pool_config->health_check_period/

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　NUM_BACKENDS;

pool_debug("retry sleep time: %d seconds", sleep_time);

pool_sleep(sleep_time);

continue;

}

……

}

if (pool_config->health_check_timeout > 0)

{

/* seems ok. cancel health check timer */

pool_signal(SIGALRM, SIG_IGN);

}

sleep_time = pool_config->health_check_period;

pool_sleep(sleep_time);

}

else

{

for (;;)

{

int r;

struct timeval t = {3, 0};

POOL_SETMASK(&UnBlockSig);

r = pool_pause(&t);

POOL_SETMASK(&BlockSig);

if (r > 0)

break;

}

pool_shmem_exit(0);

}

复制代码

可以看得比较清楚了，

第一点，health_check_period的作用，如果不为零，则health_check可以发生。

其他非零值其实都是一样。

第二点，health_check_timeout的作用，如果>0，则会被设置timer,timer到时间后，激活 health_check_timer_handler，对调用 health_check()函数的。

第三点，这里是最坑爹的部分了：

在主循环里面，只要 health_check_period不为零，则要不断地在循环里面作 health_check()动作。

这个一般而言比缺省的 health_check_timeout 20秒可高多了。

实际运行 pgpool命令的时候，如果加入 -d 参数，就可以看到这一点：pgpool-II不断通过调用healt_check()来检查各节点状况。

可以说，有了这个主循环里面折腾 health_check以后，health_check_timeout就形同虚设了。

只是不知道从哪个版本开始变成这样的，或者可以说　pgpool-II的开发者很不负责，没有很好地协调代码和文档。也许这是很多开源项目的通病了。

本文转自健哥的数据花园博客园博客，原文链接：http://www.cnblogs.com/gaojian/archive/2012/07/27/2611935.html，如需转载请自行联系原作者

说说pgpool-II的 health check相关推荐

安装EBS前期检查工具 - RDA - Health Check / Validation Engine Guide
参考文档 RDA - Health Check / Validation Engine Guide (文档 ID 250262.1) 先下载 RDA 补丁包. Download HCV ...
Health Check in eShop -- 解析微软微服务架构Demo（五）
引言 What is the Health Check Health Check(健康状态检查)不仅是对自己应用程序内部检测各个项目之间的健康状态(各项目的运行情况.项目之间的连接情况等),还包括了应 ...
linux网络健康度检测,linux运维、架构之路-K8s健康检查Health Check
一.Health Check介绍强大的自愈能力是k8s容器编排引擎一个重要特性,自愈能力的默认实现方式为自动重启发生故障的容器,另外还可以利用Liveness和Readiness探测机制设置更精细的 ...
第八章 Health Check
8.1 默认的健康检查每个容器启动时会执行一个进程,此进程由Dockerfile的CMD或ENTRYPOINT指定.如果进程退出时返回码非零,则认为容器发生故障,K8s就会根据restartPoli ...
在 Scale Up 中使用 Health Check - 每天5分钟玩转 Docker 容器技术（145）
2019独角兽企业重金招聘Python工程师标准>>> 对于多副本应用,当执行 Scale Up 操作时,新副本会作为 backend 被添加到 Service 的负责均衡中,与已有 ...
在 Rolling Update 中使用 Health Check - 每天5分钟玩转 Docker 容器技术（146）
上一节讨论了 Health Check 在 Scale Up 中的应用,Health Check 另一个重要的应用场景是 Rolling Update.试想一下下面的情况: 现有一个正常运行的多副本应 ...
ASP.NET Core on K8S深入学习（6）Health Check
本篇已加入<.NET Core on K8S学习实践系列文章索引>,可以点击查看更多容器化技术相关系列文章.预计阅读时间为10分钟. 01 - 关于K8S中的健康监测所谓Health C ...
SpringCloud使用RabbitMQ报错Rabbit health check failed
问题描述:Docker容器启动RabbitMQ以后,本地环境使用Spring Cloud连接RabbitMQ,结果报错: o.s.b.a.amqp.RabbitHealthIndicator : Ra ...
WhyNotWin11（win11升级检测工具）绿色便携版V2.1.0.0下载 | 比微软PC Health Check好用
WhyNotWin11 是一款短小精悍的一键win11升级检测工具,由于一名来自GitHub的技术大神无法忍受Bug频发的微软官方win11升级检测工具PC Health Check而决心开发的神器, ...

说说pgpool-II的 health check

说说pgpool-II的 health check相关推荐

最新文章

热门文章