注意:因为测试使用的python notebook + pandas, 所有 %使用的是%%
在SQL的末尾,可以增加返回数据的格式

  • FORMAT CSVWithNames
  • FORMAT TabSeparatedWithNamesAndTypes

1. 执行SQL查询

1.1. 查看正在执行的查询语句

--查询
SELECT query_id, user, address, elapsed, query
FROM system.processes
ORDER BY query_id ASC--杀死执行慢的SQL
KILL QUERY WHERE query_id='query_id';

1.2. 查看正在执行的更新语句

--查询
SELECT database,table,mutation_id,command,create_time,parts_to_do_names,parts_to_do,latest_fail_reason
FROM system.mutations
where is_done<>1--杀死执行慢的SQL
KILL MUTATION WHERE mutation_id = 'mutation_id';

1.3. 查询今天top 10 最慢的SQL

SELECT user,
formatDateTime(query_start_time, '%%Y%%m%%d %%T') AS start_time,
query_duration_ms / 1000 AS query_duration_s,
formatReadableSize(memory_usage ) AS memory_usage,
result_rows ,
formatReadableSize(result_bytes) AS result_bytes,
read_rows ,
formatReadableSize(read_bytes) AS read_bytes,
written_rows ,
formatReadableSize(written_bytes) AS written_bytes,
query
FROM system.query_log WHERE type = 2 and query_start_time>=today()
ORDER BY query_duration_s DESC LIMIT 10--直接查询
select type,concat(substr(query,1,100),'...') as query,read_rows,query_duration_ms,memory_usage,read_bytes,written_bytes from system.query_log limit 10--统计执行频繁的SQL
select concat(substr(query,1,100),'...') as sql,count(*) as total from system.query_log
where event_time>'2021-12-01 00:00:00' and event_time<'2021-12-02 00:00:00' and is_initial_query=1 and lower(query) like '%%select%%'
group by sql order by total desc

如何开启查询日志
query_log记录了所有clickhouse服务中所有已经执行的查询记录

     <!--全局定义--><!-- Query log. Used only for queries with setting log_queries = 1. --><query_log><database>system</database><table>query_log</table><partition_by>toYYYYMM(event_date)</partition_by><!-- Interval of flushing data. --><flush_interval_milliseconds>7500</flush_interval_milliseconds></query_log><!--如果只需要为某些用户单独开启query_log,在users.xml的profile中配置--><log_queries>1</log_queries>

1.4. 基于remote统计集群慢SQL

remote(‘addresses_expr’, db, table[, ‘user’[, ‘password’]]) 允许访问远程服务器而不创建分布式表。

  • 按节点按天统计慢SQL数
  • 查询耗时大于指定时间的慢SQL
  • 按节点统计正在运行的SQL数
  • 查询正在运行的SQL
select * from
(
select 'node1' as server,address,toStartOfDay(event_time) as event_day,count(1) as selectCount from remote('x.x.x.x1','system','query_log') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and query_duration_ms>10000 and is_initial_query=1 and lower(query) like '%%select%%'  group by address,toStartOfDay(event_time)
union all select 'node2' as server,address,toStartOfDay(event_time) as event_day,count(1) as selectCount from remote('x.x.x.x2','system','query_log') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and  query_duration_ms>10000 and is_initial_query=1 and lower(query) like '%%select%%'  group by address,toStartOfDay(event_time)
union all select 'node3' as server,address,toStartOfDay(event_time) as event_day,count(1) as selectCount from remote('x.x.x.x3','system','query_log') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and query_duration_ms>10000 and is_initial_query=1 and lower(query) like '%%select%%'  group by address,toStartOfDay(event_time)
) t order by server,event_dayselect 'node1' as server,address,event_time,type,query_duration_ms,query from remote('x.x.x.x1','system','query_log','default') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and query_duration_ms>10000 and is_initial_query=1
union all select 'yjdsj03' as server,address,event_time,type,query_duration_ms,query from remote('x.x.x.x2','system','query_log','default') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and  query_duration_ms>10000 and is_initial_query=1
union all select 'yjdsj04' as server,address,event_time,type,query_duration_ms,query from remote('x.x.x.x3','system','query_log','default') where event_time>'2021-12-01 10:00:00' and event_time<'2021-12-01 11:00:00' and query_duration_ms>10000 and is_initial_query=1 select * from
(select 'node1' as server,count(1) as cc from remote('x.x.x.x1','system','processes')
union all select 'node2' as server,count(1) as cc from remote('x.x.x.x2','system','processes')
union all select 'node3' as server,count(1) as cc from remote('x.x.x.x3','system','processes')
) t order by server;select * from
(select 'node1' as server,query_id, user, address, elapsed, query from remote('x.x.x.x1','system','processes')
union all select 'node2' as server,query_id, user, address, elapsed, query from remote('x.x.x.x2','system','processes')
union all select 'node3' as server,query_id, user, address, elapsed, query from remote('x.x.x.x3','system','processes')
) t order by server;

2. 查看表信息

2.1. 查看表占用空间大小

SELECT table,partition,formatReadableSize(sum(data_compressed_bytes)) AS compressed_size ,
formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed_bytes
FROM system.parts
WHERE active AND (table LIKE 'vehicle_warning_%%')
GROUP BY table,partition
order by partition desc

2.2. 查看列占用空间大小

select column as colName,any(type) as colType,
sum(column_data_compressed_bytes) compressed_size ,
sum(column_data_uncompressed_bytes) uncompressed_bytes,
sum(rows) as rowNum
from system.parts_columns
where active AND table like 'vehicle_warning_LOCAL'
GROUP BY column
ORDER BY uncompressed_bytes desc ;

2.3. 表的分区信息统计

统计指定时间范围内,表的分区、分区文件数及占用空间大小

SELECT partition, count() AS number_of_parts, formatReadableSize(sum(bytes)) AS sum_size
FROM system.parts
WHERE active AND (table = 'vehicle_warning_new_LOCAL') and partition between '2021-11-01 00:00:00' and '2021-11-30 00:00:00'
GROUP BY partition
ORDER BY partition ASC

2.4. 查看表的副本情况

SELECT database, table, is_leader, total_replicas, active_replicas, zookeeper_exception
is_session_expired,future_parts, parts_to_check,queue_size,inserts_in_queue,log_max_index,log_pointer
FROM system.replicas where table = 'vehicle_warning_LOCAL'--查看异常的副本,各个预警的变量可以根据自身情况调整。
SELECT database, table, is_leader, total_replicas, active_replicas, zookeeper_exception
FROM system.replicas
WHERE is_readonly OR is_session_expired OR future_parts > 20 OR parts_to_check > 10 OR queue_size > 20
OR inserts_in_queue > 10 OR log_max_index - log_pointer > 10 OR total_replicas < 2 OR active_replicas < total_replicas

3. 其它

3.1. 查看总连接数

SELECT * FROM system.metrics WHERE metric LIKE '%%Connection';

3.2. 查看磁盘空间

SELECT name,path,formatReadableSize(free_space) AS free_space,
formatReadableSize(total_space) AS total_space, type
FROM system.disks

3.3. 查看集群信息

select cluster,shard_num,shard_weight,replica_num,host_name,host_address,port,user,errors_count,estimated_recovery_time  from system.clusters where replica_num=1

3.5. 查看正在处理MergeTree族表的合并和分区变化

查看目前正在处理MergeTree族表的合并和分区变化的信息

select database ,table,elapsed ,progress,num_parts ,result_part_name ,is_mutation ,total_size_bytes_compressed ,rows_read ,rows_written
from system.merges

附录

clickhouse系统表

系统表介绍详见官网

使用的函数

import time
def yieldDF(df):for index, row in df.iteritems():for i in range(len(row)):yield row[i],indexdef printDF(df,pos=slice(1,2)):for row in df.itertuples():print(row[pos])
def executeSQL(sql):start =time.process_time()df = pd.read_sql(sql,con=engine)end = time.process_time()print('Running time: %s Seconds'%(end-start))return dfsql='''select cluster,shard_num,shard_weight,replica_num,host_name,host_address,port,user,errors_count,estimated_recovery_time
from system.clusters where replica_num=1
'''
executeSQL(sql)for d,i in yieldDF(executeSQL(sql).head(1)):print(i,'==>',d)printDF(executeSQL(sql),slice(1,3))

clickhouse MPPDB数据库 运维实用SQL总结相关推荐

  1. 美团数据库运维自动化系统构建之路

    本文整理自美团点评技术沙龙第10期:数据库技术架构与实践. 美团点评技术沙龙由美团点评技术团队主办,每月一期.每期沙龙邀请美团点评及其它互联网公司的技术专家分享来自一线的实践经验,覆盖各主要技术领域. ...

  2. 【clickhouse】ClickHouse之DBA运维宝典

    1.概述 转载:ClickHouse之DBA运维宝典 这里仅仅是积累知识.建议大家去看原来的. 最近有位网友与我聊天,他是一名 DBA,问我在 ClickHouse 中有没有一些能够 "安家 ...

  3. SpringBoot运维实用篇

    SpringBoot2零基础到项目实战-基础篇 SpringBoot运维实用篇 从此刻开始,咱们就要进入到实用篇的学习了.实用篇是在基础篇的根基之上,补全SpringBoot的知识图谱.比如在基础篇中 ...

  4. 解密京东618大促数据库运维的攻守之道

    来自:DBAplus社群 本文根据高新刚老师在[2019 DAMS中国数据智能管理峰会]现场演讲内容整理而成. 讲师介绍 高新刚,京东数科数据库团队负责人,负责京东数科数据库平台的管理维护工作,带领团 ...

  5. oracle 数据库运维技术,Oracle数据库智能运维标准化工艺研究-orastar-DIY数据库运维工具-第1期 导论篇...

    1.编制目的 为了提升运维质量,增强数据库运维标准化.规范化,保障信息系统的安全.稳定.高效运行,助力实现智能运维,star同学在此抛砖引玉,特编制该文档. 2.维度定义 根据运维经验,现将数据库日常 ...

  6. 微博热点事件背后数据库运维的“功守道”

    作者 | 张冬洪 责编 | 仲培艺 [导语] 微博拥有超过3.76亿月活用户,是当前社会热点事件传播的主要平台.而热点事件往往具有不可预测性和突发性,较短时间内可能带来流量的翻倍增长,甚至更大.如何快 ...

  7. 2020,分布式架构会给传统数据库运维带来哪些变化?

    摘要:分布式架构可能是近几年最火的话题.从集中式.SOA到分布式架构,本文回顾了这些年金融行业经历的架构演变:结合当下一些较典型的分布式数据库的实现原理,分析了分布式数据库的三个发展阶段.分布式数据库 ...

  8. 顺丰android架构师,顺丰数据库运维架构.pdf

    GOPS 全球运维大会 2018 2018.4.13-4.14 中国·广东·深圳·南山区 圣淘沙大酒店 (翡翠店 ) G O P S 全 球 运 维 大 会 2 0 1 8 · 深 圳 站 负重前行- ...

  9. 与“十“俱进 阿里数据库运维10年演进之路

    与"十"俱进 阿里数据库运维10年演进之路 原文:与"十"俱进 阿里数据库运维10年演进之路 阿里巴巴集团拥有超大的数据库实例规模,在快速发展的过程中我们在运维 ...

最新文章

  1. 试玩 go-socks5
  2. [图解]在输入框和文本框中获取和设置光标位置,以及选中文本和获取选中文本值的方法 --- 详解,兼容所有浏览器。...
  3. C语言 匿名联合体和匿名结构体
  4. delete在js里为引用删除
  5. Jdk8一行代码读取文件
  6. 数字图像处理技术的应 用领域
  7. Java 数组及多维数组
  8. 「2013-9-14」Change Remote Desktop Port
  9. 修复win7便签功能
  10. 阿宁的linux学习---vi/vim
  11. JAVA的0x1b分隔符_hive 特殊分隔符 0X1B
  12. 个性化不和谐帐户的8种方法
  13. 树莓派4B + darknet-yolov4-tiny + 英特尔第二代神经计算棒
  14. 计算机如何提高开机速度?
  15. 鼠标悬浮显示禁止图标
  16. C语言学生成绩信息管理系统课程设计报告
  17. 第六章——数值积分与数值微分
  18. 计算机系统的存储器系统的任务是,第2章 计算机系统的组成.ppt
  19. 二维码支付码的工作原理那点事
  20. 扇贝python编程课_扇贝编程APP下载|扇贝编程python V1.1.35 安卓版下载 - 下载银行...

热门文章

  1. comsol分析时总位移代表什么_网格剖分时识别并解决其中的奇异性
  2. DDOS攻击原理,种类及其防御
  3. 苹果自带跳语音服务器,iOS自带文本转语音技术(TTS)的实现即语音播报的实践
  4. python 相关系数矩阵_期望,方差,协方差,相关系数,协方差矩阵,相关系数矩阵,以及numpy实现...
  5. 中国著名的四大有哪些?
  6. android手机远程windows10,微软推出适用于Windows 10的Android远程控制
  7. 在php环境下搭建dvwa,CentOS7搭建DVWA测试环境
  8. er studio mysql_Navicat用腻了? 可以试试这几款免费且好用的 MySQL 客户端
  9. 面向对象分析烤地瓜项目
  10. 宠物管理|养犬登记|宠物识别|智慧城市监控|城市犬类管理系统