文章目录

一、glusterfs
- 1. glusterfs介绍
- 2. 常见卷的模式
- 3. glusterfs集群
- - 1）环境准备
  - 2）实验步骤
  - 3）实验过程
- 4. replica卷测试
- 5. 卷的删除
- 6. stripe模式(条带)
- 7. distributed模式
- 8. distributed-replica模式
- 9. dispersed模式
- 10. 在线裁减与在线扩容
- glusterfs小结

一、glusterfs

1. glusterfs介绍

glusterfs是一个免费,开源的分布式文件系统（它属于文件存储类型）。
glusterfs官网

可以看看这篇文章的文件系统介绍
RAID介绍

2. 常见卷的模式

卷模式	描述
Replicated	复制卷，类似raid1
Striped(新版本将会放弃此模式及其它相关的组合模式)	条带卷，类似raid0
Distributed	分布卷
Distribute Replicated	分布与复制组合
Dispersed	纠删卷，类似raid5,raid6

glusterfs看作是一个将多台服务器存储空间组合到一起，再划分出不同类型的文件存储卷给导入端使用。

官方文档

Replicated卷

Striped卷

Distributed卷

Distribute Replicated卷

3. glusterfs集群

1）环境准备

主机名	ip
client	192.168.44.100
storage1	192.168.44.110
storage2	192.168.44.120
storage3	192.168.44.130
storage4	192.168.44.140

第一步： **所有节点(包括client)**静态IP（NAT网络，能上外网）

第二步：所有节点(包括client)**都配置主机名及其主机名互相绑定（这次我这里做了别名,方便使用)

 192.168.44.100 test.cluster.com client192.168.44.110 test1.cluster.com storage1192.168.44.120 test2.cluster.com storage2192.168.44.130 test3.cluster.com storage3192.168.44.140 test4.cluster.com storage4

第三步： **所有节点(包括client)**关闭防火墙,selinux

# systemctl stop firewalld
# systemctl disable firewalld
# iptables -F
# setenforce 0# vim /etc/selinux/config
SELINUX=disabled

第四步：**所有节点(包括client)**时间同步

# ntpdate 106.75.185.63

第五步：**所有节点(包括client)**配置好yum(需要加上glusterfs官方yum源)

备份官方yum源
# cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak
获取腾讯源
# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com/repo/centos7_base.repo配置好glusterfs官方源
#vim /etc/yum.repos.d/glusterfs.repo
[glusterfs]
name=glusterfs
baseurl=https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-6/
enabled=1
gpgcheck=0清除yum缓存
# yum clean all
重新建立缓存
# yum makecache

2）实验步骤

在所有storage服务器上安装相关软件包,并启动服务
所有storage服务器建立连接, 成为一个集群
所有storage服务器准备存储目录
创建存储卷
启动存储卷
client安装挂载软件
client挂载使用

3）实验过程

第1步：在所有storage服务器上(不包括client)安装glusterfs-server软件包，并启动服务

# yum install glusterfs-server -y
# systemctl start glusterd
# systemctl enable glusterd
Created symlink from /etc/systemd/system/multi-user.target.wants/glusterd.service to /usr/lib/systemd/system/glusterd.service.
查看服务状态
# systemctl status glusterd

分布式集群一般有两种架构:

有中心节点的中心节点一般指管理节点，后面大部分分布式集群架构都属于这一种
无中心节点的所有节点又管理又做事,glusterfs属于这一种

第2步：所有storage服务器建立连接，成为一个集群

4个storage服务器建立连接不用两两连接，只需要找其中1个,连接另外3个各一次就OK了这里在storage1上操作（这里使用ip,主机名,主机名别名都可以）
[root@storage1 ~]# gluster peer probe storage2
[root@storage1 ~]# gluster peer probe storage3
[root@storage1 ~]# gluster peer probe storage4然后在所有存储上都可以使用下面命令来验证检查
# gluster peer status

注意：
如果这一步建立连接有问题（一般问题会出现在网络连接,防火墙,selinux,主机名绑定等);

如果想重做这一步，可以使用gluster peer detach [主机名/ip] 来断开连接，重新做

第三步：所有storage服务器准备存储目录（建议不在根分区上创建）
因为在虚拟机上再创建硬盘比较麻烦所以，这里直接在根分区上进行了

# mkdir -p /data/gv0

第4步: 创建存储卷(在任意一个storage服务器上做)

注意: 改变的操作(create,delete,start,stop)等只需要在任意一个storage服务器上操作，查看的操作(info)等可以在所有storage服务器上操作

下面命令在 storage1上操作的
因为在根分区创建所以需要 force 参数强制
replica 4表示是在4台服务器上做复制模式（类似raid1）[root@storage1 ~]#gluster volume create gv0 replica 4 storage1:/data/gv0/ storage2:/data/gv0/ storage3:/data/gv0/ storage4:/data/gv0/ force
volume create: gv0: success: please start the volume to access data

在所有storage服务器上都可以查看

# gluster volume info gv0Volume Name: gv0
Type: Replicate 模式为replicate模式
Volume ID: b3fd3727-0c78-418d-a5b2-cd594bfc66aa
Status: Created 这里状态为created,表示刚创建，还未启动,需要启动才能使用
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: storage1:/data/gv0
Brick2: storage2:/data/gv0
Brick3: storage3:/data/gv0
Brick4: storage4:/data/gv0
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

第五步：启动存储卷

在任意一台 storage 服务器上启动都行
[root@storage1 ~]# gluster volume start gv0
volume start: gv0: success

# gluster volume info gv0Volume Name: gv0
Type: Replicate
Volume ID: b3fd3727-0c78-418d-a5b2-cd594bfc66aa
Status: Started 现在看到状态变为started，那么就表示可以被客户端挂载使用了
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: storage1:/data/gv0
Brick2: storage2:/data/gv0
Brick3: storage3:/data/gv0
Brick4: storage4:/data/gv0
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

第6步：在 client上安装软件

[root@client ~]# yum install glusterfs glusterfs-fuse -y

fuse(Filesystem in Userspace): 用户空间文件系统,是一个客户端挂载远程文件存储的模块

第7步：client挂载使用
注意:客户端也需要在/etc/hosts文件里绑定存储节点的主机名，才可以挂载（因为我前面做的步骤是用名字的)

[root@client ~]#
[root@client ~]# mkdir /test0
[root@client ~]# mount -t glusterfs storage1:gv0 /test0这里client是挂载storage1，也可以挂载storage2,storage3,storage4任意一个。（也就是说这4个storage既是老板,又是员工。这是glusterfs的一个特点，其它的分布式存储软件基本上都会有专门的管理server)

4. replica卷测试

读写测试方法

在客户端使用dd命令往挂载目录里写文件，然后查看在storage服务器上的分布情况

复制client机器，配置一台client2
配置ip为192.168.44.90

修改主机名
# hostnamectl set-hostname client2
修改配置文件
# vim /etc/hosts192.168.44.90 test0.cluster.com client2
192.168.44.110 test1.cluster.com storage1
192.168.44.120 test2.cluster.com storage2
192.168.44.130 test3.cluster.com storage3
192.168.44.140 test4.cluster.com storage4

我们在client主机上的挂载目录下生成了一个大小为200M的文件

[root@client ~]# dd if=/dev/zero of=/test0/file1 bs=1M count=200

在4台 storage机器上查看卷组目录

# ll -h /data/gv0/
total 200M
-rw-r--r--. 2 root root 200M Sep  4 16:36 file1

接着在client2主机上生成一个2G大小的文件

[root@client2 ~]# dd if=/dev/zero of=/test0/file2 bs=1M count=2000

在client的挂载目录查看

[root@client ~]# ll -h /test0
total 2.2G
-rw-r--r--. 1 root root 200M Sep  4 16:48 file1
-rw-r--r--. 1 root root 2.0G Sep  4 16:48 file2

在其它 storage机器上查看

[root@storage3 ~]# ll -h /data/gv0/
total 2.2G
-rw-r--r--. 2 root root 200M Sep  4 16:48 file1
-rw-r--r--. 2 root root 2.0G Sep  4 16:48 file2

说明每个storage机器上都存了一份网站的数据，类似于RAID1

接着测试以下几种情况

将storage其中一台关机

比如将 storage节点关机
# init 0
在客户端执行命令需要等待10几秒中才能使用
# ll -h /test0
等待...随便用一台client客户端上在挂载目录写点数据
[root@client ~]# touch /test0/demo.txt
[root@client ~]# echo "hello" > /test0/demo.txt再把 storage4机器开起来，发现数据自动同步过来了
[root@storage4 ~]# cat /data/gv0/demo.txt
hello- 将其中一个storage节点网卡down掉

将 storage2的网卡down掉

root@storage2 ~]# systemctl stop network
客户端需要等待10几秒钟才能正常继续使用,再次启动数据就正常同步过去

将其中一个storage节点glusterfs相关的进程kill掉

# killall glusterfs
# ps -ef | grep glusterfs
客户端无需等待就能正常继续使用,但写数据不会同步到挂掉的storage节点,等它进程再次启动就可以同步过去了
# systemctl start glusterd
# systemctl enable glusterd
# systemctl status glusterd

5. 卷的删除

第一步：先在客户端把之前的测试数据删除，在解挂

[root@client ~]# rm -rf /test0/*

第二步：在任一个storage服务器上使用下面的命令停止gv0并删除，这里是在storage4上操作

[root@storage4 ~]# gluster volume stop gv0
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: gv0: success[root@storage4 ~]# gluster volume delete gv0
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: gv0: success

第三步：在所有storage服务器上都可以查看，没有gv0的信息了，说明volumn被删除了

[root@storage1 ~]# gluster volume info gv0
Volume gv0 does not exist

我在不删除gv0的情况下，能否再创建一个叫gv1的卷?
当然可以,换个目录再创建就OK

6. stripe模式(条带)

第一步: 再重做成stripe模式的卷(重点是命令里的stripe 4参数)(在任一个storage服务器上操作, 我这里是在storage1上操作）

# gluster volume create gv0 stripe 4 storage1:/data/gv0/ storage2:/data/gv0/ storage3:/data/gv0/ storage4:/data/gv0/ force

第二步: 启动gv0(在任一个storage服务器上操作, 我这里是在storage1上操作）

# gluster volume start gv0

第四步:读写测试

读写测试结果: 文件过小,不会平均分配给存储节点。有一定大小的文件会平均分配。类似raid0。

磁盘利率率100%(前提是所有节点提供的空间一样大，如果大小不一样，则按小的来进行条带)
大文件会平均分配给存储节点（LB）
没有HA，挂掉一个存储节点，此stripe存储卷则不可被客户端访问

注意：这是4.*版本，后面的版本会将测版本废弃

7. distributed模式

第1步: 准备新的存储目录(所有存储服务器上都要操作)

# mkdir -p /data/gv1

第2步: 创建distributed卷gv1(不指定replica或stripe就默认是Distributed的模式, 在任一个storage服务器上操作, 我这里是在storage1上操作)

# gluster volume create gv1 storage1:/data/gv1/ storage2:/data/gv1/ storage3:/data/gv1/ storage4:/data/gv1/ force

第3步: 启动gv1(在任一个storage服务器上操作, 我这里是在storage1上操作)

# gluster volume start gv1

第4步: 客户端挂载

client# mkdir /test1
client# mount -t glusterfs storage1:gv1 /test1

第5步:读写测试(测试方法与replica模式一样

读写测试结果: 测试结果为随机写到不同的存储里，直到所有写满为止。

利用率100%
方便扩容
不保障的数据的安全性(挂掉一个节点,等待大概1分钟后,这个节点就剔除了,被剔除的节点上的数据丢失)
也不提高IO性能

8. distributed-replica模式

第1步: 准备新的存储目录(所有存储服务器上都要操作

# mkdir -p /data/gv2

第2步:** 创建distributed-replica卷gv2(在任一个storage服务器上操作, 我这里是在storage1上操作)

storage1# gluster volume create gv2 replica 2 storage1:/data/gv2/ storage2:/data/gv2/ storage3:/data/gv2/ storage4:/data/gv2/ force

第3步: 启动gv2(在任一个storage服务器上操作, 我这里是在storage1上操作)

storage1# gluster volume start gv2

第4步: 客户端挂载

client# mkdir /test2
client# mount -t glusterfs storage1:gv2 /test2

第5步:读写测试

读写测试结果: 4个存储分为两个组，这两个组按照distributed模式随机。但在组内的两个存储会按replica模式镜像复制。

特点:

结合了distributed与replica的优点:可以扩容，也有HA特性

9. dispersed模式

第1步: 准备新的存储目录(所有存储服务器上都要操作)

# mkdir -p /data/gv3

第2步:创建卷gv3(在任一个storage服务器上操作, 我这里是在storage1上操作)

[root@storage1 data]# gluster volume create gv3 disperse 4 storage1:/data/gv3/ storage2:/data/gv3/ storage3:/data/gv3/ storage4:/data/gv3/ force
There isn't an optimal redundancy value for this configuration. Do you want to create the volume with redundancy 1 ? (y/n) y
volume create: gv3: success: please start the volume to access data注意:没有指定冗余值，默认为1，按y确认

第3步: 启动gv3(在任一个storage服务器上操作, 我这里是在storage1上操作)

[root@storage1 data]# gluster volume start gv3
volume start: gv3: success
[root@storage1 data]# gluster volume info gv3Volume Name: gv3
Type: Disperse
Volume ID: 2eee2dfb-58b7-4fb1-8329-de8a447392a4
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: storage1:/data/gv3
Brick2: storage2:/data/gv3
Brick3: storage3:/data/gv3
Brick4: storage4:/data/gv3
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

第4步: 客户端挂

client# mkdir /test3
client# mount -t glusterfs storage1:gv3 /test3

第5步:读写测试(测试方法与replica模式一样，具体过程参考授课视频)**

读写测试结果: 写200M,每个存储服务器上占67M左右。因为4个存储1个为冗余(与raid5一样)

# dd if=/dev/zero of=/test3/file1 bs=1M count=200

10. 在线裁减与在线扩容

在线裁减要看是哪一种模式的卷,比如stripe模式就不允许在线裁减。下面我以distributed卷来做裁减与扩容

在线裁减(注意要remove没有数据的brick)

[root@storage1 data]# gluster volume remove-brick gv1 storage4:/data/gv1 force
Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume.
Do you want to continue? (y/n) y
volume remove-brick commit force: success

在线扩容

# gluster volume add-brick gv1 storage4:/data/gv1 force
volume add-brick: success

glusterfs小结

属于文件存储类型，优点:可以数据共享缺点: 速度较低

glusterfs集群相关推荐

Kubernetes - - k8s - v1.12.3 动态存储管理GlusterFS及使用Heketi扩容GlusterFS集群
1,准备工作 1.1 所有节点安装GFS客户端 yum install glusterfs glusterfs-fuse -y 1.2 如果不是所有节点要部署GFS管理服务,就在需要部署的节点上打上标 ...
搭建glusterfs集群
搭建glusterfs集群 Glusterfs简介 GlusterFS是Scale-Out存储解决方案Gluster的核心,它是一个开源的分布式文件系统,具有强大的横向扩展能力,通过扩展能够支持数PB ...
centos7 部署glusterfs集群，服务端和客户端演示
centos7 部署glusterfs集群,服务端和客户端演示说明 glusterfs 分布式文件服务,详细可以百度官网文档: https://docs.gluster.org/en/latest ...
glusterfs集群安装
环境准备系统 [root@VM_0_9_centos ~]# uname -a Linux VM_0_9_centos 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 ...
GlusterFS 集群搭建
目录一.部署流程 1. 环境部署 2. 硬盘分区挂载 3. 配置/etc/hosts文件(所有节点上操作) 4. 安装.启动GFS 5. 添加节点并创建集群 6. 根据规划创建卷 6.1 创建分布式 ...
Debian6 搭建GlusterFS集群-Striped Volumes
为什么80%的码农都做不了架构师?>>> 这一篇讲解构建 Striped Volumes:条带式卷.Striped Volumes类似于raid0, stripe数等于volu ...
GlusterFS集群文件系统
2019独角兽企业重金招聘Python工程师标准>>> 安装 This document is intended(预期) to give you a step by step gui ...
树莓派 mysql集群_多树莓派集群服务器
树莓派使用实例之:2 Pi R 第二篇:Web服务器在我的上一篇文章中讲过如何做一个高可用系统:两个树莓派布署上 GlusterFS 集群文件系统,就变成一个容错文件服务器了.在这篇文章中我们会基于 ...
基于开源软件构建高性能集群NAS系统
大数据时代的到来已经不可阻挡,面对数据的爆炸式增长,尤其是半结构化数据和非结构化数据,NoSQL存储系统和分布式文件系统成为了技术浪潮,得到了长足的发展.非结构化数据目前呈现更加快速的增长趋势,IDC ...

glusterfs集群