提交io流程中aio_write之前函数的注释,可参考“块存储:AIO的直接读流程注释”。

 设置iter迭代器的函数注释可参考“块存储:AIO的直接读流程注释”。

blkdev_write_iter函数调用__generic_file_write_iter开始直写,其注释:

如果是Direct写,__generic_file_write_iter将首先调用generic_file_direct_write函数,其注释如下:

块设备直写函数blkdev_direct_IO及其以下函数调用链的注释见“块存储:AIO的直接读流程注释”。

最后blkdev_write_iter函数调用generic_write_sync用于最后数据的安全落盘,实际是调用块设备的fsync函数blkdev_fsync, 向设备发送一个FLUSH指令,将设备本身带的cache落盘:

 另外,对于具体文件系统,fsync()的实现取决于具体文件系统的实现,大部分情况下也会用到REQ_PREFLUSH接口将数据刷到硬盘存储介质。

上述blkdev_fsync先调用file_write_and_wait_range将page cache中的缓存直接落盘,但是OS并不知道磁盘上有没有写缓存,如果磁盘上面有写缓存,file_write_and_wait_range触发的落盘可能只落在了磁盘缓存上,并没有落在非易失介质上,所以需要触发下面的FLUSH指令。 

FLUSH指令作用示意:

关于REQ_PREFLUSH
REQ_PREFLUSH 是bio的request flag,表示在本次io开始时先确保在它之前完成的io都已经写到非易失性存储里。
可在一个空的bio里设置REQ_PREFLUSH,表示回刷disk page cache里数据。

Explicit cache flushes (Documentation/block/writeback_cache_control.txt)
The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from the filesystem and will make sure the volatile cache of the storage device has been flushed before the actual I/O operation is started. This explicitly
guarantees that previously completed write requests are on non-volatile storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O. It is recommend to use the blkdev_issue_flush() helper for a pure cache flush.

REQ_FLUSH:表示把磁盘cache中的data刷新到磁盘介质中,防止掉电丢失; REQ_FUA (force unit access):绕过磁盘cache,直接把数据写到磁盘介质中。

Documentation/block/writeback_cache_control.txt:
==========================================
Explicit volatile write back cache control
==========================================Introduction
------------Many storage devices, especially in the consumer market, come with volatile
write back caches.  That means the devices signal I/O completion to the
operating system before data actually has hit the non-volatile storage.  This
behavior obviously speeds up various workloads, but it means the operating
system needs to force data out to the non-volatile storage when it performs
a data integrity operation like fsync, sync or an unmount.The Linux block layer provides two simple mechanisms that let filesystems
control the caching behavior of the storage device.  These mechanisms are
a forced cache flush, and the Force Unit Access (FUA) flag for requests.Explicit cache flushes
----------------------The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
the filesystem and will make sure the volatile cache of the storage device
has been flushed before the actual I/O operation is started.  This explicitly
guarantees that previously completed write requests are on non-volatile
storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O.  It is recommend to use
the blkdev_issue_flush() helper for a pure cache flush.Forced Unit Access
------------------The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
filesystem and will make sure that I/O completion for this request is only
signaled after the data has been committed to non-volatile storage.Implementation details for filesystems
--------------------------------------Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
may both be set on a single bio.Implementation details for make_request_fn based block drivers
--------------------------------------------------------------These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface.  For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_PREFLUSH bit set.  For real device
drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
data can be completed successfully without doing any work.  Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.Implementation details for request_fn based block drivers
---------------------------------------------------------For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload.  For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing::blk_queue_write_cache(sdkp->disk->queue, true, false);and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer.  For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using::blk_queue_write_cache(sdkp->disk->queue, true, true);and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn.  If the FUA bit is not natively supported the block
layer turns it into an empty REQ_OP_FLUSH request after the actual write.

块存储:AIO的直接写流程注释相关推荐

  1. 块存储,文件存储和对象存储

         首先,我们介绍这两种传统的存储类型.通常来讲,所有磁盘阵列都是基于Block块的模式(DAS),而所有的NAS产品都是文件级存储 1.块存储         以下列出的两种存储方式都是块存储 ...

  2. hdfs写流程和MR缓冲区

    一.hdfs的写流程 1. 客户端发起RPC请求到NameNode 2. NameNode收到请求之后,进行校验: a. 校验用户是否有操作权限 b. 校验这个文件是否存在 3. 记录元数据,计算这个 ...

  3. 存储-对象存储、文件存储和块存储

    块存储和文件存储是我们比较熟悉的两种主流的存储类型,而对象存储(Object-based Storage)是一种新的网络存储架构,基于对象存储技术的设备就是对象存储设备(Object-based St ...

  4. 阿里云服务(三)—对象存储OSS和块存储

    五.对象存储OSS 块存储适合存放本地使用的一些文件,而且成本比较高,容量也有一些限制,不是适合数据量庞大的大数据. 1.对象存储OSS的概念   1.1 什么是对象存储OSS     存储分类   ...

  5. 【Linux集群教程】07 块存储之 iSCSI 服务

    6 块存储之 iSCSI 服务 6.1 iSCSI 概述 6.1.1 iSCSI 与 SCSI 原理差别 小型计算机系统接口(英语:Small Computer System Interface; 简 ...

  6. F2FS源码分析-2.2 [F2FS 读写部分] F2FS的一般文件写流程分析

    F2FS源码分析系列文章 主目录 一.文件系统布局以及元数据结构 二.文件数据的存储以及读写 F2FS文件数据组织方式 一般文件写流程 一般文件读流程 目录文件读流程(未完成) 目录文件写流程(未完成 ...

  7. Hadoop理论——hdfs读、写流程

    在Hadoop中我们一定会使用hdfs的传输,那么,hdfs的读写流程究竟是什么,我利用了一点时间整理了一下 首先就是官网的图,介绍了HDFS hdfs写流程 1,客户端client调用Distrib ...

  8. 华为分布式块存储Fusion Storage知识总结(二)

    目录 一.华为分布式存储Fusion Storage介绍 二.Fusion Storage优势(特点) 1.高弹性和扩展性 2.高性能 3.高可靠性 4.高安全性 5.数据保护 6.高易用性 Fusi ...

  9. Rocksdb 写流程,读流程,WAL文件,MANIFEST文件,ColumnFamily,Memtable,SST文件原理详解

    文章目录 前言 Rocksdb写流程图 WAL 原理分析 概述 文件格式 查看WAL的工具 创建WAL 清理WAL MANIFEST原理分析 概述 查看MANIFEST的工具 创建 及 清除 MANI ...

最新文章

  1. 面试处处碰壁,程序员“升值”好难呀!
  2. 全国大学生智能汽车竞赛英飞凌AURIXTM培训--应用篇 : 3月30日直播
  3. 在nodejs中的集成虹软人脸识别
  4. html手机端页面meta,手机页面的 HTMLmeta 标签使用与说明
  5. pyqt5中sender方法介绍_【第五节】PyQt5事件和信号
  6. Mac下crontab -e没结果的解决办法
  7. 面试中 项目遇见的难点答案_2019 百度、头条、小米、360、网易、拼多多等公司 Android 社招面试心得...
  8. (10)C#偷懒的开始永无止境的循环?
  9. centos7 yun安装mysql,CentOS7 yum方式安装MySQL5.7
  10. ASP.NET页面的生命周期(转载)
  11. Atitit 数据库技术体系 艾提拉总结 目录 1. 2. 初始概念 5 2 1.1. 2.1. 数据库的类型,网状,层次,树形数据库,kv数据库。Oodb 多媒体数据库 5 2 1.2. 2.2.
  12. shell特殊命令 sort_wc_unip命令
  13. 计算机国际会议口头报告范例,国际会议报告开场白(共4篇).docx
  14. 【Element-ui】el-table大数据量渲染卡顿问题
  15. 自动化一切!那些我每天使用的快捷自动化工作
  16. 哪种耳机音质好又便宜?高性价比蓝牙耳机推荐
  17. 我做淘宝7年的工作经验总结
  18. 电影《功夫熊猫2》中的管理知识
  19. 西工大NOJ数据结构理论——015.建立二叉树的二叉链表存储结构(严6.70)
  20. python创建学生字典_Python创建字典的八种方式

热门文章

  1. 高等代数_证明_幂等矩阵一定能够相似对角化
  2. 云计算基础(一)2022-3-21
  3. c语言编程镖局运镖,打点
  4. JSR规范系列(2)——JavaSE规范、JavaEE规范、JSR规范全面整理——截止201912
  5. 谷歌五笔输入法电脑版_新手学拼音还是学五笔打字(看完你就明白)
  6. Winyao 8125-M2-C NGFF KEY A+E RTL8125B 工业2.5G千兆网卡
  7. MySQL错误:Can't create table‘..’ (errno:150)解决方案
  8. 用的五大bug管理工具的优缺点和下载地址
  9. 通达信转MT4怎么弄
  10. 将任意窗口固定到桌面最前端