arm64 页表以及映射分析
arm64 页表映射分析
- 1 linux 6.10 xilinx内核的内存配置
- 2 arm64不同粒度页的页表
- 2.1 4KB页面粒度的页表配置
- 2.2 16KB页面粒度的页表配置
- 2.3 64KB页面粒度的页表配置
- 3 页表描述符
- 3.1 无效页表描述符
- 3.2 L0~L2页表描述符
- 3.3 L3页表描述符
- 4 linux arm64 页表映射
- 4.1 __create_pgd_mapping
- 4.2 __create_pgd_mapping_locked
- 4.3 alloc_init_pud
- 4.4 alloc_init_cont_pmd
- 4.5 init_pmd
- 4.6 alloc_init_cont_pte
- 4.7 init_pte
1 linux 6.10 xilinx内核的内存配置
在.config配置文件中可以看到如下的配置
配置内存的虚拟地址和物理地址总线位宽为48位,页面粒度为4K大小。
- 内存page size设置为4K
CONFIG_ARM64_4K_PAGES=y
- 虚拟地址位宽为48位
CONFIG_ARM64_VA_BITS_48=y
CONFIG_ARM64_VA_BITS=48
- 物理地址位宽为48位
CONFIG_ARM64_PA_BITS_48=y
CONFIG_ARM64_PA_BITS=48
- 页表级别的配置:配置为4级页表
CONFIG_PGTABLE_LEVELS=4
2 arm64不同粒度页的页表
有3种不同粒度的内存页面设置:4KB,16KB,64KB。
2.1 4KB页面粒度的页表配置
When you use a 4kB granule size, the hardware can use a 4-level look up process. The 48-bit address has nine address bits per level translated, that is 512 entries each, with the final 12 bits selecting a byte within the 4kB coming directly from the original address.
Bits 47:39 of the Virtual Address index into the 512 entry L0 table. Each of these table entries spans a 512 GB range and points to an L1 table. Within that 512 entry L1 table, bits 38:30 are used as index to select an entry and each entry points to either a 1GB block or an L2 table. Bits 29:21 index into a 512 entry L2 table and each entry points to a 2MB block or next table level. At the last level, bits 20:12 index into a 512 entry L2 table and each entry points to a 4kB block.
2.2 16KB页面粒度的页表配置
When you use a 16kB granule size, the hardware can use a 4-level look up process. The 48-bit address has 11 address bits per level translated, that is 2048 entries each, with the final 14 bits selecting a byte within the 4kB coming directly from the original address. The level 0 table contains only two entries. Bit 47 of the Virtual Address selects a descriptor from the two entry L0 table. Each of these table entries spans a 128 TB range and points to an L1 table. Within that 2048 entry L1 table, bits 46:36 are used as an index to select an entry and each entrypoints to an L2 table. Bits 35:25 index into a 2048 entry L2 table and each entry points to a 32 MB block or next table level. At the final translation stage, bits 24:14 index into a 2048 entry L2 table and each entry points to a 16kB block.
2.3 64KB页面粒度的页表配置
When you use a 64kB granule size, the hardware can use a 3-level look up process. The level 1 table contains only 64 entries.
Bits 47:42 of the Virtual Address select a descriptor from the 64 entry L1 table. Each of these table entries spans a 4TB range and points to an L2 table. Within that 8192 entry L2 table, bits 41:29 are used as index to select an entry and each entry points to either a 512 MB block or an L2 table. At the final translation stage, bits 28:16 index into an 8192 entry L3 table and each entry points to a 64kB block.
3 页表描述符
In the VMSAv8-64 translation table format, the difference in the formats of the level 0, level 1 and level 2
descriptors is:
- Whether a Block descriptor is permitted.
- If a Block descriptor is permitted, the size of the memory region described by that entry.
- The maximum OA size, depending on whether ARMv8.2-LPA is implemented.
These differences depend on the translation granule, as follows:
- 4KB granule Level 0 translation tables do not support Block descriptors.
- A block descriptor:
- In a level 1 table describes the mapping of the associated 1GB input address range.
- In a level 2 table describes the mapping of the associated 2MB input address range.
- The maximum OA size of a lookup is 48 bits.
- A block descriptor:
- 16KB granule Level 0 and level 1 translation tables do not support Block descriptors.
- A Block descriptor in a level 2 table describes the mapping of the associated 32MB input address range.
- The maximum OA size of a lookup is 48 bits.
- 64KB granule Level 0 lookup is not supported.
- In ARMv8.7 LPA is default is implemented
- A block descriptor:
- In a level 1 table describes the mapping of the associated 4TB input address range.
- In a level 2 table describes the mapping of the associated 512MB input address range.
- A block descriptor:
- The maximum OA size of a lookup is 48 bits.
- In ARMv8.7 LPA is default is implemented
3.1 无效页表描述符
如果页表描述符的最低位为0则表示当前页表描述符是一个无效的页表描述符,对于L0 ~ L3页表描述符表都适用。
3.2 L0~L2页表描述符
根据L0 ~ L2页表描述符表的bit 1位为0还是1来区分当前的输出是一个块地址还是一个指向下一级页表的地址
- 0 表示当前是一个块类型页表描述符,输出的为一个块地址
- The descriptor gives the base address of a block of memory, and the attributes for that memory region.
- 1 表示当前是一个页表类型,指向下一级页表的地址
- The descriptor gives the address of the next level of translation table, and for a stage 1 translation, some attributes for that translation.
3.3 L3页表描述符
L3页表描述符表根据页面page size设置的不同,描述符表的格式有略微的区别
- For the 4KB granule size, each entry in a level 3 table describes the mapping of the associated 4KB input address range.
- For the 16KB granule size, each entry in a level 3 table describes the mapping of the associated 16KB input address range.
- For the 64KB granule size, each entry in a level 3 table describes the mapping of the associated 64KB input address range.
Descriptor bit[1] identifies the descriptor type, and is encoded as: - 0, Reserved, invalid
- Behaves identically to encodings with bit[0] set to 0.
- This encoding must not be used in level 3 translation tables.
- 1, Page Gives the address and attributes of a 4KB, 16KB, or 64KB page of memory.
- At this level, the only valid format is the Page descriptor. The other fields in the Page descriptor are:
- Page descriptor
Gives the output address of a page of memory, as follows:- 4KB translation granule
- Bits[47:12] are bits[47:12] of the output address for a page of memory
- 16KB translation granule
- Bits[47:14] are bits[47:14] of the output address for a page of memory.
- 64KB translation granule
- bits[47:16] are bits[47:16] of the output address for a page of memory
- 4KB translation granule
4 linux arm64 页表映射
在linux系统中,arm64的页表映射是通过__create_pgd_mapping函数实现的,在linux 系统中,页表的级别分为为PGD,PUD,PMD,PTE。这分别和arm64的L0,L1,L2,L3相对应。
以4KB页面4级页表为例来分析
4.1 __create_pgd_mapping
__create_pgd_mapping函数__create_pgd_mapping_locked实现后续的页表映射工作。
static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,unsigned long virt, phys_addr_t size,pgprot_t prot,phys_addr_t (*pgtable_alloc)(int),int flags)
{mutex_lock(&fixmap_lock);__create_pgd_mapping_locked(pgdir, phys, virt, size, prot,pgtable_alloc, flags);mutex_unlock(&fixmap_lock);
}
4.2 __create_pgd_mapping_locked
- pgd_t *pgdir 表示的pgd页表的起始地址
- pgd_t *pgdp = pgd_offset_pgd(pgdir, virt);获取当前pgd页表的地址
- next = pgd_addr_end(addr, end);获取当前pgd管理页表地址的结束地址,其管理范围为512G(2^39)
- alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc, flags);分配并配置当前的pud页表映射
- phys += next - addr;下次要配置的pgd页表项的地址
- while (pgdp++, addr = next, addr != end)这段代码中的 addr = next表示要获取下一个pgd页表项的起始地址
static void __create_pgd_mapping_locked(pgd_t *pgdir, phys_addr_t phys,unsigned long virt, phys_addr_t size,pgprot_t prot,phys_addr_t (*pgtable_alloc)(int),int flags)
{unsigned long addr, end, next;pgd_t *pgdp = pgd_offset_pgd(pgdir, virt);/** If the virtual and physical address don't have the same offset* within a page, we cannot map the region as the caller expects.*/if (WARN_ON((phys ^ virt) & ~PAGE_MASK))return;phys &= PAGE_MASK;addr = virt & PAGE_MASK;end = PAGE_ALIGN(virt + size);do {next = pgd_addr_end(addr, end);alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,flags);phys += next - addr;} while (pgdp++, addr = next, addr != end);
}
4.3 alloc_init_pud
alloc_init_pud函数就是要在配置pud(即L1)级别的页表项
- p4d_none(p4d)判断当前的pgd页表项是否为空,如果为空则需要配置当前的pgd页表项
- pud_phys = pgtable_alloc(PUD_SHIFT);分配pud页表项
- __p4d_populate(p4dp, pud_phys, p4dval);将申请的pud页表的地址配置到pgd页表项中
- pudp = pud_set_fixmap_offset(p4dp, addr);获取pud页表项的地址
- next = pud_addr_end(addr, end);获取当前pud页表项管理的结束地址,其管理范围为1G (2^30)
- pud_set_huge(pudp, phys, prot);如果当前的页表描述符表类型为块设备,则输出当前的内存地址为一个1G大小粒度的huge 内存块。
- alloc_init_cont_pmd(pudp, addr, next, phys, prot, pgtable_alloc, flags);如果当前内存是一个连续的内存,则需要继续设置其下一级页表PMD
static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,phys_addr_t phys, pgprot_t prot,phys_addr_t (*pgtable_alloc)(int),int flags)
{unsigned long next;pud_t *pudp;p4d_t *p4dp = p4d_offset(pgdp, addr);p4d_t p4d = READ_ONCE(*p4dp);if (p4d_none(p4d)) {p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN;phys_addr_t pud_phys;if (flags & NO_EXEC_MAPPINGS)p4dval |= P4D_TABLE_PXN;BUG_ON(!pgtable_alloc);pud_phys = pgtable_alloc(PUD_SHIFT);__p4d_populate(p4dp, pud_phys, p4dval);p4d = READ_ONCE(*p4dp);}BUG_ON(p4d_bad(p4d));pudp = pud_set_fixmap_offset(p4dp, addr);do {pud_t old_pud = READ_ONCE(*pudp);next = pud_addr_end(addr, end);/** For 4K granule only, attempt to put down a 1GB block*/if (pud_sect_supported() &&((addr | next | phys) & ~PUD_MASK) == 0 &&(flags & NO_BLOCK_MAPPINGS) == 0) {pud_set_huge(pudp, phys, prot);/** After the PUD entry has been populated once, we* only allow updates to the permission attributes.*/BUG_ON(!pgattr_change_is_safe(pud_val(old_pud),READ_ONCE(pud_val(*pudp))));} else {alloc_init_cont_pmd(pudp, addr, next, phys, prot,pgtable_alloc, flags);BUG_ON(pud_val(old_pud) != 0 &&pud_val(old_pud) != READ_ONCE(pud_val(*pudp)));}phys += next - addr;} while (pudp++, addr = next, addr != end);pud_clear_fixmap();
}
4.4 alloc_init_cont_pmd
alloc_init_cont_pmd函数用于设置其pmd页表
- pud_none(pud)判断当前的pud页表是否为空,如果为空,则申请PMD页表,并将PMD页表的起始地址配置到pud页表项中
- next = pmd_cont_addr_end(addr, end);获取当前pmd页表项管理的内存地址的结束地址,其范围为2M (2^21)
- init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);映射pmd页表项
static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,unsigned long end, phys_addr_t phys,pgprot_t prot,phys_addr_t (*pgtable_alloc)(int), int flags)
{unsigned long next;pud_t pud = READ_ONCE(*pudp);/** Check for initial section mappings in the pgd/pud.*/BUG_ON(pud_sect(pud));if (pud_none(pud)) {pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN;phys_addr_t pmd_phys;if (flags & NO_EXEC_MAPPINGS)pudval |= PUD_TABLE_PXN;BUG_ON(!pgtable_alloc);pmd_phys = pgtable_alloc(PMD_SHIFT);__pud_populate(pudp, pmd_phys, pudval);pud = READ_ONCE(*pudp);}BUG_ON(pud_bad(pud));do {pgprot_t __prot = prot;next = pmd_cont_addr_end(addr, end);/* use a contiguous mapping if the range is suitably aligned */if ((((addr | next | phys) & ~CONT_PMD_MASK) == 0) &&(flags & NO_CONT_MAPPINGS) == 0)__prot = __pgprot(pgprot_val(prot) | PTE_CONT);init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);phys += next - addr;} while (addr = next, addr != end);
}
4.5 init_pmd
init_pmd函数用于配置pmd页表项
- pmdp = pmd_set_fixmap_offset(pudp, addr);获取当前pmd页表的基地址
- next = pmd_addr_end(addr, end);获取当前pmd页表项所管理范围的结束地址,其粒度为2M (2^21)
- alloc_init_cont_pte(pmdp, addr, next, phys, prot, pgtable_alloc, flags);分配并映射pte页表
static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,phys_addr_t phys, pgprot_t prot,phys_addr_t (*pgtable_alloc)(int), int flags)
{unsigned long next;pmd_t *pmdp;pmdp = pmd_set_fixmap_offset(pudp, addr);do {pmd_t old_pmd = READ_ONCE(*pmdp);next = pmd_addr_end(addr, end);/* try section mapping first */if (((addr | next | phys) & ~PMD_MASK) == 0 &&(flags & NO_BLOCK_MAPPINGS) == 0) {pmd_set_huge(pmdp, phys, prot);/** After the PMD entry has been populated once, we* only allow updates to the permission attributes.*/BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),READ_ONCE(pmd_val(*pmdp))));} else {alloc_init_cont_pte(pmdp, addr, next, phys, prot,pgtable_alloc, flags);BUG_ON(pmd_val(old_pmd) != 0 &&pmd_val(old_pmd) != READ_ONCE(pmd_val(*pmdp)));}phys += next - addr;} while (pmdp++, addr = next, addr != end);pmd_clear_fixmap();
}
4.6 alloc_init_cont_pte
alloc_init_cont_pte函数用于做pte页表的映射工作。
- pmd_none(pmd)判断当前的pmd页表是否为空,如果为空,则分配pte页表并配置到pmd页表项中
- init_pte(pmdp, addr, next, phys, __prot) pte页表项的映射配置
static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,unsigned long end, phys_addr_t phys,pgprot_t prot,phys_addr_t (*pgtable_alloc)(int),int flags)
{unsigned long next;pmd_t pmd = READ_ONCE(*pmdp);BUG_ON(pmd_sect(pmd));if (pmd_none(pmd)) {pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN;phys_addr_t pte_phys;if (flags & NO_EXEC_MAPPINGS)pmdval |= PMD_TABLE_PXN;BUG_ON(!pgtable_alloc);pte_phys = pgtable_alloc(PAGE_SHIFT);__pmd_populate(pmdp, pte_phys, pmdval);pmd = READ_ONCE(*pmdp);}BUG_ON(pmd_bad(pmd));do {pgprot_t __prot = prot;next = pte_cont_addr_end(addr, end);/* use a contiguous mapping if the range is suitably aligned */if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) &&(flags & NO_CONT_MAPPINGS) == 0)__prot = __pgprot(pgprot_val(prot) | PTE_CONT);init_pte(pmdp, addr, next, phys, __prot);phys += next - addr;} while (addr = next, addr != end);
}
4.7 init_pte
init_pte函数用于做pte页表的映射工作
- ptep = pte_set_fixmap_offset(pmdp, addr);获取当前pte页表的起始地址
- set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));配置pte页表项
- phys += PAGE_SIZE;每次往后移一个PAGE,即配置下一个内存页面。
static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,phys_addr_t phys, pgprot_t prot)
{pte_t *ptep;ptep = pte_set_fixmap_offset(pmdp, addr);do {pte_t old_pte = READ_ONCE(*ptep);set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));/** After the PTE entry has been populated once, we* only allow updates to the permission attributes.*/BUG_ON(!pgattr_change_is_safe(pte_val(old_pte),READ_ONCE(pte_val(*ptep))));phys += PAGE_SIZE;} while (ptep++, addr += PAGE_SIZE, addr != end);pte_clear_fixmap();
}
arm64 页表以及映射分析相关推荐
- Linux内存管理 (2)页表的映射过程
专题:Linux内存管理专题 关键词:swapper_pd_dir.ARM PGD/PTE.Linux PGD/PTE.pgd_offset_k. Linux下的页表映射分为两种,一是Linux自身的 ...
- linux 直接映射 页表大小,linux 启动过程临时页表到底映射了多大内存?
从linux-2.4内核开始,在建立临时页表的时候,一般的教科书都说是映射了8M的物理内存,但是为什么是映射8M呢?当时网上有资料说,8M足够了,但为什么就足够了,一直没有彻底搞清楚,今天又重新分析这 ...
- Linux内存管理 - 页表的映射过程初步了解
Linux下的页表映射分为两种,一是Linux自身的页表映射,另一种是ARM32 MMU硬件的映射. 为什么会分两种:看一下什么是MMU: MMU是Memory Management Unit的缩写, ...
- Linux虚拟内存映射分析以及CMA测试 - 以SSD202为例
在开始之前,先看一下SSD202的内存使用范围 硬件上SSD202内置128MB内存,其中有一部分预留给MMA,MMAP以及CMA 具体的大小设置在bootargs 中 bootargs = &quo ...
- 利用nat123端口映射快速发布网站做网站服务,解决80端口映射被屏蔽被封问题,及nat123端口映射分析
nat123端口映射是基于NAT端口映射原理的应用,提供80端口为用户解决网站80端口被屏蔽被封的问题,提供非80自定义端口直连内外网应用. 内网发布网站做网站服务,80端口被屏蔽被封,外网访问内网L ...
- 完全重映射和部分重映射分析(超详细)
大目前的主流芯片都具有重映射的功能,很多刚入坑的小伙伴不太清楚重映射功能,本文章以STM32F103C8T6为例子,对该功能进行说明. 重映射功能的作用:芯片的重映射功能是为了最大化利用IO口,减少I ...
- 读书节最该买的书,我都帮你们挑出来了
点击关注 异步图书,置顶公众号 每天与你分享 IT好书 技术干货 职场知识 过完漫长的冬天,送走了倒春寒,转眼4月也即将过半,我们有那么多的节日要过,对爱读书的真爱粉儿而言,读书节这个大日子,不放点福 ...
- 内存管理源码分析1-ARMV8-AARCH64 MMU 及 linux页表映射过程
MMU的作用,主要是完成地址的翻译,无论是main-memory地址(DDR地址),还是IO地址(设备device地址),在开启了MMU的系统中,CPU发起的指令读取.数据读写都是虚拟地址,在ARM ...
- ARMv8 MMU及Linux页表映射:TLB
<ARM SMMU原理与IOMMU技术("VT-d" DMA.I/O虚拟化.内存虚拟化)> <Linux内存管理:分页机制> <Linux内存管理:内 ...
最新文章
- ubuntu 更新mysql后无法登陆_更新ubuntu之后无法登陆mysql
- ubuntu 10.04 顶部任务栏消失!!
- webgl坐标转换_WebGL 坐标系统
- dnf服务器未响应win7,win7dnf未响应怎么解决|分享win7系统dnf总是未响应的解决方法...
- WCF 第六章 序列化与编码 编码选择
- 面对offer,如何选择
- 我是如何使用git把本地代码上传到github上的,值得借鉴
- 成都Uber优步司机奖励政策(1月16日)
- 关于未来交通,这些大咖在未来论坛上的讨论火花四溅
- 开源web管理系统mysql_10个基于Web的开源项目管理系统
- 人教版计算机三年级教学目标,人教版小学三年级数学下册教学计划
- java念整数 你的程序要读入一个整数,范围是[-100000,100000]。然后,用汉语拼音将这个整数的每一位输出出来。 如输入1234,则输出: yi er san si
- 基音提取之短时自相关法
- android 高德拖拽地图定位,拖拽选址-拖拽选址-示例中心-JS API UI 组件示例 | 高德地图API...
- 一文读懂伪回归、协整、格兰杰
- jQuery伪类选择器
- 百药食坊-团队项目开始介绍
- Ubuntu17.04 安装搜狗中文输入法
- 综述 | 激光与视觉融合SLAM
- Contest3412 - 2022中石油大中小学生联合训练第七场
热门文章
- 武侠玄幻之无极剑仙(三)
- 今天参加软通动力的笔试了
- 若史蒂夫-科尔执教尼克斯,小斯将全力支持
- python3 类中字典类型的实例变量被“篡改”
- 解决Error sending alert“ err=“Post “http://ip:port/api/v2/alerts\“: EOF
- Android 浏览器插件开发-插件库
- Unity 八方手势识别
- c++中 双冒号作用
- Servlet中的重定向和转发的区别
- Python Gauge框架