il2cpp

In this final episode of our IL2CPP micro-optimization miniseries, we’ll explore the high cost of something called “boxing”, and we’ll see how IL2CPP can avoid it when it is done unnecessarily.

在我们的IL2CPP微型优化迷你系列的最后一集中,我们将探讨被称为“装箱”的高成本,并且我们将了解IL2CPP如何避免不必要的操作。

堆分配很慢 (Heap allocations are slow)

Like many programming languages, C# allows the memory for objects to be allocated on the stack (a small, “fast”, scope-specific, block of memory) or the heap (a large, “slow”, global block of memory). Usually allocating space for an object on the heap is much more expensive than allocating space on the stack. It also involves tracking that allocated memory in the garbage collector, which has an additional cost. So we try to avoid heap allocations where possible.

像许多编程语言一样,C#允许将对象的内存分配到堆栈(较小的,“快速的”,特定于作用域的内存块)或堆(较大的,“慢的”,全局内存块)上。 通常,在堆上为对象分配空间比在堆栈上分配空间要昂贵得多。 它还涉及在垃圾回收器中跟踪已分配的内存,这会产生额外的费用。 因此,我们尝试在可能的情况下避免堆分配。

C# lets us do this by separating types into value types (which can be allocated on the stack), and reference types (which must be allocated on the heap). Types like int and float are value types, string and object are reference types. User-defined value types use the struct keyword. User-defined reference types use the class keyword. Note that a value type can never hold a the value null. In C#, the null value can only be assigned to reference types. Keep this distinction in mind as we continue.

C#通过将类型分为 值类型 (可以在堆栈上分配)和 引用类型 (必须在堆上分配)来做到这一点。 像 int 和 float 这样的类型是值类型, string 和 object 是引用类型。 用户定义的值类型使用 struct 关键字。 用户定义的引用类型使用 class 关键字。 请注意,值类型永远不能保留值null。 在C#中,只能将空值分配给引用类型。 在继续操作时,请记住这一区别。

Being good performance citizens, we try to avoid heap allocations unless they are necessary. But sometimes we need to convert a value type on the stack into a reference type on the heap. This process is called boxing. Boxing:

作为良好性能的公民,除非有必要,否则我们将避免堆分配。 但是有时我们需要将堆栈上的值类型转换为堆上的引用类型。 此过程称为 装箱 。 拳击:

  1. Allocates space on the heap

    在堆上分配空间

  2. Informs the garbage collector about the new object

    通知垃圾收集器有关新对象的信息

  3. Copies the data from the value type object into the new reference type object

    将数据从值类型对象复制到新的引用类型对象中

Ugh, let’s add boxing to our list of things to avoid!

gh,让我们将拳击添加到要避免的事情清单中!

那个讨厌的编译器 (That pesky compiler)

Suppose we are happily writing code, avoiding unnecessary heap allocations and boxing. Maybe we have some trees for our world, and each has a size which scales with its age:

假设我们愉快地编写代码,避免不必要的堆分配和装箱。 也许我们的世界有几棵树,每棵树的大小都随年龄而变化:

interface HasSize { int CalculateSize(); } struct Tree : HasSize { private int years; public Tree(int age) { years = age; } public int CalculateSize() { return years*3; } } interface HasSize { int CalculateSize(); } struct Tree : HasSize { private int years; public Tree(int age) { years = age; } public int CalculateSize() { return years*3; } }

1

2
3
4
5
6
7
8
9
10
11
12
13
14

interface HasSize {
int CalculateSize();
}
struct Tree : HasSize {
private int years;
public Tree(int age) {
years = age;
}
public int CalculateSize() {
return years*3;
}
}

1

2
3
4
5
6
7
8
9
10
11
12
13
14

interface HasSize {
int CalculateSize ( ) ;
}
struct Tree : HasSize {
private int years ;
public Tree ( int age ) {
years = age ;
}
public int CalculateSize ( ) {
return years* 3 ;
}
}

Elsewhere in our code, we have this convenient method to sum up the size of many things (including possibly Tree objects):

在代码的其他地方,我们有这种方便的方法来汇总许多事物(包括 Tree 对象)的大小:

public static int TotalSize<T>(params T[] things) where T : HasSize { var total = 0; for (var i = 0; i < things.Length; ++i) if (things[i] != null) total += things[i].CalculateSize(); return total; } public static int TotalSize<T>(params T[] things) where T : HasSize { var total = 0; for (var i = 0; i < things.Length; ++i) if (things[i] != null) total += things[i].CalculateSize(); return total; }

1

2
3
4
5
6
7
8

public static int TotalSize<T>(params T[] things) where T : HasSize
{
var total = 0;
for (var i = 0; i < things.Length; ++i)
if (things[i] != null)
total += things[i].CalculateSize();
return total;
}

1

2
3
4
5
6
7
8

public static int TotalSize < T > ( params T [ ] things ) where T : HasSize
{
var total = 0 ;
for ( var i = 0 ; i < things . Length ; ++ i )
if ( things [ i ] != null )
total += things [ i ] . CalculateSize ( ) ;
return total ;
}

This looks safe enough, but let’s peer into a little bit of the Intermediate Language (IL) code that the C# compiler generates:

这看起来足够安全,但让我们看一下C#编译器生成的一些中间语言(IL)代码:

// This is the start of the for loop // Load the array IL_0009: ldarg.0 // Load the current index IL_000a: ldloc.1 // Load element at the current index IL_000b: ldelem.any !!T // What is this box call doing in here?!? // (Hint: see the null check in the C# code) IL_0010: box !!T IL_0015: brfalse IL_002f // Set up the arguments for the method and it call IL_001a: ldloc.0 IL_001b: ldarg.0 IL_001c: ldloc.1 IL_001d: ldelema !!T IL_0022: constrained. !!T IL_0028: callvirt instance int32 Unity.IL2CPP.IntegrationTests.Tests.ValueTypeTests.ValueTypeTests/ IHasSize::CalculateSize() IL_002f: // Do the next loop iteration... // This is the start of the for loop // Load the array IL_0009: ldarg.0 // Load the current index IL_000a: ldloc.1 // Load element at the current index IL_000b: ldelem.any !!T // What is this box call doing in here?!? // (Hint: see the null check in the C# code) IL_0010: box !!T IL_0015: brfalse IL_002f // Set up the arguments for the method and it call IL_001a: ldloc.0 IL_001b: ldarg.0 IL_001c: ldloc.1 IL_001d: ldelema !!T IL_0022: constrained. !!T IL_0028: callvirt instance int32 Unity.IL2CPP.IntegrationTests.Tests.ValueTypeTests.ValueTypeTests/ IHasSize::CalculateSize() IL_002f: // Do the next loop iteration...

1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

// This is the start of the for loop
// Load the array
IL_0009: ldarg.0
// Load the current index
IL_000a: ldloc.1
// Load element at the current index
IL_000b: ldelem.any !!T
// What is this box call doing in here?!?
// (Hint: see the null check in the C# code)
IL_0010: box !!T
IL_0015: brfalse IL_002f
// Set up the arguments for the method and it call
IL_001a: ldloc.0
IL_001b: ldarg.0
IL_001c: ldloc.1
IL_001d: ldelema !!T
IL_0022: constrained. !!T
IL_0028: callvirt instance int32 Unity.IL2CPP.IntegrationTests.Tests.ValueTypeTests.ValueTypeTests/
IHasSize::CalculateSize()
IL_002f: // Do the next loop iteration...

1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

// This is the start of the for loop
// Load the array
IL_0009 : ldarg . 0
// Load the current index
IL_000a : ldloc . 1
// Load element at the current index
IL_000b : ldelem . any ! ! T
// What is this box call doing in here?!?
// (Hint: see the null check in the C# code)
IL_0010 : box ! ! T
IL_0015 : brfalse IL_002f
// Set up the arguments for the method and it call
IL_001a : ldloc . 0
IL_001b : ldarg . 0
IL_001c : ldloc . 1
IL_001d : ldelema ! ! T
IL_0022 : constrained . ! ! T
IL_0028 : callvirt instance int32 Unity . IL2CPP . IntegrationTests . Tests . ValueTypeTests . ValueTypeTests /
IHasSize :: CalculateSize ( )
IL_002f : // Do the next loop iteration...

The C# compiler has implemented the if (things[i] != null) check using boxing! If the type T is already a reference type, then the box opcode is pretty cheap – it just returns the existing pointer to the array element. But if type T is a value type (like our Tree type), then that box opcode is very costly. Of course, value types can never be null, so why do we need to implement the check in the first place? And what if we need to compute the size of one hundred Tree objects, or maybe one thousand Tree objects? That unnecessary boxing will quickly become very important.

C#编译器已 使用装箱 实现了 if(things [i]!= null) 检查! 如果类型 T 已经是引用类型,则 框 操作码非常便宜-它仅返回指向数组元素的现有指针。 但是,如果类型 T 是一个值类型(例如我们的 Tree 类型),那么该 框 操作码将 非常 昂贵。 当然,值类型永远不能为 null ,那么为什么我们首先需要实现检查? 如果我们需要计算一百个 Tree 对象或一千个 Tree 对象 的大小,该 怎么办? 不必要的拳击将很快变得 非常 重要。

最快的代码是您不执行的任何东西 (The fastest code is anything you don’t execute)

The C# compiler needs to provide a general implementation that works for any possible type T, so it is stuck with this slower code. But a compiler like IL2CPP can be a bit more aggressive when it generates code that will be executed and when it doesn’t generate the code that won’t!

C#编译器需要提供适用于任何可能的类型 T 的通用实现 ,因此它会卡在此较慢的代码中。 但是,像IL2CPP这样的编译器在生成将要执行的代码时以及在不生成将不会执行的代码时可能更具攻击性!

IL2CPP will create an implementation of The TotalSize<T> method specifically for the case where T is a Tree. the IL code above looks like this in generated C++ code:

IL2CPP将 专门针对 T 是 Tree 的情况 创建 TotalSize <T> 方法 的实现 。 上面的IL代码在生成的C ++代码中看起来像这样:

IL_0009: // Load the array TreeU5BU5D_t4162282477* L_0 = ___things0; // Load the current index int32_t L_1 = V_1; NullCheck(L_0); IL2CPP_ARRAY_BOUNDS_CHECK(L_0, L_1); int32_t L_2 = L_1; // Load the element at the current index Tree_t1533456772  L_3 = (L_0)->GetAt(static_cast<il2cpp_array_size_t>(L_2)); // Look Ma, no box and no branch! // Set up the arguments for the method and it call int32_t L_4 = V_0; TreeU5BU5D_t4162282477* L_5 = ___things0; int32_t L_6 = V_1; NullCheck(L_5); IL2CPP_ARRAY_BOUNDS_CHECK(L_5, L_6); int32_t L_7 = Tree_CalculateSize_m1657788316((Tree_t1533456772 *)( (L_5)->GetAddressAt(static_cast<il2cpp_array_size_t>(L_6))), /*hidden argument*/NULL); // Do the next loop iteration... IL_0009: // Load the array TreeU5BU5D_t4162282477* L_0 = ___things0; // Load the current index int32_t L_1 = V_1; NullCheck(L_0); IL2CPP_ARRAY_BOUNDS_CHECK(L_0, L_1); int32_t L_2 = L_1; // Load the element at the current index Tree_t1533456772  L_3 = (L_0)->GetAt(static_cast<il2cpp_array_size_t>(L_2)); // Look Ma, no box and no branch! // Set up the arguments for the method and it call int32_t L_4 = V_0; TreeU5BU5D_t4162282477* L_5 = ___things0; int32_t L_6 = V_1; NullCheck(L_5); IL2CPP_ARRAY_BOUNDS_CHECK(L_5, L_6); int32_t L_7 = Tree_CalculateSize_m1657788316((Tree_t1533456772 *)( (L_5)->GetAddressAt(static_cast<il2cpp_array_size_t>(L_6))), /*hidden argument*/NULL); // Do the next loop iteration...

1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

IL_0009:
// Load the array
TreeU5BU5D_t4162282477* L_0 = ___things0;
// Load the current index
int32_t L_1 = V_1;
NullCheck(L_0);
IL2CPP_ARRAY_BOUNDS_CHECK(L_0, L_1);
int32_t L_2 = L_1;
// Load the element at the current index
Tree_t1533456772  L_3 = (L_0)->GetAt(static_cast<il2cpp_array_size_t>(L_2));
// Look Ma, no box and no branch!
// Set up the arguments for the method and it call
int32_t L_4 = V_0;
TreeU5BU5D_t4162282477* L_5 = ___things0;
int32_t L_6 = V_1;
NullCheck(L_5);
IL2CPP_ARRAY_BOUNDS_CHECK(L_5, L_6);
int32_t L_7 = Tree_CalculateSize_m1657788316((Tree_t1533456772 *)(
(L_5)->GetAddressAt(static_cast<il2cpp_array_size_t>(L_6))), /*hidden argument*/NULL);
// Do the next loop iteration...

1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

IL_0009 :
// Load the array
TreeU5BU5D_t4162282477* L_0 = ___things0 ;
// Load the current index
int32_t L_1 = V_1 ;
NullCheck ( L_0 ) ;
IL2CPP_ARRAY_BOUNDS_CHECK ( L_0 , L_1 ) ;
int32_t L_2 = L_1 ;
// Load the element at the current index
Tree _ t1533456772  L_3 = ( L_0 ) -> GetAt ( static_cast < il2cpp_array_size_t > ( L_2 ) ) ;
// Look Ma, no box and no branch!
// Set up the arguments for the method and it call
int32_t L_4 = V_0 ;
TreeU5BU5D_t4162282477* L_5 = ___things0 ;
int32_t L_6 = V_1 ;
NullCheck ( L_5 ) ;
IL2CPP_ARRAY_BOUNDS_CHECK ( L_5 , L_6 ) ;
int32_t L_7 = Tree_CalculateSize_m1657788316 ( ( Tree_t1533456772 * ) (
( L_5 ) -> GetAddressAt ( static_cast < il2cpp_array_size_t > ( L_6 ) ) ) , /*hidden argument*/ NULL ) ;
// Do the next loop iteration...

IL2CPP recognized that the box opcode is unnecessary for a value type, because we can prove ahead of time that a value type object will never be null. In a tight loop, this removal of an unnecessary allocation and copy of data can have a significant positive impact on performance.

IL2CPP认识到 对于值类型而言, 框 操作码不是必需的,因为我们可以提前证明值类型对象永远不会为 null 。 在一个紧密的循环中,这种不必要的分配和数据副本的删除会对性能产生重大的积极影响。

结语 (Wrapping up)

As with the other micro-optimizations discussed in this series, this one is a common optimization for .NET code generators. All of the scripting backends used by Unity currently perform this optimization for you, so you can get back to writing your code.

与本系列中讨论的其他微优化一样,这是.NET代码生成器的常见优化。 当前,Unity使用的所有脚本后端均会为您执行此优化,因此您可以重新开始编写代码。

We hope you have enjoyed this miniseries about micro-optimizations. As we continue to improve the code generators and runtimes used by Unity, we’ll offer more insight into the micro-optimizations that go on behind the scenes.

我们希望您喜欢这个关于微优化的迷你系列。 随着我们继续改进Unity使用的代码生成器和运行时,我们将提供更多有关幕后微优化的信息。

翻译自: https://blogs.unity3d.com/2016/08/11/il2cpp-optimizations-avoid-boxing/

il2cpp

il2cpp_IL2CPP优化:避免装箱相关推荐

  1. 包装类(装箱与拆箱)

    包装类(装箱与拆箱) 包装类有八种,分别对应基本数据类型(Byte,Short,Boolean,Integer,Long,Float,Double,Character),这八种都继承了Number,下 ...

  2. Java基础学习笔记(三)_Java核心技术(高阶)

    本篇文章的学习资源来自Java学习视频教程:Java核心技术(高阶)_华东师范大学_中国大学MOOC(慕课) 本篇文章的学习笔记即是对Java核心技术课程的总结,也是对自己学习的总结 文章目录 Jav ...

  3. 腾讯云 Finops Crane 开发者集训营 - 让云不再“钱”途无量

    前言: Finops Crane集训营主要面向广大开发者,旨在提升开发者在容器部署.K8s层面的动手实践能力,同时吸纳Crane开源项目贡献者,鼓励开发者提交issue.bug反馈等,并搭载线上直播. ...

  4. 『玩具装箱TOY 斜率优化DP』

    玩具装箱TOY(HNOI2008) Description P教授要去看奥运,但是他舍不下他的玩具,于是他决定把所有的玩具运到北京.他使用自己的压缩器进行压缩,其可以将任意物品变成一堆,再放到一种特殊 ...

  5. 【BZOJ1010】【HNOI2008】玩具装箱(斜率优化,动态规划)

    [BZOJ1010][HNOI2008]玩具装箱 题面 题目描述 P教授要去看奥运,但是他舍不下他的玩具,于是他决定把所有的玩具运到北京.他使用自己的压缩器进行压缩,其可以将任意物品变成一堆,再放到一 ...

  6. 【决策单调性】玩具装箱(金牌导航 决策单调性优化DP-1)

    玩具装箱 金牌导航 决策单调性优化DP-1 题目大意 给出若干个物品,把iii到jjj个物品装在一起的长度l=j−i+∑k=ijakl=j-i+\sum_{k=i}^{j}a_kl=j−i+∑k=ij ...

  7. 【斜率优化】玩具装箱(luogu 3195)

    玩具装箱 luogu 3195 题目大意 有n件物品,每件物品有相对的长度CiC_iCi​现在要把这n件物品放到容器中,切放的物品必须是连续的,若把第i件物品到第j件物品放到一个容器中,那此容器的长度 ...

  8. bzoj1010[HNOI2008]玩具装箱toy 斜率优化dp

    1010: [HNOI2008]玩具装箱toy Time Limit: 1 Sec  Memory Limit: 162 MB Submit: 11893  Solved: 5061 [Submit] ...

  9. matlab三维集装3D container箱装箱优化【matlab优化算法七】

    简介 三维装箱问题考虑三个因素--一般指长.宽.高.装车.装船.装集装箱等要考虑这三个维度都不能超. 优化模型优化模型中的目标函数值可以评价装箱方案的优劣,本文考虑待装箱子的空间利用率最大以及所使用箱 ...

最新文章

  1. 前端node 和vue开发之环境搭建
  2. oracle undo
  3. Excel催化剂回顾2019年产出(文章合集),展望2020年
  4. 前端学习(1309):创建web服务器
  5. 9切换中文mac_超详细的Mac重装系统教程!让重装系统变得简单起来!
  6. 国产数据库发展十策(三):是走MySQL路线还是PostgreSQL路线?
  7. 小企业电脑如何组网_(完整版)中小型企业组网方案
  8. Linux Vi 的使用
  9. java的scjp考试_Sun认证Java程序员(SCJP)考试
  10. 高效记忆/形象记忆(10)110数字编码表 41-50
  11. 国内网络游戏企业的困境和出路
  12. Delphi 实现多国语言
  13. 国际贸易和计算机网络,网络对国际贸易的变革与影响.doc
  14. iOS开发调用苹果自带的地图应用
  15. 网站访问量统计的重要指标
  16. 腾讯新品吐个槽,目标是你的核心用户
  17. android studio 遇到 app error launching怎么办?
  18. 微信小程序如何隐藏左上角返回首页按钮?
  19. Grid Control一些术语GC、OMS、OMR、OMA的概念
  20. 从专科生跃上斯坦福读博!本人自述:要么5点睡,要么5点起

热门文章

  1. Linux 安装 miniconda
  2. 【ABC科创企业案例】高科数聚:把脉消费者视角,洞悉消费力趋向
  3. python人机对话_人机交互程序 python实现人机对话
  4. k8s部署redis哨兵
  5. 华为鸿蒙适用哪些机型,华为重磅消息公布:鸿蒙OS适配手机机型大名单曝光
  6. 大数据到底怎么学: 数据科学概论与大数据学习误区
  7. iOS越狱的判定方法
  8. 《设计模式 -- 可复用面向对象软件的基础》读后感
  9. web应用程序与web网站的区别
  10. EPPlus 6.1.0 行走在2022.11