TNN 的仿射变换形态介于 OpenCV 和 ncnn 之间。其处理流程与 OpenCV 较为相似并做了一些优化,不同的地方在于数据处理宽度为4,比较小。在性能表现方面中规中矩,小图上不及 ncnn。

MatUtils::WarpAffine

#mermaid-svg-FNwIOkXOm8kxHfXI .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .label text{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .node rect,#mermaid-svg-FNwIOkXOm8kxHfXI .node circle,#mermaid-svg-FNwIOkXOm8kxHfXI .node ellipse,#mermaid-svg-FNwIOkXOm8kxHfXI .node polygon,#mermaid-svg-FNwIOkXOm8kxHfXI .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FNwIOkXOm8kxHfXI .node .label{text-align:center;fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .node.clickable{cursor:pointer}#mermaid-svg-FNwIOkXOm8kxHfXI .arrowheadPath{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-FNwIOkXOm8kxHfXI .flowchart-link{stroke:#333;fill:none}#mermaid-svg-FNwIOkXOm8kxHfXI .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-FNwIOkXOm8kxHfXI .edgeLabel rect{opacity:0.9}#mermaid-svg-FNwIOkXOm8kxHfXI .edgeLabel span{color:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-FNwIOkXOm8kxHfXI .cluster text{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-FNwIOkXOm8kxHfXI .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FNwIOkXOm8kxHfXI text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-FNwIOkXOm8kxHfXI .actor-line{stroke:grey}#mermaid-svg-FNwIOkXOm8kxHfXI .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-FNwIOkXOm8kxHfXI #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .sequenceNumber{fill:#fff}#mermaid-svg-FNwIOkXOm8kxHfXI #sequencenumber{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI #crosshead path{fill:#333;stroke:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .messageText{fill:#333;stroke:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FNwIOkXOm8kxHfXI .labelText,#mermaid-svg-FNwIOkXOm8kxHfXI .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-FNwIOkXOm8kxHfXI .loopText,#mermaid-svg-FNwIOkXOm8kxHfXI .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-FNwIOkXOm8kxHfXI .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-FNwIOkXOm8kxHfXI .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FNwIOkXOm8kxHfXI .noteText,#mermaid-svg-FNwIOkXOm8kxHfXI .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-FNwIOkXOm8kxHfXI .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-FNwIOkXOm8kxHfXI .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-FNwIOkXOm8kxHfXI .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-FNwIOkXOm8kxHfXI .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .section{stroke:none;opacity:0.2}#mermaid-svg-FNwIOkXOm8kxHfXI .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-FNwIOkXOm8kxHfXI .section2{fill:#fff400}#mermaid-svg-FNwIOkXOm8kxHfXI .section1,#mermaid-svg-FNwIOkXOm8kxHfXI .section3{fill:#fff;opacity:0.2}#mermaid-svg-FNwIOkXOm8kxHfXI .sectionTitle0{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .sectionTitle1{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .sectionTitle2{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .sectionTitle3{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-FNwIOkXOm8kxHfXI .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .grid path{stroke-width:0}#mermaid-svg-FNwIOkXOm8kxHfXI .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-FNwIOkXOm8kxHfXI .task{stroke-width:2}#mermaid-svg-FNwIOkXOm8kxHfXI .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .taskText:not([font-size]){font-size:11px}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-FNwIOkXOm8kxHfXI .task.clickable{cursor:pointer}#mermaid-svg-FNwIOkXOm8kxHfXI .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FNwIOkXOm8kxHfXI .taskText0,#mermaid-svg-FNwIOkXOm8kxHfXI .taskText1,#mermaid-svg-FNwIOkXOm8kxHfXI .taskText2,#mermaid-svg-FNwIOkXOm8kxHfXI .taskText3{fill:#fff}#mermaid-svg-FNwIOkXOm8kxHfXI .task0,#mermaid-svg-FNwIOkXOm8kxHfXI .task1,#mermaid-svg-FNwIOkXOm8kxHfXI .task2,#mermaid-svg-FNwIOkXOm8kxHfXI .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutside0,#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutside2{fill:#000}#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutside1,#mermaid-svg-FNwIOkXOm8kxHfXI .taskTextOutside3{fill:#000}#mermaid-svg-FNwIOkXOm8kxHfXI .active0,#mermaid-svg-FNwIOkXOm8kxHfXI .active1,#mermaid-svg-FNwIOkXOm8kxHfXI .active2,#mermaid-svg-FNwIOkXOm8kxHfXI .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-FNwIOkXOm8kxHfXI .activeText0,#mermaid-svg-FNwIOkXOm8kxHfXI .activeText1,#mermaid-svg-FNwIOkXOm8kxHfXI .activeText2,#mermaid-svg-FNwIOkXOm8kxHfXI .activeText3{fill:#000 !important}#mermaid-svg-FNwIOkXOm8kxHfXI .done0,#mermaid-svg-FNwIOkXOm8kxHfXI .done1,#mermaid-svg-FNwIOkXOm8kxHfXI .done2,#mermaid-svg-FNwIOkXOm8kxHfXI .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-FNwIOkXOm8kxHfXI .doneText0,#mermaid-svg-FNwIOkXOm8kxHfXI .doneText1,#mermaid-svg-FNwIOkXOm8kxHfXI .doneText2,#mermaid-svg-FNwIOkXOm8kxHfXI .doneText3{fill:#000 !important}#mermaid-svg-FNwIOkXOm8kxHfXI .crit0,#mermaid-svg-FNwIOkXOm8kxHfXI .crit1,#mermaid-svg-FNwIOkXOm8kxHfXI .crit2,#mermaid-svg-FNwIOkXOm8kxHfXI .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-FNwIOkXOm8kxHfXI .activeCrit0,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCrit1,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCrit2,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-FNwIOkXOm8kxHfXI .doneCrit0,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCrit1,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCrit2,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-FNwIOkXOm8kxHfXI .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-FNwIOkXOm8kxHfXI .milestoneText{font-style:italic}#mermaid-svg-FNwIOkXOm8kxHfXI .doneCritText0,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCritText1,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCritText2,#mermaid-svg-FNwIOkXOm8kxHfXI .doneCritText3{fill:#000 !important}#mermaid-svg-FNwIOkXOm8kxHfXI .activeCritText0,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCritText1,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCritText2,#mermaid-svg-FNwIOkXOm8kxHfXI .activeCritText3{fill:#000 !important}#mermaid-svg-FNwIOkXOm8kxHfXI .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-FNwIOkXOm8kxHfXI g.classGroup text .title{font-weight:bolder}#mermaid-svg-FNwIOkXOm8kxHfXI g.clickable{cursor:pointer}#mermaid-svg-FNwIOkXOm8kxHfXI g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FNwIOkXOm8kxHfXI g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-FNwIOkXOm8kxHfXI .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-FNwIOkXOm8kxHfXI .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FNwIOkXOm8kxHfXI .dashed-line{stroke-dasharray:3}#mermaid-svg-FNwIOkXOm8kxHfXI #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI .commit-id,#mermaid-svg-FNwIOkXOm8kxHfXI .commit-msg,#mermaid-svg-FNwIOkXOm8kxHfXI .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-FNwIOkXOm8kxHfXI g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-FNwIOkXOm8kxHfXI g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FNwIOkXOm8kxHfXI g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FNwIOkXOm8kxHfXI .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FNwIOkXOm8kxHfXI .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-FNwIOkXOm8kxHfXI .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-FNwIOkXOm8kxHfXI .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FNwIOkXOm8kxHfXI .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-FNwIOkXOm8kxHfXI .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-FNwIOkXOm8kxHfXI .edgeLabel text{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FNwIOkXOm8kxHfXI .node circle.state-start{fill:black;stroke:black}#mermaid-svg-FNwIOkXOm8kxHfXI .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-FNwIOkXOm8kxHfXI #statediagram-barbEnd{fill:#9370db}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-state .divider{stroke:#9370db}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-FNwIOkXOm8kxHfXI .note-edge{stroke-dasharray:5}#mermaid-svg-FNwIOkXOm8kxHfXI .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-FNwIOkXOm8kxHfXI .error-icon{fill:#522}#mermaid-svg-FNwIOkXOm8kxHfXI .error-text{fill:#522;stroke:#522}#mermaid-svg-FNwIOkXOm8kxHfXI .edge-thickness-normal{stroke-width:2px}#mermaid-svg-FNwIOkXOm8kxHfXI .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-FNwIOkXOm8kxHfXI .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-FNwIOkXOm8kxHfXI .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-FNwIOkXOm8kxHfXI .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-FNwIOkXOm8kxHfXI .marker{fill:#333}#mermaid-svg-FNwIOkXOm8kxHfXI .marker.cross{stroke:#333}:root { --mermaid-font-family: "trebuchet ms", verdana, arial;}#mermaid-svg-FNwIOkXOm8kxHfXI {color: rgba(0, 0, 0, 0.75);font: ;}

MatUtils::WarpAffine
CheckSrcAndDstMat
ArmMatConverterAcc::WarpAffine

CheckSrcAndDstMat 对输入输出的所在设备、数据类型及尺寸等进行检查。
ArmMatConverterAcc::WarpAffine 即 arm 设备上的实现。

    auto ret = CheckSrcAndDstMat(src, dst, true, true, true);if (ret != TNN_OK) {return ret;}if (dst.GetData() == nullptr) {// set dst size to src sizedst = Mat(dst.GetDeviceType(), dst.GetMatType(), src.GetDims());}MAT_CONVERTER_PREPARATION(src.GetDeviceType());return converter->WarpAffine(src, dst, param, command_queue);

ArmMatConverterAcc::WarpAffine

CheckMatConverterParams 检查输入输出是否为空,所属设备是否一致。
AFFINE_CHECK_RUN 检查参数,目前仅支持常量填充。

    Status ret = TNN_OK;ret = CheckMatConverterParams(src, dst, true);if (ret != TNN_OK)return ret;int dst_width  = dst.GetWidth();int dst_height = dst.GetHeight();if (dst_width == 0 || dst_height == 0) {return Status(TNNERR_INVALID_INPUT, "dst size is zero");}if (src.GetMatType() == NGRAY) {AFFINE_CHECK_RUN(WarpAffineBilinearC1, WarpAffineNearestC1);} else if (src.GetMatType() == N8UC3) {AFFINE_CHECK_RUN(WarpAffineBilinearC3, WarpAffineNearestC3);} else if (src.GetMatType() == N8UC4) {AFFINE_CHECK_RUN(WarpAffineBilinearC4, WarpAffineNearestC4);} else if (src.GetMatType() == NNV21 || src.GetMatType() == NNV12) {AFFINE_CHECK_RUN(WarpAffineBilinearYUV420sp, WarpAffineNearestYUV420sp);} else {return Status(TNNERR_PARAM_ERR, "ArmMatConverterAcc::WarpAffine, convert type not support yet");}return ret;

WarpAffineBilinearC1

#mermaid-svg-7Ash0gs4WD6eL8Cs .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .label text{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .node rect,#mermaid-svg-7Ash0gs4WD6eL8Cs .node circle,#mermaid-svg-7Ash0gs4WD6eL8Cs .node ellipse,#mermaid-svg-7Ash0gs4WD6eL8Cs .node polygon,#mermaid-svg-7Ash0gs4WD6eL8Cs .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-7Ash0gs4WD6eL8Cs .node .label{text-align:center;fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .node.clickable{cursor:pointer}#mermaid-svg-7Ash0gs4WD6eL8Cs .arrowheadPath{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-7Ash0gs4WD6eL8Cs .flowchart-link{stroke:#333;fill:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-7Ash0gs4WD6eL8Cs .edgeLabel rect{opacity:0.9}#mermaid-svg-7Ash0gs4WD6eL8Cs .edgeLabel span{color:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-7Ash0gs4WD6eL8Cs .cluster text{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-7Ash0gs4WD6eL8Cs .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-7Ash0gs4WD6eL8Cs text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .actor-line{stroke:grey}#mermaid-svg-7Ash0gs4WD6eL8Cs .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .sequenceNumber{fill:#fff}#mermaid-svg-7Ash0gs4WD6eL8Cs #sequencenumber{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs #crosshead path{fill:#333;stroke:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .messageText{fill:#333;stroke:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-7Ash0gs4WD6eL8Cs .labelText,#mermaid-svg-7Ash0gs4WD6eL8Cs .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .loopText,#mermaid-svg-7Ash0gs4WD6eL8Cs .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-7Ash0gs4WD6eL8Cs .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-7Ash0gs4WD6eL8Cs .noteText,#mermaid-svg-7Ash0gs4WD6eL8Cs .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-7Ash0gs4WD6eL8Cs .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-7Ash0gs4WD6eL8Cs .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-7Ash0gs4WD6eL8Cs .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .section{stroke:none;opacity:0.2}#mermaid-svg-7Ash0gs4WD6eL8Cs .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-7Ash0gs4WD6eL8Cs .section2{fill:#fff400}#mermaid-svg-7Ash0gs4WD6eL8Cs .section1,#mermaid-svg-7Ash0gs4WD6eL8Cs .section3{fill:#fff;opacity:0.2}#mermaid-svg-7Ash0gs4WD6eL8Cs .sectionTitle0{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .sectionTitle1{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .sectionTitle2{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .sectionTitle3{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-7Ash0gs4WD6eL8Cs .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .grid path{stroke-width:0}#mermaid-svg-7Ash0gs4WD6eL8Cs .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-7Ash0gs4WD6eL8Cs .task{stroke-width:2}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText:not([font-size]){font-size:11px}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-7Ash0gs4WD6eL8Cs .task.clickable{cursor:pointer}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText0,#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText1,#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText2,#mermaid-svg-7Ash0gs4WD6eL8Cs .taskText3{fill:#fff}#mermaid-svg-7Ash0gs4WD6eL8Cs .task0,#mermaid-svg-7Ash0gs4WD6eL8Cs .task1,#mermaid-svg-7Ash0gs4WD6eL8Cs .task2,#mermaid-svg-7Ash0gs4WD6eL8Cs .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutside0,#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutside2{fill:#000}#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutside1,#mermaid-svg-7Ash0gs4WD6eL8Cs .taskTextOutside3{fill:#000}#mermaid-svg-7Ash0gs4WD6eL8Cs .active0,#mermaid-svg-7Ash0gs4WD6eL8Cs .active1,#mermaid-svg-7Ash0gs4WD6eL8Cs .active2,#mermaid-svg-7Ash0gs4WD6eL8Cs .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-7Ash0gs4WD6eL8Cs .activeText0,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeText1,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeText2,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeText3{fill:#000 !important}#mermaid-svg-7Ash0gs4WD6eL8Cs .done0,#mermaid-svg-7Ash0gs4WD6eL8Cs .done1,#mermaid-svg-7Ash0gs4WD6eL8Cs .done2,#mermaid-svg-7Ash0gs4WD6eL8Cs .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-7Ash0gs4WD6eL8Cs .doneText0,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneText1,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneText2,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneText3{fill:#000 !important}#mermaid-svg-7Ash0gs4WD6eL8Cs .crit0,#mermaid-svg-7Ash0gs4WD6eL8Cs .crit1,#mermaid-svg-7Ash0gs4WD6eL8Cs .crit2,#mermaid-svg-7Ash0gs4WD6eL8Cs .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCrit0,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCrit1,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCrit2,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCrit0,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCrit1,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCrit2,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-7Ash0gs4WD6eL8Cs .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-7Ash0gs4WD6eL8Cs .milestoneText{font-style:italic}#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCritText0,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCritText1,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCritText2,#mermaid-svg-7Ash0gs4WD6eL8Cs .doneCritText3{fill:#000 !important}#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCritText0,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCritText1,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCritText2,#mermaid-svg-7Ash0gs4WD6eL8Cs .activeCritText3{fill:#000 !important}#mermaid-svg-7Ash0gs4WD6eL8Cs .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-7Ash0gs4WD6eL8Cs g.classGroup text .title{font-weight:bolder}#mermaid-svg-7Ash0gs4WD6eL8Cs g.clickable{cursor:pointer}#mermaid-svg-7Ash0gs4WD6eL8Cs g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-7Ash0gs4WD6eL8Cs g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-7Ash0gs4WD6eL8Cs .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-7Ash0gs4WD6eL8Cs .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .dashed-line{stroke-dasharray:3}#mermaid-svg-7Ash0gs4WD6eL8Cs #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs .commit-id,#mermaid-svg-7Ash0gs4WD6eL8Cs .commit-msg,#mermaid-svg-7Ash0gs4WD6eL8Cs .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-7Ash0gs4WD6eL8Cs g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-7Ash0gs4WD6eL8Cs g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-7Ash0gs4WD6eL8Cs g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-7Ash0gs4WD6eL8Cs .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-7Ash0gs4WD6eL8Cs .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-7Ash0gs4WD6eL8Cs .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-7Ash0gs4WD6eL8Cs .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-7Ash0gs4WD6eL8Cs .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-7Ash0gs4WD6eL8Cs .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-7Ash0gs4WD6eL8Cs .edgeLabel text{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-7Ash0gs4WD6eL8Cs .node circle.state-start{fill:black;stroke:black}#mermaid-svg-7Ash0gs4WD6eL8Cs .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-7Ash0gs4WD6eL8Cs #statediagram-barbEnd{fill:#9370db}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-state .divider{stroke:#9370db}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-7Ash0gs4WD6eL8Cs .note-edge{stroke-dasharray:5}#mermaid-svg-7Ash0gs4WD6eL8Cs .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-7Ash0gs4WD6eL8Cs .error-icon{fill:#522}#mermaid-svg-7Ash0gs4WD6eL8Cs .error-text{fill:#522;stroke:#522}#mermaid-svg-7Ash0gs4WD6eL8Cs .edge-thickness-normal{stroke-width:2px}#mermaid-svg-7Ash0gs4WD6eL8Cs .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-7Ash0gs4WD6eL8Cs .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-7Ash0gs4WD6eL8Cs .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-7Ash0gs4WD6eL8Cs .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-7Ash0gs4WD6eL8Cs .marker{fill:#333}#mermaid-svg-7Ash0gs4WD6eL8Cs .marker.cross{stroke:#333}:root { --mermaid-font-family: "trebuchet ms", verdana, arial;}#mermaid-svg-7Ash0gs4WD6eL8Cs {color: rgba(0, 0, 0, 0.75);font: ;}

WarpAffineBilinearC1
WarpAffineBilinear

脱离了 TNN 定义的结构体,便于移植。
调用模板函数 WarpAffineBilinear。

    WarpAffineBilinear<1>(src, batch, src_w, src_h, dst, dst_w, dst_h, transform, border_val);

WarpAffineBilinear

#mermaid-svg-fOBg9bYftcPVXKsR .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-fOBg9bYftcPVXKsR .label text{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .node rect,#mermaid-svg-fOBg9bYftcPVXKsR .node circle,#mermaid-svg-fOBg9bYftcPVXKsR .node ellipse,#mermaid-svg-fOBg9bYftcPVXKsR .node polygon,#mermaid-svg-fOBg9bYftcPVXKsR .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-fOBg9bYftcPVXKsR .node .label{text-align:center;fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .node.clickable{cursor:pointer}#mermaid-svg-fOBg9bYftcPVXKsR .arrowheadPath{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-fOBg9bYftcPVXKsR .flowchart-link{stroke:#333;fill:none}#mermaid-svg-fOBg9bYftcPVXKsR .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-fOBg9bYftcPVXKsR .edgeLabel rect{opacity:0.9}#mermaid-svg-fOBg9bYftcPVXKsR .edgeLabel span{color:#333}#mermaid-svg-fOBg9bYftcPVXKsR .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-fOBg9bYftcPVXKsR .cluster text{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-fOBg9bYftcPVXKsR .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-fOBg9bYftcPVXKsR text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-fOBg9bYftcPVXKsR .actor-line{stroke:grey}#mermaid-svg-fOBg9bYftcPVXKsR .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-fOBg9bYftcPVXKsR .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-fOBg9bYftcPVXKsR #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-fOBg9bYftcPVXKsR .sequenceNumber{fill:#fff}#mermaid-svg-fOBg9bYftcPVXKsR #sequencenumber{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR #crosshead path{fill:#333;stroke:#333}#mermaid-svg-fOBg9bYftcPVXKsR .messageText{fill:#333;stroke:#333}#mermaid-svg-fOBg9bYftcPVXKsR .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-fOBg9bYftcPVXKsR .labelText,#mermaid-svg-fOBg9bYftcPVXKsR .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-fOBg9bYftcPVXKsR .loopText,#mermaid-svg-fOBg9bYftcPVXKsR .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-fOBg9bYftcPVXKsR .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-fOBg9bYftcPVXKsR .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-fOBg9bYftcPVXKsR .noteText,#mermaid-svg-fOBg9bYftcPVXKsR .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-fOBg9bYftcPVXKsR .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-fOBg9bYftcPVXKsR .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-fOBg9bYftcPVXKsR .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-fOBg9bYftcPVXKsR .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .section{stroke:none;opacity:0.2}#mermaid-svg-fOBg9bYftcPVXKsR .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-fOBg9bYftcPVXKsR .section2{fill:#fff400}#mermaid-svg-fOBg9bYftcPVXKsR .section1,#mermaid-svg-fOBg9bYftcPVXKsR .section3{fill:#fff;opacity:0.2}#mermaid-svg-fOBg9bYftcPVXKsR .sectionTitle0{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .sectionTitle1{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .sectionTitle2{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .sectionTitle3{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-fOBg9bYftcPVXKsR .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .grid path{stroke-width:0}#mermaid-svg-fOBg9bYftcPVXKsR .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-fOBg9bYftcPVXKsR .task{stroke-width:2}#mermaid-svg-fOBg9bYftcPVXKsR .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .taskText:not([font-size]){font-size:11px}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-fOBg9bYftcPVXKsR .task.clickable{cursor:pointer}#mermaid-svg-fOBg9bYftcPVXKsR .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-fOBg9bYftcPVXKsR .taskText0,#mermaid-svg-fOBg9bYftcPVXKsR .taskText1,#mermaid-svg-fOBg9bYftcPVXKsR .taskText2,#mermaid-svg-fOBg9bYftcPVXKsR .taskText3{fill:#fff}#mermaid-svg-fOBg9bYftcPVXKsR .task0,#mermaid-svg-fOBg9bYftcPVXKsR .task1,#mermaid-svg-fOBg9bYftcPVXKsR .task2,#mermaid-svg-fOBg9bYftcPVXKsR .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutside0,#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutside2{fill:#000}#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutside1,#mermaid-svg-fOBg9bYftcPVXKsR .taskTextOutside3{fill:#000}#mermaid-svg-fOBg9bYftcPVXKsR .active0,#mermaid-svg-fOBg9bYftcPVXKsR .active1,#mermaid-svg-fOBg9bYftcPVXKsR .active2,#mermaid-svg-fOBg9bYftcPVXKsR .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-fOBg9bYftcPVXKsR .activeText0,#mermaid-svg-fOBg9bYftcPVXKsR .activeText1,#mermaid-svg-fOBg9bYftcPVXKsR .activeText2,#mermaid-svg-fOBg9bYftcPVXKsR .activeText3{fill:#000 !important}#mermaid-svg-fOBg9bYftcPVXKsR .done0,#mermaid-svg-fOBg9bYftcPVXKsR .done1,#mermaid-svg-fOBg9bYftcPVXKsR .done2,#mermaid-svg-fOBg9bYftcPVXKsR .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-fOBg9bYftcPVXKsR .doneText0,#mermaid-svg-fOBg9bYftcPVXKsR .doneText1,#mermaid-svg-fOBg9bYftcPVXKsR .doneText2,#mermaid-svg-fOBg9bYftcPVXKsR .doneText3{fill:#000 !important}#mermaid-svg-fOBg9bYftcPVXKsR .crit0,#mermaid-svg-fOBg9bYftcPVXKsR .crit1,#mermaid-svg-fOBg9bYftcPVXKsR .crit2,#mermaid-svg-fOBg9bYftcPVXKsR .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-fOBg9bYftcPVXKsR .activeCrit0,#mermaid-svg-fOBg9bYftcPVXKsR .activeCrit1,#mermaid-svg-fOBg9bYftcPVXKsR .activeCrit2,#mermaid-svg-fOBg9bYftcPVXKsR .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-fOBg9bYftcPVXKsR .doneCrit0,#mermaid-svg-fOBg9bYftcPVXKsR .doneCrit1,#mermaid-svg-fOBg9bYftcPVXKsR .doneCrit2,#mermaid-svg-fOBg9bYftcPVXKsR .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-fOBg9bYftcPVXKsR .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-fOBg9bYftcPVXKsR .milestoneText{font-style:italic}#mermaid-svg-fOBg9bYftcPVXKsR .doneCritText0,#mermaid-svg-fOBg9bYftcPVXKsR .doneCritText1,#mermaid-svg-fOBg9bYftcPVXKsR .doneCritText2,#mermaid-svg-fOBg9bYftcPVXKsR .doneCritText3{fill:#000 !important}#mermaid-svg-fOBg9bYftcPVXKsR .activeCritText0,#mermaid-svg-fOBg9bYftcPVXKsR .activeCritText1,#mermaid-svg-fOBg9bYftcPVXKsR .activeCritText2,#mermaid-svg-fOBg9bYftcPVXKsR .activeCritText3{fill:#000 !important}#mermaid-svg-fOBg9bYftcPVXKsR .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-fOBg9bYftcPVXKsR g.classGroup text .title{font-weight:bolder}#mermaid-svg-fOBg9bYftcPVXKsR g.clickable{cursor:pointer}#mermaid-svg-fOBg9bYftcPVXKsR g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-fOBg9bYftcPVXKsR g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-fOBg9bYftcPVXKsR .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-fOBg9bYftcPVXKsR .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-fOBg9bYftcPVXKsR .dashed-line{stroke-dasharray:3}#mermaid-svg-fOBg9bYftcPVXKsR #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR .commit-id,#mermaid-svg-fOBg9bYftcPVXKsR .commit-msg,#mermaid-svg-fOBg9bYftcPVXKsR .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-fOBg9bYftcPVXKsR g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-fOBg9bYftcPVXKsR g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-fOBg9bYftcPVXKsR g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-fOBg9bYftcPVXKsR .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-fOBg9bYftcPVXKsR .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-fOBg9bYftcPVXKsR .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-fOBg9bYftcPVXKsR .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-fOBg9bYftcPVXKsR .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-fOBg9bYftcPVXKsR .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-fOBg9bYftcPVXKsR .edgeLabel text{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-fOBg9bYftcPVXKsR .node circle.state-start{fill:black;stroke:black}#mermaid-svg-fOBg9bYftcPVXKsR .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-fOBg9bYftcPVXKsR #statediagram-barbEnd{fill:#9370db}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-state .divider{stroke:#9370db}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-fOBg9bYftcPVXKsR .note-edge{stroke-dasharray:5}#mermaid-svg-fOBg9bYftcPVXKsR .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-fOBg9bYftcPVXKsR .error-icon{fill:#522}#mermaid-svg-fOBg9bYftcPVXKsR .error-text{fill:#522;stroke:#522}#mermaid-svg-fOBg9bYftcPVXKsR .edge-thickness-normal{stroke-width:2px}#mermaid-svg-fOBg9bYftcPVXKsR .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-fOBg9bYftcPVXKsR .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-fOBg9bYftcPVXKsR .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-fOBg9bYftcPVXKsR .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-fOBg9bYftcPVXKsR .marker{fill:#333}#mermaid-svg-fOBg9bYftcPVXKsR .marker.cross{stroke:#333}:root { --mermaid-font-family: "trebuchet ms", verdana, arial;}#mermaid-svg-fOBg9bYftcPVXKsR {color: rgba(0, 0, 0, 0.75);font: ;}

WarpAffineBilinear
WarpAffineInit
WarpAffinePrepareOneRow
WarpAffineCalculateOneRow

WarpAffineInit 为buffer开辟内存并计算好adeltabdelta
buf_loctab_loc为局部缓冲区和表。
src2指向第2行。

    int src_plane = src_h * src_w * schannel;int* buffer = nullptr;WarpAffineInit(dst, batch, dst_w, dst_h, schannel, border_val, transform, &buffer);int* adelta = buffer;int* bdelta = buffer + dst_w * 2;int max_num_threads = OMP_MAX_THREADS_NUM_;int* buf_loc        = new int[dst_w * max_num_threads];short* tab_loc      = new short[dst_w * max_num_threads];const unsigned char* src2 = src + src_w * schannel;

dst_loc_base为当初行偏移。
buf_loc_ttab_loc_t指向当前线程可用的内存。
WarpAffinePrepareOneRow 计算映射关系,即生成 map。
WarpAffineCalculateOneRow 根据源像素生成结果。

    OMP_PARALLEL_FOR_for (int y = 0; y < dst_h * batch; ++y) {int thread_id    = OMP_TID_;int x_count      = 0;int end_x        = 0;int dst_loc_base = y * dst_w * schannel;int* buf_loc_t   = buf_loc + thread_id * dst_w;short* tab_loc_t = tab_loc + thread_id * dst_w;WarpAffinePrepareOneRow(buf_loc_t, tab_loc_t, adelta, bdelta, schannel, src, src_w, src_h,dst + dst_loc_base, dst_w, y % dst_h, (y / dst_h) * src_plane, x_count, end_x, border_val);WarpAffineCalculateOneRow(end_x - x_count + 1, end_x, schannel, dst_loc_base, buf_loc_t, tab_loc_t, src, src2, dst);}delete[] buf_loc;delete[] tab_loc;free(buffer);

WarpAffineInit

将目标填充边界值。比较暴力。
InitInterTab2D 生成插值表。
WarpAffineMatrixInverse 对参数矩阵求逆。

    uint8_t border_ival = (uint8_t)border_val;memset(dst, border_ival, batch * dst_h * dst_w * channel);// Init LookUp TableInitInterTab2D();double m[6];WarpAffineMatrixInverse(transform, m);

预先计算行列元素变换系数,adelta为 M11xM_{11}xM11​x 和 M21xM_{21}xM21​x,bdelta为 M12y+M13M_{12}y+ M_{13}M12​y+M13​ 和 M22y+M23M_{22}y+ M_{23}M22​y+M23​。

    *buffer = reinterpret_cast<int*>(armMalloc((dst_w + dst_h) * 2 * sizeof(int)));int* adelta = *buffer;int* bdelta = *buffer + dst_w * 2;for (int x = 0; x < dst_w; x++) {*adelta++ = SATURATE_CAST_INT(m[0] * x * 1024);*adelta++ = SATURATE_CAST_INT(m[3] * x * 1024);}for (int y = 0; y < dst_h; y++) {*bdelta++ = SATURATE_CAST_INT((m[1] * y + m[2]) * 1024);*bdelta++ = SATURATE_CAST_INT((m[4] * y + m[5]) * 1024);}

InitInterTab2D

    static bool inited = false;if (inited) {return;}short* itab = BilinearTab_i[0][0];int ksize = KSIZE;float* _tab = new float[2 * INTER_TAB_SIZE];int i, j, k1, k2;InitInterTab1D(_tab, INTER_TAB_SIZE);for (i = 0; i < INTER_TAB_SIZE; i++) {for (j = 0; j < INTER_TAB_SIZE; j++, itab += ksize * ksize) {int isum = 0;for (k1 = 0; k1 < ksize; k1++) {float vy = _tab[i * ksize + k1];for (k2 = 0; k2 < ksize; k2++) {float v                       = vy * _tab[j * ksize + k2];isum += itab[k1 * ksize + k2] = SATURATE_CAST_SHORT(v * INTER_REMAP_COEF_SCALE);}}if (isum != INTER_REMAP_COEF_SCALE) {int diff   = isum - INTER_REMAP_COEF_SCALE;int ksize2 = ksize / 2, Mk1 = ksize2, Mk2 = ksize2, mk1 = ksize2, mk2 = ksize2;for (k1 = ksize2; k1 < ksize2 + 2; k1++)for (k2 = ksize2; k2 < ksize2 + 2; k2++) {if (itab[k1 * ksize + k2] < itab[mk1 * ksize + mk2])mk1 = k1, mk2 = k2;else if (itab[k1 * ksize + k2] > itab[Mk1 * ksize + Mk2])Mk1 = k1, Mk2 = k2;}if (diff < 0)itab[Mk1 * ksize + Mk2] = (short)(itab[Mk1 * ksize + Mk2] - diff);elseitab[mk1 * ksize + mk2] = (short)(itab[mk1 * ksize + mk2] - diff);}}}delete[] _tab;

WarpAffineMatrixInverse

M=[m0m1m2m3m4m5m6m7m8]=[m0m1m2m3m4m5m6m7m8]=[ABOD]M = \begin{bmatrix} m_0 & m_1 & m_2 \\ m_3 & m_4 & m_5 \\ m_6 & m_7 & m_8 \end{bmatrix} =\begin{bmatrix} \begin{array}{c c|c} m_0 & m_1 & m_2 \\ m_3 & m_4 & m_5 \\\hline m_6 & m_7 & m_8 \end{array} \end{bmatrix} =\begin{bmatrix} A & B \\ O & D \end{bmatrix} M=⎣⎡​m0​m3​m6​​m1​m4​m7​​m2​m5​m8​​⎦⎤​=⎣⎡​m0​m3​m6​​m1​m4​m7​​m2​m5​m8​​​​⎦⎤​=[AO​BD​]
对于块上三角矩阵,且 D=ID=ID=I
M−1=[A−1−A−1BD−1OD−1]=[A−1−A−1BOI]M^{-1}=\begin{bmatrix} A^{-1} & -A^{-1}BD^{-1} \\ O & D^{-1} \end{bmatrix} =\begin{bmatrix} A^{-1} & -A^{-1}B \\ O & I \end{bmatrix} M−1=[A−1O​−A−1BD−1D−1​]=[A−1O​−A−1BI​]
其中
A−1=1det(A)adj(A)=1m0m4−m1m3[A11A12A21A22]=1m0m4−m1m3[m4−m1−m3m0]\begin{aligned} A^{−1}&=\frac{1}{\mathrm{det}(A)}\mathrm{adj}(A)\\ &=\frac{1}{m_0 m_4- m_1 m_3}\begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}\\ &=\frac{1}{m_0 m_4- m_1 m_3}\begin{bmatrix} m_4 & -m_1 \\ -m_3 & m_0 \end{bmatrix} \end{aligned} A−1​=det(A)1​adj(A)=m0​m4​−m1​m3​1​[A11​A21​​A12​A22​​]=m0​m4​−m1​m3​1​[m4​−m3​​−m1​m0​​]​

    double M[6];M[0] = transform[0][0];M[1] = transform[0][1];M[2] = transform[0][2];M[3] = transform[1][0];M[4] = transform[1][1];M[5] = transform[1][2];// Inverse transform matrixdouble D   = M[0] * M[4] - M[1] * M[3];D          = D != 0 ? 1. / D : 0;double A11 = M[4] * D, A22 = M[0] * D;inverse[0]      = A11;inverse[1]      = M[1] * (-D);inverse[3]      = M[3] * (-D);inverse[4]      = A22;double b1 = -A11        * M[2] - inverse[1] * M[5];double b2 = -inverse[3] * M[2] - A22        * M[5];inverse[2]      = b1;inverse[5]      = b2;

WarpAffinePrepareOneRow

src2没有传入,而是再次计算了一遍。
vld1q_s32 从内存加载4个元素到寄存器。
_bdelta0 加载当前行所对应的 M12yM_{12}yM12​y 和 M22yM_{22}yM22​y,_bdelta则为两份的拼接。
_src_w为不同维度上的步长。

    const unsigned char* src2 = src + src_w * channel;short xy_loc_buf[dst_w * 2];short tb_loc_buf[dst_w];int   sc_loc_buf[dst_w];int*   adelta_p     = adelta;short* xy_loc_buf_p = xy_loc_buf;short* tb_loc_buf_p = tb_loc_buf;int*   sc_loc_buf_p = sc_loc_buf;int x = 0;
#ifdef TNN_USE_NEONint32x2_t _bdelta0 = vld1_s32(bdelta + 2 * y);int32x4_t _bdelta  = vcombine_s32(_bdelta0, _bdelta0);int32x4_t _offset  = vdupq_n_s32(16);int16x8_t _mask    = vdupq_n_s16(31);int16x8_t _coeff   = {1,32,1,32,1,32,1,32};int32x4_t _channel = vdupq_n_s32(channel);int32x4_t _soffset = vdupq_n_s32(src_offset);int16x4_t _src_w   = {1, (short)src_w,1,(short)src_w};

每次处理目的区域的4个元素。
vaddq_s32 执行4个整数的加法。
_xyxy0为前两个目的像素所对应的源图坐标 (M11x+M12y+M13,M21x+M22y+M23)(M_{11}x + M_{12}y + M_{13}, M_{21}x+M_{22}y + M_{23})(M11​x+M12​y+M13​,M21​x+M22​y+M23​),_xyxy1为后两个。ncnn 中 x 和 y 是分开的。
右移10位还原后,_xyxy0s_xyxy1s为真实坐标值。
vcombine_s16 将两个较小的向量合并为一个较大的向量。
_xyxy01s为4对坐标值。
vst1q_s16 存储8个元素到内存。

    for (; x < dst_w>>2<<2; x += 4) {int32x4_t _xyxy0   = vaddq_s32(vld1q_s32(adelta_p), _offset);int32x4_t _xyxy1   = vaddq_s32(vld1q_s32(adelta_p + 4), _offset);_xyxy0             = vaddq_s32(_xyxy0, _bdelta);_xyxy1             = vaddq_s32(_xyxy1, _bdelta);int16x4_t _xyxy0s  = vshrn_n_s32(_xyxy0, 10);//int16x8_t _xyxy01s = vshrn_high_n_s32(_xyxy0s, _xyxy1, 10);int16x4_t _xyxy1s  = vshrn_n_s32(_xyxy1, 10);int16x8_t _xyxy01s = vcombine_s16(_xyxy0s, _xyxy1s);vst1q_s16(xy_loc_buf_p, _xyxy01s);

vmull_s16 为向量长乘。
_src_0中得到两部分偏移,即xy*w
vmulq_s32
vst1_s16 存储到内存中。
sc_loc_buf_p保存4个目的像素对应到源图上的偏移。

        int32x4_t _src_0   = vmull_s16(_xyxy0s, _src_w);int32x4_t _src_1   = vmull_s16(vget_high_s16(_xyxy01s), _src_w);vst1q_s32(sc_loc_buf_p, vaddq_s32(vmulq_s32(_channel, VPADDQ_S32(_src_0, _src_1)), _soffset));

减少右移位数可以保留一定范围内的小数值。

        _xyxy0s            = vshrn_n_s32(_xyxy0, 5);//_xyxy01s           = vshrn_high_n_s32(_xyxy0s, _xyxy1, 5);_xyxy1s            = vshrn_n_s32(_xyxy1, 5);_xyxy01s           = vcombine_s16(_xyxy0s, _xyxy1s);int16x8_t _tab_xys = vmulq_s16(vandq_s16(_xyxy01s, _mask), _coeff);vst1_s16(tb_loc_buf_p, vpadd_s16(vget_low_s16(_tab_xys), vget_high_s16(_tab_xys)));adelta_p     += 8;xy_loc_buf_p += 8;tb_loc_buf_p += 4;sc_loc_buf_p += 4;}if (dst_w % 4) {x -= 4;}
#endif

对于末尾未对齐的数据。
new_x为 M11x+M12y+M13M_{11}x + M_{12}y + M_{13}M11​x+M12​y+M13​, new_y为 M21x+M22y+M23M_{21}x+M_{22}y + M_{23}M21​x+M22​y+M23​。
由于adeltabdelta乘以了1024,右移后new_x_locnew_y_loc为对应源图坐标。
tb_loc_buf将 x 和 y 的小数部分拼接起来存储。
sc_loc_buf记录一维格式下的地址。

    for (; x < dst_w; ++x) {int new_x     = adelta[2 * x] + bdelta[2 * y] + 16;int new_y     = adelta[2 * x + 1] + bdelta[2 * y + 1] + 16;int new_x_loc = new_x >> 10;int new_y_loc = new_y >> 10;xy_loc_buf[2 * x]     = new_x_loc;xy_loc_buf[2 * x + 1] = new_y_loc;tb_loc_buf[x] = ((new_x >> 5) & 31) + ((new_y >> 5) & 31) * 32;sc_loc_buf[x] = (new_x_loc + new_y_loc * src_w) * channel + src_offset;}

CheckDataIsOnBoundary 检查数据是否在边界上。
如果源图上的坐标在边界内,则保存到buf_loc,插值系数保存到tab_loc
否则,如果在边界上则计算出变换后的结果。
wtab为系数乘积表。
mask0mask1mask2mask3用于判断插值的4个点是否超出。

    for (x = 0; x < dst_w; ++x) {short new_x_loc    = xy_loc_buf[2 * x];short new_y_loc    = xy_loc_buf[2 * x + 1];short new_xy_float = tb_loc_buf[x];int   src_loc      = sc_loc_buf[x];if ((unsigned)new_x_loc < (src_w - 1) && (unsigned)new_y_loc < (src_h - 1)) {buf_loc[x] = src_loc;tab_loc[x] = new_xy_float;x_count++;end_x = x;} else if (CheckDataIsOnBoundary(new_x_loc, new_y_loc, src_w, src_h)) {short* wtab = BilinearTab_i[new_xy_float][0];int dsc_loc = x * channel;int mask0 = new_x_loc >= 0 && new_y_loc >= 0;int mask1 = new_x_loc <= (src_w - 2) && new_y_loc >= 0;int mask2 = new_x_loc >= 0 && new_y_loc <= (src_h - 2);int mask3 = new_x_loc <= (src_w - 2) && new_y_loc <= (src_h - 2);for (int c = 0; c < channel; ++c) {int val_xy = 0;val_xy += wtab[0] * (mask0 ? src[src_loc + c] : border_val);val_xy += wtab[1] * (mask1 ? src[src_loc + channel + c] : border_val);val_xy += wtab[2] * (mask2 ? src2[src_loc + c] : border_val);val_xy += wtab[3] * (mask3 ? src2[src_loc + channel + c] : border_val);dst[dsc_loc + c] = SATURATE_CAST_UCHAR((val_xy + (1 << 14)) >> 15);}}}

WarpAffineCalculateOneRow

buf_loc_p对应 OpenCV 中的_map1,但这里是一维的。

    const int* buf_loc_p   = buf_loc + begin_x;const short* tab_loc_p = tab_loc + begin_x;const short* tab_p     = BilinearTab_i[0][0];int x                  = begin_x;#ifdef TNN_USE_NEON#define MAKE_CAL(n)                                                                 \_val0    = vmull_s16(_tab0, vget_low_s16(_src16_0##n));                         \_val1    = vmull_s16(_tab1, vget_high_s16(_src16_0##n));                        \_val2    = vmull_s16(_tab2, vget_low_s16(_src16_1##n));                         \_val3    = vmull_s16(_tab3, vget_high_s16(_src16_1##n));                        \_res0123 = VPADDQ_S32(VPADDQ_S32(_val0, _val1), VPADDQ_S32(_val2, _val3));      \_res0123 = vaddq_s32(_res0123, _offset);                                        \_res16   = vshrn_n_s32(_res0123, 15);                                           \_resu8.val[n] = vqmovun_s16(vcombine_s16(_res16, _res16));                      \

#define CAL_C0() MAKE_CAL(0)
#define CAL_C1() MAKE_CAL(1)
#define CAL_C2() MAKE_CAL(2)
#define CAL_C3() MAKE_CAL(3)#endif

对于单通道图像,加载4个对应源像素周围的4点。
dst_p为当前目的图上的起始地址。
_src_loc为4个源像素的偏移,考虑到插值像素的局部性质,与src1src2配合可以访问4个目的像素所需的元素。
_src01存储了前两个目的像素所对应的元素,_src23为后两个。
vmovl_u8 左移变成uint16_src16_0_src16_1与系数乘积宽度相同。

    if (channel == 1) {#ifdef TNN_USE_NEONuint8_t* dst_p         = dst +  dst_loc_base + begin_x * 1;int32x4_t _offset      = vdupq_n_s32(1 << 14);int simd_loop          = 0;for (; x <= end_x - 3; x += 4) {int32x4_t _src_loc = vld1q_s32(buf_loc_p);uint8x8_t _src01   = uint8x8_t();_src01 = vld1_lane_u8(src1 + _src_loc[0], _src01, 0);_src01 = vld1_lane_u8(src1 + _src_loc[0] + 1, _src01, 1);_src01 = vld1_lane_u8(src2 + _src_loc[0], _src01, 2);_src01 = vld1_lane_u8(src2 + _src_loc[0] + 1, _src01, 3);_src01 = vld1_lane_u8(src1 + _src_loc[1], _src01, 4);_src01 = vld1_lane_u8(src1 + _src_loc[1] + 1, _src01, 5);_src01 = vld1_lane_u8(src2 + _src_loc[1], _src01, 6);_src01 = vld1_lane_u8(src2 + _src_loc[1] + 1, _src01, 7);int16x8_t _src16_0 = vreinterpretq_s16_u16(vmovl_u8(_src01));uint8x8_t _src23   = uint8x8_t();_src23 = vld1_lane_u8(src1 + _src_loc[2], _src23, 0);_src23 = vld1_lane_u8(src1 + _src_loc[2] + 1, _src23, 1);_src23 = vld1_lane_u8(src2 + _src_loc[2], _src23, 2);_src23 = vld1_lane_u8(src2 + _src_loc[2] + 1, _src23, 3);_src23 = vld1_lane_u8(src1 + _src_loc[3], _src23, 4);_src23 = vld1_lane_u8(src1 + _src_loc[3] + 1, _src23, 5);_src23 = vld1_lane_u8(src2 + _src_loc[3], _src23, 6);_src23 = vld1_lane_u8(src2 + _src_loc[3] + 1, _src23, 7);int16x8_t _src16_1 = vreinterpretq_s16_u16(vmovl_u8(_src23));

_tab0_tab1_tab2_tab3为4个目的像素对应的系数乘积。
_val0为第一个目的像素的分立结果。
VPADDQ_S32 调用 vpaddq_s32,相邻元素相加。
_res0123为计算得到的4个目的像素值。
_offset为舍入值。右移后得到_res16
vqmovun_s16 每个有符号整数值,将其饱和为一个原始宽度的一半的无符号整数值。
由于 vqmovun_s16 每次处理8个数所以这里存在一半的浪费。

            int16x4_t _tab_loc = vld1_s16(tab_loc_p);int16x4_t _tab0    = vld1_s16(tab_p + _tab_loc[0] * 4);int16x4_t _tab1    = vld1_s16(tab_p + _tab_loc[1] * 4);int16x4_t _tab2    = vld1_s16(tab_p + _tab_loc[2] * 4);int16x4_t _tab3    = vld1_s16(tab_p + _tab_loc[3] * 4);int32x4_t _val0    = vmull_s16(_tab0, vget_low_s16(_src16_0));int32x4_t _val1    = vmull_s16(_tab1, vget_high_s16(_src16_0));int32x4_t _val2    = vmull_s16(_tab2, vget_low_s16(_src16_1));int32x4_t _val3    = vmull_s16(_tab3, vget_high_s16(_src16_1));int32x4_t _res0123 = VPADDQ_S32(VPADDQ_S32(_val0, _val1), VPADDQ_S32(_val2, _val3));_res0123           = vaddq_s32(_res0123, _offset);int16x4_t _res16   = vshrn_n_s32(_res0123, 15);uint8x8_t _resu8   = vqmovun_s16(vcombine_s16(_res16, _res16));

保存4个结果。

            vst1_lane_u8(dst_p, _resu8, 0);vst1_lane_u8(dst_p + 1, _resu8, 1);vst1_lane_u8(dst_p + 2, _resu8, 2);vst1_lane_u8(dst_p + 3, _resu8, 3);buf_loc_p += 4;tab_loc_p += 4;dst_p     += 4;++simd_loop;}x = begin_x + (simd_loop << 2);
#endif

dst_loc为目的地址,src_loc为对应到源图上的地址。
point0point1point2point3为源图上周围4点,val_xy0为双线性插值的结果。

        for (; x <= end_x; x++) {int dst_loc = dst_loc_base + x * 1;int src_loc = buf_loc[x];short* wtab = BilinearTab_i[tab_loc[x]][0];int point0 = src1[src_loc];int point1 = src1[src_loc + 1];int point2 = src2[src_loc];int point3 = src2[src_loc + 1];int val_xy0  = wtab[0] * point0 + wtab[1] * point1 + wtab[2] * point2 + wtab[3] * point3;dst[dst_loc] = SATURATE_CAST_UCHAR((val_xy0 + (1 << 14)) >> 15);}

对于2通道的图像,

    } else if (channel == 2) {#ifdef TNN_USE_NEONuint8_t* dst_p         = dst +  dst_loc_base + begin_x * 2;int32x4_t _offset      = vdupq_n_s32(1 << 14);int simd_loop          = 0;for (; x <= end_x - 3; x += 4) {int32x4_t _src_loc = vld1q_s32(buf_loc_p);uint8x8x2_t _src01 = uint8x8x2_t();_src01 = vld2_lane_u8(src1 + _src_loc[0], _src01, 0);_src01 = vld2_lane_u8(src1 + _src_loc[0] + 2, _src01, 1);_src01 = vld2_lane_u8(src2 + _src_loc[0], _src01, 2);_src01 = vld2_lane_u8(src2 + _src_loc[0] + 2, _src01, 3);_src01 = vld2_lane_u8(src1 + _src_loc[1], _src01, 4);_src01 = vld2_lane_u8(src1 + _src_loc[1] + 2, _src01, 5);_src01 = vld2_lane_u8(src2 + _src_loc[1], _src01, 6);_src01 = vld2_lane_u8(src2 + _src_loc[1] + 2, _src01, 7);int16x8_t _src16_00 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[0]));int16x8_t _src16_01 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[1]));uint8x8x2_t _src23  = uint8x8x2_t();_src23 = vld2_lane_u8(src1 + _src_loc[2], _src23, 0);_src23 = vld2_lane_u8(src1 + _src_loc[2] + 2, _src23, 1);_src23 = vld2_lane_u8(src2 + _src_loc[2], _src23, 2);_src23 = vld2_lane_u8(src2 + _src_loc[2] + 2, _src23, 3);_src23 = vld2_lane_u8(src1 + _src_loc[3], _src23, 4);_src23 = vld2_lane_u8(src1 + _src_loc[3] + 2, _src23, 5);_src23 = vld2_lane_u8(src2 + _src_loc[3], _src23, 6);_src23 = vld2_lane_u8(src2 + _src_loc[3] + 2, _src23, 7);int16x8_t _src16_10 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[0]));int16x8_t _src16_11 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[1]));int16x4_t _tab_loc = vld1_s16(tab_loc_p);int16x4_t _tab0    = vld1_s16(tab_p + _tab_loc[0] * 4);int16x4_t _tab1    = vld1_s16(tab_p + _tab_loc[1] * 4);int16x4_t _tab2    = vld1_s16(tab_p + _tab_loc[2] * 4);int16x4_t _tab3    = vld1_s16(tab_p + _tab_loc[3] * 4);int32x4_t _val0, _val1, _val2, _val3, _res0123;int16x4_t _res16;uint8x8x2_t _resu8;CAL_C0();CAL_C1();vst2_lane_u8(dst_p, _resu8, 0);vst2_lane_u8(dst_p + 2, _resu8, 1);vst2_lane_u8(dst_p + 4, _resu8, 2);vst2_lane_u8(dst_p + 6, _resu8, 3);buf_loc_p += 4;tab_loc_p += 4;dst_p     += 8;++simd_loop;}x = begin_x + (simd_loop << 2);
#endiffor (; x <= end_x; x++) {int dst_loc = dst_loc_base + x * 2;int src_loc = buf_loc[x];short* wtab = BilinearTab_i[tab_loc[x]][0];int point00 = src1[src_loc];int point01 = src1[src_loc + 1];int point02 = src1[src_loc + 2];int point03 = src1[src_loc + 3];int point10 = src2[src_loc];int point11 = src2[src_loc + 1];int point12 = src2[src_loc + 2];int point13 = src2[src_loc + 3];int val_xy0      = wtab[0] * point00 + wtab[1] * point02 + wtab[2] * point10 + wtab[3] * point12;int val_xy1      = wtab[0] * point01 + wtab[1] * point03 + wtab[2] * point11 + wtab[3] * point13;dst[dst_loc]     = SATURATE_CAST_UCHAR((val_xy0 + (1 << 14)) >> 15);dst[dst_loc + 1] = SATURATE_CAST_UCHAR((val_xy1 + (1 << 14)) >> 15);}

对于3通道数据,

    } else if (channel == 3) {#ifdef TNN_USE_NEONuint8_t* dst_p         = dst +  dst_loc_base + begin_x * 3;int32x4_t _offset      = vdupq_n_s32(1 << 14);int simd_loop          = 0;for (; x <= end_x - 3; x += 4) {int32x4_t _src_loc = vld1q_s32(buf_loc_p);uint8x8x3_t _src01 = uint8x8x3_t();_src01 = vld3_lane_u8(src1 + _src_loc[0], _src01, 0);_src01 = vld3_lane_u8(src1 + _src_loc[0] + 3, _src01, 1);_src01 = vld3_lane_u8(src2 + _src_loc[0], _src01, 2);_src01 = vld3_lane_u8(src2 + _src_loc[0] + 3, _src01, 3);_src01 = vld3_lane_u8(src1 + _src_loc[1], _src01, 4);_src01 = vld3_lane_u8(src1 + _src_loc[1] + 3, _src01, 5);_src01 = vld3_lane_u8(src2 + _src_loc[1], _src01, 6);_src01 = vld3_lane_u8(src2 + _src_loc[1] + 3, _src01, 7);int16x8_t _src16_00 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[0]));int16x8_t _src16_01 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[1]));int16x8_t _src16_02 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[2]));uint8x8x3_t _src23  = uint8x8x3_t();_src23 = vld3_lane_u8(src1 + _src_loc[2], _src23, 0);_src23 = vld3_lane_u8(src1 + _src_loc[2] + 3, _src23, 1);_src23 = vld3_lane_u8(src2 + _src_loc[2], _src23, 2);_src23 = vld3_lane_u8(src2 + _src_loc[2] + 3, _src23, 3);_src23 = vld3_lane_u8(src1 + _src_loc[3], _src23, 4);_src23 = vld3_lane_u8(src1 + _src_loc[3] + 3, _src23, 5);_src23 = vld3_lane_u8(src2 + _src_loc[3], _src23, 6);_src23 = vld3_lane_u8(src2 + _src_loc[3] + 3, _src23, 7);int16x8_t _src16_10 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[0]));int16x8_t _src16_11 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[1]));int16x8_t _src16_12 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[2]));int16x4_t _tab_loc = vld1_s16(tab_loc_p);int16x4_t _tab0    = vld1_s16(tab_p + _tab_loc[0] * 4);int16x4_t _tab1    = vld1_s16(tab_p + _tab_loc[1] * 4);int16x4_t _tab2    = vld1_s16(tab_p + _tab_loc[2] * 4);int16x4_t _tab3    = vld1_s16(tab_p + _tab_loc[3] * 4);int32x4_t _val0, _val1, _val2, _val3, _res0123;int16x4_t _res16;uint8x8x3_t _resu8;CAL_C0();CAL_C1();CAL_C2();vst3_lane_u8(dst_p, _resu8, 0);vst3_lane_u8(dst_p + 3, _resu8, 1);vst3_lane_u8(dst_p + 6, _resu8, 2);vst3_lane_u8(dst_p + 9, _resu8, 3);buf_loc_p += 4;tab_loc_p += 4;dst_p     += 12;++simd_loop;}x = begin_x + (simd_loop << 2);
#endiffor (; x <= end_x; x++) {int dst_loc = dst_loc_base + x * 3;int src_loc = buf_loc[x];short* wtab = BilinearTab_i[tab_loc[x]][0];int point00 = src1[src_loc];int point01 = src1[src_loc + 1];int point02 = src1[src_loc + 2];int point03 = src1[src_loc + 3];int point04 = src1[src_loc + 4];int point05 = src1[src_loc + 5];int point10 = src2[src_loc];int point11 = src2[src_loc + 1];int point12 = src2[src_loc + 2];int point13 = src2[src_loc + 3];int point14 = src2[src_loc + 4];int point15 = src2[src_loc + 5];int val_xy0      = wtab[0] * point00 + wtab[1] * point03 + wtab[2] * point10 + wtab[3] * point13;int val_xy1      = wtab[0] * point01 + wtab[1] * point04 + wtab[2] * point11 + wtab[3] * point14;int val_xy2      = wtab[0] * point02 + wtab[1] * point05 + wtab[2] * point12 + wtab[3] * point15;dst[dst_loc]     = SATURATE_CAST_UCHAR((val_xy0 + (1 << 14)) >> 15);dst[dst_loc + 1] = SATURATE_CAST_UCHAR((val_xy1 + (1 << 14)) >> 15);dst[dst_loc + 2] = SATURATE_CAST_UCHAR((val_xy2 + (1 << 14)) >> 15);}
    } else if (channel == 4) {#ifdef TNN_USE_NEONuint8_t* dst_p         = dst +  dst_loc_base + begin_x * 4;int32x4_t _offset      = vdupq_n_s32(1 << 14);int simd_loop          = 0;for (; x <= end_x - 3; x += 4) {int32x4_t _src_loc = vld1q_s32(buf_loc_p);uint8x8x4_t _src01 = uint8x8x4_t();_src01 = vld4_lane_u8(src1 + _src_loc[0], _src01, 0);_src01 = vld4_lane_u8(src1 + _src_loc[0] + 4, _src01, 1);_src01 = vld4_lane_u8(src2 + _src_loc[0], _src01, 2);_src01 = vld4_lane_u8(src2 + _src_loc[0] + 4, _src01, 3);_src01 = vld4_lane_u8(src1 + _src_loc[1], _src01, 4);_src01 = vld4_lane_u8(src1 + _src_loc[1] + 4, _src01, 5);_src01 = vld4_lane_u8(src2 + _src_loc[1], _src01, 6);_src01 = vld4_lane_u8(src2 + _src_loc[1] + 4, _src01, 7);int16x8_t _src16_00 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[0]));int16x8_t _src16_01 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[1]));int16x8_t _src16_02 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[2]));int16x8_t _src16_03 = vreinterpretq_s16_u16(vmovl_u8(_src01.val[3]));uint8x8x4_t _src23  = uint8x8x4_t();_src23 = vld4_lane_u8(src1 + _src_loc[2], _src23, 0);_src23 = vld4_lane_u8(src1 + _src_loc[2] + 4, _src23, 1);_src23 = vld4_lane_u8(src2 + _src_loc[2], _src23, 2);_src23 = vld4_lane_u8(src2 + _src_loc[2] + 4, _src23, 3);_src23 = vld4_lane_u8(src1 + _src_loc[3], _src23, 4);_src23 = vld4_lane_u8(src1 + _src_loc[3] + 4, _src23, 5);_src23 = vld4_lane_u8(src2 + _src_loc[3], _src23, 6);_src23 = vld4_lane_u8(src2 + _src_loc[3] + 4, _src23, 7);int16x8_t _src16_10 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[0]));int16x8_t _src16_11 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[1]));int16x8_t _src16_12 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[2]));int16x8_t _src16_13 = vreinterpretq_s16_u16(vmovl_u8(_src23.val[3]));int16x4_t _tab_loc = vld1_s16(tab_loc_p);int16x4_t _tab0    = vld1_s16(tab_p + _tab_loc[0] * 4);int16x4_t _tab1    = vld1_s16(tab_p + _tab_loc[1] * 4);int16x4_t _tab2    = vld1_s16(tab_p + _tab_loc[2] * 4);int16x4_t _tab3    = vld1_s16(tab_p + _tab_loc[3] * 4);int32x4_t _val0, _val1, _val2, _val3, _res0123;int16x4_t _res16;uint8x8x4_t _resu8;CAL_C0();CAL_C1();CAL_C2();CAL_C3();vst4_lane_u8(dst_p, _resu8, 0);vst4_lane_u8(dst_p + 4, _resu8, 1);vst4_lane_u8(dst_p + 8, _resu8, 2);vst4_lane_u8(dst_p + 12, _resu8, 3);buf_loc_p += 4;tab_loc_p += 4;dst_p     += 16;++simd_loop;}x = begin_x + (simd_loop << 2);
#endiffor (; x <= end_x; x++) {int dst_loc = dst_loc_base + x * 4;int src_loc = buf_loc[x];short* wtab = BilinearTab_i[tab_loc[x]][0];int point00 = src1[src_loc];int point01 = src1[src_loc + 1];int point02 = src1[src_loc + 2];int point03 = src1[src_loc + 3];int point04 = src1[src_loc + 4];int point05 = src1[src_loc + 5];int point06 = src1[src_loc + 6];int point07 = src1[src_loc + 7];int point10 = src2[src_loc];int point11 = src2[src_loc + 1];int point12 = src2[src_loc + 2];int point13 = src2[src_loc + 3];int point14 = src2[src_loc + 4];int point15 = src2[src_loc + 5];int point16 = src2[src_loc + 6];int point17 = src2[src_loc + 7];int val_xy0      = wtab[0] * point00 + wtab[1] * point04 + wtab[2] * point10 + wtab[3] * point14;int val_xy1      = wtab[0] * point01 + wtab[1] * point05 + wtab[2] * point11 + wtab[3] * point15;int val_xy2      = wtab[0] * point02 + wtab[1] * point06 + wtab[2] * point12 + wtab[3] * point16;int val_xy3      = wtab[0] * point03 + wtab[1] * point07 + wtab[2] * point13 + wtab[3] * point17;dst[dst_loc]     = SATURATE_CAST_UCHAR((val_xy0 + (1 << 14)) >> 15);dst[dst_loc + 1] = SATURATE_CAST_UCHAR((val_xy1 + (1 << 14)) >> 15);dst[dst_loc + 2] = SATURATE_CAST_UCHAR((val_xy2 + (1 << 14)) >> 15);dst[dst_loc + 3] = SATURATE_CAST_UCHAR((val_xy3 + (1 << 14)) >> 15);}}#ifdef TNN_USE_NEON#undef MAKE_CAL
#undef CAL_C0
#undef CAL_C1
#undef CAL_C2
#undef CAL_C3#endif

参考资料:

  • Simple 3x3 matrix inverse code (C++)
  • Inverse of a Matrix
  • How to Find the Inverse of a 3x3 Matrix
  • Matrix inversion of a3×3matrix
  • Efficient 4x4 matrix inverse (affine transform)
  • Inverses of 2 × 2 Block Matrices
  • General Formula: Matrix Inversion Lemma
  • 常用NEON 内置函数记录备用
  • Neon Intrinsics各函数介绍

TNN MatConverter WarpAffine相关推荐

  1. TNN MatConverter CvtColor NV21TOBGR

    OpenCV 中的 carotene 对于 armv7优化较好,而 armv8下则是 NEON 实现.TNN 提供了一套图像预处理接口并且进行了汇编优化.下面以 NV21TOBGR 为例进行介绍. M ...

  2. OpenCV中图像旋转(warpAffine)算法的实现过程

    在OpenCV中,目前并没有现成的函数直接用来实现图像旋转,它是用仿射变换函数cv::warpAffine来实现的,此函数目前支持4种插值算法,最近邻.双线性.双三次.兰索斯插值,如果传进去的参数为基 ...

  3. 重构ncnn,腾讯优图开源新一代移动端推理框架TNN

    来源 | 腾讯优图 从学界到工业界,"开源"已经成为AI领域的一个关键词.一方面,它以"授人以渔"的方式为AI构建了一个开放共进的生态环境,帮助行业加速AI应用 ...

  4. 深度学习框架大PK:TNN决战MNN,ncnn依旧经典

    近年来,开发者社区中,「开源」成了新流行趋势. 尤其是深度学习框架,自腾讯2017年将ncnn开源之后,各大AI实验室都「慷慨」的将自己的框架开源,以实现较为快速的创新. 今年6月10日,腾讯又宣布基 ...

  5. 使用netron对TensorFlow、Pytorch、Keras、PaddlePaddle、MXNet、Caffe、ONNX、UFF、TNN、ncnn、OpenVINO等模型的可视化

    欢迎大家关注笔者,你的关注是我持续更博的最大动力 原创文章,转载告知,盗版必究 使用netron对TensorFlow.Pytorch.Keras.PaddlePaddle.MXNet.Caffe.O ...

  6. warpAffine函数

    warpAffine函数 函数作用: 对图像进行仿射变换,变换的图像边界是平行的 函数调用形式: C++:void warpAffine(InputArray src, OutputArray dst ...

  7. 【OpenCV3】图像旋转与平移——cv::warpAffine()详解

    图像旋转和平移是图像处理中常用的一种操作,opencv2和opencv3中对图像的旋转和平移都是通过仿射变换函数cv::warpAffine()来实现的. 1.图像的旋转 图像的旋转具体实现分为两步: ...

  8. TNN API说明文档

    TNN API说明文档 TNN:https://github.com/Tencent/TNN 说明文档:https://github.com/Tencent/TNN/blob/master/doc/c ...

  9. TNN MatConvertParam参数scale和bias设置

    Pytorch的Normalize的计算过程是:TNN MatConvertParam参数设置 使用TNN进行模型推理前,需要进行必要的预处理,如下需要设置TNN_NS::MatConvertPara ...

最新文章

  1. hibernate tools for eclipse plugins在线怎么安装
  2. AD域与外部网站域名相同处理办法
  3. 块状元素的text-align对齐属性
  4. 中文条件jsp mysql_jsp MySQL中的一些中文问题的解决
  5. Utility Manager 的一些百度不了的操作
  6. mac下安装mysql-pyhon_mac下安装MySQL-python模块
  7. ubuntu16.04命令行模式和图形界面互相切换
  8. 大学生创新创业实务 复习题(无答案)
  9. 关于将网易有道词典单词本导出到必应词典生词本的尝试
  10. 贷款软件测试经典bug描述,如何描述bug
  11. 卫星轨道的计算是利用计算机的,轨道计算
  12. 将计算机放置桌面右上角,如何给电脑桌面上添加我的电脑快捷方式
  13. linux wa%过高,iostat查看io状况
  14. 微软官方简体中文版Vs2008与MSDN下载
  15. 【Translate插件】报错:更新TTK失败,请检查网络连接问题
  16. linux mysql命令行登录_Linux 操作MySQL常用命令行
  17. C# Base64转换
  18. UltraEdit使用简介
  19. SQL Server连接表
  20. 华三通信用“五心”寻找政务云的“答案”

热门文章

  1. 深度学习_目标检测(二)——ODA(三)TensorFlow API版本(一)尝试使用
  2. PHP5 session
  3. 【鲁大师实验室】九号N100电动车最详细评测,我们给它跑了个分
  4. Redis 6.0 新特性-多线程连环13问
  5. SQL SERVER 分组求和
  6. Info2.0 让技术人员失业的技术
  7. seata 分布式事务没有传递xid导致事务失效解决方案
  8. recurdyn接触特征参数含义
  9. java实验:电商购物平台(demo)
  10. 微信小程序利用Canvas实现绘画直线