C++解析xml的开源库有很多,在此我就不一一列举了,今天主要说下Rapidxml,我使用这个库也并不是很多,如有错误之处还望大家能够之处,谢谢。

附:

官方链接:http://rapidxml.sourceforge.net/

官方手册:http://rapidxml.sourceforge.net/manual.html

之前有一次用到,碰到了个"坑",当时时间紧迫并未及时查找,今天再次用到这个库,对这样的"坑"不能踩第二次,因此我决定探个究竟。

先写两段示例:

创建xml:

 1 void CreateXml()
 2 {
 3     rapidxml::xml_document<> doc;
 4
 5     auto nodeDecl = doc.allocate_node(rapidxml::node_declaration);
 6     nodeDecl->append_attribute(doc.allocate_attribute("version", "1.0"));
 7     nodeDecl->append_attribute(doc.allocate_attribute("encoding", "UTF-8"));
 8     doc.append_node(nodeDecl);//添加xml声明
 9
10     auto nodeRoot = doc.allocate_node(rapidxml::node_element, "Root");//创建一个Root节点
11     nodeRoot->append_node(doc.allocate_node(rapidxml::node_comment, NULL, "编程语言"));//添加一个注释内容到Root,注释没有name 所以第二个参数为NULL
12     auto nodeLangrage = doc.allocate_node(rapidxml::node_element, "language", "This is C language");//创建一个language节点
13     nodeLangrage->append_attribute(doc.allocate_attribute("name", "C"));//添加一个name属性到language
14     nodeRoot->append_node(nodeLangrage); //添加一个language到Root节点
15     nodeLangrage = doc.allocate_node(rapidxml::node_element, "language", "This is C++ language");//创建一个language节点
16     nodeLangrage->append_attribute(doc.allocate_attribute("name", "C++"));//添加一个name属性到language
17     nodeRoot->append_node(nodeLangrage); //添加一个language到Root节点
18
19     doc.append_node(nodeRoot);//添加Root节点到Document
20     std::string buffer;
21     rapidxml::print(std::back_inserter(buffer), doc, 0);
22     std::ofstream outFile("language.xml");
23     outFile << buffer;
24     outFile.close();
25 }

结果:

1 <?xml version="1.0" encoding="UTF-8"?>
2 <Root>
3     <!--编程语言-->
4     <language name="C">This is C language</language>
5     <language name="C++">This is C++ language</language>
6 </Root>

修改xml:

 1 void MotifyXml()
 2 {
 3     rapidxml::file<> requestFile("language.xml");//从文件加载xml
 4     rapidxml::xml_document<> doc;
 5     doc.parse<0>(requestFile.data());//解析xml
 6
 7     auto nodeRoot = doc.first_node();//获取第一个节点,也就是Root节点
 8     auto nodeLanguage = nodeRoot->first_node("language");//获取Root下第一个language节点
 9     nodeLanguage->first_attribute("name")->value("Motify C");//修改language节点的name属性为 Motify C
10     std::string buffer;
11     rapidxml::print(std::back_inserter(buffer), doc, 0);
12     std::ofstream outFile("MotifyLanguage.xml");
13     outFile << buffer;
14     outFile.close();
15 }

结果:

1 <Root>
2     <language name="Motify C">This is C language</language>
3     <language name="C++">This is C++ language</language>
4 </Root>

由第二个结果得出:

第一个language的name属性确实改成我们所期望的值了,不过不难发现xml的声明和注释都消失了。是怎么回事呢?这个问题也困扰了我一段时间,既然是开源库,那我们跟一下看看他都干了什么,从代码可以看出可疑的地方主要有两处:print和parse,这两个函数均需要提供一个flag,这个flag到底都干了什么呢,从官方给的教程来看 均使用的0,既然最终执行的是print我们就从print开始调试跟踪吧

找到了找到print调用的地方:

1     template<class OutIt, class Ch>
2     inline OutIt print(OutIt out, const xml_node<Ch> &node, int flags = 0)
3     {
4         return internal::print_node(out, &node, flags, 0);
5     }

继续跟踪:

 1  // Print node
 2         template<class OutIt, class Ch>
 3         inline OutIt print_node(OutIt out, const xml_node<Ch> *node, int flags, int indent)
 4         {
 5             // Print proper node type
 6             switch (node->type())
 7             {
 8
 9             // Document
10             case node_document:
11                 out = print_children(out, node, flags, indent);
12                 break;
13
14             // Element
15             case node_element:
16                 out = print_element_node(out, node, flags, indent);
17                 break;
18
19             // Data
20             case node_data:
21                 out = print_data_node(out, node, flags, indent);
22                 break;
23
24             // CDATA
25             case node_cdata:
26                 out = print_cdata_node(out, node, flags, indent);
27                 break;
28
29             // Declaration
30             case node_declaration:
31                 out = print_declaration_node(out, node, flags, indent);
32                 break;
33
34             // Comment
35             case node_comment:
36                 out = print_comment_node(out, node, flags, indent);
37                 break;
38
39             // Doctype
40             case node_doctype:
41                 out = print_doctype_node(out, node, flags, indent);
42                 break;
43
44             // Pi
45             case node_pi:
46                 out = print_pi_node(out, node, flags, indent);
47                 break;
48
49                 // Unknown
50             default:
51                 assert(0);
52                 break;
53             }
54
55             // If indenting not disabled, add line break after node
56             if (!(flags & print_no_indenting))
57                 *out = Ch('\n'), ++out;
58
59             // Return modified iterator
60             return out;
61         }

跟进print_children 发现这实际是个递归,我们继续跟踪

 1 // Print element node
 2 template<class OutIt, class Ch>
 3 inline OutIt print_element_node(OutIt out, const xml_node<Ch> *node, int flags, int indent)
 4 {
 5     assert(node->type() == node_element);
 6
 7     // Print element name and attributes, if any
 8     if (!(flags & print_no_indenting))
 9     ...//省略部分代码
10
11     return out;
12 } 

我们发现第8行有一个&判断 查看print_no_indenting的定义:

1     // Printing flags
2
3     const int print_no_indenting = 0x1;   //!< Printer flag instructing the printer to suppress indenting of XML. See print() function.

据此我们就可以分析了,按照开发风格统一的思想,parse也应该有相同的标志定义

省略分析parse流程..

我也顺便去查看了官方文档,确实和我预想的一样,贴一下头文件中对这些标志的描述,详细信息可参考官方文档

  1  // Parsing flags
  2
  3     //! Parse flag instructing the parser to not create data nodes.
  4     //! Text of first data node will still be placed in value of parent element, unless rapidxml::parse_no_element_values flag is also specified.
  5     //! Can be combined with other flags by use of | operator.
  6     //! <br><br>
  7     //! See xml_document::parse() function.
  8     const int parse_no_data_nodes = 0x1;
  9
 10     //! Parse flag instructing the parser to not use text of first data node as a value of parent element.
 11     //! Can be combined with other flags by use of | operator.
 12     //! Note that child data nodes of element node take precendence over its value when printing.
 13     //! That is, if element has one or more child data nodes <em>and</em> a value, the value will be ignored.
 14     //! Use rapidxml::parse_no_data_nodes flag to prevent creation of data nodes if you want to manipulate data using values of elements.
 15     //! <br><br>
 16     //! See xml_document::parse() function.
 17     const int parse_no_element_values = 0x2;
 18
 19     //! Parse flag instructing the parser to not place zero terminators after strings in the source text.
 20     //! By default zero terminators are placed, modifying source text.
 21     //! Can be combined with other flags by use of | operator.
 22     //! <br><br>
 23     //! See xml_document::parse() function.
 24     const int parse_no_string_terminators = 0x4;
 25
 26     //! Parse flag instructing the parser to not translate entities in the source text.
 27     //! By default entities are translated, modifying source text.
 28     //! Can be combined with other flags by use of | operator.
 29     //! <br><br>
 30     //! See xml_document::parse() function.
 31     const int parse_no_entity_translation = 0x8;
 32
 33     //! Parse flag instructing the parser to disable UTF-8 handling and assume plain 8 bit characters.
 34     //! By default, UTF-8 handling is enabled.
 35     //! Can be combined with other flags by use of | operator.
 36     //! <br><br>
 37     //! See xml_document::parse() function.
 38     const int parse_no_utf8 = 0x10;
 39
 40     //! Parse flag instructing the parser to create XML declaration node.
 41     //! By default, declaration node is not created.
 42     //! Can be combined with other flags by use of | operator.
 43     //! <br><br>
 44     //! See xml_document::parse() function.
 45     const int parse_declaration_node = 0x20;
 46
 47     //! Parse flag instructing the parser to create comments nodes.
 48     //! By default, comment nodes are not created.
 49     //! Can be combined with other flags by use of | operator.
 50     //! <br><br>
 51     //! See xml_document::parse() function.
 52     const int parse_comment_nodes = 0x40;
 53
 54     //! Parse flag instructing the parser to create DOCTYPE node.
 55     //! By default, doctype node is not created.
 56     //! Although W3C specification allows at most one DOCTYPE node, RapidXml will silently accept documents with more than one.
 57     //! Can be combined with other flags by use of | operator.
 58     //! <br><br>
 59     //! See xml_document::parse() function.
 60     const int parse_doctype_node = 0x80;
 61
 62     //! Parse flag instructing the parser to create PI nodes.
 63     //! By default, PI nodes are not created.
 64     //! Can be combined with other flags by use of | operator.
 65     //! <br><br>
 66     //! See xml_document::parse() function.
 67     const int parse_pi_nodes = 0x100;
 68
 69     //! Parse flag instructing the parser to validate closing tag names.
 70     //! If not set, name inside closing tag is irrelevant to the parser.
 71     //! By default, closing tags are not validated.
 72     //! Can be combined with other flags by use of | operator.
 73     //! <br><br>
 74     //! See xml_document::parse() function.
 75     const int parse_validate_closing_tags = 0x200;
 76
 77     //! Parse flag instructing the parser to trim all leading and trailing whitespace of data nodes.
 78     //! By default, whitespace is not trimmed.
 79     //! This flag does not cause the parser to modify source text.
 80     //! Can be combined with other flags by use of | operator.
 81     //! <br><br>
 82     //! See xml_document::parse() function.
 83     const int parse_trim_whitespace = 0x400;
 84
 85     //! Parse flag instructing the parser to condense all whitespace runs of data nodes to a single space character.
 86     //! Trimming of leading and trailing whitespace of data is controlled by rapidxml::parse_trim_whitespace flag.
 87     //! By default, whitespace is not normalized.
 88     //! If this flag is specified, source text will be modified.
 89     //! Can be combined with other flags by use of | operator.
 90     //! <br><br>
 91     //! See xml_document::parse() function.
 92     const int parse_normalize_whitespace = 0x800;
 93
 94     // Compound flags
 95
 96     //! Parse flags which represent default behaviour of the parser.
 97     //! This is always equal to 0, so that all other flags can be simply ored together.
 98     //! Normally there is no need to inconveniently disable flags by anding with their negated (~) values.
 99     //! This also means that meaning of each flag is a <i>negation</i> of the default setting.
100     //! For example, if flag name is rapidxml::parse_no_utf8, it means that utf-8 is <i>enabled</i> by default,
101     //! and using the flag will disable it.
102     //! <br><br>
103     //! See xml_document::parse() function.
104     const int parse_default = 0;
105
106     //! A combination of parse flags that forbids any modifications of the source text.
107     //! This also results in faster parsing. However, note that the following will occur:
108     //! <ul>
109     //! <li>names and values of nodes will not be zero terminated, you have to use xml_base::name_size() and xml_base::value_size() functions to determine where name and value ends</li>
110     //! <li>entities will not be translated</li>
111     //! <li>whitespace will not be normalized</li>
112     //! </ul>
113     //! See xml_document::parse() function.
114     const int parse_non_destructive = parse_no_string_terminators | parse_no_entity_translation;
115
116     //! A combination of parse flags resulting in fastest possible parsing, without sacrificing important data.
117     //! <br><br>
118     //! See xml_document::parse() function.
119     const int parse_fastest = parse_non_destructive | parse_no_data_nodes;
120
121     //! A combination of parse flags resulting in largest amount of data being extracted.
122     //! This usually results in slowest parsing.
123     //! <br><br>
124     //! See xml_document::parse() function.
125     const int parse_full = parse_declaration_node | parse_comment_nodes | parse_doctype_node | parse_pi_nodes | parse_validate_closing_tags;

根据以上提供的信息我们改下之前的源代码:

1 doc.parse<0>(requestFile.data());//解析xml
2 auto nodeRoot = doc.first_node("");//获取第一个节点,也就是Root节点

改为

1 doc.parse<rapidxml::parse_declaration_node | rapidxml::parse_comment_nodes | rapidxml::parse_non_destructive>(requestFile.data());//解析xml
2 auto nodeRoot = doc.first_node("Root");//获取第一个节点,也就是Root节点

这里解释一下,parse加入了三个标志,分别是告诉解析器创建声明节点、告诉解析器创建注释节点、和不希望解析器修改传进去的数据,第二句是当有xml的声明时,默认的first_node并不是我们期望的Root节点,因此通过传节点名来找到我们需要的节点。

注:

1.这个库在append的时候并不去判断添加项(节点、属性等)是否存在

2.循环遍历时对项(节点、属性等)进行修改会导致迭代失效

总结:用别人写的库,总会有些意想不到的问题,至今我只遇到了这些问题,如果还有其它问题欢迎补充,顺便解释下"坑"并不一定是用的开源库有问题,更多的时候可能是还没有熟练的去使用这个工具。

感谢rapidxml的作者,为我们提供一个如此高效便利的工具。

转载于:https://www.cnblogs.com/pene/p/6873592.html

使用Rapidxml 库遇到的问题和分析过程相关推荐

  1. c语言测序,一次Hi-C建库测序,两种分析,你不心动?

    原标题:一次Hi-C建库测序,两种分析,你不心动? 基于Hi-C测序数据,既可进行基因组辅助组装,又可对基因组序列.基因结构及其调控元件的三维空间结构互作进行差异分析,结合基因功能研究,深入解析关键科 ...

  2. NLP之情感分析:基于python编程(jieba库)实现中文文本情感分析(得到的是情感评分)之全部代码

    NLP之情感分析:基于python编程(jieba库)实现中文文本情感分析(得到的是情感评分)之全部代码 目录 全部代码 相关文章 NLP之情感分析:基于python编程(jieba库)实现中文文本情 ...

  3. 文本分析软件_读书笔记:伍多库卡茨质性文本分析:方法、实践与软件使用指南...

    读书笔记:伍多·库卡茨<质性文本分析:方法.实践与软件使用指南> 一.这篇文章.这本书或这篇论文的中心思想.核心观点是什么?核心观点:质性数据如何系统化分析?三大主要方法:主题分析.评估分 ...

  4. DPDK 跟踪库tracepoint源码实例分析

    DPDK笔记 DPDK 跟踪库tracepoint源码实例分析 RToax 2021年4月 注意: 跟踪库 基于DPDK 20.05 DPDK跟踪库:trace library 1. trace流程源 ...

  5. 可视化拖拽组件库一些技术要点原理分析(三)

    本文是可视化拖拽系列的第三篇,之前的两篇文章一共对 17 个功能点的技术原理进行了分析: 编辑器 自定义组件 拖拽 删除组件.调整图层层级 放大缩小 撤消.重做 组件属性设置 吸附 预览.保存代码 绑 ...

  6. 易语言.尘土界面库2.0版源代码分析(1):缘起

    作者:liigo 原文链接:http://blog.csdn.net/liigo/archive/2009/06/23/4292691.aspx 转载请注明出处:http://blog.csdn.ne ...

  7. 可视化拖拽组件库一些技术要点原理分析(二)

    本文是对<可视化拖拽组件库一些技术要点原理分析>[1]的补充.上一篇文章主要讲解了以下几个功能点: 1.编辑器2.自定义组件3.拖拽4.删除组件.调整图层层级5.放大缩小6.撤消.重做7. ...

  8. NLP之TEA:基于python编程(jieba库)实现中文文本情感分析(得到的是情感评分)之全部代码

    NLP之TEA:基于python编程(jieba库)实现中文文本情感分析(得到的是情感评分)之全部代码 目录 全部代码 相关文章 NLP之TEA:基于python编程(jieba库)实现中文文本情感分 ...

  9. SQL实现FIFO算法:库龄继承、配额分析

    1.背景 为了保证集团成品/物料的跌价计提统一正确,集团子公司内部交易场景中,由A公司转卖成品/物料到B公司,需要实现账龄继承(即:物料在A公司账龄为36,转卖到B公司后,该物料账龄为36,不能从0开 ...

最新文章

  1. android开发 修改标题栏背景_基于.NET的APP开发方式中MobileForm的使用smobiler
  2. Software-Defined Networking之搬砖的故事
  3. 具有用户定义类型的format的示例用法
  4. linux crontab不运行,Linux运维知识之解决Linux中crontab不执行ntpdate问题
  5. Android 属性动画ObjectAnimator使用demo,组合动画
  6. js关闭iframe窗口_[Selenium]24.处理弹窗新式的模态窗口
  7. TTYL的完整形式是什么?
  8. 如何给Xcode添加我们常用的插件呢?
  9. linux防火墙--iptables(三)
  10. mysql创建视图不允许子查询
  11. 《原力计划【第二季】》第 5 周周榜揭晓!!!
  12. b - 数据结构实验之排序二:交换排序_数据结构学习大纲
  13. [软件更新]gladder2.0.3.3
  14. C# sqlsugar依赖引用报错的问题解决
  15. 如何在ArcGIS中打开卫星影像
  16. heima并发30---ConcurrentHashMap--274(143-149)
  17. 修改Visata下的无线网卡(Intel 5100 agn)Mac地址
  18. Django models 筛选不等于
  19. 绝了!多个激光雷达和相机的快速且鲁棒的外参标定方法(代码开源)
  20. Spring Boot【定制化】~ AOP统一结果处理以及异常拦截

热门文章

  1. 2.1数据处理安全:文件加密
  2. android安装文件类型,下列哪一种属于Android智能型移动装置上的安装文件类型( )...
  3. 蓝牙 传输 socket
  4. 红帽认证—RHCSA
  5. 回收废品,每次选两个来切碎,分别是m和n且m<=n, 当m==n时 ,两个都切碎(即数组直接少了m和n),如果m<n,则m切碎,n=n-m ,切到最后 剩余一个废品则返回废品大小,若剩0个则返回0
  6. Apache代理配置
  7. 【FL studio12】神一样的插件-Pitcher
  8. 自适应粒子群优化算法的MATLAB性能仿真
  9. android的本地通讯录获取以及RecyclerView展示
  10. js原生后代选择器_JavaScript DOM查询,原生js实现元素子节点的获取