# fasterxml **Repository Path**: calvinwilliams/fasterxml ## Basic Information - **Project Name**: fasterxml - **Description**: fasterxml是一个SAX模式的XML解析器。它直接解析XML文本,调用注册的事件函数,快速访问XML中感兴趣的内容。 - **Primary Language**: C - **License**: LGPL-2.1 - **Default Branch**: release - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 22 - **Forks**: 16 - **Created**: 2014-10-05 - **Last Updated**: 2025-04-22 ## Categories & Tags **Categories**: utils **Tags**: None ## README fasterxml - 最快的XML解析器 1.概述 2.编译安装 3.基本使用 4.性能 5.最后 ------------------------------------------------------------ 1.概述 fasterxml是一个快速SAX模式的XML解析器。它直接解析XML文本,调用注册的事件函数,快速访问XML中感兴趣的内容。 fasterxml主要用于只读方式挖掘XML数据到应用对象中,如XML配置文件解析、大型XML报文解包。因为它不像DOM模式需要先构造一个完整的文档树,所以速度非常快,几乎接近strlen的性能。 fasterxml非常小巧,整个源代码由一对fasterxml.h,fasterxml.c组成,500行,13KB,不依赖于任何其它库,容易嵌入到您的项目中。 2.编译安装 以Linux操作系统为例,每个目录里的构造脚本均是makefile.Linux,外层目录makefile.Linux递归下级子目录 [code] $ cd src $ make -f makefile.Linux install gcc -g -fPIC -O2 -Wall -Werror -fno-strict-aliasing -I. -c fasterxml.c gcc -g -fPIC -O2 -Wall -Werror -fno-strict-aliasing -o libfasterxml.so fasterxml.o -shared -L. cp -f libfasterxml.so /usr/lib/exlib/ cp -f fasterxml.h /usr/include/fasterxml/ [/code] 3.基本使用 fasterxml使用接口函数只有三个,下面依次说明之 [code] typedef int funcCallbackOnXmlNode( int type , char *xpath , int xpath_len , int xpath_size , char *node , int node_len , char *properties , int properties_len , char *content , int content_len , void *p ); int TravelXmlBuffer( char *xml_buffer , char *xpath , int xpath_size , funcCallbackOnXmlNode *pfuncCallbackOnXmlNode , void *p ); int TravelXmlBuffer4( char *xml_buffer , char *xpath , int xpath_size , funcCallbackOnXmlNode *pfuncCallbackOnXmlNode , funcCallbackOnXmlNode *pfuncCallbackOnEnterXmlNode , funcCallbackOnXmlNode *pfuncCallbackOnLeaveXmlNode , funcCallbackOnXmlNode *pfuncCallbackOnXmlLeaf , void *p ); [/code] 函数TravelXmlBuffer读入XML文本并快速解析之,当遇到XML树枝或XML树叶时调用pfuncCallbackOnXmlNode呼叫应用访问,应用可根据type确定读到了树枝事件或树叶事件等,以及xpath,筛选访问感兴趣的XML节点。 函数TravelXmlBuffer4是TravelXmlBuffer事件细分版本。 还可根据需要,快速解析访问当前节点的属性集合,函数原型如下 [code] typedef int funcCallbackOnXmlProperty( char *xpath , int xpath_len , int xpath_size , char *propname , int propname_len , char *propvalue , int propvalue_len , void *p ); _WINDLL_FUNC int TravelXmlPropertiesBuffer( char *properties , int properties_len , char *xpath , int xpath_len , int xpath_size , funcCallbackOnXmlProperty *pfuncCallbackOnXmlProperty , void *p ); [/code] 下面是一个简单的示例 [code] funcCallbackOnXmlProperty CallbackOnXmlProperty ; int CallbackOnXmlProperty( char *xpath , int xpath_len , int xpath_size , char *propname , int propname_len , char *propvalue , int propvalue_len , void *p ) { printf( " PROPERTY p[%s] xpath[%s] propname[%.*s] propvalue[%.*s]\n" , (char*)p , xpath , propname_len , propname , propvalue_len , propvalue ); return 0; } funcCallbackOnXmlNode CallbackOnXmlNode ; int CallbackOnXmlNode( int type , char *xpath , int xpath_len , int xpath_size , char *nodename , int nodename_len , char *properties , int properties_len , char *content , int content_len , void *p ) { int nret = 0 ; if( type & FASTXML_NODE_BRANCH ) { if( type & FASTXML_NODE_ENTER ) { printf( "ENTER-BRANCH p[%s] xpath[%s] nodename[%.*s] properties[%.*s]\n" , (char*)p , xpath , nodename_len , nodename , properties_len , properties ); } else if( type & FASTXML_NODE_LEAVE ) { printf( "LEAVE-BRANCH p[%s] xpath[%s] nodename[%.*s] properties[%.*s]\n" , (char*)p , xpath , nodename_len , nodename , properties_len , properties ); } else { printf( "BRANCH p[%s] xpath[%s] nodename[%.*s] properties[%.*s]\n" , (char*)p , xpath , nodename_len , nodename , properties_len , properties ); } } else if( type & FASTXML_NODE_LEAF ) { printf( "LEAF p[%s] xpath[%s] nodename[%.*s] properties[%.*s] content[%.*s]\n" , (char*)p , xpath , nodename_len , nodename , properties_len , properties , content_len , content ); } if( properties && properties[0] ) { nret = TravelXmlPropertiesBuffer( properties , properties_len , xpath , xpath_len , xpath_size , & CallbackOnXmlProperty , p ) ; if ( nret ) return nret; } return 0; } ... char xpath[ 1024 + 1 ] ; char *xml_buffer = NULL ; char *p = "hello world" ; memset( xpath , 0x00 , sizeof(xpath) ); nret = TravelXmlBuffer( xml_buffer , xpath , sizeof(xpath) , & CallbackOnXmlNode , p ) ; free( xml_buffer ); if( nret && nret != FASTXML_INFO_END_OF_BUFFER ) { printf( "TravelXmlTree failed[%d]\n" , nret ); return nret; } ... [/code] 执行之,输出为 [code] $ cat test_basic.xml content sub_content content $ test_fasterxml test_basic.xml BRANCH p[hello world] xpath[/xml] nodename[xml] properties[ version="1.0" encoding="GBK"?] PROPERTY p[hello world] xpath[/xml.version] propname[version] propvalue[1.0] PROPERTY p[hello world] xpath[/xml.encoding] propname[encoding] propvalue[GBK] ENTER-BRANCH p[hello world] xpath[/root] nodename[root] properties[] LEAF p[hello world] xpath[/root/leaf] nodename[leaf] properties[] content[content] ENTER-BRANCH p[hello world] xpath[/root/sub_branch] nodename[sub_branch] properties[] LEAF p[hello world] xpath[/root/sub_branch/sub_leaf] nodename[sub_leaf] properties[] content[sub_content] LEAVE-BRANCH p[hello world] xpath[/root/sub_branch] nodename[sub_branch] properties[] BRANCH p[hello world] xpath[/root/self_branch] nodename[self_branch] properties[ /] BRANCH p[hello world] xpath[/root/self_branch2] nodename[self_branch2] properties[/] ENTER-BRANCH p[hello world] xpath[/root/branch_has_property] nodename[branch_has_property] properties[ propname1=propvalue1] PROPERTY p[hello world] xpath[/root/branch_has_property.propname1] propname[propname1] propvalue[propvalue1] LEAF p[hello world] xpath[/root/branch_has_property/leaf_has_property] nodename[leaf_has_property] properties[ propname1="prop value1" propname2="prop value2" ] content[content] PROPERTY p[hello world] xpath[/root/branch_has_property/leaf_has_property.propname1] propname[propname1] propvalue[prop value1] PROPERTY p[hello world] xpath[/root/branch_has_property/leaf_has_property.propname2] propname[propname2] propvalue[prop value2] LEAVE-BRANCH p[hello world] xpath[/root/branch_has_property] nodename[branch_has_property] properties[] LEAVE-BRANCH p[hello world] xpath[/root] nodename[root] properties[] [/code] 客户可以改造事件回调函数,实现快速遍历访问XML,比如根据XML某些XPATH数据填充一个C结构体相关成员。 4.性能 fasterxml的性能非常高,下面是fasterxml与目前宣称最快XML解析器rapidxml的压测比较 压测用XML文件test_big_and_lang.xml约622KB(17870行),里面包含大量节点、重复节点、各种编码以及少量注释。 [code] ~/exsrc/fasterxml-1.0.0/test $ ls -l test_big_and_lang.xml -rw-r--r-- 1 calvin calvin 622511 Sep 21 10:51 test_big_and_lang.xml ~/exsrc/fasterxml-1.0.0/test $ wc test_big_and_lang.xml 17870 26332 622511 test_big_and_lang.xml [/code] 首先是fasterxml开跑10次,取最好成绩 [code] ~/exsrc/fasterxml-1.0.0/test $ time ./press_fasterxml test_big_and_lang.xml 1000 Elapse 5 seconds real 0m5.130s user 0m3.402s sys 0m0.007s [/code] 遍历测试XML文件所有树枝、树叶、属性1000次,耗时约5秒。 然后是rapidxml开跑10次,取最好成绩 [code] ~/exsrc/fasterxml-1.0.0/test_rapidxml $ time ./press_rapidxml ../test/test_big_and_lang.xml 1000 Elapse 19 seconds real 0m19.169s user 0m0.923s sys 0m18.265s [/code] 遍历测试XML文件所有树枝、树叶、属性1000次,耗时约19秒。 可见fasterxml比rapidxml快了约3倍,fasterxml夺取了XML(SAX)性能桂冠。 5.最后 欢迎使用fasterxml,如果你使用中碰到了问题请告诉我,谢谢 ^_^ 首页传送门 : [url]http://git.oschina.net/calvinwilliams/fasterxml[/url] 作者邮箱 : calvinwilliams.c@gmail.com