使用PHP 5.0 轻松解析XML文档

发表于:2007-05-25来源:作者:点击数: 标签:phpxml5.0轻松解析
用sax方式的时候,要自己构建3个函数,而且要直接用这三的函数来返回数据, 要求较强的逻辑。 在处理不同结构的xml的时候, 还要重新进行构造这三个函数,麻烦! 用dom方式,倒是好些,但是他把每个节点都看作是一个node,操作起来要写好多的代码, 麻烦! 网上有好
用sax方式的时候,要自己构建3个函数,而且要直接用这三的函数来返回数据, 要求较强的逻辑。 在处理不同结构的xml的时候, 还要重新进行构造这三个函数,麻烦!

  用dom方式,倒是好些,但是他把每个节点都看作是一个node,操作起来要写好多的代码, 麻烦!

  网上有好多的开源的xml解析的类库, 以前看过几个,但是心里总是觉得不踏实,感觉总是跟在别人的屁股后面.

  这几天在搞java, 挺累的,所以决定换换脑袋,写点php代码,为了防止以后xml解析过程再令我犯难,就花了一天的时间写了下面一个xml解析的类,于是就有了下面的东西。

  实现方式是通过包装"sax方式的解析结果"来实现的. 总的来说,对于我个人来说挺实用的,性能也还可以,基本上可以完成大多数的处理要求。

  功能:

  1、 对基本的xml文件的节点进行 查询 / 添加 / 修改 / 删除 工作.

  2、导出xml文件的所有数据到一个数组里面.

  3、整个设计采用了oo方式,在操作结果集的时候, 使用方法类似于dom

  缺点:

  1、 每个节点最好都带有一个id(看后面的例子), 每个“节点名字”=“节点的标签_节点的id”,如果这个id值没有设置,程序将自动给他产生一个id,这个id就是这个节点在他的上级节点中的位置编号,从0开始。

   2、 查询某个节点的时候可以通过用“|”符号连接“节点名字”来进行。这些“节点名字”都是按顺序写好的上级节点的名字。

  使用说明:

  运行下面的例子,在执行结果页面上可以看到函数的使用说明

  代码是通过php5来实现的,在php4中无法正常运行。

  由于刚刚写完,所以没有整理文档,下面的例子演示的只是一部分的功能,代码不是很难,要是想知道更多的功能,可以研究研究源代码。

  目录结构:
      test.php
      test.xml
      xml / SimpleDocumentBase.php
      xml / SimpleDocumentNode.php
      xml / SimpleDocumentRoot.php
      xml / SimpleDocumentParser.php
  文件:test.xml

<?xml version="1.0" encoding="GB2312"?>
<shop>
<name>华联</name>
<address>北京长安街-9999号</address>
<desc>连锁超市</desc>
<cat id="food">
<goods id="food11">
<name>food11</name>
<price>12.90</price>
</goods>
<goods id="food12">
<name>food12</name>
<price>22.10</price>
<desc creator="hahawen">好东西推荐</desc>
</goods>
</cat>
<cat>
<goods id="tel21">
<name>tel21</name>
<price>1290</price>
</goods>
</cat>
<cat id="coat">
<goods id="coat31">
<name>coat31</name>
<price>112</price>
</goods>
<goods id="coat32">
<name>coat32</name>
<price>45</price>
</goods>
</cat>
<special id="hot">
<goods>
<name>hot41</name>
<price>99</price>
</goods>
</special>
</shop>

   文件:test.php

  <?php
    require_once "xml/SimpleDocumentParser.php";
require_once "xml/SimpleDocumentBase.php";
require_once "xml/SimpleDocumentRoot.php";
require_once "xml/SimpleDocumentNode.php";
$test = new SimpleDocumentParser();
$test->parse("test.xml");
$dom = $test->getSimpleDocument();
echo "<pre>";
echo "<hr><font color=red>";
echo "下面是通过函数getSaveData()返回的整个xml数据的数组";
echo "</font><hr>";
print_r($dom->getSaveData());
echo "<hr><font color=red>";
echo "下面是通过setValue()函数,给给根节点添加信息,添加后显示出结果xml文件的内容";
echo "</font><hr>";
$dom->setValue("telphone", "123456789");
echo htmlspecialchars($dom->getSaveXml());
echo "<hr><font color=red>";
echo "下面是通过getNode()函数,返回某一个分类下的所有商品的信息";
echo "</font><hr>";
$obj = $dom->getNode("cat_food");
$nodeList = $obj->getNode();
foreach($nodeList as $node){
$data = $node->getValue();
echo "<font color=red>商品名:".$data[name]."</font><br>";
print_R($data);
print_R($node->getAttribute());
}
echo "<hr><font color=red>";
echo "下面是通过findNodeByPath()函数,返回某一商品的信息";
echo "</font><hr>";
$obj = $dom->findNodeByPath("cat_food|goods_food11");
if(!is_object($obj)){
echo "该商品不存在";
}else{
$data = $obj->getValue();
echo "<font color=red>商品名:".$data[name]."</font><br>";
print_R($data);
print_R($obj->getAttribute());
}
echo "<hr><font color=red>";
echo "下面是通过setValue()函数,给商品\"food11\"添加属性, 然后显示添加后的结果";
echo "</font><hr>";
$obj = $dom->findNodeByPath("cat_food|goods_food11");
$obj->setValue("leaveword", array("value"=>"这个商品不错",
"attrs"=>array("author"=>"hahawen", "date"=>date('Y-m-d'))));
echo htmlspecialchars($dom->getSaveXml());
echo "<hr><font color=red>";
echo "下面是通过removeValue()/removeAttribute()函数,
给商品\"food11\"改变和删除属性, 然后显示操作后的结果";
echo "</font><hr>";
$obj = $dom->findNodeByPath("cat_food|goods_food12");
$obj->setValue("name", "new food12");
$obj->removeValue("desc");
echo htmlspecialchars($dom->getSaveXml());
echo "<hr><font color=red>";
echo "下面是通过createNode()函数,添加商品, 然后显示添加后的结果";
echo "</font><hr>";
$obj = $dom->findNodeByPath("cat_food");
$newObj = $obj->createNode("goods", array("id"=>"food13"));
$newObj->setValue("name", "food13");
$newObj->setValue("price", 100);
echo htmlspecialchars($dom->getSaveXml());
echo "<hr><font color=red>";
echo "下面是通过removeNode()函数,删除商品, 然后显示删除后的结果";
echo "</font><hr>";
$obj = $dom->findNodeByPath("cat_food");
$obj->removeNode("goods_food12");
echo htmlspecialchars($dom->getSaveXml());
?>



文件:SimpleDocumentParser.php

<?php
/**
 *=========================================================
 *
 * @author     hahawen(大龄青年)  
 * @since      2004-12-04
 * @copyright  Copyright (c) 2004, NxCoder Group
 *
 *=========================================================
 */
/**
 * class SimpleDocumentParser
 * use SAX parse xml file, and build SimpleDocumentObject
 * all this pachage's is work for xml file, and method is action as DOM.
 *
 * @package SmartWeb.common.xml
 * @version 1.0
 */
class SimpleDocumentParser
{
 private $domRootObject = null;
 private $currentNO = null;
 private $currentName = null;
 private $currentValue = null;
 private $currentAttribute = null;
 public function getSimpleDocument()
 {
     return $this->domRootObject;
 }
 public function parse($file)
 {
        $xmlParser = xml_parser_create();
     xml_parser_set_option($xmlParser,XML_OPTION_CASE_FOLDING, 0);
     xml_parser_set_option($xmlParser,XML_OPTION_SKIP_WHITE, 1);
     xml_parser_set_option($xmlParser, XML_OPTION_TARGET_ENCODING, 'UTF-8');
     xml_set_object($xmlParser, $this);
     xml_set_element_handler($xmlParser, "startElement", "endElement");
     xml_set_character_data_handler($xmlParser, "characterData");
        if (!xml_parse($xmlParser, file_get_contents($file)))
            die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xmlParser)),
        xml_get_current_line_number($xmlParser))); xml_parser_free($xmlParser); } private function startElement($parser, $name, $attrs) { $this->currentName = $name; $this->currentAttribute = $attrs; if($this->currentNO == null) { $this->domRootObject = new SimpleDocumentRoot($name); $this->currentNO = $this->domRootObject; } else { $this->currentNO = $this->currentNO->createNode($name, $attrs); } } private function endElement($parser, $name) { if($this->currentName==$name) { $tag = $this->currentNO->getSeq(); $this->currentNO = $this->currentNO->getPNodeObject(); if($this->currentAttribute!=null && sizeof($this->currentAttribute)>0) $this->currentNO->setValue($name, array('value'=>$this->currentValue,
       'attrs'=>$this->currentAttribute)); else $this->currentNO->setValue($name, $this->currentValue); $this->currentNO->removeNode($tag); } else { $this->currentNO = (is_a($this->currentNO, 'SimpleDocumentRoot'))? null:
        $this->currentNO->getPNodeObject(); } } private function characterData($parser, $data) { $this->currentValue = iconv('UTF-8', 'GB2312', $data); } function __destruct() { unset($this->domRootObject); } } ?>

  文件:SimpleDocumentBase.php

<?php
/**
 *=========================================================
 *
 * @author     hahawen(大龄青年)  
 * @since      2004-12-04
 * @copyright  Copyright (c) 2004, NxCoder Group
 *
 *=========================================================
 */
/**
 * abstract class SimpleDocumentBase
 * base class for xml file parse
 * all this pachage's is work for xml file, and method is action as DOM.
 *
 * 1\ add/update/remove data of xml file.
 * 2\ explode data to array.
 * 3\ rebuild xml file
 *
 * @package SmartWeb.common.xml
 * @abstract
 * @version 1.0
 */
abstract class SimpleDocumentBase
{
 private $nodeTag = null;
 private $attributes = array();
 private $values = array();
 private $nodes = array();
    function __construct($nodeTag)
    {
        $this->nodeTag = $nodeTag;
    }
    public function getNodeTag()
    {
     return $this->nodeTag;
    }
    public function setValues($values){
     $this->values = $values;
    }
    public function setValue($name, $value)
    {
     $this->values[$name] = $value;
    }
    public function getValue($name=null)
    {
     return $name==null? $this->values: $this->values[$name];
    }
    public function removeValue($name)
    {
     unset($this->values["$name"]);
    }
    public function setAttributes($attributes){
        $this->attributes = $attributes;
    }
    public function setAttribute($name, $value)
    {
     $this->attributes[$name] = $value;
    }
    public function getAttribute($name=null)
    {
        return $name==null? $this->attributes: $this->attributes[$name];
    }
    public function removeAttribute($name)
    {
     unset($this->attributes["$name"]);
    }
    public function getNodesSize()
    {
        return sizeof($this->nodes);
    }
    protected function setNode($name, $nodeId)
    {
     $this->nodes[$name] = $nodeId;
    }
    public abstract function createNode($name, $attributes);
    public abstract function removeNode($name);
    public abstract function getNode($name=null);
    protected function getNodeId($name=null)
    {
     return $name==null? $this->nodes: $this->nodes[$name];
    }
    protected function createNodeByName($rootNodeObj, $name, $attributes, $pId)
    {
        $tmpObject = $rootNodeObj->createNodeObject($pId, $name, $attributes);
        $key = isset($attributes[id])? $name.'_'.$attributes[id]: $name.'_'.$this->getNodesSize();
        $this->setNode($key, $tmpObject->getSeq());
     return $tmpObject;
    }
    protected function removeNodeByName($rootNodeObj, $name)
    {
        $rootNodeObj->removeNodeById($this->getNodeId($name));
        if(sizeof($this->nodes)==1)
            $this->nodes = array();
     else
      unset($this->nodes[$name]);
    }
    protected function getNodeByName($rootNodeObj, $name=null)
    {
     if($name==null)
     {
            $tmpList = array();
            $tmpIds = $this->getNodeId();
            foreach($tmpIds as $key=>$id)
             $tmpList[$key] = $rootNodeObj->getNodeById($id);
            return $tmpList;
     }
     else
     {
      $id = $this->getNodeId($name);
      if($id===null)
      {
             $tmpIds = $this->getNodeId();
             foreach($tmpIds as $tkey=>$tid)
             {
              if(strpos($key, $name)==0)
              {
               $id = $tid;
               break;
              }
             }
      }
      return $rootNodeObj->getNodeById($id);
     }
    }
    public function findNodeByPath($path)
    {
     $pos = strpos($path, '|');
     if($pos<=0)
     {
      return $this->getNode($path);
        }
        else
        {
         $tmpObj = $this->getNode(substr($path, 0, $pos));
         return is_object($tmpObj)? $tmpObj->findNodeByPath(substr($path, $pos+1)): null;
        }
    }
    public function getSaveData()
    {
        $data = $this->values;
        if(sizeof($this->attributes)>0)
         $data[attrs] = $this->attributes;
        $nodeList = $this->getNode();
        if($nodeList==null)
         return $data;
        foreach($nodeList as $key=>$node)
        {
         $data[$key] = $node->getSaveData();
        }
     return $data;
    }
    public function getSaveXml($level=0)
    {
     $prefixSpace = str_pad("", $level, "\t");
     $str = "$prefixSpace<$this->nodeTag";
        foreach($this->attributes as $key=>$value)
         $str .= " $key=\"$value\"";
        $str .= ">\r\n";
        foreach($this->values as $key=>$value){
            if(is_array($value))
            {
                $str .= "$prefixSpace\t<$key";
                foreach($value[attrs] as $attkey=>$attvalue)
                 $str .= " $attkey=\"$attvalue\"";
                $tmpStr = $value[value];
            }
            else
            {
             $str .= "$prefixSpace\t<$key";
             $tmpStr = $value;
            }
            $tmpStr = trim(trim($tmpStr, "\r\n"));
            $str .= ($tmpStr===null || $tmpStr==="")? " />\r\n": ">$tmpStr</$key>\r\n";
        }
        foreach($this->getNode() as $node)
         $str .= $node->getSaveXml($level+1)."\r\n";
     $str .= "$prefixSpace</$this->nodeTag>";
     return $str;
    }
    function __destruct()
    {
     unset($this->nodes, $this->attributes, $this->values);
    }
}
?>
      


文件:SimpleDocumentRoot.php

<?php
/**
 *=========================================================
 *
 * @author     hahawen(大龄青年)  
 * @since      2004-12-04
 * @copyright  Copyright (c) 2004, NxCoder Group
 *
 *=========================================================
 */
/**
 * class SimpleDocumentRoot
 * xml root class, include values/attributes/subnodes.
 * all this pachage's is work for xml file, and method is action as DOM.
 *
 * @package SmartWeb.common.xml
 * @version 1.0
 */
class SimpleDocumentRoot extends SimpleDocumentBase
{
 private $prefixStr = '<?xml version="1.0" encoding="utf-8" ?>';
 private $nodeLists = array();
 function __construct($nodeTag)
 {
        parent::__construct($nodeTag);
 }
    public function createNodeObject($pNodeId, $name, $attributes)
    {
     $seq = sizeof($this->nodeLists);
     $tmpObject = new SimpleDocumentNode($this, $pNodeId, $name, $seq);
     $tmpObject->setAttributes($attributes);
     $this->nodeLists[$seq] = $tmpObject;
     return $tmpObject;
    }
    public function removeNodeById($id)
    {
        if(sizeof($this->nodeLists)==1)
            $this->nodeLists = array();
     else
      unset($this->nodeLists[$id]);
    }
    public function getNodeById($id)
    {
     return $this->nodeLists[$id];
    }
    public function createNode($name, $attributes)
    {
        return $this->createNodeByName($this, $name, $attributes, -1);
    }
    public function removeNode($name)
    {
        return $this->removeNodeByName($this, $name);
    }
    public function getNode($name=null)
    {
        return $this->getNodeByName($this, $name);
    }
    public function getSaveXml()
    {
     $prefixSpace = "";
     $str = $this->prefixStr."\r\n";
     return $str.parent::getSaveXml(0);
    }
}
?>
      
  
  文件:SimpleDocumentNode.php

<?php
/**
 *=========================================================
 *
 * @author     hahawen(大龄青年)  
 * @since      2004-12-04
 * @copyright  Copyright (c) 2004, NxCoder Group
 *
 *=========================================================
 */
/**
 * class SimpleDocumentNode
 * xml Node class, include values/attributes/subnodes.
 * all this pachage's is work for xml file, and method is action as DOM.
 *
 * @package SmartWeb.common.xml
 * @version 1.0
 */
class SimpleDocumentNode extends SimpleDocumentBase
{
 private $seq = null;
 private $rootObject = null;
    private $pNodeId = null;
    function __construct($rootObject, $pNodeId, $nodeTag, $seq)
    {
     parent::__construct($nodeTag);
        $this->rootObject = $rootObject;
        $this->pNodeId = $pNodeId;
        $this->seq = $seq;
    }
    public function getPNodeObject()
    {
     return ($this->pNodeId==-1)? $this->rootObject: $this->rootObject->getNodeById($this->pNodeId);
    }
    public function getSeq(){
     return $this->seq;
    }
    public function createNode($name, $attributes)
    {
        return $this->createNodeByName($this->rootObject, $name, $attributes, $this->getSeq());
    }
    public function removeNode($name)
    {
        return $this->removeNodeByName($this->rootObject, $name);
    }
    public function getNode($name=null)
    {
        return $this->getNodeByName($this->rootObject, $name);
    }
}
?>
      


下面是例子运行对结果:

  下面是通过函数getSaveData()返回的整个xml数据的数组

Array
(
    [name] => 华联
    [address] => 北京长安街-9999号
    [desc] => 连锁超市
    [cat_food] => Array
        (
            [attrs] => Array
                (
                    [id] => food
                )
            [goods_food11] => Array
                (
                    [name] => food11
                    [price] => 12.90
                    [attrs] => Array
                        (
                            [id] => food11
                        )
                )
            [goods_food12] => Array
                (
                    [name] => food12
                    [price] => 22.10
                    [desc] => Array
                        (
                            [value] => 好东西推荐
                            [attrs] => Array
                                (
                                    [creator] => hahawen
                                )
                        )
                    [attrs] => Array
                        (
                            [id] => food12
                        )
                )
        )
    [cat_1] => Array
        (
            [goods_tel21] => Array
                (
                    [name] => tel21
                    [price] => 1290
                    [attrs] => Array
                        (
                            [id] => tel21
                        )
                )
        )
    [cat_coat] => Array
        (
            [attrs] => Array
                (
                    [id] => coat
                )
            [goods_coat31] => Array
                (
                    [name] => coat31
                    [price] => 112
                    [attrs] => Array
                        (
                            [id] => coat31
                        )
                )
            [goods_coat32] => Array
                (
                    [name] => coat32
                    [price] => 45
                    [attrs] => Array
                        (
                            [id] => coat32
                        )
                )
        )
    [special_hot] => Array
        (
            [attrs] => Array
                (
                    [id] => hot
                )
            [goods_0] => Array
                (
                    [name] => hot41
                    [price] => 99
                )
        )
)

  下面是通过setValue()函数,给给根节点添加信息,添加后显示出结果xml文件的内容

<?xml version="1.0" encoding="GB2312" ?>

<shop> <name>华联</name> <address>北京长安街-9999号</address> <desc>连锁超市</desc> <telphone>123456789</telphone> <cat id="food"> <goods id="food11"> <name>food11</name> <price>12.90</price> </goods> <goods id="food12"> <name>food12</name> <price>22.10</price> <desc creator="hahawen">好东西推荐</desc> </goods> </cat> <cat> <goods id="tel21"> <name>tel21</name> <price>1290</price> </goods> </cat> <cat id="coat"> <goods id="coat31"> <name>coat31</name> <price>112</price> </goods> <goods id="coat32"> <name>coat32</name> <price>45</price> </goods> </cat> <special id="hot"> <goods> <name>hot41</name> <price>99</price> </goods> </special> </shop>


下面是通过getNode()函数,返回某一个分类下的所有商品的信息商品名:

food11
Array ( [name] => food11 [price] => 12.90 ) Array ( [id] => food11 ) 商品名:food12
Array ( [name] => food12 [price] => 22.10 [desc] => Array ( [value] => 好东西推荐 [attrs] => Array ( [creator] => hahawen ) ) ) Array ( [id] => food12 )
  下面是通过findNodeByPath()函数,返回某一商品的信息商品名:
food11

Array ( [name] => food11 [price] => 12.90 ) Array ( [id] => food11 )

  下面是通过setValue()函数,给商品"food11"添加属性, 然后显示添加后的结果

<?xml version="1.0" encoding="GB2312" ?>
<shop>
 <name>华联</name>
 <address>北京长安街-9999号</address>
 <desc>连锁超市</desc>
 <telphone>123456789</telphone>
 <cat id="food">
  <goods id="food11">
   <name>food11</name>
   <price>12.90</price>
   <leaveword author="hahawen" date="2004-12-05">这个商品不错</leaveword>
  </goods>
  <goods id="food12">
   <name>food12</name>
   <price>22.10</price>
   <desc creator="hahawen">好东西推荐</desc>
  </goods>
 </cat>
 <cat>
  <goods id="tel21">
   <name>tel21</name>
   <price>1290</price>
  </goods>
 </cat>
 <cat id="coat">
  <goods id="coat31">
   <name>coat31</name>
   <price>112</price>
  </goods>
  <goods id="coat32">
   <name>coat32</name>
   <price>45</price>
  </goods>
 </cat>
 <special id="hot">
  <goods>
   <name>hot41</name>
   <price>99</price>
  </goods>
 </special>
</shop>

  下面是通过removeValue()/removeAttribute()函数,给商品"food11"改变和删除属性, 然后显示操作后的结果

<?xml version="1.0" encoding="GB2312" ?>
<shop>
 <name>华联</name>
 <address>北京长安街-9999号</address>
 <desc>连锁超市</desc>
 <telphone>123456789</telphone>
 <cat id="food">
  <goods id="food11">
   <name>food11</name>
   <price>12.90</price>
   <leaveword author="hahawen" date="2004-12-05">这个商品不错</leaveword>
  </goods>
  <goods id="food12">
   <name>new food12</name>
   <price>22.10</price>
  </goods>
 </cat>
 <cat>
  <goods id="tel21">
   <name>tel21</name>
   <price>1290</price>
  </goods>
 </cat>
 <cat id="coat">
  <goods id="coat31">
   <name>coat31</name>
   <price>112</price>
  </goods>
  <goods id="coat32">
   <name>coat32</name>
   <price>45</price>
  </goods>
 </cat>
 <special id="hot">
  <goods>
   <name>hot41</name>
   <price>99</price>
  </goods>
 </special>
</shop>


下面是通过createNode()函数,添加商品, 然后显示添加后的结果


<?xml version="1.0" encoding="GB2312" ?> <shop> <name>华联</name> <address>北京长安街-9999号</address> <desc>连锁超市</desc> <telphone>123456789</telphone> <cat id="food"> <goods id="food11"> <name>food11</name> <price>12.90</price> <leaveword author="hahawen" date="2004-12-05">这个商品不错</leaveword> </goods> <goods id="food12"> <name>new food12</name> <price>22.10</price> </goods> <goods id="food13"> <name>food13</name> <price>100</price> </goods> </cat> <cat> <goods id="tel21"> <name>tel21</name> <price>1290</price> </goods> </cat> <cat id="coat"> <goods id="coat31"> <name>coat31</name> <price>112</price> </goods> <goods id="coat32"> <name>coat32</name> <price>45</price> </goods> </cat> <special id="hot"> <goods> <name>hot41</name> <price>99</price> </goods> </special> </shop>

  下面是通过removeNode()函数,删除商品, 然后显示删除后的结果

<?xml version="1.0" encoding="GB2312" ?>
<shop>
 <name>华联</name>
 <address>北京长安街-9999号</address>
 <desc>连锁超市</desc>
 <telphone>123456789</telphone>
 <cat id="food">
  <goods id="food11">
   <name>food11</name>
   <price>12.90</price>
   <leaveword author="hahawen" date="2004-12-05">这个商品不错</leaveword>
  </goods>
  <goods id="food13">
   <name>food13</name>
   <price>100</price>
  </goods>
 </cat>
 <cat>
  <goods id="tel21">
   <name>tel21</name>
   <price>1290</price>
  </goods>
 </cat>
 <cat id="coat">
  <goods id="coat31">
   <name>coat31</name>
   <price>112</price>
  </goods>
  <goods id="coat32">
   <name>coat32</name>
   <price>45</price>
  </goods>
 </cat>
 <special id="hot">
  <goods>
   <name>hot41</name>
   <price>99</price>
  </goods>
 </special>
</shop>

原文转自:http://www.ltesting.net