PHP Basic Series – Playing with XML

XML – Extensible Markup Language – is one of the most flexible type of files that you can use at your application. It can easily allow 2 different applications to communicate, it can save data, store configuration information and much more.

PHP has 2 great libraries to work with XML:

  • DOM – Document Object Model: A great library to strongly manipulate XML in all levels
  • SimpleXML: A “simple” xml library that will help you to easily manipulate XML files until a certain level (for instance you can’t delete an item on a XML with the SimpleXML).

Even knowing that SimpleXML does not handle everything it is by far the best library to work with. Is very light, the performance is amazing and you can handle the minimal necessary which is creating, reading, adding and, with a little trick, deleting nodes. The best feature of the SimpleXML library is the XPath. XPath provides you with a small “regex” interpretation to easily let you get the contents and attributes from a given node / parent etc.

[Note:] SimpleXML only supports well-formed XML files / strings. Only by loading it, PHP already gives a great tool to validate XML.

First let’s work with a small XML file:

<?xml version="1.0" encoding="utf-8" ?>
<states>
<!--US States and Territories-->
 <state abbreviation="AL">Alabama</state>
 <state abbreviation="AK">Alaska</state>
 <state abbreviation="AZ">Arizona</state>
 <state abbreviation="AR">Arkansas</state>
 <state abbreviation="FL">Florida</state>
 <state abbreviation=IN>Indiana</state>
 <!-- list continues -->

</states>

Copy this code and paste inside a text file and name it as you please (states-provinces.xml / states-provinces.txt / etc). Since SimpleXML allows to read the XML from a string, we will be seeing both codes here in this post.

First let’s load the created file into the SimpleXML object.

<?php
// reading a xml file with simple xml
$xml = simplexml_load_file('states-provinces.xml');
?>

Or, if you want to load it from a string:

<?php
// reading a xml string with simple xml
$xmlString = '
<states>
<!--US States and Territories-->
 <state abbreviation="AL">Alabama</state>
 <state abbreviation="AK">Alaska</state>
 <state abbreviation="AZ">Arizona</state>
 <state abbreviation="AR">Arkansas</state>
 <state abbreviation="FL">Florida</state>
 <state abbreviation=IN>Indiana</state>
 <!-- list continues -->

</states>';
$xml = simplexml_load_string($xmlString);
?>

After this point, all the remaining procedures for SimpleXML are the same and so it’s the structure.

Once created SimpleXML will load all XML contents to an array of SimpleXML objects.

As any other method, when manipulating is necessary that the focus be at:

  • read
  • write
  • delete

SimpleXML does read and write pretty well, but when it comes to delete is a bit tricky.

Reading:

To read attributes and values from a given item is pretty simple and the library offers you 2 ways to access the element node.

  1. Accessing in an as array style
  2. Accessing using XPATH

To access in an array style all you need to do is use the position of the element in the array (if you know it) or call the children method inside a loop.

<?php
foreach ($xml->children() as $node) {
 // to read the state name value all you need is to convert the node to string
 $name = (string) $node;
 // to read the state abbreviation is even simpler, call the attribute using the array feature
 $abbreviation = $node['abbreviation'];
 // just to print
 echo "name: {$name} abbrv: {$abbreviation}
";
}
?>

Seems simple, but when you are working with XML files that have 3, 4 childnodes levels, it becomes more complicated and more intensive to get to the results. In cases like this we use the XPATH to access the elements of an XML in a fast way.

As mentioned before, the XPATH will work as a small REGEX to get to the element and load its features.

<?php
$node = $xml->xpath('//state[@abbreviation=\'IN\']');
?>

As simple as it looks, with XPATH, you can easily access the Indiana node value without having to loop until it. XPATH will also be very useful when we are considering deleting a given node.

Writing:

SimpleXML is a simple structured xml library so there aren’t many methods to write into the file, therefore, the methods existing are more than enough to perform the operation. The basics are, value and attributes, so, all needed is to add the child and it’s attributes.

Consider the insertion of the Texas State on our small XML file:

<?php
// adding a new state child
$state = $xml->addChild('state', 'Texas');
// adding the abbreviation attribute on the newly created state
$state->addAttribute('abbreviation', 'TX');
// saving the xml file
$xml->asXML();

foreach ($xml->children() as $node) {
// to read the state name value all you need is to convert the node to string
$name = (string) $node;
// to read the state abbreviation is even simpler, call the attribute using the array feature
$abbreviation = $node['abbreviation'];
// just to print
echo "name: {$name} abbrv: {$abbreviation}
";
}
?>

Just like reading, if necessary to add a child inside a given node, all needed is to reach the node and then add the child. Just to exemplify, consider we adding the child city inside the recently created Texas State.

<?php
$state = $xml->addChild('state', 'Texas');
$state->addAttribute('abbreviation', 'TX');

$city = $state->addChild('city', 'Austin');
$city->addAttribute('capital', 'yes');
$xml->asXML();
?>

The fact of the SimpleXML library be a simpler version of XML handling, it does not support direct access to functions such as DELETE and UPDATE, therefore, there is a way around it.

Deleting:

SimpleXML does not support direct deletion of a node, therefore, since it is treated as an array, you can, just like in an array, unset a node to delete it.

<?php
    $counter = -1;
    $itemToUnset = null;
    foreach ($xml->children() as $node) {
        ++$counter;
        $attr = $node->attributes();
        if ($attr['abbreviation'] == 'TX') {
            $itemToUnset = $counter;
            break;
        }
     }

   unset($xml->state[$itemToUnset]);
   $newXmlText = $xml->asXML();
?>

For multiple tier levels of the XML file, you can use the XPATH to get to uppermost parent node and then loop inside it to get the index and “delete” the node.

Updating:

Updating is not one of the easier tasks with SimpleXML and, in a case like this, using the DOM XML object is recommended, therefore just like the delete there is a way to go around it and update a XML file using SimpleXML.

First get all information that is necessary to re-create the node. Update the pieces that need to update, delete the node and then re-insert it with the addChild methods.

Even knowing that SimpleXML does not carry a strong support for delete and update, for it reading, nothing is faster than it. Most of the times XML manipulation will be used to read values from an AJAX request, config file for a bootstrap, simply store a new entry on a XML file, etc.

With the support of the XPATH, SimpleXML becomes a strong library and can handle the basics for XML manipulation with PHP in a structured and easy-to-understand  way.

Have fun.

About mcloide

Making things simpler, just check: http://www.mcloide.com View all posts by mcloide

3 responses to “PHP Basic Series – Playing with XML

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: