Tip: Variable substitution in XML documents


XML was originally created simply to allow for documents to be authored (marked up) in a
variety of formats. Because XML is primarily a language for representing static data, the
idea of variables, value substitution, and other dynamic data representations were not
considered (at least, not much!). As a result, XML documents often end up with redundant
data, inconsistent data, and a variety of other problems that result from purely static
data formats.

XML does have a limited facility for dynamic data, and it turns out that this
facility can empower XML authors greatly. That facility, of course, is entity references.
In this tip, I examine entity references in detail, explaining how they are used
and what they offer you, the XML document author.

I realize that this is one of those topics in XML that can seem a little
mysterious and obscure. However, you can simply think of an entity reference as a variable
in XML; that variable has a value that’s declared somewhere, and every time the variable occurs,
the parser substitutes that value in the output. More specifically and accurately,
an entity reference is like a static, final variable in the Java language. It cannot change
from its initial value, which is defined in a DTD somewhere.

While many times an entity reference refers to an online resource, it can also have
a value defined for it in a DTD, as in Listing 1.

<!ENTITY variableName "variable value">

Instead of typing “variable value” several times in your XML document
(and possibly introducing typos and user error), you can just refer to
the value through the reference, as shown in Listing 2.

<content>The variable's value is &variableName;.</content>

Of course, this seems pretty trivial, so let’s look at a more realistic example.
Listing 3 shows a simple XML document fragment that’s intended for display on a Web page

  <title>Simplify with entity references</title>
  <content type="html">
    <center><h1>Simplify with entity references</h1></center>

      This tip, <i>Simplify with entity references</i>, details an underused facet of
	  XML document authoring. So on and so on, ad nauseum, ad infinitum.

Notice that the title “Simplify with entity references” was repeated three times.
Not only does this introduce room for error, it makes it a pain to change all
occurrences (of which there may be a dozen or more in this and related documents
in the future). This makes the document a good candidate for an entity reference.
First, add the definition of the entity to your DTD, as in Listing 4.

<!ENTITY articleTitle "Simplify with entity references">

Then, change the XML to look like that in Listing 5.


  <content type="html">
      This tip, <i>&articleTitle;</i>, details an underused facet of
	  XML document authoring. So on and so on, ad nauseum, ad infinitum.


Now, by simply changing the entity reference’s value, you can change all references
in the XML document to the new value.

In addition, you can move the entity definition from the DTD into the XML document
itself, as seen in Listing 6.

<?xml version="1.0"?>
<!DOCTYPE page [
  <!ENTITY articleTitle "O'Reilly J2EE Best Practices">


  <content type="html">

      This tip, <i>&articleTitle;</i>, details an underused facet of
	  XML document authoring. So on and so on, ad nauseum, ad infinitum.

While this won’t necessarily improve the performance of your document
parsing, it certainly is a better organizational approach, and makes maintenance
significantly easier. The only time you might want to move the entity reference outside
of the document is when multiple documents share data, and all use a shared entity


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s