Comparing XML Files

Comparing XML files is similar to comparing text files—open a file comparison window and load the files, or paste the texts, you want to compare.

XML files are, by definition, text files and comparing them as such is often sufficient. There are times, however, when running a text comparison on XML files may not yield the desired results due to insignificant differences between the two files such as formatting or attribute ordering neither of which affects the content of the XML. Although designed to be human-readable, XML data are often computer-generated and many XML generators attempt to minimize the number of characters in the resulting XML file, often using a format that is unsuitable for human interpretation.

To achieve the desired result, DeltaWalker employs a technique referred to XML cannonization. The XML canonicalization process will often change the original XML document in one or more of the following ways:

  • Encoding of the document in UTF-8.
  • Normalization of line breaks to #xA.
  • Normalization of the attribute values.
  • Substitution of character and parsed entities.
  • Removing of the XML declaration.
  • Removing of the document type declaration.
  • Addition of the default attributes.

All these make the canonical form of XML documents particularly useful for the purposes of comparison as they eliminate all unimportant aspects such as formatting, thus narrowing the focus strictly to content diferences. DeltaWalker's Text and Structure views give you the two most important views on your XML data.