Mastering XML: A Complete Guide to Structuring, Storing, and Exchanging Data

4.46K 0 0 0 0

📘 Chapter 3: Validation Using DTD and XSD

🧠 What is XML Validation?

XML validation ensures that your XML document conforms to a specific structure and set of rules. This helps catch errors early and maintain data consistency, especially in applications that exchange information between systems.

Two main types of validation are:

  • DTD (Document Type Definition)
  • XSD (XML Schema Definition)

🧾 1. Why Validate XML?

Reason

Benefit

Enforce data rules

Ensure correct data types/format

Interoperability

Make XML portable across systems

Catch errors early

Prevent malformed data

Improve documentation

Define expected structure clearly


📑 2. DTD – Document Type Definition

🔹 What is DTD?

DTD defines the structure and legal elements/attributes of an XML document. It can be either:

  • Internal DTD (embedded in the XML file)
  • External DTD (referenced via URL or path)

Internal DTD Example

xml

 

<?xml version="1.0"?>

<!DOCTYPE book [

  <!ELEMENT book (title, author, price)>

  <!ELEMENT title (#PCDATA)>

  <!ELEMENT author (#PCDATA)>

  <!ELEMENT price (#PCDATA)>

]>

<book>

  <title>XML Basics</title>

  <author>Jane Doe</author>

  <price>29.99</price>

</book>


External DTD Example

book.dtd:

dtd

 

<!ELEMENT book (title, author, price)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (#PCDATA)>

<!ELEMENT price (#PCDATA)>

book.xml:

xml

 

<?xml version="1.0"?>

<!DOCTYPE book SYSTEM "book.dtd">

<book>

  <title>XML Basics</title>

  <author>Jane Doe</author>

  <price>29.99</price>

</book>


📋 DTD Declarations

Declaration Type

Syntax Example

Element

<!ELEMENT title (#PCDATA)>

Attribute List

<!ATTLIST book genre CDATA #IMPLIED>

Sequence

<!ELEMENT book (title, author)>

Choice

`<!ELEMENT book (title

Optional Element

<!ELEMENT book (title?, author)>

Repeating Element

<!ELEMENT book (title+)>


🧬 3. XSD – XML Schema Definition

🔹 What is XSD?

XSD is more powerful and flexible than DTD. It defines:

  • Element types
  • Data types (string, integer, date, etc.)
  • Attribute types
  • Sequence, choice, and constraints
  • Custom complex types

XSD is written in XML syntax, making it extensible and tool-friendly.


Basic XSD Example

books.xsd:

xml

 

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="book">

    <xs:complexType>

      <xs:sequence>

        <xs:element name="title" type="xs:string"/>

        <xs:element name="author" type="xs:string"/>

        <xs:element name="price" type="xs:decimal"/>

      </xs:sequence>

    </xs:complexType>

  </xs:element>

</xs:schema>

book.xml:

xml

 

<?xml version="1.0"?>

<book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

      xsi:noNamespaceSchemaLocation="books.xsd">

  <title>XML Mastery</title>

  <author>Alan Turing</author>

  <price>49.99</price>

</book>


📋 XSD vs DTD

Feature

DTD

XSD

Syntax

Not XML

Written in XML

Data types

Limited

Rich (string, date, decimal)

Namespaces

Not supported

Supported

Custom types

No

Yes (simple and complex)

Usage today

Legacy and simple use cases

Modern and enterprise-grade


🔧 4. Data Types in XSD

XSD Type

Description

xs:string

Any string of characters

xs:integer

Whole numbers

xs:decimal

Decimal numbers

xs:date

Date (YYYY-MM-DD)

xs:boolean

true or false


Restricting Values

xml

 

<xs:element name="status">

  <xs:simpleType>

    <xs:restriction base="xs:string">

      <xs:enumeration value="active"/>

      <xs:enumeration value="inactive"/>

    </xs:restriction>

  </xs:simpleType>

</xs:element>


Pattern Matching (RegEx)

xml

 

<xs:element name="zipCode">

  <xs:simpleType>

    <xs:restriction base="xs:string">

      <xs:pattern value="\d{5}"/>

    </xs:restriction>

  </xs:simpleType>

</xs:element>


🧱 5. Complex Types

Complex Type with Attributes

xml

 

<xs:element name="product">

  <xs:complexType>

    <xs:sequence>

      <xs:element name="name" type="xs:string"/>

      <xs:element name="price" type="xs:decimal"/>

    </xs:sequence>

    <xs:attribute name="id" type="xs:integer" use="required"/>

  </xs:complexType>

</xs:element>


📎 6. Optional and Repeating Elements

XSD Notation

Meaning

minOccurs="0"

Optional

maxOccurs="1"

Appears only once

maxOccurs="unbounded"

Can repeat many times

xml

 

<xs:element name="tag" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>


📐 7. XSD Validation Tools

Tool Name

Description

XML Validator

Online web tool to check structure

Visual Studio

IDE with schema-aware XML editing

Oxygen XML Editor

Professional tool for validation

XMLSpy

Enterprise XML modeling tool


📁 8. Real-World Use Cases

Industry

XML Schema Application

Healthcare

HL7 Clinical Documents

Banking

ISO 20022 Payment Messages

Publishing

EPUB files (chapters, metadata)

Government

UBL (Invoices, Purchase Orders)

Android

Layout and Manifest validation


🧪 9. Practice Example

XML

xml

 

<employee id="1001">

  <name>John Doe</name>

  <position>Manager</position>

  <salary>5000</salary>

</employee>

XSD

xml

 

<xs:element name="employee">

  <xs:complexType>

    <xs:sequence>

      <xs:element name="name" type="xs:string"/>

      <xs:element name="position" type="xs:string"/>

      <xs:element name="salary" type="xs:decimal"/>

    </xs:sequence>

    <xs:attribute name="id" type="xs:integer" use="required"/>

  </xs:complexType>

</xs:element>


📚 Summary Table


Feature

DTD

XSD

Syntax

Non-XML

XML-based

Validation Type

Structural

Structural + Data types

Readability

Shorter, less strict

Verbose but flexible

Used In

Legacy projects, simple XML

Modern apps, industry standards

Back

FAQs


1. Q: What does XML stand for?

A: XML stands for eXtensible Markup Language.

2. Q: Is XML case-sensitive?

A: Yes, <Tag> and <tag> are treated as different elements.

3. Q: Can I define my own tags in XML?

A: Absolutely. That's why it's called "extensible."

4. Q: What’s the difference between XML and HTML?

A: XML stores and structures data, while HTML displays it.

5. Q: Why is XML used in configuration files?

A: Its structured format and readability make it ideal for settings/configs.

6. Q: Can XML be used for data transfer in APIs?

A: Yes. Many enterprise and legacy APIs use SOAP, which is XML-based.

7. Q: Is XML outdated?

A: Not at all. While JSON is preferred for web APIs, XML is widely used in enterprise, publishing, and government systems.

8. Q: How do I check if my XML is valid?

A: You can validate it using DTD or XSD files or an XML validator tool.

9. Q: Can XML store binary data?

A: Not directly. It needs to be base64 encoded first.

10. Q: What tools can I use to edit XML?

A: Notepad++, VS Code, XMLSpy, Eclipse, and Oxygen XML Editor are popular.