The Overview of XML

XML permits documents authors to create markup (that is, text-based notation for describing data) for virtually any type of information. This enables document authors to create entirely new markup languages for describing any type of data, such as mathematical formula, software configuration instructions, chemical molecular structures, music, new recipes and financial reports.
XML describes data in a way that both human beings and computer can understand. XML is not a replacement for HTML. HTML is about displaying information, while XML is about carrying information. XML uses tags to structure data. The tags are not predefined- every developer is expected to define his/her tags. XML is designed to be self-descriptive. Tags are markup construct that begins with "<" and ends with ">". Tags come in three flavours: start-tags, for example <section>, end-tags, for example </section>, and empty-element tags, for example <line-break />. An element’s start and end tags enclose text that represents a piece of data.
Every XML document must have exactly one root element that contains all the other elements. XML documents may begin by declaring some information about themselves, as in the following example.
<?xml version="1.0" encoding=" ISO-8859-1" ?>
Now let us take a look at this simple XML code below:
Example 1:
<?xml version="1.0" encoding="ISO-8859-1"?>
<Christian 24 >Christian 24 Library>
</Online Community>
From the codes above, XML did nothing at all. It is just information wrapped in tags. Someone must write a piece of software to send, receive or display it. The first line of code tells the version and character encoding being used by this XML document. The second line of code tells what kind of information or XML document. The XML applications that will use the codes in example 1, will looked at the root or parent tag in the XML document.
Here, it is <MyPersonalDetails >, which is not defined by XML. XML allows authors to create their own XML tag to be used in each document. XML, like any other languages, is capable of having two or more child tags or commonly known as nested tags. The <FullName> tag has three child tags, so on and so forth.
Also, XML tags are case sensitive. Meaning we cannot declare
< MyPersonalDetails > opening tag with a closing tag of
</myPersonalDetails >. Opening and closing tags must be written with
the same case:
Creating and Modifying XML Documents
XML allows one to describe data precisely in a well-formed format. XML document are highly portable. Any text editor such notepad of software that supports ASCII/Unicode characters can open XML documents for viewing and editing. An XML document is created by typing XML codes into a text editor and then save the document with a filename and a .xml extension. Most Web browsers can display XML documents in a formatted manner that shows the XML’s structure.
Processing XML Documents
To process an XML document, you would need an XML parser (or XML processor). A parser is software that checks that the document follows the syntax rules specified by the W3C’s XML recommendation and makes the document’s data available to application. A parser would for example check an XML document to ensure that there is a single root element, a start tag for each element, and properly nested tags (that is, the end tag for a nested element must appear before the end tag of the enclosing element).
Furthermore, XML is case sensitive, so the proper capitalisation must be used in elements as in Example 1. A document that conforms to this syntax issaid to be a well-formed XML document and is syntactically correct. If an XML parser can process an XML document successfully, that XML document is well-formed. Parsers can provide access to XML-encoded data in well-formed document only. Often XML parsers are built into software or available for download over the Internet. Examples of parser include Microsoft XML Core Services (MSXML), Xerces Expat and so on.
Validating XML Documents
In addition to being well formed, an XML document may be valid. This means that it contains a reference to a Document Type Definition (DTD) and that its elements and attributes are declared in that DTD and follows the grammatical rules for them that the DTD specifies. A DTD is an example of a schema or grammar.
Since the initial publication of XML 1.0, there has been substantial work in the area of schema languages for XML. Such schema languages typically constrain the set of elements that may be used in a document, which attributes may be applied to them, the order in which they may appear, and the allowable parent/child relationships. When an XML document references DTD or a schema, some parsers (called validating parsers) can read the DTD/Schema and check that theXML conforms to the DTD/Schema, the XML document is valid. For example, if in Figure 2.1 we were referencing DTD that specifies that BirthDate element must have Month, Date and Year, then the exclusion of Year element would invalidate the XML document detail2.xml.
However, the XML document would still be well formed, because it follows proper XML syntax (that is, it has one root element, each element has a start tag and an end tag, and the element are nested properly). By definition, a valid XML document is well formed.
Parsers that cannot check for document conformity against DTDs/schemas are nonvalidating parsers- they determine only whether an XML document is well-formed, not whether it is valid. Schema are XML documents themselves, whereas DTDs are not. XML processors are classified as validating or non-validating depending on whether or not they check XML documents for validity. A processor that discovers a validity error must be able to report it, but may continue normal processing.
Formatting and Manipulating XML Documents
XML document can be manipulated to appear differently on several devices. For example, the way XML document renders on Personal Digital Assistants (PDAs) is different from Desktop computers. Most XML documents contain only data. They do not include formatting instructions, so applications that process XML documents must look forhow to process, manipulate or display the data. Extensible Stylesheet
Language (XSL) can be used to specify rendering instructions for different platforms. XML-processing programs can also search, sort and manipulate XML data using XSL. Other popular XML-related technologies are: XPath XML Path Language (XPath), which is used for accessing parts of an XML document, XSL Formatting Objects (XSL-FO), which is a XML vocabulary used to describe document formatting, and XSL Transformations-language (XSLT) used for transforming XML documents into other documents.
Viewing an XML Document in Web Browser
Example 1 shows a simple listing of a text file for detail2.xml. This document does not contain formatting information for the detail2.xml. This is because XML is a tool for describing the structure, storage and transferring of data across disparate format/sources. Formatting and displaying data from an XML document is achieved in different ways within specific application platform. For instance, when the user loads detail2.xml in the Internet Explorer, MSXML (Microsoft XML Core Services) or Firefox, it will be parsed and display the document data. Each browser has a built-in style sheet to format the data.

The XML document will be displayed with colour-coded root and child elements. A plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure. To view the raw XML source (without the + and - signs), select “View Page Source” or “View Source” from the browser menu.
Although these symbols are not part if the XML document, both browser place them next to every container element. A minus sign indicates that the browser is displaying the container element child element. Clicking the minus sign next to an element collapses that element (that is, it causes the browser to hide the container element’schildren) and replace the minus sign with a plus).
Conversely, clicking the plus sign next to an element expands the elements (that is, it causes the browser to display the container elements children and replace the plus sign with a minus sign).
This behaviour is similar to viewing the directory structure on one’s system in Windows Explorer or another similar directory viewer. In fact, a directory structure often is modelled as a series of tree structure in which the root of the tree represents a disk drive for instance C: and nodes in the tree represent directories. Parsers often store XML data as tree structure to facilitate efficient
Within the last two decades of the introduction of XML, it has been used to create hundreds of languages which include XHTML, WSDL for describing available web services, WAP and WML as markup languages for handheld devices, RSS languages for news feeds, RDF and OWL for describing resources and ontology, SMIL for describing multimedia for the web etc. In addition, XML-based formats have become the default for most office-productivity tools, including Microsoft Office (Office Open XML) and Apple's iWork.
XML describes data in a way that both human beings and computer can understand. It enhances the storage and exchange of data amongst disparate computer systems. So far you have learnt how to create, modify, validate, format, process and view XML documents in a browser.
Terry, F-M. (2009). Web Development and Design Foundations with     XHTML. USA : Pearson International Edition.
Deitel, P. J. & Deitel, H.M. (2008). Internet and World Wide Web: How to Program. (4th ed.). New Jersey, USA: Pearson Prentice Hal.