You are on page 1of 43

XML schema SAX Xquery.

XML Schema

XML Schema
XML Schema is an XML-based alternative to DTD. An XML schema describes the structure of an XML document. The XML Schema language is also referred to as XML Schema Definition (XSD).

What is an XML Schema?


The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema:
defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes

A Simple XML Document note.xml


<?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

A DTD File note.dtd


<!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

An XML Schema note.xsd


<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

A Reference to a DTD
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd">

<note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

A Reference to an XML Schema


<?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchemainstance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

The <schema> Element


The <schema> element is the root element of every XML Schema:
<?xml version="1.0"?>

<xs:schema> ... ... </xs:schema>

The <schema> element may contain some attributes.

A schema declaration often looks something like this:


<?xml version="1.0"?> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSche ma xmlns="http://www.w3schools.com" elementFormDefault="qualified"> ... ... </xs:schema>

xmlns:xs="http://www.w3.org/2001/X MLSchema"
indicates that the elements and data types used in the schema come from the "http://www.w3.org/2001/XMLSchema" namespace. It also specifies that namespace should be prefixed with xs:

xmlns="http://www.w3schools.com"
indicates that the default namespace is "http://www.w3schools.com".

elementFormDefault="qualified
indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified.

XSD Simple Elements


XML Schemas define the elements of your XML files. simple types can only have content directly contained between the elements opening and closing tags. They cannot have attributes or child elements.

Defining a Simple Element


The syntax for defining a simple element is:
<xs:element name="xxx" type="yyyy"/>

where xxx is the name of the element and yyyy is the data type of the element. XML Schema has a lot of built-in data types. The most common types are:
xs:string xs:decimal xs:integer xs:boolean xs:date xs:time

Example
Here are some XML elements:
<lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03-27</dateborn>

And here are the corresponding simple element definitions:


<xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

Default and Fixed Values for Simple Elements


Simple elements may have a default value OR a fixed value specified. A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red":
<xs:element name="color" type="xs:string" default="red"/>

A fixed value is also automatically assigned to the element, and you cannot specify another value. In the following example the fixed value is "red":
<xs:element name="color" type="xs:string" fixed="red"/>

XSD Complex Elements


A complex element contains other elements and/or attributes. A complex element is an XML element that contains other elements and/or attributes. There are four kinds of complex elements:
empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text

Note: Each of these elements may contain attributes as well!

Examples of Complex Elements


A complex XML element, "product", which is empty:
<product pid="1345"/>

A complex XML element, "employee", which contains only other elements:


<employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

A complex XML element, "food", which contains only text:


<food type="dessert">Ice cream</food>

A complex XML element, "description", which contains both elements and text:
<description> It happened on <date lang="norwegian">03.03.99</date> .... </description>

How to Define a Complex Element


Look at this complex XML element, "employee", which contains only other elements:
<employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

How to Define a Complex Element


We can define a complex element in an XML Schema two different ways: 1. The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>

If you use the method described above, only the "employee" element can use the specified complex type.

2. The "employee" element can have a type attribute that refers to the name of the complex type to use:
<xs:element name="employee" type="personinfo"/> <xs:complexType name="personinfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

If you use the method described above, several elements can refer to the same complex type, like this:

<xs:element name="employee" type="personinfo"/> <xs:element name="student" type="personinfo"/> <xs:element name="member" type="personinfo"/> <xs:complexType name="personinfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

SAX

XML Parsers
What is an XML parser?
Software that reads and parses XML Passes data to the invoking application The application does something useful with the data

XML Parsers
Why is this a good thing?
Since XML is a standard, we can write generic programs to parse XML data Frees the programmer from writing a new parser each time a new data format comes along

XML Parsers
Two types of parser
SAX (Simple API for XML)
Event driven API Sends events to the application as the document is read

DOM (Document Object Model)


Reads the entire document into memory in a tree structure

SAX and DOM


SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files
DOM is a W3C standard SAX is an ad-hoc (but very popular) standard SAX was developed by David Megginson and is open source

There are various implementations available Java implementations are provided as part of JAXP (Java API for XML Processing) JAXP is included as a package in Java 1.4
JAXP is available separately for Java 1.3

Unlike many XML technologies, SAX and DOM are relatively easy

Difference between SAX and DOM


DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML document and calls one of your methods for each element or block of text that it encounters Consequences:
DOM provides random access into the XML document SAX provides only sequential access to the XML document DOM is slow and requires huge amounts of memory, so it cannot be used for large XML documents SAX is fast and requires very little memory, so it can be used for huge documents (or large numbers of documents)
This makes SAX much more popular for web sites

Some DOM implementations have methods for changing the XML document in memory; SAX implementations do not

Callbacks
SAX works through callbacks: you call the parser, it calls methods that you supply
Your program

startDocument(...)
The SAX parser

main(...) parse(...)

startElement(...)
characters(...) endElement( ) endDocument( )

XQuery

XQuery
XQuery is to XML what SQL is to database tables. XQuery was designed to query XML data. Example
for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title

The XML Example Document


Example

How to Select Nodes From "books.xml"?


Functions XQuery uses functions to extract data from XML documents. The doc() function is used to open the "books.xml" file:
doc("books.xml")

Path Expressions
XQuery uses path expressions to navigate through elements in an XML document. The following path expression is used to select all the title elements in the "books.xml" file: doc("books.xml")/bookstore/book/title
(/bookstore selects the bookstore element, /book selects all the book elements under the bookstore element, and /title selects all the title elements under each book element)

The XQuery above will extract the following:


<title lang="en">Everyday Italian</title> <title lang="en">Harry Potter</title> <title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML</title>

Predicates
XQuery uses predicates to limit the extracted data from XML documents. The following predicate is used to select all the book elements under the bookstore element that have a price element with a value that is less than 30:

doc("books.xml")/bookstore/book[price<30] The XQuery above will extract the following:


<book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book>

How to Select Nodes From "books.xml" With FLWOR


FLWOR is an acronym for "For, Let, Where, Order by, Return". Example
for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title

The for clause selects all book elements under the bookstore element into a variable called $x. The where clause selects only book elements with a price element with a value greater than 30. The order by clause defines the sort-order. Will be sort by the title element. The return clause specifies what should be returned. Here it returns the title elements.

The result of the XQuery expression above will be:


<title lang="en">Learning XML</title> <title lang="en">XQuery Kick Start</title>

You might also like