What are Schemas?
Schemas are plural of “Schema.” In order to understand it completely, we can divide our discussion in following subsections:
Schema and subschema
Schema is the conceptual organization of the entire database as viewed by designers.
Whereas, subschema is the conceptual organization of database as “seen” by the application program accessing it (1).
The word schema derives from the Greek word σχημα,
meaning form or shape. It was first popularized in the Western world by
Immanuel Kant in the late 1700s. According to the 1933 edition of the Oxford English Dictionary, Kant used the word schema to mean, "Any one of certain forms or rules of
the ‘productive imagination’ through which the understanding is able to apply
its ‘categories’ to the manifold of sense-perception in the process of
realizing knowledge or experience."
The original Greek plural is σχηματα,
schemata in Latin transliteration; and this is the
form which Kant used, originally. Its plural changed to something that sounds
more natural to an Anglophone ear. And
hence, schemata became “schemas”. Schema word entered computer science, probably through
database theory. Here, schema originally meant any
document that described the permissible content of a database. More
specifically, a schema was a description of all the tables in a database and
the fields in the table. A schema also described what type of data each field
could contain: CHAR, INT, CHAR[32], BLOB, DATE, and so on (2).
The word schema has grown from that source definition to a more
generic meaning of any document that describes the permissible contents of
other documents, especially if data typing is involved. Thus, there are
different kinds of schemas from different technologies, including vocabulary
schemas, RDF schemas, organizational schemas, X.500 schemas and, of course, XML
schemas.
Comparison between Schema and Document
Type Definition (DTD) language
Schemas are logical
development over DTDs. Schemas do not
have problems that DTDs have such as, DTDs- can not have data typing, they have
non-XML syntax. DTDs are only marginally extensible and don’t scale very well,
and DTDs cannot enforce the order or number of child elements in mixed
content. Hence, Schemas are better
choice over DTDs.
Schemas are
strategies to solve all the problems of DTDs by defining a new XML-based syntax
for describing the permissible contents of XML documents that includes:
Schema languages and their scopes
Schemas are written using specific languages. Since schemas is such a generic term, there are more than one schema languages for XML. In fact there are many, each with its own unique advantages and disadvantages and further insight can be obtained using following links. Such as, Murata Makoto's Relax (3), Rick Jelliffe's Schematron (4), James Clark's TREX - Tree Regular Expressions for XML (5) the Document Definition Markup Language (DDML, also known as Xschema; 6), and the W3C's misleadingly, generically titled XML Schema language. In addition, traditional XML DTDs can be considered to be yet another schema language. W3C schemas are complex. Relax is a simpler language and offers still extensible data type. Relax adopts the less controversial data types half of the W3C XML Schema recommendation, but replaces the much more complex and much less popular structures half with a much simpler language. Relax also has the advantage of being an official JIS and ISO standard.
Schema code example
The ‘greeting
schema’ example: first write the xml
code.
File greeting.xml
<?xml version="1.0"?>
<GREETING>
Hello XML!
</GREETING>
Now write the
Schema code. By convention, the cod file
for Schema is stored with name of the file with 3 letter extension .xsd, for
example greeting.xsd (see below). Schema
code can be written and saved in any text editor that knows how to save Unicode
files. Schema documents are XML
documents and have all the privileges and responsibilities of other XML
documents. They can even have DTDs, DOCTYPE
declarations, and style sheets.
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="GREETING" type="xsd:string"/>
</xsd:schema>
The root element
of this and all other schemas is schema
. This must be in the http://www.w3.org/2001/XMLSchema
namespace. Normally, this namespace is
bound to the prefix xsd
or xs
, although this
can change as long as the URI stays the same.
Elements are declared using xsd:element
elements.
In the above example it includes a single such element declaring the GREETING
element. The name
attribute specifies which element is being
declared, GREETING
in this
example. This xsd:element
element also has a type
attribute whose value is the data type of the element. In this case the
type is xsd:string
, a
standard type for elements that can contain any amount of text in any form but
not child elements (7).
The document must
be validated for its correctness against a defined Schema. The schema specification specifically allows
for a variety of different means for associating documents with schemas. For
instance, one possibility is that both the name of the document to validate and
the name of the schema to validate it against could be passed to the validator
program on the command line like this:
C:\>validator greeting.xml greeting.xsd
To attach a
schema to a document, add an xsi:noNamespaceSchemaLocation
attribute to the document's root element.
W3C Schema language
The W3C XML
Schema language was created by the W3C XML Schema. W3C XML
Schemas express shared vocabularies and allow machines to carry out rules made
by people. They provide a means for defining the structure, content and
semantics of XML documents. It is a very large specification designed
to handle a broad range of use cases. It
is an open standard, free to be implemented by any interested party (8).
The W3C XML Schema language divides elements into
complex and simple types. A simple type element is one like GREETING
that can only contain text and does not
have any attributes. It cannot contain any child elements. It may, however, be
more limited in the kind of text it can contain. For instance, a schema can say
that a simple element contains an integer, a date, or a decimal value between
3.76 and 98.24. Complex elements can have attributes and can have child
elements. Most documents need a mix of
both complex and simple elements.
Answer the following questions:
1.
What is the definition of Schema?
Answer: Schema is the conceptual organization of the entire database as viewed by designers.
2. What is subschema?
Answer: subschema is the conceptual organization of database as “seen” by the application program accessing it
3.
Which is true from the following statements:
a) DTDs have extensibility & scalability b) Schemas have
extensibility & scalability c)
both of them have extensibility and, d) none of them
extensibility & scalability. Answer: b)
4.
What are schema characteristics?
Answer: i) Powerful data typing including range
checking
ii)
Namespace-aware validation based on namespace URIs rather than on
prefixes
iii) Extensibility and
scalability
5.
What is W3C Schema, explain breifly?
Answer:
The W3C XML Schema
language was created by the W3C XML Schema.
W3C XML Schemas express shared vocabularies and allow machines to
carry out rules made by people. They provide a means for defining the
structure, content and semantics of XML documents. It
is a very large specification designed to handle a broad range of use
cases. It is an open standard, free to
be implemented by any interested party.
Hungry Minds. http://www.ibiblio.org/xml/books/bible2/chapters/ch24.html
http://www.xml.gr.jp/relax/
http://www.ascc.net/xml/resource/schematron/schematron.html
http://www.thaiopensource.com/trex/
http://purl.oclc.org/NET/ddml