Recall from the previous post that an electronic publication conforming to the OPF standard must provide a package document. This must be an XML document with a root element of <package> which includes elements called <metadata>, <manifest>, and <spine>. Figure 1. shows an overview of the package document that is delivered with our sample epub ebook.

Figure 1. Package Overview
You can see this document has the correct set of package elements. Before looking at each in detail, there are two things I'd like to point out on this screenshot. First, the unique-identifier attribute in the <package>:
unique-identifier="EPB-UUID"This attribute tells the software of the reading device to look out for a metadata item of type <dc:identifier> that has an 'id' attribute with the value 'EPB-UUID'. The value of this element is an identifier that is globally unique. Towards the bottom of the metadata, when we look at it, you will find the following element:
<dc:identifier id="EPB-UUID">That 32-digit hexadecimal value is the GUID (Globally Unique Identifier) generated by epubBooks to identify this particular publication. Of course, an ISBN is another globally unique identifier, and if you are in the publishing business and routinely buy ISBNs by the dozen, you would probably insert the ISBN here.
urn:uuid:CBC56AFC-6C29-1014-8672-92A1DF1F0AF1
</dc:identifier>
xmlns:opf="http://www.idpf.org/2007/opf"indicate that the prefix 'dc' refers to the Dublin Core element definitions (see below for more detail), and that the 'opf' prefix refers to OPF extensions to the Dublin Core. For instance, look at the two date elements in the metadata:
xmlns:dc="http://purl.org/dc/elements/1.1"
<dc:date opf:event="original-publication">1922</dc:date>The 'dc' prefix on the 'date' elements identifies them as publication dates that follow the Dublin Core specification. The 'opf' prefix on the 'event' attribute identifies 'event' as belonging to the OPF specification.
<dc:date opf:event="epub-publication">2009-09-24</dc:date>
Unfortunately, looking at the OPF specification, it seems the publisher is free to give the event attribute whatever value they like:
"The set of values for event are not defined by this specification; possible values may include: creation, publication, and modification."Now, let's look in more detail at the package metadata.
The <metadata> element of the package can contain wide ranging information about the publication. To keep OPF as open as possible, the metadata of an OPF package makes use of another open standard, namely the Dublin Core Metadata standard.
The Dublin Core is an initiative working towards standard ways of describing resources. They actively promote standardised sharing information thereby increasing interoperability between organisations - let's all agree to call a spade a spade and not a shovel or a digger.
The Dublin Core has a wider scope than just ebooks. However, there is a rich set of attributes that can be applied to electronic publications. Figure 2. shows the metadata that epubBooks placed in the package of The Curious Case of Benjamin Button.

Figure 2. Package Metadata
For convenience, I'll list the metadata elements again here:
- title
- language
- identifier
- ---------
- creator
- date(s)
- publisher
- subject
- source
- rights
Title
In fact the schema says there must be 'One Or More' of the mandatory elements. In other words, there can be more than one title, more than one language, and more than one identifier. The standard does not specify which title should be displayed, only that a reading device should choose 'the most appropriate title' for display, perhaps based on available fonts or language.
Identifier
There can also be more than one identifier element in the metadata. We've seen above how the unique identifier is handled. If you wanted, you could publish an ebook with several identifiers: your internally generated identifier, a GUID, and your ISBN. You then have to say which is to be considered the globally unique identifier.
Language
The specification says there must be at least one <language> metadata element, but there may be more than one. I suppose if you were publishing an English-Mandarin dictionary or were writing a learned text about the Rosetta Stone you might have a reason to specify more than one language.
Full list of metadata elements
The following table summarises the full set of metadata elements that can appear in the <metadata> section of a <package>
| Element | Number | Description |
| title | One or more | The title of the publication. As we've seen, there can be more than one, but there must be at least one title. |
| creator | Zero or more | The primary creators or authors of the publication. Each element is recommended to hold one name and is recommended to be in the form it should be presented to the reader. When there's more than one creator, it's expected they would be displayed in the order in which the elements appear in the metadata. Other contributors should be identified in <contributor> elements. |
| subject | Zero or more | The subject matter of the publication. There is no standardisation here. The optional text could be a sentence, a list of keywords, or one keyword per element. |
| description | Zero or more | The description(s) of the publication. |
| publisher | Zero or more | The publisher(s) of the publication. |
| contributor | Zero or more | The person(s) making contributions to the publication in a manner that is secondary to the role of creator. OPF defines nearly 30 different roles as contributor and specifies the syntax for their identification. |
| date | Zero or more | The publication date(s) for the publication. We've already seen that OPF extends the Dublin Core definition of this element, allowing different 'event' dates to be recognised. |
| type | Zero or more | The type(s) that describe the publication. This is relatively free-form although the specification recommends using words from controlled vocabularies i.e. selecting from a restricted set of words. Terms relating to genre e.g. Young Adult, Fantasy, Literary, might be used as well as terms like Fiction, Non-Fiction etc. |
| format | Zero or more | The media-types of the publication. The recommendation is to use a MIME type. |
| identifier | One or more | One or more identifiers for the publication, one of which must be defined as a unique identifier. See the discussion above. |
| source | Zero or more | Identification of any other documents or publications from which the current publication is derived. |
| language | One or more | One or more language identifiers. |
| relation | Zero or more | Identifier(s) of resources to which the current publication is related. |
| coverage | Zero or more | One or more identifiers of the scope of the publication. OPF recommends following the Dublin Core specification of coverage and to use a controlled vocabulary for geographical, temporal, and juridical descriptions. |
| rights | Zero or more | An assertion of the rights of the publisher/creator with respect to this publication. |
Package <manifest>
Figure 3. shows the <manifest> element of the OPF package for our sample epub ebook, The Curious Case of Benjamin Button.

Figure 3. Package Manifest
The package manifest identifies all of the resources that are needed to display the ebook fully and correctly. Each entry in the manifest consists of an <item> element, as in:
<item id="chapter-001" href="chapter-001.xml" media-type="application/xhtml+xml"/>Each <item> element has an 'id' attribute which identifies this resource uniquely within the publication. It has an 'href' attribute which points to the content document, in the example above it's an XML document called 'chapter-001.xml'. The 'media-type' attribute in this example shows that the resource should be handled as an XHTML document.
You can see that the manifest lists 13 content documents: a title page, a page of information about the publisher (epubBooks), and the 11 chapters of the story.
Each content document includes a 'link' element that refers to the CSS stylesheet 'body.css'. Therefore, the manifest includes an <item> for it:
<item id="main-css" href="css/book.css" media-type="text/css"/>The publisher information document, epubbooksinfo.xml, includes an image which is the company logo. Therefore, the manifest includes an <item> for it:
<item id="epubbooks-logo" href="images/epubbooks-logo.png" media-type="image/png"/>The rule is: if the content documents use it, it must be in the manifest. There are some aspects of the manifest that will be reserved for a future post so they don't clutter up this presentation. These cover media-types that are not part of the OPS Core, Out-Of-Line XML Islands, and the use of fallback documents to support these non-standard documents.
There's one more item in the manifest, and it's quite an important one:
<item id="ncx" href="epb.ncx" media-type="application/x-dtbncx+xml">The 'id' attribute is set to 'ncx', the 'href' points to a file called 'epb.ncx', and the 'media-type' indicates that the resource should be handled as an NCX document. NCX is a standard way of declaring a Table of Contents. It's another open standard, this time maintained by the DAISY consortium.
This leads us nicely into the description of the third mandatory element of an OPF package - the <spine> element.
Package <spine>
Figure 4. shows the expanded <spine> element of our sample package.

Figure 4. Package Spine
The <spine> starts off like this:
<spine toc="ncx">The
<itemref idref="titlepage" linear="yes"/>
<itemref idref="epubbooksinfo" linear="yes"/>
<itemref idref="chapter-001" linear="yes"/>
<itemref idref="chapter-002" linear="yes"/>
The next thing to notice about the <spine> is that it contains a list of <itemref> elements. Each <itemref> has an attribute called 'idref', and the value of an idref is the id of an item in the manifest.
For example, the first idref has value 'titlepage'. Look back at the manifest screenshot and you'll see that the first content document in the manifest has id="titlepage", and that item points to the content document itself (titlepage.xml).
The spine is a list of content documents and the important thing about the list is that it specifies the linear order in which the content documents should be displayed: title page, followed by the publisher's information page, followed by chapter 1, etc.
The <itemref> element has an optional attribute called 'linear'. This attribute takes a yes/no value and is used to indicate whether the referenced document is primary or auxiliary. This can be used by reading devices to show auxiliary information in a different way from the main flow of the primary information. In our case, the values are all set to 'yes' which is the default.