<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6014512293401911267</id><updated>2011-12-03T10:06:35.022Z</updated><category term='Benjamin Button'/><category term='Open Publication Structure'/><category term='META-INF'/><category term='navMap'/><category term='GUID'/><category term='OCF'/><category term='NISO'/><category term='Sandcastle'/><category term='mimetype'/><category term='ebook'/><category term='switch'/><category term='Inline'/><category term='GUI'/><category term='epub'/><category term='Open Container Format'/><category term='XML documentation'/><category term='OPF'/><category term='Open Packaging Format'/><category term='OPS'/><category term='W3Schools'/><category term='Smashwords'/><category term='class'/><category term='Out-Of-Line'/><category term='XML Island'/><category term='ISBN'/><category term='Inside Epub'/><category term='XHTML'/><category term='spine'/><category term='fallback'/><category term='Preferred Vocabulary'/><category term='wysiwyg'/><category term='wysiwyg epub editor'/><category term='Zip'/><category term='IDPF'/><category term='MusicXML'/><category term='XSL'/><category term='CSS'/><category term='PDF'/><category term='navPoint'/><category term='MCE'/><category term='manifest'/><category term='Threepress'/><category term='NCX'/><category term='XML'/><category term='WinZip'/><category term='Dublin Core'/><category term='website'/><category term='Adobe Digital Editions'/><category term='case'/><category term='online'/><category term='C#'/><category term='Package'/><category term='editor'/><category term='Zen Garden'/><category term='epubBooks'/><category term='epubcheck'/><category term='Container'/><category term='ASP.Net'/><category term='DAISY'/><category term='metadata'/><category term='DotNetZip'/><category term='tiny MCE'/><title type='text'>Inside epub</title><subtitle type='html'>Explore the epub standard from the International Digital Publishing Forum (IDPF). Look inside ebooks written in the epub standard. Develop code to unpack and display epub books. Learn how to save books in epub format.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://netkingcol.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>15</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-7777722579832235427</id><published>2010-02-18T17:40:00.005Z</published><updated>2010-02-18T17:54:07.665Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Inside Epub'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='editor'/><category scheme='http://www.blogger.com/atom/ns#' term='Benjamin Button'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='CSS'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='epubBooks'/><category scheme='http://www.blogger.com/atom/ns#' term='wysiwyg epub editor'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'></title><content type='html'>The purpose of this post is to review progress on the project to develop an online wysiwyg epub editor. For a number of reasons it's a good time to pause and reflect on what's been achieved and to identify what remains to be done.&lt;br /&gt;&lt;br /&gt;I've built up a basic library of epub classes to handle the functionality of epub editing - classes to handle the container, the package with its metadata, manifest, and spine, and the NCX.&lt;br /&gt;&lt;br /&gt;At the front end, the user can perform the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Create a new epublication (a term I prefer to 'epub publication').&lt;/li&gt;&lt;li&gt;Insert a basic set of metadata items for the new publication.&lt;/li&gt;&lt;li&gt;Limited editing of metadata for an existing publication.&lt;/li&gt;&lt;li&gt;Create and manipulate content documents.&lt;/li&gt;&lt;li&gt;Add images to the manifest and insert them (sort of) in a content document.&lt;/li&gt;&lt;li&gt;Edit the CSS files of the publication.&lt;/li&gt;&lt;/ul&gt;The screenshots shown below illustrate this functionality.&lt;br /&gt;&lt;br /&gt;However, there are aspects of an online system that can no longer be ignored. For instance, the application needs to be multi-user. To date I've worked with a single server folder for the epub library whereas each user needs a folder to hold their own library and to act as a work area for writing. The need to handle many users immediately suggests some kind of database would be useful; so far I've held data in constants, in the web.config and, embarrassingly, as hard-code. It's time to organise this better.&lt;br /&gt;&lt;br /&gt;This review will step through the existing functionality and will identify where and how the application should be enhanced.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Library (Books View)&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The screenshot in figure 1. shows the opening page of the application. It presents a list of books and a button for creating a new epub.&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S30KwzNsAcI/AAAAAAAAARE/2ZWQWXb0z7Y/s800/InsideEpub0046.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S30KwzNsAcI/AAAAAAAAARE/2ZWQWXb0z7Y/s288/InsideEpub0046.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Books View&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The user clicks on the book they want to open or clicks on the 'New epub' button to start a new publication. Currently, the list of books is fetched by extracting a folder path from web.config. All files with the extension .epub are shown in the list.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06; font-size: x-small;"&gt;Design Improvements&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The following enhancements are deemed to be either essential or desirable:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The application should be multi-user. User registration and login should be handled by the Membership Service using the range of built-in Login controls. This should use a SQL-Express database in the App_Data folder.&lt;/li&gt;&lt;li&gt;Additional user information should be held; in particular a folder on the web server should be assigned for each user's work. This root folder would be the place to hold the user's library and it is from this folder that the Books list would be populated.&lt;/li&gt;&lt;li&gt;It would be friendlier to display the title of the book rather than the filename for each book, for instance 'The Curious Case of Benjamin Button' rather than 'fitzgerald-curious-case-of-benjamin-button.epub'.&lt;/li&gt;&lt;/ol&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Metadata (Book Information View)&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;In figure 2. the Book Information view is shown. This displays the &amp;lt;metadata&amp;gt; from the publication's package document.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S30LbiZ7B6I/AAAAAAAAARI/sBDjBysxglI/s800/InsideEpub0047.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S30LbiZ7B6I/AAAAAAAAARI/sBDjBysxglI/s288/InsideEpub0047.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. Book Information View&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The user can modify any items on the screen and click the Save button to store the changes in the epub.&lt;br /&gt;&lt;div&gt;A review of the Open Packaging Format Schema shows, however, that the handling of metadata needs to be more sophisticated. Figure 3. is an extract from the schema showing the definition of metadata-content.&lt;/div&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;&amp;lt;define name="OPF20.metadata-content"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;lt;choice&amp;gt; &lt;br /&gt;&amp;nbsp; &amp;lt;interleave&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="OPF20.dc-metadata-element"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;optional&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="OPF20.x-metadata-element"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/optional&amp;gt;&lt;br /&gt;&amp;nbsp; &amp;lt;/interleave&amp;gt;&lt;br /&gt;&amp;nbsp; &amp;lt;interleave&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;oneOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;DC.title-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/oneOrMore&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;oneOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;DC.language-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/oneOrMore&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;oneOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;DC.identifier-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/oneOrMore&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;zeroOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;DC.optional-metadata-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/zeroOrMore&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;zeroOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;OPF20.meta-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/zeroOrMore&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;strong&gt;zeroOrMore&lt;/strong&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;ref name="&lt;strong&gt;OPF20.any-other-element&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/zeroOrMore&amp;gt;&lt;br /&gt;&amp;nbsp; &amp;lt;/interleave&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;lt;/choice&amp;gt;&lt;br /&gt;&amp;lt;/define&amp;gt;&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Figure 3.OPF20.metadata-content&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The &lt;strong&gt;&amp;lt;oneOrMore&amp;gt;&lt;/strong&gt; wrapped around the title, language, and id elements indicates that there must be at least one of these elements in the metadata, but there may be more. The &lt;strong&gt;&amp;lt;zeroOrMore&amp;gt;&lt;/strong&gt; identifies elements that are optional. However, the &lt;strong&gt;OrMore&lt;/strong&gt; part of the definition means there can be many of these elements too. We've already seen in earlier posts that there can be a range of dates, creators, contributors, and descriptions. In fact the schema says there can be any number of these items.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06; font-size: x-small;"&gt;Design Improvements&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The following enhancements are essential for handling the metadata of an epublication.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Handle multiple instances of any metadata element.&lt;/li&gt;&lt;li&gt;It should be possible to add, modify, and delete metadata elements.&lt;/li&gt;&lt;li&gt;All metadata elements can be deleted except for one each of title, language, and id.&lt;/li&gt;&lt;/ol&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Book Content&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 4. shows the latest incarnation of the Book Content view.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S30LuJmW2OI/AAAAAAAAARM/EBur2MvEoa0/s800/InsideEpub0048.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S30LuJmW2OI/AAAAAAAAARM/EBur2MvEoa0/s288/InsideEpub0048.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 4. Book Content View&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The functionality of this screen has been changed since it was last presented. The most significant differences are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Drag and drop functionality to move content documents within the publication. This activity is enabled/disabled by the 'Organise' checkbox. The Move Up/Move Down options were removed from the Action dropdown as they are no longer needed.&lt;/li&gt;&lt;li&gt;The tinyMCE editor, which is where the content documents are displayed, has been configured with a default font size that's easier to read and a drag handle has been provided to allow the user to change the height of the text area.&lt;/li&gt;&lt;li&gt;The new document details - Contents Entry and Document Heading - were moved above the editing area to allow the resize facility just mentioned.&lt;/li&gt;&lt;li&gt;The ability to put a heading at the top of a new content document was made optional.&lt;/li&gt;&lt;li&gt;Not shown in the screenshot, a 'boiler-plate' copyright document is inserted after the title page when a new epublication is created. It uses text like the following, where names and dates are taken from the metadata and inserted into the text at fixed locations. An author or publisher could change the text to that of their choice.&lt;/li&gt;&lt;/ul&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid; margin-left: 50px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;strong&gt;Copyright © Colin Hazlehurst, 2010&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Colin Hazlehurst asserts the moral right to be identified as the author of this work.&lt;br /&gt;&lt;br /&gt;No part of this publication may be reproduced, stored or introduced into a retrieval system, or transmitted, in any form or by any means, without the prior written permission of both the copyright owner and the publisher of this work.&lt;/blockquote&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06; font-size: x-small;"&gt;Design Improvements&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;There are still useful enhancements that could be made to this view:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The ability to promote and demote content document and reflect the changes in the &amp;lt;navMap&amp;gt; of the NCX document.&lt;/li&gt;&lt;li&gt;The code should start with the Save button disabled. It should detect when either the Table of Contents or the text of the currently displayed content document are changed. It should then enable the Save button. The Save button should be disabled after any changes have been saved.&lt;/li&gt;&lt;li&gt;There is a particular challenge with respect to handling images which is the subject of a separate note below. The problem is that the href of an image in the manifest is relative to the document which references it. When displaying the image in a browser, the URL in the image's &lt;em&gt;src&lt;/em&gt; attribute&amp;nbsp;needs to specify a path on the server relative to the root of the application.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Media View&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 5. shows the view when the user clicks on the Media tab. The application reads the manifest and finds all files that have a media-type beginning with the text 'image/'. For each file it finds, an Image control is added to the view and the source is set to the href for the manifest item.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S30MYcdYcNI/AAAAAAAAARQ/2-H_Pwt2qks/s800/InsideEpub0049.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S30MYcdYcNI/AAAAAAAAARQ/2-H_Pwt2qks/s288/InsideEpub0049.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 5. Media View&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;A FileUpload control works in conjunction with an Upload button to allow the user to upload a new image for inclusion in the publication.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06; font-size: x-small;"&gt;Design Improvements&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The following enhancements would greatly improve the handling of media by the application.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Media types other than images should be handled.&lt;/li&gt;&lt;li&gt;Some ability to generate thumbnails should be included which would keep the correct aspect ratio for each image.&lt;/li&gt;&lt;li&gt;Currently images cannot be selected for deletion.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Styles View&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Remembering that the value of XHTML is that it gives structure to the content of a document, but it also separates the content from its presentation. The widespread tool-of-choice for presenting content is CSS. epublications can incorporate any number of CSS stylesheets to help present the content.&lt;br /&gt;&lt;br /&gt;Figure 6. shows the Styles tab in the epub editor project.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S30MqfA8tKI/AAAAAAAAARU/RXC85NXxvbY/s800/InsideEpub0050.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S30MqfA8tKI/AAAAAAAAARU/RXC85NXxvbY/s288/InsideEpub0050.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 6. Styles View&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;When the user clicks on the Styles tab, the application reads the manifest and finds all files that have the media-type 'text/css'. It constructs a list of these files and allows the user to select one by clicking on it. In the example shown, main.css was selected and the stylesheet is displayed.&lt;br /&gt;&lt;br /&gt;The 'Action' dropdown list gives the user the ability to add and remove CSS files. The Save button is used to save any changes the user makes to the currently displayed stylesheet.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Image Handling Issue&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;It was mentioned above that there is an issue with the handling of images that is particular to the online environment. The href of an image in the manifest of an epub is set relative to the content document that references it. On the web, the &lt;em&gt;src&lt;/em&gt; attribute of the image control is a URL relative to the root of the web application.&lt;br /&gt;&lt;br /&gt;In the Benjamin Button example, the ePubBooks logo is referenced on the &lt;em&gt;epubbooksinfo&lt;/em&gt; page in the OPS folder. The href is set to 'images/epubbooks-logo.png', and the logo is held in folder OPS/images. A web page with a root folder called epub displaying the &lt;em&gt;epubbooksinfo&lt;/em&gt; page in a tinyMCE editor would expect to find the logo in folder .../epub/images. &lt;br /&gt;&lt;br /&gt;In a multi-user system it would not be possible to put the images for all users' publications in one folder. Each user must have their own work area, which means they must have a separate folder on the server. If F.Scott Fitzgerald were using this application (and who, given the weirdness of Benjamin Button, can say that he can't?), then he might be saving his content in a folder like:&lt;br /&gt;&lt;blockquote&gt;epub/fsfitzgerald/benjaminbutton/OPS&lt;/blockquote&gt;Therefore, to view the image in the browser, its &lt;em&gt;src&lt;/em&gt; would need to be something like:&lt;br /&gt;&lt;blockquote&gt;epub/fsfitzgerald/benjaminbutton/OPS/images/myimage.svg&lt;/blockquote&gt;The epub must reference the image simply as: images/myimage.svg.&lt;br /&gt;&lt;br /&gt;This difference in addressing must be handled by the application. The obvious choice is to use a pair of&amp;nbsp;XSL transforms, one of which builds a web-relative URL and replaces the &lt;em&gt;src&lt;/em&gt; attributes of all images in a content document with this value. This transform runs when the user selects a&amp;nbsp;document from the table of contents to display it in the editing area.&lt;br /&gt;&lt;br /&gt;The other transform runs when the user clicks the Save button after making changes to a content document. It strips out the web path and replaces it with a document relative path. The content documents saved to the filesystem and thus in the .epub must always use the document relative path - that's the only way the epub can be shipped.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;How do you get to Carnegie Hall?&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;So that's the job facing me on this project. It's interesting, but time consuming. To sum&amp;nbsp;up the way I feel is like:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;Tourist&lt;/strong&gt;&lt;/span&gt;: How do you get to Carnegie Hall?&lt;br /&gt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;Yokel&lt;/strong&gt;&lt;/span&gt;: Well, you wouldn't want to start from here.&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-7777722579832235427?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/7777722579832235427'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/7777722579832235427'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/02/purpose-of-this-post-is-to-review.html' title=''/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh5.ggpht.com/_cvaF-9-3DHs/S30KwzNsAcI/AAAAAAAAARE/2ZWQWXb0z7Y/s72-c/InsideEpub0046.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-7626148589622412277</id><published>2010-02-16T15:47:00.006Z</published><updated>2010-02-16T16:05:52.138Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='wysiwyg'/><category scheme='http://www.blogger.com/atom/ns#' term='class'/><category scheme='http://www.blogger.com/atom/ns#' term='editor'/><category scheme='http://www.blogger.com/atom/ns#' term='OPS'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Inline'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><category scheme='http://www.blogger.com/atom/ns#' term='fallback'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Packaging Format'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Publication Structure'/><category scheme='http://www.blogger.com/atom/ns#' term='XML Island'/><category scheme='http://www.blogger.com/atom/ns#' term='manifest'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><category scheme='http://www.blogger.com/atom/ns#' term='spine'/><title type='text'>Manifest and spine management in C#</title><content type='html'>This post presents the requirements for C# classes that manage the manifest and the spine - both of which are elements in an epub package. The manifest identifies all of the files that are part of a publication while the spine specifies the linear reading order of its content documents. &lt;br /&gt;&lt;br /&gt;A reading system needs only to read and parse these elements; it doesn't&amp;nbsp;modify them in any way, with one exception. However,&amp;nbsp;an online wysiwyg epub editor needs the ability to insert, remove, and rearrange files in both the manifest and the spine.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Manifest Items&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The &amp;lt;manifest&amp;gt; element of an epub &amp;lt;package&amp;gt; contains &amp;lt;item&amp;gt; elements, one item&amp;nbsp;for each file that is referenced&amp;nbsp;from anywhere in the publication. A manifest item has the attributes shown in Table 1.&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="2" cellspacing="0"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="width: 140px;"&gt;&lt;strong&gt;Attribute Name&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Attribute Description&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;id&lt;/td&gt;&lt;td&gt;Mandatory, unique identifier of the file within the manifest.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;href&lt;/td&gt;&lt;td&gt;Mandatory, URI of the file for this item.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;media-type&lt;/td&gt;&lt;td&gt;Mandatory, MIME media-type for this item.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;fallback&lt;/td&gt;&lt;td&gt;id of the manifest item to which a reading system should fall back if it is unable to process the namespace of the current item. Mandatory when the current document is an Out-Of-Line XML Island.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;fallback-style&lt;/td&gt;&lt;td&gt;id of the manifest item which holds a CSS stylesheet using which the contents of the current item may be rendered.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;required-namespace&lt;/td&gt;&lt;td&gt;When the current document is an Out-Of-Line XML Island, this attribute must be present and&amp;nbsp;it should be set to&amp;nbsp;the namespace of the document.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;required-modules&lt;/td&gt;&lt;td&gt;A comma-separated list of Extended Modules, which might belong to XHTML or to the namespace of an Out-Of-Line XML Island. This list of modules helps the reading system decide whether it has the capabilities to process the current item.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Table 1. manifest item attributes&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;In the context of a C# class designed to read and write manifest items, these attributes are simply strings to be accessed through the methods and properties of the class. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Attribute Handling Methods&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Extracting the attributes of an XML node is a common activity in epub code. The most succinct code to access an attribute value is:&lt;br /&gt;&lt;blockquote&gt;XmlNode targetAttribute = node.Attributes.GetNamedItem(attributeName);&lt;/blockquote&gt;However, many attributes are optional, and the variable targetAttribute will be set to null if the attribute is not present. Therefore, I prefer to&amp;nbsp;wrap this statement up with some defensive programming which checks for a missing attribute and also distinguishes the case where the attribute is present but is set to an empty string. I use an overloaded TryGetAttribute method which offers a few ways of handling these situations. One example of the method is shown below.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;public static bool TryGetAttribute(XmlNode node &lt;br /&gt;&amp;nbsp;&amp;nbsp; ,string attributeName&lt;br /&gt;&amp;nbsp;&amp;nbsp; ,out string attributeValue) {&lt;br /&gt;&lt;br /&gt;&amp;nbsp; // initialise the results&lt;br /&gt;&amp;nbsp; bool result = false;&lt;br /&gt;&amp;nbsp; attributeValue = string.Empty;&lt;br /&gt;&lt;br /&gt;&amp;nbsp; // try to get the named attribute&lt;br /&gt;&amp;nbsp; XmlNode targetAttribute = node.Attributes.GetNamedItem(attributeName);&lt;br /&gt;&lt;br /&gt;&amp;nbsp; // if the attribute was found&lt;br /&gt;&amp;nbsp; if (targetAttribute != null) {&lt;br /&gt;&amp;nbsp; &amp;nbsp; // extract the value and set the result to true&lt;br /&gt;&amp;nbsp; &amp;nbsp; attributeValue = targetAttribute.InnerText;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; result = true;&lt;br /&gt;&amp;nbsp; }&lt;br /&gt;return result;&lt;br /&gt;}//TryGetAttribute&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The converse&amp;nbsp;of reading a potentially missing attribute occurs when we want to set the value of an attribute that may or may not be present in the target XmlNode. Again, this happens often enough to make it worth creating a method to handle it. I call this SetOrAddAttribute and a listing is shown below.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody valign="top"&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;public static void SetOrAddAttribute(XmlNode node,&lt;br /&gt;&amp;nbsp;&amp;nbsp; string attributeName, string attributeValue){&lt;br /&gt;&amp;nbsp; // try to get the attribute&lt;br /&gt;&amp;nbsp; XmlAttribute targetAttribute = TryGetAttribute(node, attributeName);&lt;br /&gt;&amp;nbsp; // if the attribute is not present in the given node&lt;br /&gt;&amp;nbsp; if (targetAttribute == null){&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; // create and add an empty attribute&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; targetAttribute = node.OwnerDocument.CreateAttribute(attributeName);&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; node.Attributes.Append(targetAttribute);&lt;br /&gt;&amp;nbsp; }&lt;br /&gt;// set the attribute value&lt;br /&gt;&amp;nbsp; targetAttribute.InnerText = attributeValue;&lt;br /&gt;}&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;The manifestitem class&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;With attribute handling in place, it's straightforward to create a manifestitem class in C#. The constructor is given a reference to an XmlNode which it stores in a private variable:&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;private XmlNode _node;&lt;br /&gt;&lt;br /&gt;public manifestitem(XmlNode node){&lt;br /&gt;&amp;nbsp; _node = node;&lt;br /&gt;}&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;Each attribute of the manifest item is then provided with a property which can be used to get and set the attribute value. For example, look at the following snippet which handles the href attribute&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;public string href {&lt;br /&gt;&amp;nbsp; get { &lt;br /&gt;&amp;nbsp; &amp;nbsp; string _href;&lt;br /&gt;&amp;nbsp; &amp;nbsp; utilities.TryGetAttribute(_node,"href", out _href);&lt;br /&gt;&amp;nbsp; &amp;nbsp; return _href; &lt;br /&gt;&amp;nbsp; }&lt;br /&gt;&amp;nbsp; set {&lt;br /&gt;&amp;nbsp; &amp;nbsp; utilities.SetOrAddAttribute(_node, "href", value);&lt;br /&gt;&amp;nbsp; }&lt;br /&gt;}&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The get method returns the attribute value, if it is present, or an empty string. The set method replaces the value of any existing href&amp;nbsp;attribute or adds an href attribute with the given value if the attribute is not present in the item node.&lt;br /&gt;&lt;br /&gt;This pattern is repeated for each attribute.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;The manifest class&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;A C# class to handle an epub's manifest is concerned with the manifest's &amp;lt;item&amp;gt; elements. It needs to find them, add them, and remove them. To that end the methods in Table 2.&amp;nbsp;make up the manifest class which is part of the project to develop an online wysiwyg epub editor.&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="2" cellspacing="0" valign="top"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="width: 200px;"&gt;&lt;strong&gt;Method&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;manifest(XmlDocument package)&lt;/td&gt;&lt;td&gt;Constructor which receives the epub package as an XmlDocument.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ManifestNode()&lt;/td&gt;&lt;td&gt;A method which returns the &amp;lt;manifest&amp;gt; as an XmlNode.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ManifestItems()&lt;/td&gt;&lt;td&gt;Method returning the manifest item elements as an XmlNodeList.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Add(manifestitem item)&lt;/td&gt;&lt;td&gt;Method to add the given instance of a manifestitem to the manifest.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Add(string id, string href, string media_type)&lt;/td&gt;&lt;td&gt;Method to add an item to the manifest, assigning it the given mandatory values for id, href, and media-type.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Remove(string id, string packagePath)&lt;/td&gt;&lt;td&gt;Remove the item with the given id from the manifest. Also, delete the file from the file system using the physical path in the packagePath argument.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;GetManifestItemById(string id)&lt;/td&gt;&lt;td&gt;Return the item element with the given id as a manifestitem instance.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;CreateManifestItem()&lt;/td&gt;&lt;td&gt;Create a new manifestitem instance which can be adorned with attribute values and inserted in the manifest using the Add method.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Table 2. properties and methods of the manifest class&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Note that the node order in the manifest is not important, unlike in the spine. Therefore, the Add methods simply append new items at the end of the manifest.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Spine&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;In some ways the &amp;lt;spine&amp;gt; is easier to handle than the &amp;lt;manifest&amp;gt;; there are fewer attributes to work with;&amp;nbsp;but it does have a few complications. Firstly, the &amp;lt;spine&amp;gt; element includes the &lt;em&gt;toc&lt;/em&gt; attribute which holds the id of the manifest item that holds the NCX document for the publication. That attribute has to be accessible so the reading software can find and open the NCX.&lt;br /&gt;&lt;br /&gt;Secondly, the spine provides the reading system with the linear reading order of the content documents. Therefore, the order of the nodes in the spine is important.&lt;br /&gt;&lt;br /&gt;Spine nodes are called &amp;lt;itemref&amp;gt; because they refer to items in the manifest; the &lt;em&gt;idref&lt;/em&gt; attribute of each itemref element is the id of a manifest &amp;lt;item&amp;gt;. Each item &lt;em&gt;id&lt;/em&gt; must only appear once in the spine. &lt;br /&gt;&lt;br /&gt;The only other attribute that the Open Packaging Format schema allows is the &lt;em&gt;linear&lt;/em&gt; attribute. This distinguishes primary content documents (value="yes") from auxiliary content&amp;nbsp; documents (value="no"). "yes" is the default, so this attribute can be omitted.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Useful Enumerations&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Before presenting the spine class, it's worth introducing two enumerations that&amp;nbsp;support the code. The first of these describes the position where a new spine itemref should be inserted. The &lt;strong&gt;InsertPosition&lt;/strong&gt; enumeration is shown below.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;public enum&lt;/span&gt; &lt;span style="color: #3d85c6;"&gt;InsertPosition&lt;/span&gt; &lt;span style="color: black;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; after&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; ,before&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; ,bottom&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; ,top&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;}&lt;/span&gt;&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;This provides options to insert a new itemref at the top or bottom of the reading order, or to insert it before or after a given other itemref node.&lt;br /&gt;&lt;br /&gt;The second enumeration allows the code to specify the value of the &lt;em&gt;linear&lt;/em&gt; attribute without passing a string. The &lt;strong&gt;Linear&lt;/strong&gt; enumeration is show below.&lt;br /&gt;&lt;br /&gt;&lt;table style="border-bottom: thin solid; border-left: thin solid; border-right: thin solid; border-top: thin solid;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="background-color: #eeeeff;"&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;public enum&lt;/span&gt; &lt;span style="color: #3d85c6;"&gt;Linear &lt;/span&gt;&lt;span style="color: black;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; yes&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp; ,no&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;}&lt;/span&gt;&lt;/blockquote&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;The spine class&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;A C# class to provide basic handling for the &amp;lt;spine&amp;gt; element could have the methods and properties shown in Table 3.&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="2" cellspacing="0" valign="top"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="width: 150px;"&gt;&lt;strong&gt;Method&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;spine(XmlDocument package)&lt;/td&gt;&lt;td&gt;Constructor which receives the epub package as an XmlDocument.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;tocId&lt;/td&gt;&lt;td&gt;Return the id of the NCX manifest item from the&amp;nbsp;&lt;em&gt;toc&lt;/em&gt; attribute&amp;nbsp;of the&amp;nbsp;spine element.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;itemrefs&lt;/td&gt;&lt;td&gt;Method returning the itemref elements as an XmlNodeList.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Add(string idref, InsertPosition ip, string refNodeId, Linear linear)&lt;/td&gt;&lt;td&gt;Add&amp;nbsp;an itemref&amp;nbsp;instance&amp;nbsp;to the spine. The new itemref will have the given idref and linear values, and the position will be determined by the InsertPosition value relative to the itemref element which has the idref value in the refNodeId argument.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Remove(string id)&lt;/td&gt;&lt;td&gt;Remove the itemref with the given id from the spine.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Table 3. properties and methods of the spine class&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Earlier I mentioned that with one exception a reading system does not modify the manifest or the spine. The Open Package Format says that any part of the publication that can be referenced during processing of an epub &lt;strong&gt;must &lt;/strong&gt;be included in the spine. However, if the reading system encounters content that is not present in the spine:&lt;br /&gt;&lt;blockquote&gt;&lt;em&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;the Reading System &lt;strong&gt;should&lt;/strong&gt; add it to the spine (the placement at the discretion of the Reading System) and assign a value of 'no' to the linear attribute.&lt;/span&gt;&lt;/em&gt;&lt;/blockquote&gt;So, a reading system can add itemrefs to the spine. I interpret this to mean that the in-memory representation of the spine is modified and not the package file in the file system nor the compressed version of the package held in the .epub file.&amp;nbsp;Please contradict me if you know this to be false.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-7626148589622412277?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/7626148589622412277'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/7626148589622412277'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/02/manifest-and-spine-management-in-c.html' title='Manifest and spine management in C#'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-9101163618768303870</id><published>2010-02-15T14:38:00.004Z</published><updated>2010-02-16T09:42:10.085Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='case'/><category scheme='http://www.blogger.com/atom/ns#' term='NCX'/><category scheme='http://www.blogger.com/atom/ns#' term='OPS'/><category scheme='http://www.blogger.com/atom/ns#' term='CSS'/><category scheme='http://www.blogger.com/atom/ns#' term='MusicXML'/><category scheme='http://www.blogger.com/atom/ns#' term='switch'/><category scheme='http://www.blogger.com/atom/ns#' term='Preferred Vocabulary'/><category scheme='http://www.blogger.com/atom/ns#' term='Inline'/><category scheme='http://www.blogger.com/atom/ns#' term='fallback'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Publication Structure'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='XML Island'/><category scheme='http://www.blogger.com/atom/ns#' term='Out-Of-Line'/><title type='text'>XML Islands in epub publications</title><content type='html'>&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Preferred Vocabulary&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;An earlier post -&amp;nbsp;&lt;a href="http://netkingcol.blogspot.com/2010/01/how-standards-work-together.html"&gt;How the standards work together&lt;/a&gt;&amp;nbsp;- showed that the Open Publication Structure (OPS)&amp;nbsp;specifies the&amp;nbsp;XHTML tags, grouped into modules, that should be used to create an epub&amp;nbsp;publication. These&amp;nbsp;modules&amp;nbsp;constitute what is called a &lt;em&gt;Preferred Vocabulary&lt;/em&gt;. In other words, the OPS&amp;nbsp;specifies&amp;nbsp;the tags that &lt;em&gt;should&lt;/em&gt; be used to define the structure of a work - a &amp;lt;div&amp;gt; here, a &amp;lt;table&amp;gt; there, and so on. &lt;br /&gt;&lt;br /&gt;All conforming reading systems must recognise and be able to render documents written using the preferred vocabulary. The term&amp;nbsp;'baseline' reading system indicates this minimal ability.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Beyond the&amp;nbsp;Preferred Vocabulary&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;You can achieve a great deal using only the preferred vocabulary. A very high proportion of existing printed matter, including its illustrations,&amp;nbsp;could be transferred to epub format using only the standard OPS modules - with layout and formatting support from&amp;nbsp;CSS. However, that adjective 'preferred' hints that epub&amp;nbsp;productions do not have to be written exclusively in that vocabulary.&amp;nbsp;OPS recognises that there are other tags in&amp;nbsp;XHTML 1.1 that an author might&amp;nbsp;&lt;em&gt;want&lt;/em&gt; to use and that there are&amp;nbsp;other XML vocabularies that a publisher might &lt;em&gt;need&lt;/em&gt; to include. &lt;br /&gt;&lt;br /&gt;XML is a widely used technology. If you enter into Google the search term: 'XML vocabulary for', and follow this with any topic in which you have an interest, there's a good chance someone has designed, or is actively designing,&amp;nbsp;an XML&amp;nbsp;vocabulary for that topic. I found the following; you will find many more:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;ceXML - civil engineering XML.&lt;/li&gt;&lt;li&gt;genXML -&amp;nbsp;for the exchange of genealogical data.&lt;/li&gt;&lt;li&gt;mathML - XML for sharing mathematical expressions.&lt;/li&gt;&lt;li&gt;MusicXML - XML for capturing musical notation.&lt;/li&gt;&lt;/ul&gt;The users of these diverse vocabularies are very likely to want to&amp;nbsp;include content written using them in their epub publications. It might be&amp;nbsp;a fragment of non-preferred XML embedded, as an example,&amp;nbsp;within an otherwise conforming XHTML content document, or it might be&amp;nbsp;an entire document conveying the essential content of the publication.&lt;br /&gt;&lt;br /&gt;The Open Publication Structure offers an approach that allows non-preferred content to be included in a publication while&amp;nbsp;ensuring&amp;nbsp;that the content is available, in some form, to all consumers. The trick is to allow reading systems that have been designed to handle the non-standard content to exploit it to the full while insisting that the publisher provide the information in a form that is accessible to baseline readers.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;XML Islands&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;If I were to embed snippets of foreign languages into this post by suggesting &lt;em&gt;per se&lt;/em&gt; that&amp;nbsp;it is &lt;em&gt;de riguer &lt;/em&gt;to introduce &lt;em&gt;de novo&lt;/em&gt; some concept of the &lt;em&gt;deus ex machina&lt;/em&gt;, you would rightly accuse me of poor writing style.&amp;nbsp;However,&amp;nbsp;those French and Latin phrases are examples of content taken from another language, or non-preferred vocabulary, emebedded in a stream of preferred vocabulary; in this case&amp;nbsp;it's embedded in English but it could be Greek embedded in Spanish or Mandarin embedded in Tagalog. &lt;br /&gt;&lt;br /&gt;It is poor writing style to sprinkle foreign phrases about&amp;nbsp;like this because the reader whose preferred vocabulary is English will not necessarily understand Latin. Likewise, a chunk of XML written using a non-preferred vocabulary&amp;nbsp;will make no sense to a baseline reading system because&amp;nbsp;it is not required to process such content. I'm not saying it's a bad idea to insert 'foreign' XML into a content document, simply that special handling is required when it is used.&lt;br /&gt;&lt;br /&gt;Chunks of XML, written in a foreign language and embedded in a stream of preferred vocabulary,&amp;nbsp;are called &lt;strong&gt;Inline XML Islands&lt;/strong&gt;. Islands in the stream, that is what they are.&lt;br /&gt;&lt;br /&gt;It is possible for entire content documents to be written in a non-preferred vocabulary.&amp;nbsp;In this case, and because they are not embedded in a stream of preferred vocabulary,&amp;nbsp;these documents&amp;nbsp;are called &lt;strong&gt;Out-Of-Line XML Islands&lt;/strong&gt;,&lt;strong&gt; &lt;/strong&gt;though they are more like continents than islands -&amp;nbsp;entirely inaccessible and&amp;nbsp;incomprehensible to a baseline reading system.&lt;br /&gt;&lt;br /&gt;The&amp;nbsp;Open Publication Structure and Open Packaging Format standards&amp;nbsp;define between them what&amp;nbsp;XML Islands are and then state the requirements to be met by publishers when creating them&amp;nbsp;and the guidelines to be followed by reading systems when encountering them in a publication.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Publisher's Responsibilities: Out-Of-Line XML Islands&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;If a publisher wants to include&amp;nbsp;Out-Of-Line XML Islands in a&amp;nbsp;work, they must meet the following requirements.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The XML Island must be a complete&amp;nbsp;XML document that conforms to its own schema (the schema defines the vocabulary).&lt;/li&gt;&lt;li&gt;The manifest item&amp;nbsp;for an Out-Of-Line XML Island&amp;nbsp;must identify the namespace of the document using the &lt;em&gt;required- namespace&lt;/em&gt; attribute.&lt;/li&gt;&lt;li&gt;For each&amp;nbsp;Out-Of-Line XML Island, the publisher must provide a &lt;strong&gt;fallback&lt;/strong&gt; document which can be processed directly. The manifest must include fallback documents as well as the XML Islands they support.&lt;/li&gt;&lt;li&gt;The manifest item for the XML Island&amp;nbsp;must include a &lt;em&gt;fallback&lt;/em&gt; attribute, and that attribute should give the &lt;em&gt;id&lt;/em&gt; of the fallback document.&lt;/li&gt;&lt;li&gt;If necessary, a fallback item may itself have a specified fallback, creating a fallback chain.&lt;/li&gt;&lt;li&gt;Fallback chains must not form a loop.&lt;/li&gt;&lt;li&gt;As an alternative to a fallback item, the publisher may provide a stylesheet which can be used for the presentation of the non-standard content. In this case the &lt;em&gt;fallback-style&lt;/em&gt; attribute should be specified and the target stylesheet should be identified.&lt;/li&gt;&lt;li&gt;An Out-Of-Line XML Island may specify both a fallback item and a fallback-style.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Reading System Guidelines: Out-Of-Line XML Islands&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;A reading system is some combination of hardware and software. In an open market, reading systems will have a range of abilities, including the ability to handle one or more content types that fall outside the OPS preferred vocabulary.&lt;br /&gt;&lt;br /&gt;When a reading system processes an item in the manifest which specifies a fallback item it should follow these guidelines:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Starting from an initial content document, identified in the spine or NCX, the reading system must follow the fallback chain until it finds a document it knows how to display. At the end of every fallback chain the reading system&amp;nbsp;should find a document that it can render.&lt;/li&gt;&lt;li&gt;A reading system may display any item that it is capable of processing, it doesn't have to be the first one it finds.&lt;/li&gt;&lt;li&gt;If an Out-Of-Line XML Island specifies both a fallback item and a fallback stylesheet, a reading system may choose which one to use.&lt;/li&gt;&lt;li&gt;When a reading system is designed to have special capabilities, it may do more than the minimum with the content of an XML Island.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Inline XML Islands&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;When a fragment of 'foreign'&amp;nbsp;XML is to be embedded in a stream of content which is&amp;nbsp;written using the preferred vocabulary, the publisher should provide an inline&amp;nbsp;mechanism for handling it. &lt;br /&gt;&lt;br /&gt;We saw that fallback documents were used for Out-Of-Line XML Islands. The equivalent inline technique is the &lt;em&gt;switch&lt;/em&gt; statement which presents zero or more &lt;em&gt;case&lt;/em&gt; elements each of which wraps XML markup inside a &lt;em&gt;required-namespace&lt;/em&gt; declaration. The syntax takes the form:&lt;br /&gt;&lt;blockquote&gt;&amp;lt;ops:switch id="switch_id"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;ops:case required-namespace="namespace"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; ... XML content in the named vocabulary&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/ops:case&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;ops:default&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; ... fallback OPS-compliant content&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/ops:default&amp;gt;&lt;br /&gt;&amp;lt;/ops:switch&amp;gt;&lt;/blockquote&gt;A reading system should examine the&amp;nbsp;&lt;em&gt;required-namespace&lt;/em&gt; of&amp;nbsp;each case element and determine whether it can handle that namespace. It should process the first such case that it finds, although it doesn't have to. If the reading system either cannot or chooses not to process any of the cases, it must process the default element. The default must always contain content that would be valid in any OPS content document.&lt;br /&gt;&lt;br /&gt;The example below shows how a fragment of MusicXML might be presented in a content document.&lt;br /&gt;&lt;blockquote&gt;&amp;lt;ops:switch id="musicXML_Example"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;ops:case required-namespace="http://www.recordare.com/"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;score-partwise version="2.0"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;part-list&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;score-part id="P1"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;part-name&amp;gt;Music&amp;lt;/part-name&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/score-part&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/part-list&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;part id="P1"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;measure number="1"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;attributes&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;divisions&amp;gt;1&amp;lt;/divisions&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;key&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;fifths&amp;gt;0&amp;lt;/fifths&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/key&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;time&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;beats&amp;gt;4&amp;lt;/beats&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;beat-type&amp;gt;4&amp;lt;/beat-type&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/time&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;clef&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;sign&amp;gt;G&amp;lt;/sign&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;line&amp;gt;2&amp;lt;/line&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/clef&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/attributes&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;note&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;pitch&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;step&amp;gt;C&amp;lt;/step&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;octave&amp;gt;4&amp;lt;/octave&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/pitch&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;duration&amp;gt;4&amp;lt;/duration&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;type&amp;gt;whole&amp;lt;/type&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/note&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/measure&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/part&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/score-partwise&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/ops:case&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;ops:default&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;img src="images/Cnatural.png" &amp;lt;/img&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/ops:default&amp;gt;&lt;br /&gt;&amp;lt;/ops:switch&amp;gt;&lt;/blockquote&gt;A reading system that understands MusicXML would probably choose to process the XML contained within the case element. A baseline reader would process the default case and render the image shown here:&lt;br /&gt;&lt;img alt="CNatural" border="0" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S3kb3dl_8rI/AAAAAAAAAQo/b0hhLW52jPk/s800/MusicXML_Cnatural.png" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-9101163618768303870?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/9101163618768303870'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/9101163618768303870'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/02/xml-islands-in-epub-publications.html' title='XML Islands in epub publications'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh6.ggpht.com/_cvaF-9-3DHs/S3kb3dl_8rI/AAAAAAAAAQo/b0hhLW52jPk/s72-c/MusicXML_Cnatural.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-9011904564165329894</id><published>2010-01-29T11:42:00.012Z</published><updated>2010-01-31T11:40:23.336Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Inside Epub'/><category scheme='http://www.blogger.com/atom/ns#' term='NCX'/><category scheme='http://www.blogger.com/atom/ns#' term='class'/><category scheme='http://www.blogger.com/atom/ns#' term='editor'/><category scheme='http://www.blogger.com/atom/ns#' term='navMap'/><category scheme='http://www.blogger.com/atom/ns#' term='tiny MCE'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><category scheme='http://www.blogger.com/atom/ns#' term='navPoint'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='ASP.Net'/><category scheme='http://www.blogger.com/atom/ns#' term='manifest'/><category scheme='http://www.blogger.com/atom/ns#' term='spine'/><title type='text'>Design of an NCX handler class in C#</title><content type='html'>Figure 1. shows a revised design for the Book Content screen of the online wysiwyg epub editor I'm developing.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S2KSAsCDdmI/AAAAAAAAAO8/RimsICYtKwc/s800/InsideEpub0043.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S2KSAsCDdmI/AAAAAAAAAO8/RimsICYtKwc/s288/InsideEpub0043.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Book content and editing screen&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The main changes to the design are as follows:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The TreeView display of the NCX (Table of Contents) is replaced by a Repeater control. This allows more flexibility in the management of the NCX. This post explores the functionality required by a C# class that maintains the NCX document of an epub publication; access to that functionality is achieved through the Repeater control in a way that isn't so easy with a TreeView. Each document now has a checkbox beside it, allowing the document to be selected for action.&lt;/li&gt;&lt;li&gt;A dropdown list of actions that can be applied to one or more selected content documents. This is the key feature that gives access to NCX management functionality and includes the following actions: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;Remove one or more content documents from the publication.&lt;/li&gt;&lt;li&gt;Move a content document up or down in the reading order.&lt;/li&gt;&lt;li&gt;Insert a new content document before or after a given document.&lt;/li&gt;&lt;li&gt;Change the text that appears in the table of contents for a given document.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;The TinyMCE editor configuration has been enriched so the author has more control over the initial styling of the text (but remember, features like selected font, font-size, and text/background colouring should be left to the reader who makes personal presentation choices when using a reading device).&amp;nbsp;The new editor configuration&amp;nbsp;also allows the more technical author to view the XHTML for the current content document.&lt;/li&gt;&lt;li&gt;When the user wants to add a content document, they now have the ability to specify the following items: &lt;/li&gt;&lt;ul&gt;&lt;li&gt;The name of the file where the new content should be written.&lt;/li&gt;&lt;li&gt;The document heading that should appear at the start of the content.&lt;/li&gt;&lt;li&gt;The text that should appear in the table of contents.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;In Figure 2. the Actions dropdown list is shown expanded. This list&amp;nbsp;provides the structure for much of the remainder of this post, with the addition of the following methods and properties:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A constructor that, given an NCX document, will instantiate the class.&lt;/li&gt;&lt;li&gt;A method for saving changes to the NCX back to disk. In the current design, changes are only written to the&amp;nbsp;.epub file when the entire publication is saved.&lt;/li&gt;&lt;li&gt;Properties that allow the following items in the NCX to be set programmatically: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;The title of the publication in the &lt;em&gt;docTitle/text&lt;/em&gt; element.&lt;/li&gt;&lt;li&gt;The author of the publication in the &lt;em&gt;docAuthor/text&lt;/em&gt; element.&lt;/li&gt;&lt;li&gt;The unique identifier of the publication in the &lt;em&gt;meta&lt;/em&gt; element which has a &lt;em&gt;name&lt;/em&gt; attribute of &lt;strong&gt;dtb:uid&lt;/strong&gt;&lt;/li&gt;&lt;li&gt;The document publisher in the &lt;em&gt;meta&lt;/em&gt; element which has a &lt;em&gt;name&lt;/em&gt; attribute of &lt;strong&gt;epub-creator&lt;/strong&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S2Kr7SAd5iI/AAAAAAAAAPE/kp3EFBiellg/s800/InsideEpub0045.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S2Kr7SAd5iI/AAAAAAAAAPE/kp3EFBiellg/s288/InsideEpub0045.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2.&amp;nbsp;Actions on content documents&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This editor&amp;nbsp;application contains a class called 'ncx' which we will now examine.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Constructor&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;An instance of the ncx class is created by its constructor. An NCX is an XML document, so much of the functionality of NCX management is concerned with XML operations - finding individual nodes, getting and setting attribute values, managing NameSpaces etc.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The ncx constructor is given the filepath to the ncx document. This was&amp;nbsp;extracted from the .epub file when the current book was selected. The &amp;lt;spine&amp;gt; element of the &amp;lt;package&amp;gt; has a &lt;em&gt;toc&lt;/em&gt; attribute. This holds the identifier of the &amp;lt;manifest&amp;gt; item which gives the&amp;nbsp;&lt;em&gt;href&lt;/em&gt; of the&amp;nbsp;NCX file. The path to this file is passed to the constructor.&lt;br /&gt;&lt;br /&gt;The ncx document is loaded and parsed, then key nodes are extracted as follows:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&amp;lt;docTitle&amp;gt;&amp;lt;text&amp;gt;Book Title&amp;lt;/text&amp;gt;&amp;lt;/docTitle&amp;gt; - allowing the book's title to be easily read and written using the Title property.&lt;/li&gt;&lt;li&gt;&amp;lt;docAuthor&amp;gt;&amp;lt;text&amp;gt;Author name&amp;lt;/text&amp;gt;&amp;lt;/docAuthor&amp;gt; - allowing the book's author to be easily read and written using the Author property.&lt;/li&gt;&lt;li&gt;&amp;lt;head&amp;gt;&amp;lt;meta name='dtb:uid'/&amp;gt; - allowing the&amp;nbsp;unique identifier of the publication&amp;nbsp;to be easily read and written using the Identifier property.&lt;/li&gt;&lt;li&gt;&amp;lt;head&amp;gt;&amp;lt;meta name='epub-creator'/&amp;gt; - allowing the publisher to be easily read and written using the Publisher property.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Save&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The converse of the constructor, where the NCX is read and parsed, the Save method writes the updated NCX back to disk. After it has been saved by this method, the new NCX will be saved to the .epub file the next time the publication is saved.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Remove&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;What does it mean to remove a content document from an epub publication? Complete removal means performing the following actions:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Delete the &amp;lt;item&amp;gt; from the &amp;lt;manifest&amp;gt;.&lt;/li&gt;&lt;li&gt;Delete the &amp;lt;itemref&amp;gt; from the &amp;lt;spine&amp;gt;.&lt;/li&gt;&lt;li&gt;Delete the &amp;lt;navPoint&amp;gt; from the &amp;lt;navMap&amp;gt; in the NCX.&lt;/li&gt;&lt;li&gt;Delete the content document from the file system.&lt;/li&gt;&lt;li&gt;Rebuild the container (.epub file).&lt;/li&gt;&lt;/ul&gt;The ncx class is concerned only with managing the NCX document, so the Remove method deletes the &amp;lt;navPoint&amp;gt; from the &amp;lt;navMap&amp;gt;. To keep the &amp;lt;navMap&amp;gt; tidy, the method also renumbers the &lt;em&gt;playOrder&lt;/em&gt; attribute of all &amp;lt;navPoint&amp;gt; elements that came after the removed document.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Insert After and Insert Before&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Authors and publishers commonly want to reorganise and rearrange content. This might involve changing the order in which content documents are presented and that is the topic of the next section. It might also involve the insertion of new content into an existing publication. As discussed above with the Remove method, adding a content document requires work elsewhere in the &amp;lt;package&amp;gt; - the &amp;lt;manifest&amp;gt; and the &amp;lt;spine&amp;gt; must also be updated.&lt;br /&gt;&lt;br /&gt;The ncx class methods InsertAfter and InsertBefore make the required changes to the NCX document. The required behaviour is as follows:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;In both cases a new &amp;lt;navPoint&amp;gt; is added to the &amp;lt;navMap&amp;gt;. The &amp;lt;navLabel&amp;gt;&amp;lt;text&amp;gt; element of the &amp;lt;navPoint&amp;gt; is set to the given text. The &lt;em&gt;src&lt;/em&gt; attribute of the &amp;lt;content&amp;gt; element is set to the given filename.&lt;/li&gt;&lt;li&gt;The value given to the &lt;em&gt;playOrder&lt;/em&gt; attribute of the &amp;lt;navPoint&amp;gt; will depend on whether the action is to insert the new document before or after a selected document: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;When the new document is inserted &lt;strong&gt;before&lt;/strong&gt; a selected document, the new document takes the &lt;em&gt;playOrder&lt;/em&gt; of the selected document. The selected document and all following &amp;lt;navPoint&amp;gt; elements have their &lt;em&gt;playOrder&lt;/em&gt; incremented to move them down the reading order.&lt;/li&gt;&lt;li&gt;When the new document is inserted &lt;strong&gt;after&lt;/strong&gt; a selected document, the new document takes the &lt;em&gt;playOrder&lt;/em&gt; of the next document. That document and all following documents have their &lt;em&gt;playOrder&lt;/em&gt; incremented, again moving them down the reading order.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Move Up and Move Down&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;To move a content document up or down in the reading order is relatively straightforward. The only requirement is to swap the &lt;em&gt;playOrder&lt;/em&gt; values of two adjacent &amp;lt;navPoint&amp;gt; elements. &lt;br /&gt;&lt;br /&gt;When a given document is to be moved up the reading order i.e. to be presented to the reader earlier, the &lt;em&gt;playOrder&lt;/em&gt; of the given document is swapped with that of the previous document. When a document is to be moved down the reading order, it will swap &lt;em&gt;playOrder&lt;/em&gt; values with the content document that follows it in the current reading order.&lt;br /&gt;&lt;br /&gt;Of course, the &amp;lt;navMap&amp;gt; nodes must be sorted by &lt;em&gt;playOrder&lt;/em&gt; before they are bound to the user interface controls. Only then will the table of contents look as expected.&lt;br /&gt;&lt;br /&gt;The &amp;lt;spine&amp;gt; element of the &amp;lt;package&amp;gt; document also holds the linear reading order of the publication. Any changes made to the NCX should also be reflected there, but those actions are outside the scope of the ncx class.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Rename&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The Rename method allows the content provider to change the text that describes a content document as seen in the table of contents. The action of this method is to store the given string in the &amp;lt;navLabel&amp;gt;&amp;lt;text&amp;gt; element of the given &amp;lt;navPoint&amp;gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Properties&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The following properties provide convenient access to the elements and attributes of the NCX document that aren't part of the document navigation (navMap).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Title&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Gets or sets the publication title, held in the &amp;lt;docTitle&amp;gt;&amp;lt;text&amp;gt; element.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Author&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Gets or sets the publication author, held in the &amp;amp;t;docTitle&amp;gt;&amp;lt;text&amp;gt; element.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Identifer&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Gets or set the unique identifier of the publication which is held in the &lt;em&gt;content&lt;/em&gt; attribute of the &amp;lt;head&amp;gt;&amp;lt;meta&amp;gt; element with its &lt;em&gt;name&lt;/em&gt; attribute set to &lt;strong&gt;dtb:uid&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Publisher&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Gets or sets the publisher name which is held in the content attribute of the &amp;lt;head&amp;gt;&amp;lt;meta&amp;gt; element with its name attribute set to &lt;strong&gt;epub-creator&lt;/strong&gt;.&lt;br /&gt;&lt;hr /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Article Navigation&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Developing an epub editor &lt;a href="http://netkingcol.blogspot.com/2010/01/user-interface-for-online-epub-editor.html"&gt;&amp;lt;&amp;lt; Previous&lt;/a&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;Next &amp;gt;&amp;gt;&lt;br /&gt;Exploring epub standarda: &lt;a href="http://netkingcol.blogspot.com/2009/12/introduction-to-epub.html"&gt;Introduction&lt;/a&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;try {var pageTracker = _gat._getTracker("UA-12755454-1");pageTracker._trackPageview();} catch(err) {}&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-9011904564165329894?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/9011904564165329894'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/9011904564165329894'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/design-of-ncx-handler-class-in-c.html' title='Design of an NCX handler class in C#'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh4.ggpht.com/_cvaF-9-3DHs/S2KSAsCDdmI/AAAAAAAAAO8/RimsICYtKwc/s72-c/InsideEpub0043.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-8550012949740626144</id><published>2010-01-21T13:52:00.056Z</published><updated>2010-02-01T19:07:00.201Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='wysiwyg'/><category scheme='http://www.blogger.com/atom/ns#' term='Inside Epub'/><category scheme='http://www.blogger.com/atom/ns#' term='navMap'/><category scheme='http://www.blogger.com/atom/ns#' term='XSL'/><category scheme='http://www.blogger.com/atom/ns#' term='META-INF'/><category scheme='http://www.blogger.com/atom/ns#' term='website'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Zip'/><category scheme='http://www.blogger.com/atom/ns#' term='mimetype'/><category scheme='http://www.blogger.com/atom/ns#' term='Sandcastle'/><category scheme='http://www.blogger.com/atom/ns#' term='navPoint'/><category scheme='http://www.blogger.com/atom/ns#' term='XML documentation'/><category scheme='http://www.blogger.com/atom/ns#' term='epubcheck'/><category scheme='http://www.blogger.com/atom/ns#' term='OCF'/><category scheme='http://www.blogger.com/atom/ns#' term='Threepress'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Container Format'/><title type='text'>Online Epub Editor Project: Technical Notes</title><content type='html'>This post adds some technical notes to the Inside Epub articles which explore the epub standards and show how to write an online wysiwyg epub editor. The following topics are covered:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Using the DotNetZip library to open and save an ebook&lt;/li&gt;&lt;li&gt;Checking the validity of your .epub files&lt;/li&gt;&lt;li&gt;Using the Sandcastle Help Builder to document code&lt;/li&gt;&lt;li&gt;Sorting an NCX navMap by playOrder&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Using the DotNetZip library to open and save an ebook&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;I chose the DotNetZip library for this project because it looked well documented and seemed to offer the functionality I wanted. All was fine until I submitted an epub for validation using the facility provided by &lt;a href="http://threepress.org/document/epub-validate"&gt;Threepress&lt;/a&gt;&amp;nbsp;(see Checking the validity of your .epub files, below).&lt;br /&gt;&lt;br /&gt;There were a few simple problems to resolve but the most persistent was a report that the mimetype file did not contain the expected information (application/epub+zip). The file clearly did have the correct information in it so I suspected the zipping wasn't working correctly.&lt;br /&gt;&lt;br /&gt;The rules for the ZIP Container in the &lt;a href="http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm"&gt;Open Container Format Specification&lt;/a&gt; state:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;The first file in the ZIP Container MUST be a file by the ASCII name of ‘mimetype’ which holds the MIME type for the ZIP Container (i.e., “application/epub+zip” as an ASCII string; no padding, white-space or case change). The file MUST be neither compressed nor encrypted and there MUST NOT be an extra field in its ZIP header. If this is done, then the ZIP Container offers convenient “magic number” support as described in RFC 2048 and the following will hold true:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;The bytes “PK” will be at the beginning of the file &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;The bytes “mimetype” will be at position 30 &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;The actual MIME type (i.e., the ASCII string “application/epub+zip”) will begin at position 38&lt;/span&gt;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;/blockquote&gt;I made a couple of mistakes when building the epub the first few (many)&amp;nbsp;times. Figure&amp;nbsp;1. shows one of my earlier attempts which I opened in Notepad to view the characters in the zipped file (remember it's in Zip format so you can't expect to see normal text). You can see that, although the mimetype is the first file in the archive, the file shouldn't have the long folder path 'eBookTemp/Inside epub/' and the 'application/epub+zip' doesn't start at position 38.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S1g3vT8uWfI/AAAAAAAAAM8/kMlpMeqfCbw/s800/InsideEpub0040.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S1g3vT8uWfI/AAAAAAAAAM8/kMlpMeqfCbw/s288/InsideEpub0040.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Badly constructed epub&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;It was clear that I couldn't simply write:&lt;br /&gt;&lt;blockquote&gt;using (ZipFile zippedBook = new ZipFile())&lt;br /&gt;{&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; zippedBook.AddDirectory(ebookPath);&lt;br /&gt;}&lt;/blockquote&gt;where ebookPath was the folder holding the epub files. I had to make sure that the mimetype was uncompressed and started at the right position in the file. I evolved the following code to achieve this:&lt;br /&gt;&lt;blockquote&gt;zippedBook.ForceNoCompression = true;&lt;br /&gt;zippedBook.AddEntry("mimetype","","application/epub+zip");&lt;br /&gt;&lt;br /&gt;zippedBook.ForceNoCompression = false;&lt;br /&gt;zippedBook.AddDirectory(_fileSystemPath + "META-INF","META-INF");&lt;br /&gt;zippedBook.AddDirectory(&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _fileSystemPath + _package.PackagePath,_package.PackagePath);&lt;br /&gt;&lt;br /&gt;zippedBook.Save(_ebookPath);&lt;/blockquote&gt;The 'ForceNoCompression=true;' ensures that mimetype is not compressed. This is followed by 'ForceNocompression=false;' to compresss the rest of the epub. With this code, the application produced an epub file that looked like Figure&amp;nbsp;2. when opened in Notepad.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S1g3u68b7LI/AAAAAAAAAM4/afFQ7ela0zE/s800/InsideEpub0039.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" height="108" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S1g3u68b7LI/AAAAAAAAAM4/afFQ7ela0zE/s320/InsideEpub0039.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2.&amp;nbsp;Well constructed epub archive&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This file passed the epub validation check discussed below.&lt;br /&gt;&lt;br /&gt;The other side of the coin with respect to epub files is how to expand an epub archive so you can work with its files. I developed the following code to do this:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;foreach (ZipEntry ze in zippedBook.Entries){&lt;br /&gt;&amp;nbsp;&amp;nbsp; ze.ExtractExistingFile = ExtractExistingFileAction.OverwriteSilently;&lt;br /&gt;&amp;nbsp;&amp;nbsp; if (ze.IsDirectory){ &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;Directory.CreateDirectory(fileSystemPath + Path.GetDirectoryName(ze.FileName));&lt;br /&gt;&amp;nbsp; }&lt;br /&gt;&amp;nbsp; else{&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;ze.Extract(fileSystemPath);&lt;br /&gt;&amp;nbsp; }&lt;br /&gt;}&lt;/blockquote&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Checking the validity of your .epub files&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;There are some&amp;nbsp;important tests to perform to be sure&amp;nbsp;the online epub editor is working correctly. Perhaps the most obvious is whether its output&amp;nbsp;can be loaded into a reading device. That could be deceptive, though, because the reading software on one reading device might be more rigorous and compliant than on another.&lt;br /&gt;&lt;br /&gt;Another way is to use a tool that validates epub publications. I found one of these at Threepress, as shown in Figure 3. The underlying tool is a software project called &lt;a href="http://code.google.com/p/epubcheck/"&gt;epubcheck&lt;/a&gt;. &lt;br /&gt;&lt;a href="http://lh6.ggpht.com/_cvaF-9-3DHs/S1g3uNHt-wI/AAAAAAAAAMw/81c66UD0W9I/s800/InsideEpub0037.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S1g3uNHt-wI/AAAAAAAAAMw/81c66UD0W9I/s288/InsideEpub0037.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. ePub validation of Inside Epub.epub&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;You could argue that this is also software and it may have bugs in it, and you'd be right. Given the wide and&amp;nbsp;rapidly growing&amp;nbsp;range of Zip tools, epub conversion tools, and reading software around, some confirmation of compliance to standards is required and tools like epubcheck will become an important way of ensuring readers will not be disappointed.&lt;br /&gt;&lt;br /&gt;It was a great relief, therefore, that I overcame my initial difficulties with building a conforming OCF epub file and received the big green tick from epubcheck.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Using the Sandcastle Help Builder to document code&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;Microsoft and others&amp;nbsp;provide several tools and techniques for documenting code in the MSDN style, as in this example of the &lt;a href="http://msdn.microsoft.com/en-us/library/system.aspx"&gt;System Class Library&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;When using Visual Studio, you enable&amp;nbsp;XML comments on the Build tab of Project | Properties. Then&amp;nbsp;you can use the triple slash (///) notation to add XML-style documentation to your source code. See Figure&amp;nbsp;4. for a sample. The comments are needed for every Namespace and for every public Class, including all its Public members - methods, properties etc. As soon as you type /// before one of these items, the Visual Studio IDE creates the comment structure with an initial &amp;lt;summary&amp;gt; and, if it's a method, with &amp;lt;param&amp;gt; elements appropriate to the method's signature.&lt;/li&gt;&lt;li&gt;Tools are available to extract the comments&amp;nbsp;by reflection on the binary files of your application and to build&amp;nbsp;navigable documentation like the MSDN example above.&amp;nbsp;I've used Ndoc before so I thought I would try &lt;a href="http://www.codeplex.com/SHFB"&gt;Sandcastle Help File Builder&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;I've made a preliminary pass through the code adding XML decoration, as it's called. Figure&amp;nbsp;4. shows an example of this. There are guidelines for &lt;a href="http://www.dynicity.com/Products/XMLDocComments.aspx"&gt;XML documentation comments&lt;/a&gt;&amp;nbsp;which list and explain the appropriate XML grammar.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S1g3wDhOt5I/AAAAAAAAANE/17MyZ5hN7Z4/s800/InsideEpub0042.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S1g3wDhOt5I/AAAAAAAAANE/17MyZ5hN7Z4/s288/InsideEpub0042.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 4. XML decoration of source code&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The Sandcastle Help File Builder needed a little configuration. I set values for the following in the Project Properties:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;HelpFileFormat, set to 'Website'&lt;/li&gt;&lt;li&gt;OutputPath, set to&amp;nbsp;the folder where the output should be written.&lt;/li&gt;&lt;li&gt;WorkingPath, set to a folder where work files&amp;nbsp;can be created.&lt;/li&gt;&lt;/ul&gt;You click on the Build icon and wait five minutes while Sandcastle works its magic.&amp;nbsp;You can see the results in Figure 5. and also at &lt;a href="http://www.hazelhurst.net/InsideEpub/"&gt;Inside Epub Class Library&lt;/a&gt;.&lt;br /&gt;&lt;a href="http://lh6.ggpht.com/_cvaF-9-3DHs/S1g3tvAjHMI/AAAAAAAAAMs/0hl6RRButro/s800/InsideEpub0036.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S1g3tvAjHMI/AAAAAAAAAMs/0hl6RRButro/s288/InsideEpub0036.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 5. Sandcastle&amp;nbsp;'Website' output&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Sorting an NCX navMap by playOrder&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The NCX document of an epub publication holds information, in XML format,&amp;nbsp;about the structure of the publication. Its most important feature is a hierarchical list of 'navigation points'&amp;nbsp;showing how the publication is subdivided. This list can be used by software in a reading device to provide access points allowing the reader to select directly what they want to view.&lt;br /&gt;&lt;br /&gt;The root of the navigation data is the &amp;lt;navMap&amp;gt; element. This contains a hierarchical set of &amp;lt;navPoint&amp;gt; elements. Figure 6. shows a sample &amp;lt;navMap&amp;gt;.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S0M3hBeI5xI/AAAAAAAAAF4/-y_eoMQX1Nk/s800/InsideEpub0010.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S0M3hBeI5xI/AAAAAAAAAF4/-y_eoMQX1Nk/s288/InsideEpub0010.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 6. Sample NCX showing its navMap&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;A &amp;lt;navPoint&amp;gt; element holds the text to be displayed to the reader in the&amp;nbsp;&amp;lt;navLabel&amp;gt; element. It also tells the reading software where to go for the content document; this is held in the &lt;em&gt;src&lt;/em&gt; attribute of the &amp;lt;content&amp;gt; element.&lt;br /&gt;&lt;br /&gt;The topic of this Technical Note is the &lt;em&gt;playOrder&lt;/em&gt; attribute of the &amp;lt;navPoint&amp;gt;. The playOrder attribute holds a number specifying the sequence in which the &amp;lt;navPoint&amp;gt; details should be presented to the reader. &lt;br /&gt;&lt;br /&gt;It's important to distinguish the playOrder sequence from the sequence in which the &amp;lt;navPoint&amp;gt; nodes are held within the &amp;lt;navMap&amp;gt;; they don't have to be the same. In general, you can't assume anything about the way a content provider has built the &amp;lt;navMap&amp;gt;. &lt;br /&gt;&lt;br /&gt;For instance, supposing a publisher is working on a 'flat' document i.e. one with only one level in the &amp;lt;navPoint&amp;gt; hierarchy. If a new content document needs to be inserted between two documents with playOrder values of 5 and 6, the publisher might add the new&amp;nbsp;&amp;lt;navPoint&amp;gt; as the last node in the &amp;lt;navMap&amp;gt;; they would give the new &amp;lt;navPoint&amp;gt; a playOrder&amp;nbsp;value of 6, and increment the playOrder of all &amp;lt;navPoint&amp;gt; elements that need to follow it in reading order.&lt;br /&gt;&lt;br /&gt;What this means is that before presenting the navigation details to the reader, the &amp;lt;navPoint&amp;gt; nodes need to be sorted by their playOrder attribute. There are several approaches that could be taken, but the one described here is to use an XSL transform. &lt;br /&gt;&lt;br /&gt;The following&amp;nbsp;listing shows one possible transform.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;1&amp;nbsp;&amp;nbsp; &amp;lt;?xml version="1.0" encoding="utf-8"?&amp;gt;&lt;br /&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&amp;gt;&lt;br /&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;lt;xsl:output method="xml" indent="yes" omit-xml-declaration="no" encoding="utf-8"/&amp;gt;&lt;br /&gt;4&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;lt;xsl:template match="node()[local-name()='navMap']"&amp;gt;&lt;br /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:element name="navMap"&amp;gt;&lt;br /&gt;6&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:apply-templates select="node()[local-name()='navPoint']"&amp;gt;&lt;br /&gt;7&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:sort select="number(@playOrder)" data-type="number"/&amp;gt;&lt;br /&gt;8&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/xsl:apply-templates&amp;gt;&amp;gt;&lt;br /&gt;9&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/xsl:element&amp;gt;&lt;br /&gt;10&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/xsl:template&amp;gt;&lt;br /&gt;11&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:template match="node()[local-name()='navPoint']"&amp;gt;&lt;br /&gt;12&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;xsl:copy-of select="."/&amp;gt;&lt;br /&gt;13&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/xsl:template&amp;gt;&lt;br /&gt;14 &amp;lt;/xsl:stylesheet&amp;gt;&lt;/blockquote&gt;At line 3, the &amp;lt;xsl:output&amp;gt; statement declares that the output is to be an XML document which includes an &amp;lt;?xml ?&amp;gt; declaration and is encoded in UTF-8.&lt;br /&gt;&lt;br /&gt;Lines 4-10 of&amp;nbsp;the transform find and process&amp;nbsp;the &amp;lt;navMap&amp;gt; element. &lt;br /&gt;&lt;br /&gt;At line 5 a new &amp;lt;navMap&amp;gt; is generated in the output.&lt;br /&gt;&lt;br /&gt;At line 6, the transform handles the &amp;lt;navPoint&amp;gt; elements. The corresponding template at lines 11-13 returns a copy of each &amp;lt;navPoint&amp;gt;.&lt;br /&gt;&lt;br /&gt;Line 7 is the key to the whole transform. It sorts the returned &amp;lt;navPoint&amp;gt; elements using the playOrder attribute.&lt;br /&gt;&lt;br /&gt;The end result is a &amp;lt;navMap&amp;gt; in which the &amp;lt;navPoint&amp;gt; elements are in the order in which they should be presented to the reader.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Calling the transform in C#&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The calling sequence for using this transform in C# can be something like the following:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: #38761d;"&gt;// load the transform into an XslCompiledTransform instance&lt;/span&gt;&lt;br /&gt;XslCompiledTransform xslt = new XslCompiledTransform();&lt;br /&gt;xslt.Load(MapPath(".") + "/TransformNavMap.xsl");&lt;br /&gt;&lt;br /&gt;&lt;span style="color: #38761d;"&gt;// prepare an XmlWriter to write the transform output&lt;/span&gt;&lt;br /&gt;StringBuilder&amp;nbsp;newNavMap = new StringBuilder();&lt;br /&gt;XmlWriter writer = XmlWriter.Create(newNavMap , xslt.OutputSettings);&lt;br /&gt;&lt;br /&gt;&lt;span style="color: #38761d;"&gt;//&amp;nbsp;sort the book's&amp;nbsp;navMap using the transform&lt;/span&gt;&lt;br /&gt;xslt.Transform(book.container.package.ncx.NavMap, writer);&lt;br /&gt;writer.Close();&lt;/blockquote&gt;Now the new &amp;lt;navMap&amp;gt; can be retrieved as an XmlDocument using newNavMap.ToString();&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-8550012949740626144?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8550012949740626144'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8550012949740626144'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/online-epub-editor-project-technical.html' title='Online Epub Editor Project: Technical Notes'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh4.ggpht.com/_cvaF-9-3DHs/S1g3vT8uWfI/AAAAAAAAAM8/kMlpMeqfCbw/s72-c/InsideEpub0040.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-870042653534272843</id><published>2010-01-19T17:38:00.002Z</published><updated>2010-01-29T11:46:32.185Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='wysiwyg'/><category scheme='http://www.blogger.com/atom/ns#' term='NCX'/><category scheme='http://www.blogger.com/atom/ns#' term='IDPF'/><category scheme='http://www.blogger.com/atom/ns#' term='navMap'/><category scheme='http://www.blogger.com/atom/ns#' term='editor'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='GUI'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>User interface for an online epub editor</title><content type='html'>This article looks at the minimum requirements for an online epub editor and how they have been implemented in the sample application. The scope is limited to handling the editor's functionality - how to create, save, edit, and view .epub publications. It&amp;nbsp;excludes the supporting features like user login, disk space management, uploading and referencing images. Those are important topics but they are common to many online applications, so I don't want to reinvent them here.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Requirements Summary&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;To refine further the scope of this project, an&amp;nbsp;epub editor should include the following abilities:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Select an existing epub book&amp;nbsp;for viewing or amendment.&lt;/li&gt;&lt;li&gt;Open a selected epub&amp;nbsp;book and view its contents - including its metadata.&lt;/li&gt;&lt;li&gt;Navigate around a book using its NCX table of contents.&lt;/li&gt;&lt;li&gt;Edit a section and save its contents.&lt;/li&gt;&lt;li&gt;Add a new section to the book.&lt;/li&gt;&lt;li&gt;Remove a section from the book.&lt;/li&gt;&lt;li&gt;Change the reading order of the content documents.&lt;/li&gt;&lt;li&gt;View the book's metadata.&lt;/li&gt;&lt;li&gt;Edit and save the book's metadata.&lt;/li&gt;&lt;li&gt;Save the book to its .epub file.&lt;/li&gt;&lt;li&gt;Create a new epub book.&lt;/li&gt;&lt;/ul&gt;Not included, for this first version of the editor, are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Adding multiple instances of metadata items, such as &lt;em&gt;title&lt;/em&gt;, &lt;em&gt;identifier, language, creator, &lt;/em&gt;etc.&lt;/li&gt;&lt;li&gt;Implementing &lt;em&gt;Save As&lt;/em&gt; to an .epub with a different name.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Application Overview&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;Figure 1. shows a screenshot of the online editor. A menu gives access to different views of the application. It has options called:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="color: #660000;"&gt;Books&lt;/span&gt; - showing a list of epub books in the user's 'library'.&lt;/li&gt;&lt;li&gt;&lt;span style="color: #660000;"&gt;Book Information&lt;/span&gt; - displaying the metadata for the selected book.&lt;/li&gt;&lt;li&gt;&lt;span style="color: #660000;"&gt;Book Content&lt;/span&gt; - displaying the table of contents for the selected book.&lt;/li&gt;&lt;li&gt;&lt;span style="color: #660000;"&gt;Media &lt;/span&gt;- listing the non-text media items (e.g. images) that are part of the selected book.&lt;/li&gt;&lt;li&gt;&lt;span style="color: #660000;"&gt;Styles &lt;/span&gt;- listing the CSS files that are associated with the selected book.&lt;/li&gt;&lt;/ul&gt;&lt;a href="http://lh6.ggpht.com/_cvaF-9-3DHs/S1SXDku7rdI/AAAAAAAAAJQ/NsOH9UHBPwY/s800/InsideEpub0027.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S1SXDku7rdI/AAAAAAAAAJQ/NsOH9UHBPwY/s288/InsideEpub0027.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Viewing a list of books&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Select an existing epub book for viewing or amendment&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 1. shows the application positioned on the 'Books' view. The screen below the menu comprises a ListBox and a button showing the text 'New epub'.&lt;br /&gt;&lt;br /&gt;The ListBox displays the .epub files that are in the user's library, where 'library' is defined as a folder on the web server that hosts the application. In this example, I've added an item to the &amp;lt;appSettings&amp;gt; in the&amp;nbsp;web.config file of the application which provides the path to the library folder. This is shown in Figure 2.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S1Sa2m_qE7I/AAAAAAAAAJU/2wI2aQ0jZWs/s640/InsideEpub0028.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S1Sa2m_qE7I/AAAAAAAAAJU/2wI2aQ0jZWs/s400/InsideEpub0028.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. epub library configuration&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;When the user selects a book, by clicking on it in the list,&amp;nbsp;the application reads the .epub file using the ZipFile entity of the DotNetZip library. Using the ExtractAll method of that class the contents of the .epub are written to the file system. All subsequent operations, like changing a document or adding a document,&amp;nbsp;are carried out on files rather than on the in-memory zipped data. However, the .epub file&amp;nbsp;is kept upto date after each saved change.&lt;br /&gt;&lt;br /&gt;In a multi-user application a more sophisticated approach would be required to keep each user's library separate from all others and under the protection of an access control mechanism. There are similarities with blogging sites and Content Management Systems.&lt;br /&gt;&lt;br /&gt;The 'New epub' button, unsurprisingly, is where the user will start creating a new epub publication from scratch.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Viewing and changing metadata&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 3. is a screenshot after the user selects a book and then clicks on the 'Book Information' tab. &lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S1Sf4C1o-yI/AAAAAAAAAJY/Gv8yLtuzXKs/s800/InsideEpub0029.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S1Sf4C1o-yI/AAAAAAAAAJY/Gv8yLtuzXKs/s288/InsideEpub0029.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. epub metadata&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The application displays the metadata items that are found in the &amp;lt;metadata&amp;gt; section of the book's package document. The display is created using an ASP.Net Repeater control which is bound to the &amp;lt;metadata&amp;gt; node, so it displays all of the items found there and only the items found there. Further, it displays them in the order in which they appear, so the two &amp;lt;date&amp;gt; elements for instance are not together.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Edit and save a book's metadata&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The information is displayed in TextBox controls, so it can be changed. The 'Save' button is used to update the metadata in the package document. We'll see later how the application saves the package data in the .epub file.&lt;br /&gt;&lt;br /&gt;Some enhancements are needed to this model for a more flexible editor. For instance:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The ability to mark an entire .epub as read-only is desirable. If you don't own the intellectual property to a work, it's not appropriate to change it.&lt;/li&gt;&lt;li&gt;Some metadata items&amp;nbsp;can occur more than once. Examples are: &lt;em&gt;title, identifier, language, creator, contributor.&lt;/em&gt; It would be useful to be able to add and remove such items in the metadata.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Navigate around a book using its NCX table of contents&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 4. shows a screenshot similar to one in an earlier post and is the result of clicking on the 'Book Content' tab while a book is selected.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S1Sf4U1x0eI/AAAAAAAAAJc/VhkBGyEMjew/s800/InsideEpub0030.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S1Sf4U1x0eI/AAAAAAAAAJc/VhkBGyEMjew/s288/InsideEpub0030.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 4. Navigating a book's contents&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This screen uses a TreeView control to show a hierarchical table of contents. The TreeView is bound to a version of the navMap extracted from the book's NCX information. To the right of the table of contents is the area used to view and edit the text. This is a TextBox control associated with the tiny MCE Javascript, converting it into an editor control.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Edit and save content and&amp;nbsp;add a new section&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Beneath the reading/writing area in&amp;nbsp;Figure 4.&amp;nbsp;you can see the following controls:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A button marked 'Save'. This is used to save the current contents of the edit area to the underlying content document in the file system.&amp;nbsp;&lt;/li&gt;&lt;li&gt;A button marked 'Add' accompanied by two TextBoxes labelled 'Name' and 'Title'. The user enters the name for the content document in the Name box and the content heading in the Title box and clicks the Add button.&lt;/li&gt;&lt;/ul&gt;Figure 5. shows the results when Name is set to 'chapter01' and Title is set to 'Chapter 1'.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S1Vj0IxSP7I/AAAAAAAAAJg/p-DAZhD3p3U/s800/InsideEpub0031.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S1Vj0IxSP7I/AAAAAAAAAJg/p-DAZhD3p3U/s288/InsideEpub0031.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 5. Navigating a book's contents&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;An empty content document called 'chapter01.xml' is created and the text 'Chapter 1' is inserted into&amp;nbsp;the document heading.&amp;nbsp;The application adds chapter01 to the package, meaning the content document is&amp;nbsp;added to the &amp;lt;manifest&amp;gt;, the &amp;lt;spine&amp;gt;, and the &amp;lt;navMap&amp;gt; in the NCX document. The application navigates to the new document ready for the user to start typing. In this case,&amp;nbsp;the user has typed some text into the editing area and pressed 'Save'.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Creating a new epub publication&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;It was shown above that there is a 'New epub' button on the 'Books' tab. Figure 6. shows the result of clicking that button.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S1VwXIMbDlI/AAAAAAAAAJk/72J7WqbDm2s/s800/InsideEpub0032.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S1VwXIMbDlI/AAAAAAAAAJk/72J7WqbDm2s/s288/InsideEpub0032.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 6. Creating a new epub&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The user is presented with a set of TextBoxes inviting input of basic metadata for the new book. The first three items - title, identifier, and language - are mandatory. The only item already set to a value is the identifier. In case you don't have your own ISBN number or other number that is globally unique, the application generates a GUID (Globally Unique IDentifier). You can use this or overwrite it with your own number.&lt;br /&gt;&lt;br /&gt;Figure 7. shows the screen after each metadata item has been given a value and just&amp;nbsp;before the 'Create' button is clicked.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S1V1VhkZ9zI/AAAAAAAAAJo/VSrsLwueA3E/s800/InsideEpub0033.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S1V1VhkZ9zI/AAAAAAAAAJo/VSrsLwueA3E/s288/InsideEpub0033.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 7. metadata for the new epub&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Figure 8. shows the results of clicking the 'Create' button. A new epub is created and the application shows the 'Book Contents' view. The user has entered and saved some text. Notice this is an almost empty publication, having only one content document.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S1V1VvN8jvI/AAAAAAAAAJs/_SHSe4xwA6U/s800/InsideEpub0034.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S1V1VvN8jvI/AAAAAAAAAJs/_SHSe4xwA6U/s288/InsideEpub0034.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 8. The new epub waiting to be written&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Figure 9. shows the 'Books' view after a new book is created. The list of books is updated to show the new title. Clicking on the new book opens it for viewing and editing just like any other book in the library.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S1V3lcQl4sI/AAAAAAAAAJw/7XIgC0kBHa8/s800/InsideEpub0035.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S1V3lcQl4sI/AAAAAAAAAJw/7XIgC0kBHa8/s288/InsideEpub0035.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 9. New epub showing in the 'Books' list&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Summary&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;This article has given a view of how an online wysiwyg epub editor might look. Details of the code can be found, shortly,&amp;nbsp;in the reference section for the code project.&lt;br /&gt;&lt;br /&gt;Undoubtedly, this is rough and ready. Many refinements would be required to make using this application a useful and pleasant experience. Among these are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A much richer configuration of the tiny MCE editor to provide better formatting of the text.&lt;/li&gt;&lt;li&gt;Autosaving contents at short intervals.&lt;/li&gt;&lt;/ul&gt;&lt;hr /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Article Navigation&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Developing an epub editor &lt;a href="http://netkingcol.blogspot.com/2010/01/develop-your-own-epub-editor.html"&gt;&amp;lt;&amp;lt; Previous&lt;/a&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="http://netkingcol.blogspot.com/2010/01/design-of-ncx-handler-class-in-c.html"&gt;Next&lt;/a&gt; &amp;gt;&amp;gt;&lt;br /&gt;Exploring epub standarda: &lt;a href="http://netkingcol.blogspot.com/2009/12/introduction-to-epub.html"&gt;Introduction&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-870042653534272843?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/870042653534272843'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/870042653534272843'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/user-interface-for-online-epub-editor.html' title='User interface for an online epub editor'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh6.ggpht.com/_cvaF-9-3DHs/S1SXDku7rdI/AAAAAAAAAJQ/NsOH9UHBPwY/s72-c/InsideEpub0027.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-2323692444558541569</id><published>2010-01-14T15:05:00.005Z</published><updated>2010-01-19T17:45:17.830Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='wysiwyg'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Packaging Format'/><category scheme='http://www.blogger.com/atom/ns#' term='online'/><category scheme='http://www.blogger.com/atom/ns#' term='editor'/><category scheme='http://www.blogger.com/atom/ns#' term='Zen Garden'/><category scheme='http://www.blogger.com/atom/ns#' term='tiny MCE'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='DotNetZip'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>Develop your own epub editor</title><content type='html'>The previous posts in this series explored the structure of epub books and demonstrated how the epub standards work together to describe and package content documents. If you found those articles highly technical you might want to look away now - this post is the first of&amp;nbsp;a series that shows how to create your own web-based, wysisyg editor that you can use to create epub documents. The programming strand of Inside Epub&amp;nbsp;starts here. This post is an introduction to the subject and gives an idea of the development environment, the software tools, and the documentation that will be available as the project unfolds.&lt;br /&gt;&lt;br /&gt;There are still topics to cover in the previous strand, like embedding XML islands in your content and handling fallback from non-standard to standard document types - and I will cover those in future. To help you choose the posts that match your interests, I've created a separate list of links for the code development and will join the posts in each strand with 'next' and 'previous' links.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Objectives - mine and yours&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The aim of the programming strand of this blog is to create a web-based, wysiwyg epub editor. This will work only with epub documents and will be suitable for the creation of new ebooks. There's plenty of effort going into conversion of existing texts between formats so I don't feel the need to do that. If you believe that epub will become the format of choice for ebooks&amp;nbsp;then why not create your book in epub directly? I don't claim that this application&amp;nbsp;will be as powerful as Microsoft Word® with a 'Save As epub' add-in; it definitely won't be. &lt;em&gt;My&lt;/em&gt; aim is to show you how &lt;em&gt;you&lt;/em&gt; might set about creating a robust, online, multi-user&amp;nbsp;epub editor.&lt;br /&gt;&lt;br /&gt;To show you where we'll end up, Figure 1. shows a screenshot of the editor after opening the sample epub book &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt;.&amp;nbsp;It has a menu across the top and the body of the page shows the result of selecting a book on the 'Books' tab and&amp;nbsp;clicking on the 'Book Content' tab. This manages to look like Adobe Digital Editions but, being online, doesn't&amp;nbsp;require you to download and install&amp;nbsp;the reader.&lt;br /&gt;&lt;br /&gt;At the moment we're not interested in style, only in functionality. If you want to see what might be possible by adding CSS styling, look at &lt;a href="http://epubzengarden.com/#/static/middlemarch/OEBPS/chapter1.html"&gt;epub Zen Garden&lt;/a&gt;.&lt;br /&gt;&lt;a href="http://lh6.ggpht.com/_cvaF-9-3DHs/S07YbA3bUXI/AAAAAAAAAIk/nSLIN3fy3ks/s800/InsideEpub0017.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S07YbA3bUXI/AAAAAAAAAIk/nSLIN3fy3ks/s288/InsideEpub0017.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Web-based, wysiwyg epub editor&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The application was written&amp;nbsp;for the&amp;nbsp;ASP.Net&amp;nbsp;platform using Microsoft's Visual Web Developer 2008 Express Edition®. The programming language is&amp;nbsp;C#.&lt;br /&gt;&lt;br /&gt;On the left of the screenshot, a Treeview control is bound to the navMap of the NCX document in order to display&amp;nbsp;the Table of Contents.&amp;nbsp;To the right, an editor control shows the content of the Title Page.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Software Prerequisites&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;We've seen that epub publications are held&amp;nbsp;as Zip archives and&amp;nbsp;that epub packaging is achieved using a range of XML documents. Content documents&amp;nbsp;are held in XHTML. To start developing an epub editor that runs on the .Net platform I chose the following range of free and/or open software tools:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.microsoft.com/express/vwd/"&gt;Visual Web Developer 2008 Express Edition&lt;/a&gt;&amp;nbsp;is used as the main development platform. There are more recent versions of this toolkit available, in beta testing at&amp;nbsp;the time of writing,&amp;nbsp;if you must have the latest. The tool enables the development of ASP.Net websites and includes extensive XML handling. The NCX file requires transformation before it is bound to the Table of Contents control and Visual Web Developer has facilities for using XSLT transforms.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.codeplex.com/DotNetZip"&gt;DotNetZip library&lt;/a&gt;&amp;nbsp;is used to manipulate Zip files in the C# .Net environment. Follow the stratightforward installation instructions. The ZipFile class of this library is used to open .epub files and to extract the contents to a folder on disk. It is also used to refresh the archive when a content document is changed or when documents are added or removed.&lt;/li&gt;&lt;li&gt;&lt;a href="http://tinymce.moxiecode.com/"&gt;tiny MCE&lt;/a&gt;&amp;nbsp;is a &lt;a href="http://tinymce.moxiecode.com/using.php"&gt;widely&amp;nbsp;used&lt;/a&gt;&amp;nbsp;Javascript editor which converts an ASP.Net&amp;nbsp;TextBox control into an editor that outputs XHTML documents. The &lt;a href="http://www.wordpress.com/"&gt;WordPress&lt;/a&gt; blogging site and the&amp;nbsp;&lt;a href="http://www.joomla.org/"&gt;Joomla&lt;/a&gt; Content Management Software use this technology.&amp;nbsp;The installation instructions are easy to follow. Configuration is a bit harder so, for the purposes of these posts, I'm keeping it simple.&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Prototyping Approach&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;My favourite approach to using visual development tools is prototyping and this is especially true if I just want to demonstrate some techniques. It's really easy to throw a website together using Visual Web Developer&amp;nbsp;but, having said that, the object-oriented approach required by C# means we need to give some thought to the classes of object that will be required.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Data Model&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 2. shows the data model for an epub document drawn, literally, on the back of an envelope.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S08BgZ6eUYI/AAAAAAAAAJA/9UXGudpUoMI/s800/InsideEpub0018.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S08BgZ6eUYI/AAAAAAAAAJA/9UXGudpUoMI/s288/InsideEpub0018.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. epub data model&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Each node in this model will be developed either as a C# class or will be accessible as an XMLNode within one of those classes.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Example Class &lt;em&gt;epub&lt;/em&gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;Figure 2. shows the class &lt;em&gt;epub&lt;/em&gt; which is used to model the contents of&amp;nbsp;an epub file. Most of the code is hidden in this image to enable you to see the structure. The class has Members, Constructors, Methods, and Properties.&lt;/span&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S08cpZTIOMI/AAAAAAAAAJM/bn6qUwdEMQY/s800/InsideEpub0019.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S08cpZTIOMI/AAAAAAAAAJM/bn6qUwdEMQY/s288/InsideEpub0019.jpg" /&gt;&lt;/a&gt; &lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure 2. epub class&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;In order not to clutter the posts in this thread, reference material for the code project, including class definitions,&amp;nbsp;can be found&amp;nbsp;at:&amp;nbsp;&lt;a href="http://netkingcol.blogspot.com/2010/01/inside-epub-code-reference.html"&gt;Inside Epub Code Reference&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;That's it for setting the scene. Next time I'll look at the user interface.&lt;br /&gt;&lt;hr /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Article Navigation&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Developing an epub editor &amp;lt;&amp;lt; Previous&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="http://netkingcol.blogspot.com/2010/01/user-interface-for-online-epub-editor.html"&gt;Next &amp;gt;&amp;gt;&lt;/a&gt;&lt;br /&gt;Exploring epub standarda: &lt;a href="http://netkingcol.blogspot.com/2009/12/introduction-to-epub.html"&gt;Introduction&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-2323692444558541569?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/2323692444558541569'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/2323692444558541569'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/develop-your-own-epub-editor.html' title='Develop your own epub editor'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh6.ggpht.com/_cvaF-9-3DHs/S07YbA3bUXI/AAAAAAAAAIk/nSLIN3fy3ks/s72-c/InsideEpub0017.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-4162937885714338623</id><published>2010-01-14T14:48:00.001Z</published><updated>2010-01-21T15:47:18.609Z</updated><title type='text'>Inside Epub Code Reference</title><content type='html'>Documentation for the Class Library developed for this project is available at &lt;a href="http://www.hazelhurst.net/insideepub/"&gt;www.hazelhurst.net/insideepub/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-4162937885714338623?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/4162937885714338623'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/4162937885714338623'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/inside-epub-code-reference.html' title='Inside Epub Code Reference'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-132854310750150373</id><published>2010-01-14T14:41:00.002Z</published><updated>2010-01-14T14:44:37.764Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Container'/><category scheme='http://www.blogger.com/atom/ns#' term='class'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>Class epub</title><content type='html'>&lt;span style="color: black;"&gt;This class models an epub ebook. The approach is taken that the Zip handler will be used only to open and save a .epub file. All other operations will use the files in the folder to which the Zip file is extracted.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;&lt;em&gt;epub&lt;/em&gt; Members&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Class epub exposes the following members:&lt;br /&gt;&lt;br /&gt;&lt;table border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Property&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;_container&lt;br /&gt;&lt;/td&gt;&lt;td&gt;_container is a private instance of class &lt;em&gt;container&lt;/em&gt;. It gives access to the rest of the epub documents in the ebook, mainly through its package instance.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;_filePath&lt;br /&gt;&lt;/td&gt;&lt;td&gt;_filePath is a string holding the location in the file system where the .epub for this instance of epub is located.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;&lt;em&gt;epub &lt;/em&gt;Constructor&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The class has a constructor with the signature:&lt;br /&gt;&lt;blockquote&gt;public epub(string ebookPath, string fileSystemPath)&lt;br /&gt;&lt;/blockquote&gt;This allows the caller to create an &lt;em&gt;epub&lt;/em&gt; instance which will locate and open the ebook at location&amp;nbsp;&lt;em&gt;ebookPath. &lt;/em&gt;The ebook is extracted from the .epub and written to the file system at location &lt;em&gt;fileSystemPath&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;&lt;em&gt;epub&lt;/em&gt; Methods&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Class &lt;em&gt;epub&lt;/em&gt; exposes the following methods:&lt;br /&gt;&lt;br /&gt;&lt;table border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Method&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Create&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Creates a new epub ebook with the given title at the given location. Uses the given working folder path to store the files that will be zipped into the .epub.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Open&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Open an existing .epub file and expand it into the given working folder.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Save&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Save the epub instance as a .epub file at the given location.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;SaveEntry&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Save the given text of a content document.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;&lt;em&gt;epub &lt;/em&gt;Properties&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Class epub exposes the following properties:&lt;br /&gt;&lt;br /&gt;&lt;table border="1"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Property&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;container&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Get or set the &lt;em&gt;container&lt;/em&gt; instance belonging to the current instance of &lt;em&gt;epub&lt;/em&gt;. This property is widely used to gain access to most of the ebook's package and NCX data.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Code of the &lt;em&gt;epub&lt;/em&gt; class (14Jan10 14:20)&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;code&gt;The code of the &lt;em&gt;epub&lt;/em&gt; class is shown below.&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;#region using&lt;br /&gt;using System;&lt;br /&gt;using System.IO;&lt;br /&gt;#endregion&lt;br /&gt;&lt;br /&gt;namespace net.hazelhurst.epub&lt;br /&gt;{&lt;br /&gt;public class epub&lt;br /&gt;{&lt;br /&gt;#region Members&lt;br /&gt;private container _container;&lt;br /&gt;private string _filePath; &lt;br /&gt;#endregion &lt;br /&gt;&lt;br /&gt;#region Constructors&lt;br /&gt;public epub(string eBookPath, string fileSystemPath)&lt;br /&gt;{&lt;br /&gt;Open(eBookPath, fileSystemPath);&lt;br /&gt;} &lt;br /&gt;#endregion&lt;br /&gt;&lt;br /&gt;#region Methods&lt;br /&gt;public static epub Create(string title, string booksPath, out string workPath)&lt;br /&gt;{&lt;br /&gt;epub result = null;&lt;br /&gt;string emptyBookPath;&lt;br /&gt;&lt;br /&gt;workPath = booksPath + title + "\\";&lt;br /&gt;if (!Directory.Exists(workPath) &amp;amp;&amp;amp; !File.Exists(booksPath + title + ".epub"))&lt;br /&gt;{&lt;br /&gt;string newbook = booksPath + title + ".epub";&lt;br /&gt;emptyBookPath = booksPath + "empty\\";&lt;br /&gt;File.Copy(emptyBookPath + "empty.zip", newbook);&lt;br /&gt;result = new epub(newbook, workPath);&lt;br /&gt;result.container.package.Title = title;&lt;br /&gt;}&lt;br /&gt;else&lt;br /&gt;{&lt;br /&gt;throw new Exception("A book with that title already exists");&lt;br /&gt;}&lt;br /&gt;return result;&lt;br /&gt;} &lt;br /&gt;public bool Open(string eBookPath, string fileSystemPath)&lt;br /&gt;{&lt;br /&gt;bool bookOpen = false;&lt;br /&gt;&lt;br /&gt;_filePath = eBookPath;&lt;br /&gt;&lt;br /&gt;// get the book's container&lt;br /&gt;_container = new container(eBookPath, fileSystemPath);&lt;br /&gt;&lt;br /&gt;// if the container is available&lt;br /&gt;if (_container != null)&lt;br /&gt;{&lt;br /&gt;bookOpen = true;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;return bookOpen;&lt;br /&gt;}&lt;br /&gt;public bool Save()&lt;br /&gt;{&lt;br /&gt;bool result = false;&lt;br /&gt;_container.Save();&lt;br /&gt;//ZipFileHandler.Save(Title, filePath, outputPath);&lt;br /&gt;return result;&lt;br /&gt;}&lt;br /&gt;public bool SaveEntry(string fileName, string content)&lt;br /&gt;{&lt;br /&gt;bool result = false;&lt;br /&gt;&lt;br /&gt;try&lt;br /&gt;{&lt;br /&gt;_container.SaveEntry(fileName, content);&lt;br /&gt;}&lt;br /&gt;catch (Exception ex)&lt;br /&gt;{&lt;br /&gt;throw ex;&lt;br /&gt;}&lt;br /&gt;return result;&lt;br /&gt;}&lt;br /&gt;#endregion&lt;br /&gt;&lt;br /&gt;#region Properties&lt;br /&gt;public container container&lt;br /&gt;{&lt;br /&gt;get&lt;br /&gt;{&lt;br /&gt;return _container;&lt;br /&gt;}&lt;br /&gt;set&lt;br /&gt;{&lt;br /&gt;_container = value;&lt;br /&gt;}&lt;br /&gt;}&lt;br /&gt;public string Title&lt;br /&gt;{&lt;br /&gt;get { return _container.package.Title; }&lt;br /&gt;set { _container.package.Title = value; }&lt;br /&gt;} &lt;br /&gt;#endregion&lt;br /&gt;&lt;br /&gt;} //epub&lt;br /&gt;}&lt;br /&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-132854310750150373?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/132854310750150373'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/132854310750150373'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/class-epub.html' title='Class epub'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-2718886034442561124</id><published>2010-01-06T18:00:00.005Z</published><updated>2010-02-15T14:41:57.041Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='PDF'/><category scheme='http://www.blogger.com/atom/ns#' term='WinZip'/><category scheme='http://www.blogger.com/atom/ns#' term='IDPF'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Packaging Format'/><category scheme='http://www.blogger.com/atom/ns#' term='Smashwords'/><category scheme='http://www.blogger.com/atom/ns#' term='OCF'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Zip'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Container Format'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>OPS and PDF in one OCF container</title><content type='html'>I showed you what an OCF container file looked like&amp;nbsp;in the topic: &lt;a href="http://netkingcol.blogspot.com/2010/01/how-standards-work-together.html"&gt;How the standards work together&lt;/a&gt;. Here it is again:&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S0CMbIf0i_I/AAAAAAAAAE4/1UCNxoFno6g/s800/InsideEpub0006.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S0CMbIf0i_I/AAAAAAAAAE4/1UCNxoFno6g/s288/InsideEpub0006.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Container with one &amp;lt;rootfile&amp;gt; element&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The container is&amp;nbsp;always called &lt;em&gt;container.xml.&amp;nbsp;&lt;/em&gt;The&amp;nbsp;example&amp;nbsp;has a &amp;lt;rootfile&amp;gt; element in it that points to the OPF package file&amp;nbsp;&lt;em&gt;ebp.opf &lt;/em&gt;in&amp;nbsp;folder &lt;em&gt;OPS.&amp;nbsp;&lt;/em&gt;That folder also holds all of the other items in the manifest. &lt;br /&gt;&lt;br /&gt;A key strength of epub is that it delivers text that&amp;nbsp;will fit easily on any size of reading device, from the smallest smartphone to the largest PC monitor. Text will flow smoothly from one&amp;nbsp;screen to the next regardless of&amp;nbsp;the size of screen or the size of font that the reader prefers.&lt;br /&gt;&lt;br /&gt;Supposing you want to give the reader the ability to print the book in an attractive format. PDF would be a good choice for this.&amp;nbsp;Sites like &lt;a href="http://www.smashwords.com/books/view/6855"&gt;Smashwords&lt;/a&gt;&amp;nbsp;offer separate downloads of epub and PDF versions of a book, but they could be bundled together using the Open Container Format. The&amp;nbsp;OCF is a general-purpose container specification and you can store more than one version of your publication in it.&lt;br /&gt;&lt;br /&gt;Figure 2. shows &lt;em&gt;container.xml&lt;/em&gt; for a sample book available in the &lt;a href="http://www.idpf.org/forums/viewtopic.php?t=54&amp;amp;sid=fbe41d710cbcb60b7572f0e9ef671025"&gt;IDPF forums&lt;/a&gt;. The book&amp;nbsp;contains&amp;nbsp;some Sherlock Holmes stories and Figure 2.&amp;nbsp;shows how two renditions of the text - OPS and PDF -&amp;nbsp;are represented:&lt;br /&gt;&lt;a href="http://lh6.ggpht.com/_cvaF-9-3DHs/S0S8emRX-qI/AAAAAAAAAHg/6FhOc8XjKQM/s800/InsideEpub0015.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh6.ggpht.com/_cvaF-9-3DHs/S0S8emRX-qI/AAAAAAAAAHg/6FhOc8XjKQM/s288/InsideEpub0015.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. Container with two renditions&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Quite simply, there are two &amp;lt;rootfile&amp;gt; elements present, one for each rendition:&lt;br /&gt;&lt;blockquote&gt;&amp;lt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;rootfiles&lt;/strong&gt;&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;rootfile&lt;/strong&gt;&lt;/span&gt; &lt;strong&gt;&lt;span style="color: #cc0000;"&gt;full-path&lt;/span&gt;&lt;/strong&gt;="&lt;strong&gt;&lt;span style="color: blue;"&gt;OEBPS/content.opf&lt;/span&gt;&lt;/strong&gt;" &lt;strong&gt;&lt;span style="color: #cc0000;"&gt;media-type&lt;/span&gt;&lt;/strong&gt;="&lt;strong&gt;&lt;span style="color: blue;"&gt;application/oebps-package+xml&lt;/span&gt;&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;rootfile&lt;/strong&gt;&lt;/span&gt; &lt;strong&gt;&lt;span style="color: #cc0000;"&gt;full-path&lt;/span&gt;&lt;/strong&gt;="&lt;strong&gt;&lt;span style="color: blue;"&gt;PDF/Holmes.pdf&lt;/span&gt;&lt;/strong&gt;" &lt;strong&gt;&lt;span style="color: #cc0000;"&gt;media-type&lt;/span&gt;&lt;/strong&gt;="&lt;strong&gt;&lt;span style="color: blue;"&gt;application/pdf&lt;/span&gt;&lt;/strong&gt;"/&amp;gt;&lt;br /&gt;&amp;lt;/&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;rootfiles&lt;/strong&gt;&lt;/span&gt;&amp;gt;&lt;/blockquote&gt;The first &amp;lt;rootfile&amp;gt; identifies the OPF package, as in the earlier example. Reading software designed to handle OPF packages would render the book using information contained in this &amp;lt;rootfile&amp;gt;.&lt;br /&gt;&lt;br /&gt;The second &amp;lt;rootfile&amp;gt; points to the file &lt;em&gt;Holmes.pdf&lt;/em&gt; in folder &lt;em&gt;PDF&lt;/em&gt; and the media-type indicates that this should be handled as a PDF file. Here, then is&amp;nbsp;the alternate rendition for the text. Notice that the container follows the OCF recommendation that alternate renditions should be placed in dedicated folders. Figure 3. illustrates this by showing the folder structure in a&amp;nbsp;WinZip® view of the epub file.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S0S8e-szDfI/AAAAAAAAAHk/sQrp_L69JjM/s800/InsideEpub0016.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S0S8e-szDfI/AAAAAAAAAHk/sQrp_L69JjM/s288/InsideEpub0016.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. Dedicated folders for alternate renditions&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The PDF file is held in a distinct folder in the Zip archive.&lt;br /&gt;&lt;br /&gt;Not now, but in the future, you may be able to store more than one publication in an OCF container, for example the complete works of an author held as separate ebooks. This would require the reading software to read the container and present the list of contained ebooks to the reader.&lt;br /&gt;&lt;br /&gt;Article Navigation&lt;br /&gt;&lt;a href="http://netkingcol.blogspot.com/2010/02/xml-islands-in-epub-publications.html"&gt;Next &amp;gt;&amp;gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-2718886034442561124?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/2718886034442561124'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/2718886034442561124'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/ops-and-pdf-in-one-ocf-container.html' title='OPS and PDF in one OCF container'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh5.ggpht.com/_cvaF-9-3DHs/S0CMbIf0i_I/AAAAAAAAAE4/1UCNxoFno6g/s72-c/InsideEpub0006.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-720551080970369049</id><published>2010-01-05T16:00:00.001Z</published><updated>2010-01-06T14:35:29.226Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DAISY'/><category scheme='http://www.blogger.com/atom/ns#' term='NCX'/><category scheme='http://www.blogger.com/atom/ns#' term='NISO'/><category scheme='http://www.blogger.com/atom/ns#' term='navPoint'/><category scheme='http://www.blogger.com/atom/ns#' term='navMap'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='Adobe Digital Editions'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><title type='text'>NCX navigation in epub books</title><content type='html'>We saw in an earlier post how the &amp;lt;spine&amp;gt; element in the package document is used to provide a linear reading order for the content documents of an epub publication. An ebook reader could use the spine data to retrieve and chain together the book's content documents and present them to you, starting at the first screen of the first document.&lt;br /&gt;&lt;br /&gt;But, suppose you're looking at Victor Hugo's &lt;em&gt;Les Misérables&lt;/em&gt; and you want to find the passage where Jean Valjean is carrying the wounded Marius through the sewers of Paris. How to find it? You don't want to scroll forwards through the book from the beginning (&lt;em&gt;my Penguin Classics version has 1200 pages and I'd be looking for page 1083&lt;/em&gt;). What you need is a Table of Contents that would guide you to Part Five, Book III, Chapter III. The epub specifications make this possible through the use of an XML document that follows yet another open standard.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;The DAISY Consortium and NCX&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The &lt;a href="http://www.daisy.org/about-us"&gt;&lt;strong&gt;D&lt;/strong&gt;igital &lt;strong&gt;A&lt;/strong&gt;ccessible &lt;strong&gt;I&lt;/strong&gt;nformation &lt;strong&gt;Sy&lt;/strong&gt;stem&amp;nbsp;consortium&lt;/a&gt; is an organisation aiming&amp;nbsp;"to lead the worldwide transition from analog to Digital Talking Books." As such, they have been&amp;nbsp;heavily involved in developing and maintaining information standards. The DAISY/NISO standard entitled &lt;a href="http://www.daisy.org/z3986/2005/Z3986-2005.html"&gt;DAISY Specifications for the Digital Talking Book&lt;/a&gt;&amp;nbsp;is the basis of the approach taken by the International Digital Publishing Forum (IDPF) to providing a Table of Contents in an epub publication.&lt;br /&gt;&lt;br /&gt;The standard has an acronym. It's called &lt;a href="http://www.daisy.org/z3986/2005/Z3986-2005.html#NCX"&gt;NCX&lt;/a&gt;. The Open Packaging Format specification offers two meanings for the acronym:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Navigation Center eXtended&lt;/li&gt;&lt;li&gt;Navigation Control for XML applications&lt;/li&gt;&lt;/ul&gt;Take your pick, it doesn't really matter. The important thing is that the standard is followed. To give you immediately an idea of how NCX can be used to present a Table of Contents, take a look at Figure 1.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S0M3hBeI5xI/AAAAAAAAAF4/-y_eoMQX1Nk/s800/InsideEpub0010.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S0M3hBeI5xI/AAAAAAAAAF4/-y_eoMQX1Nk/s288/InsideEpub0010.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. NCX for The Curious Case of Benjamin Button&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The screenshot in Figure 1. shows the NCX document that's shipped with the epubBooks version of &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt;. In this view, I've collapsed the &amp;lt;head&amp;gt;, &amp;lt;docTitle&amp;gt;, and &amp;lt;docAuthor&amp;gt; elements in order to concentrate on the &amp;lt;navMap&amp;gt; information. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;A simple NCX example&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;A book can be said to have a hierarchical structure -&amp;nbsp;it may have Parts which comprise Chapters which in turn may be divided into Sections and Sub-sections. This hierarchy is expressed in the&amp;nbsp;navMap section of the NCX document. Starting with&amp;nbsp;our simple example, which has no hierarachy&amp;nbsp;to speak of, take a look at Figure 2, below.&lt;br /&gt;&lt;blockquote&gt;&amp;lt;&lt;span style="color: #b45f06;"&gt;navMap&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt; &lt;span style="color: #cc0000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;navpoint-1&lt;/span&gt;" &lt;span style="color: #cc0000;"&gt;playOrder&lt;/span&gt;="&lt;span style="color: blue;"&gt;1&lt;/span&gt;"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;Title Page&amp;lt;/&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;content&lt;/span&gt; &lt;span style="color: #990000;"&gt;src&lt;/span&gt;="&lt;span style="color: blue;"&gt;title.xml&lt;/span&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt; &lt;span style="color: #cc0000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;navpoint-2&lt;/span&gt;" &lt;span style="color: #cc0000;"&gt;playOrder&lt;/span&gt;="&lt;span style="color: blue;"&gt;2&lt;/span&gt;"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;epubBooks Information&amp;lt;/&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;content&lt;/span&gt; &lt;span style="color: #cc0000;"&gt;src&lt;/span&gt;="&lt;span style="color: blue;"&gt;epubbooksinfo.xml&lt;/span&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt; &lt;span style="color: #cc0000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;navpoint-3&lt;/span&gt;" &lt;span style="color: #cc0000;"&gt;playOrder&lt;/span&gt;="&lt;span style="color: blue;"&gt;3&lt;/span&gt;"&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;1&amp;lt;/&lt;span style="color: #b45f06;"&gt;text&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navLabel&lt;/span&gt;&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: #b45f06;"&gt;content&lt;/span&gt; &lt;span style="color: #cc0000;"&gt;src&lt;/span&gt;="&lt;span style="color: blue;"&gt;chapter-001.xml&lt;/span&gt;"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: #b45f06;"&gt;navPoint&lt;/span&gt;&amp;gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;em&gt;Figure 2. A simple navMap&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The&amp;nbsp;&amp;lt;navMap&amp;gt; element can have any number of &amp;lt;navPoint&amp;gt; elements. Each &amp;lt;navPoint&amp;gt; identifies a significant subdivision of the book to which the reader may navigate directly - for instance to the start of a given chapter. The &amp;lt;navPoint&amp;gt; element has attributes 'id' and 'playOrder'. 'id' is a unique identifier and 'playOrder' is a number, starting from 1,&amp;nbsp;that indicates the position of the navPoint in the sequence of content documents making up the publication.&lt;br /&gt;&lt;br /&gt;A &amp;lt;navPoint&amp;gt; element contains a&amp;nbsp;&amp;lt;navLabel&amp;gt; element and a &amp;lt;content&amp;gt; element. The &amp;lt;navLabel&amp;gt; has a &amp;lt;text&amp;gt; element which holds the text that will be displayed in the Table of Contents. The &amp;lt;content&amp;gt; element has 'src' attribute that tells the reading software the name of the content document to display.&lt;br /&gt;&lt;br /&gt;Figure 3. shows &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt; opened using Adobe Digitial Editions®. You can see how the &amp;lt;text&amp;gt; values from each &amp;lt;navPoint&amp;gt; are displayed in the Table of Contents.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S0NPcwKtVnI/AAAAAAAAAGA/5diSCoUtYvw/s800/InsideEpub0012.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S0NPcwKtVnI/AAAAAAAAAGA/5diSCoUtYvw/s288/InsideEpub0012.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. NCX Table of Contents in Adobe Digital Editions®&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;When the reader clicks on chapter 1, represented by the imaginatively selected&amp;nbsp;"1" in the Table of Contents, the reading software looks up the &amp;lt;content&amp;gt; element of that &amp;lt;navPoint&amp;gt; and fetches 'chapter-001.xml' for display.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;An NCX with two levels&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;A &amp;lt;navPoint&amp;gt; may itself contain further &amp;lt;navPoint&amp;gt; elements. That is how the hierarchical structure of the book is expressed. In the simple example above the nesting is only one level deep, but take a look at Figure 4.&lt;br /&gt;&lt;blockquote&gt;&amp;lt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt; id="navpoint-3" playOrder="3"&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;text&amp;gt;PART ONE&amp;lt;/text&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;content src="part-01.xml"/&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: blue;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt; id="navpoint-4" playOrder="4"&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;text&amp;gt;CHAPTER I&amp;lt;/text&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;content src="chapter-001.xml"/&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: blue;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt;&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;span style="color: blue;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt; id="navpoint-5" playOrder="5"&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;text&amp;gt;CHAPTER II&amp;lt;/text&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/navLabel&amp;gt; &lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;content src="chapter-002.xml"/&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;lt;/&lt;span style="color: blue;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt;&amp;gt;&lt;br /&gt;&lt;br /&gt;...intervening navPoints in Part One&lt;br /&gt;&lt;br /&gt;&amp;lt;/&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;navPoint&lt;/strong&gt;&lt;/span&gt;&amp;gt; (end of navpoint-3)&lt;br /&gt;&lt;/blockquote&gt;&lt;em&gt;Figure 4. navPoints nested to two levels&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The NCX extract in Figure 4. is taken from Jules Verne's &lt;em&gt;20,000 Leagues Under the Sea&lt;/em&gt;, also downloaded from epubBooks. I've changed the colour coding of the elements to show the hierarchical relationship between the navPoints.&lt;br /&gt;&lt;br /&gt;navPoint-3 introduces Part One of the book. As before, it has a navLabel (PART ONE) and the content document is called 'part-01.xml'. Next is another navPoint &lt;em&gt;within&lt;/em&gt; the Part One navPoint i.e. before the &amp;lt;/navPoint&amp;gt; declaration. This indicates that the new navPoint and all of the navPoints up to the next &amp;lt;/navPoint&amp;gt;&amp;nbsp;belong to Part One of the book. Figure 5. shows how this looks in Adobe Digital Editions®&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S0NdFSXYC2I/AAAAAAAAAGE/nlCyKyTvfsQ/s800/InsideEpub0013.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S0NdFSXYC2I/AAAAAAAAAGE/nlCyKyTvfsQ/s288/InsideEpub0013.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 5. Two-level NCX&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Notice how the chapter titles are indented within PART ONE. The standard does not impose any limit on the depth of nesting so if you're a corporate lawyer drawing up an eContract you can create sub-sections to your heart's content. Also, NCX doesn't impose names like part, chapter, and section to the structure of a book. It's left entirely to the publisher how they want to name these subdivisions; the NCX document holds whatever you choose in the navLabel/text entries.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Ebooks are nothing without the writing&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;To finish off this topic, I went and found my all-time favourite book,&amp;nbsp;&lt;em&gt;Les Misérables,&lt;/em&gt;&amp;nbsp;and opened it in Adobe Digital Editions to locate that passage I mentioned earlier. Figure 6. Shows how epubBooks built the NCX for this book. They have three levels of nesting, dividing the work into Volumes, Books, and Chapters.&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S0SdsWm9XXI/AAAAAAAAAHE/PApNK8Q3Tw0/s800/InsideEpub0014.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S0SdsWm9XXI/AAAAAAAAAHE/PApNK8Q3Tw0/s288/InsideEpub0014.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 6. Jean Valjean carrying Marius&lt;/em&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-720551080970369049?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/720551080970369049'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/720551080970369049'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/ncx-navigation-in-epub-books.html' title='NCX navigation in epub books'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh4.ggpht.com/_cvaF-9-3DHs/S0M3hBeI5xI/AAAAAAAAAF4/-y_eoMQX1Nk/s72-c/InsideEpub0010.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-5601511497967067604</id><published>2010-01-04T15:09:00.012Z</published><updated>2010-01-04T16:02:59.462Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ISBN'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Packaging Format'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='GUID'/><category scheme='http://www.blogger.com/atom/ns#' term='metadata'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='manifest'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='Dublin Core'/><category scheme='http://www.blogger.com/atom/ns#' term='spine'/><title type='text'>A closer look at OPF</title><content type='html'>Let's take a closer look at the Open Packaging Format. &lt;br /&gt;&lt;br /&gt;Recall from the&amp;nbsp;previous post&amp;nbsp;that an electronic publication conforming to the OPF standard must provide a package document. This must be&amp;nbsp;an XML document with a root element of &amp;lt;package&amp;gt;&amp;nbsp;which includes elements called &amp;lt;metadata&amp;gt;, &amp;lt;manifest&amp;gt;, and &amp;lt;spine&amp;gt;.&amp;nbsp;Figure 1. shows an overview of the package document that is delivered with our sample epub ebook.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz-YE494IgI/AAAAAAAAAEc/VCiyLGXqtHg/s800/InsideEpub0005.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz-YE494IgI/AAAAAAAAAEc/VCiyLGXqtHg/s288/InsideEpub0005.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Package Overview&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;You can see this document has the correct set of package elements. Before looking at each in detail, there are two things I'd like to point out on this screenshot. First, the unique-identifier attribute in the &amp;lt;package&amp;gt;:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: red;"&gt;&lt;span style="color: #990000;"&gt;unique-identifier&lt;/span&gt;&lt;span style="color: black;"&gt;="&lt;/span&gt;&lt;span style="color: blue;"&gt;EPB-UUID&lt;/span&gt;&lt;/span&gt;&lt;span style="color: black;"&gt;"&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;This attribute tells the software of the reading device to look out for a metadata item of type &amp;lt;dc:identifier&amp;gt; that has an 'id' attribute with the value 'EPB-UUID'. The value of this element is an identifier that is globally unique. Towards the bottom of the metadata, when we look at it,&amp;nbsp;you will find the following element:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;dc:identifier&lt;/span&gt;&lt;span style="color: red;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;&lt;span style="color: #990000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;EPB-UUID&lt;/span&gt;"&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; urn:uuid:CBC56AFC-6C29-1014-8672-92A1DF1F0AF1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;/&lt;span style="color: #b45f06;"&gt;dc:identifier&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;That 32-digit hexadecimal value is the GUID (Globally Unique Identifier) generated by epubBooks to identify this particular publication. Of course, an ISBN is another globally unique identifier, and if you are in the publishing business and routinely buy ISBNs by the dozen, you would probably insert the ISBN here. &lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;Second, the attributes in the &amp;lt;metadata&amp;gt; element:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: #990000;"&gt;xmlns:opf&lt;/span&gt;="&lt;span style="color: black;"&gt;&lt;/span&gt;&lt;span style="color: blue;"&gt;http://www.idpf.org/2007/opf&lt;span style="color: black;"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: #990000;"&gt;xmlns:dc&lt;/span&gt;="&lt;span style="color: black;"&gt;&lt;/span&gt;&lt;span style="color: blue;"&gt;http://purl.org/dc/elements/1.1&lt;/span&gt;"&lt;br /&gt;&lt;/blockquote&gt;indicate that the prefix 'dc' refers to the &lt;a href="http://www.dublincore.org/"&gt;Dublin Core&lt;/a&gt; element definitions (see below for more detail), and that the 'opf' prefix refers to OPF extensions to the Dublin Core. For instance, look at the two date elements in the metadata:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;dc:date&lt;/span&gt; &lt;span style="color: #990000;"&gt;opf:event&lt;/span&gt;="&lt;span style="color: blue;"&gt;original-publication&lt;/span&gt;"&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;span style="color: black;"&gt;1922&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;/&lt;span style="color: #b45f06;"&gt;dc:date&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;dc:date&lt;/span&gt; &lt;span style="color: #990000;"&gt;opf:event&lt;/span&gt;="&lt;span style="color: blue;"&gt;epub-publication&lt;/span&gt;"&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;span style="color: black;"&gt;2009-09-24&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;/&lt;span style="color: #b45f06;"&gt;dc:date&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;The 'dc' prefix on the 'date' elements identifies them as publication&amp;nbsp;dates that follow the Dublin Core specification. The 'opf' prefix on the 'event' attribute identifies 'event' as belonging to the OPF specification.&lt;br /&gt;&lt;br /&gt;Unfortunately, looking at the OPF specification, it seems the publisher is free to give the event attribute whatever value they like:&lt;br /&gt;&lt;blockquote&gt;"The set of values for event are not defined by this specification; possible values may include: &lt;em&gt;creation&lt;/em&gt;, &lt;em&gt;publication&lt;/em&gt;, and &lt;em&gt;modification&lt;/em&gt;."&lt;br /&gt;&lt;/blockquote&gt;Now, let's look in more detail at the package metadata. &lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Package &amp;lt;metadata&amp;gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The &amp;lt;metadata&amp;gt; element of the package can contain wide ranging information about the publication. To keep OPF as open as possible, the metadata of an OPF package makes use of another open standard, namely the &lt;a href="http://www.dublincore.org/"&gt;Dublin Core Metadata&lt;/a&gt; standard. &lt;br /&gt;&lt;br /&gt;The Dublin Core is an initiative working towards standard ways of describing resources. They actively promote standardised sharing information thereby increasing interoperability between organisations - let's all agree to call a spade a spade and not a shovel or a digger. &lt;br /&gt;&lt;br /&gt;The Dublin Core has a wider scope than just ebooks. However, there is a rich set of attributes that &lt;em&gt;can &lt;/em&gt;be applied to electronic publications. Figure 2. shows the metadata that epubBooks placed in the package of &lt;em&gt;The Curious Case of Benjamin Button.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S0G3iL_6TzI/AAAAAAAAAFU/VkebRCTJJBg/s800/InsideEpub0007.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S0G3iL_6TzI/AAAAAAAAAFU/VkebRCTJJBg/s288/InsideEpub0007.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. Package Metadata&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;For convenience, I'll list the metadata elements&amp;nbsp;again here:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;title&lt;/li&gt;&lt;li&gt;language&lt;/li&gt;&lt;li&gt;identifier&lt;/li&gt;&lt;li&gt;---------&lt;/li&gt;&lt;li&gt;creator&lt;/li&gt;&lt;li&gt;date(s)&lt;/li&gt;&lt;li&gt;publisher&lt;/li&gt;&lt;li&gt;subject&lt;/li&gt;&lt;li&gt;source&lt;/li&gt;&lt;li&gt;rights&lt;/li&gt;&lt;/ul&gt;The elements above the dashed line are mandatory. The OPF Package Schema says that there must be at least one title, at least one language element, and at least one identifier element. All other elements are optional.&lt;br /&gt;&lt;br /&gt;&lt;span style="color: #b45f06;"&gt;&lt;strong&gt;Title&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;In fact the schema says there must be 'One Or More' of the mandatory elements. In other words, there can be more than one title, more than one language, and more than one identifier. The standard does not specify which title should be displayed, only&amp;nbsp;that a reading device should choose 'the most appropriate title' for display, perhaps based on available fonts or language.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Identifier&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;There can also be more than one&amp;nbsp;identifier element in the metadata. We've seen above how the unique identifier is handled. If you wanted, you could&amp;nbsp;publish an ebook with several identifiers: your internally generated identifier, a GUID, and your ISBN. You then have to say which is to be considered the globally unique identifier.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Language&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The specification says there must be at least one &amp;lt;language&amp;gt; metadata element, but there may be more than one. I suppose if you were publishing an English-Mandarin dictionary or were writing a learned text about the Rosetta Stone you might have a reason to specify more than one language.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Full list of metadata elements&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;The following table summarises the full set of metadata elements that can appear in the &amp;lt;metadata&amp;gt; section of a &amp;lt;package&amp;gt;&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="2"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Element&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Number&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;title&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The title of the publication. As we've seen, there can be more than one, but there must be at least one title.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;creator&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The primary creators or authors of the publication. Each element is recommended to hold one name and is recommended to be in the form it should be presented to the reader. When there's more than one creator, it's expected they would be displayed in the order in which the elements appear in the metadata. Other contributors should be identified in &amp;lt;contributor&amp;gt; elements.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;subject&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The subject matter of the publication. There is no standardisation here. The optional text could be a sentence, a list of keywords, or one keyword per element.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;description&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The description(s) of the publication.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;publisher&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The publisher(s) of the publication.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;contributor&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The person(s) making contributions to the publication in a manner that is secondary to the role of creator. OPF defines nearly 30 different roles as contributor and specifies the syntax for their identification.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;date&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The publication date(s) for the publication. We've already seen that OPF extends the Dublin Core definition of this element, allowing different 'event' dates to be recognised.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;type&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The type(s) that describe the publication. This is relatively free-form although the specification recommends using words from controlled vocabularies i.e. selecting from a restricted set of words. Terms relating to genre e.g. Young Adult, Fantasy, Literary, might be used as well as terms like Fiction, Non-Fiction etc.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;format&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;The media-types of the publication. The recommendation is to use a MIME type.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;identifier&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more identifiers for the publication, one of which must be defined as a unique identifier. See the discussion above.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;source&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Identification of any other documents or publications from which the current publication is derived.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;language&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more language identifiers.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;relation&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Identifier(s) of resources to which the current publication is related.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;coverage&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;One or more identifiers of the scope of the publication. OPF recommends following the Dublin Core specification of &lt;a href="http://dublincore.org/documents/dcmi-terms/index.shtml#terms-coverage"&gt;coverage&lt;/a&gt;&amp;nbsp;and to use a controlled vocabulary for&amp;nbsp;geographical, temporal, and juridical&amp;nbsp;descriptions.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;rights&lt;br /&gt;&lt;/td&gt;&lt;td&gt;Zero or more&lt;br /&gt;&lt;/td&gt;&lt;td&gt;An assertion of the rights of the publisher/creator with respect to this publication.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Table 1. Package metadata elements&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Package &amp;lt;manifest&amp;gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 3. shows the &amp;lt;manifest&amp;gt; element of the OPF package for our sample epub ebook, &lt;em&gt;The Curious Case of Benjamin Button.&lt;/em&gt;&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/S0G3jGCxn8I/AAAAAAAAAFY/2fbcABweuXs/s800/InsideEpub0008.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/S0G3jGCxn8I/AAAAAAAAAFY/2fbcABweuXs/s288/InsideEpub0008.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. Package Manifest&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The package manifest identifies all of the resources that are needed to display the ebook fully and correctly. Each entry in the manifest consists of an &amp;lt;item&amp;gt; element, as in:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;item&lt;/span&gt; &lt;span style="color: #990000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;chapter-001&lt;/span&gt;" &lt;span style="color: #990000;"&gt;href&lt;/span&gt;="&lt;span style="color: blue;"&gt;chapter-001.xml&lt;/span&gt;" &lt;span style="color: #990000;"&gt;media-type&lt;/span&gt;="&lt;span style="color: blue;"&gt;application/xhtml+xml&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;Each &amp;lt;item&amp;gt; element has an 'id' attribute which identifies this resource uniquely within the publication. It has an 'href' attribute which points to the content document, in the example above it's an XML document called 'chapter-001.xml'. The 'media-type' attribute in this example shows that the resource should be handled as an XHTML document.&lt;br /&gt;&lt;br /&gt;You can see that the manifest lists 13 content documents: a title page, a page of information about the publisher (epubBooks), and the 11 chapters of the story.&lt;br /&gt;&lt;br /&gt;Each content document includes a 'link' element that refers to the&amp;nbsp;CSS stylesheet 'body.css'. Therefore, the manifest includes an &amp;lt;item&amp;gt; for it:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;item&lt;/span&gt; &lt;span style="color: #990000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;main-css&lt;/span&gt;" &lt;span style="color: #990000;"&gt;href&lt;/span&gt;="&lt;span style="color: blue;"&gt;css/book.css&lt;/span&gt;" &lt;span style="color: #990000;"&gt;media-type&lt;/span&gt;="&lt;span style="color: blue;"&gt;text/css&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;span style="color: black;"&gt;The publisher information document, epubbooksinfo.xml, includes an image which is the company logo. Therefore, the&amp;nbsp;manifest includes an &amp;lt;item&amp;gt; for it:&lt;/span&gt;&lt;span style="color: black;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: black;"&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;item&lt;/span&gt; &lt;span style="color: #990000;"&gt;id&lt;/span&gt;="&lt;span style="color: blue;"&gt;epubbooks-logo&lt;/span&gt;" &lt;span style="color: #990000;"&gt;href&lt;/span&gt;="&lt;span style="color: blue;"&gt;images/epubbooks-logo.png&lt;/span&gt;" &lt;span style="color: #990000;"&gt;media-type&lt;/span&gt;="&lt;span style="color: blue;"&gt;image/png&lt;/span&gt;"/&lt;/span&gt;&lt;span style="color: blue;"&gt;&amp;gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;The rule is: if the content documents use it, it must be in the manifest. There are some aspects of the manifest that will be reserved for a future post so they don't clutter up this presentation. These cover media-types that are not part of the OPS Core, Out-Of-Line XML Islands, and the use of fallback documents to support these non-standard documents.&lt;br /&gt;&lt;br /&gt;There's one more item in the manifest, and it's quite an important one:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;item&lt;/span&gt; &lt;span style="color: #990000;"&gt;id&lt;/span&gt;&lt;span style="color: black;"&gt;="&lt;/span&gt;&lt;span style="color: blue;"&gt;ncx&lt;/span&gt;&lt;span style="color: black;"&gt;"&lt;/span&gt; &lt;span style="color: #990000;"&gt;href&lt;/span&gt;&lt;span style="color: black;"&gt;="&lt;/span&gt;&lt;span style="color: blue;"&gt;epb.ncx&lt;/span&gt;&lt;span style="color: black;"&gt;"&lt;/span&gt; &lt;span style="color: #990000;"&gt;media-type&lt;/span&gt;&lt;span style="color: black;"&gt;="&lt;/span&gt;&lt;span style="color: blue;"&gt;application/x-dtbncx+xml&lt;span style="color: black;"&gt;"&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;The 'id' attribute is set to 'ncx', the 'href' points to a file called 'epb.ncx', and the 'media-type' indicates that the resource should be handled as an NCX document. NCX is a standard way of declaring a Table of Contents. It's another open standard, this time maintained by the DAISY consortium.&lt;br /&gt;&lt;br /&gt;This leads us nicely into the description of the third mandatory element of an OPF package - the &amp;lt;spine&amp;gt; element.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Package &amp;lt;spine&amp;gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Figure 4. shows the expanded &amp;lt;spine&amp;gt; element of our sample package.&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/S0G3j-4zoqI/AAAAAAAAAFc/fOQiFUI_K88/s800/InsideEpub0009.jpg" target="_blank"&gt;&lt;br /&gt;&lt;img alt="Click to see the full image" src="http://lh4.ggpht.com/_cvaF-9-3DHs/S0G3j-4zoqI/AAAAAAAAAFc/fOQiFUI_K88/s288/InsideEpub0009.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 4. Package Spine&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The &amp;lt;spine&amp;gt; starts off like this:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;spine&lt;/span&gt; &lt;span style="color: #990000;"&gt;toc&lt;/span&gt;="&lt;span style="color: blue;"&gt;ncx&lt;/span&gt;"&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;itemref&lt;/span&gt; &lt;span style="color: #990000;"&gt;idref&lt;/span&gt;="&lt;span style="color: blue;"&gt;titlepage&lt;/span&gt;" &lt;span style="color: #990000;"&gt;linear&lt;/span&gt;="&lt;span style="color: blue;"&gt;yes&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;itemref&lt;/span&gt; &lt;span style="color: #990000;"&gt;idref&lt;/span&gt;="&lt;span style="color: blue;"&gt;epubbooksinfo&lt;/span&gt;" &lt;span style="color: #990000;"&gt;linear&lt;/span&gt;="&lt;span style="color: blue;"&gt;yes&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;itemref&lt;/span&gt; &lt;span style="color: #990000;"&gt;idref&lt;/span&gt;="&lt;span style="color: blue;"&gt;chapter-001&lt;/span&gt;" &lt;span style="color: #990000;"&gt;linear&lt;/span&gt;="&lt;span style="color: blue;"&gt;yes&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp; &amp;lt;&lt;/span&gt;&lt;span style="color: #b45f06;"&gt;itemref&lt;/span&gt; &lt;span style="color: #990000;"&gt;idref&lt;/span&gt;="&lt;span style="color: blue;"&gt;chapter-002&lt;/span&gt;" &lt;span style="color: #990000;"&gt;linear&lt;/span&gt;="&lt;span style="color: blue;"&gt;yes&lt;/span&gt;"/&lt;span style="color: blue;"&gt;&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;The &lt;spine&gt;element has a 'toc' attribute with the value 'ncx'. This is the value of the 'id' attribute&amp;nbsp;of the item in the manifest that points to the mandatory table of contents document. In other words, it's a way to identify&amp;nbsp;the table of contents.&amp;nbsp;We saw in the last section that the manifest item with an id of 'ncx' points to a document called 'epb.ncx'. We'll look at NCX documents in more detail in a future article.&lt;br /&gt;&lt;br /&gt;The next thing to notice about the &amp;lt;spine&amp;gt; is that it contains a list of &amp;lt;itemref&amp;gt; elements. Each &amp;lt;itemref&amp;gt; has an attribute called 'idref', and the value of an idref is the id of an item in the manifest. &lt;br /&gt;&lt;br /&gt;For example, the first idref has value 'titlepage'. Look back at the manifest screenshot and you'll see that the first content document in the manifest has id="titlepage", and that item points to the content document itself (titlepage.xml).&lt;br /&gt;&lt;br /&gt;The spine is a list of content documents and the important thing about the list is that it specifies the linear order in which the content documents should be displayed: title page, followed by the publisher's information page, followed by chapter 1, etc.&lt;br /&gt;&lt;br /&gt;The &amp;lt;itemref&amp;gt; element has an optional attribute called 'linear'. This attribute takes a yes/no value and is used to indicate whether the referenced document is primary or auxiliary. This can be used by reading devices to show auxiliary information in a different way from the main flow of the primary information. In our case, the values are all set to 'yes' which is the default.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-5601511497967067604?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/5601511497967067604'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/5601511497967067604'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/closer-look-at-opf.html' title='A closer look at OPF'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh3.ggpht.com/_cvaF-9-3DHs/Sz-YE494IgI/AAAAAAAAAEc/VCiyLGXqtHg/s72-c/InsideEpub0005.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-5840042700211759158</id><published>2010-01-03T12:15:00.116Z</published><updated>2010-01-06T16:45:12.663Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='IDPF'/><category scheme='http://www.blogger.com/atom/ns#' term='Container'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='OPS'/><category scheme='http://www.blogger.com/atom/ns#' term='OCF'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>How the standards work together</title><content type='html'>In earlier posts I've referred to the Open Publication Structure, the Open Container Format, and the Open Packaging Format. What helps to make these specifications open is that they make widespread use of other, industry-wide, open standards like XHTML, XML, CSS, MIME, among others. &lt;br /&gt;&lt;br /&gt;The IDPF standards documents make frequent reference to these other standards, and they all work together to make it possible for content providers - authors and publishers - to ensure that their publications are available on the widest range of reading devices. All they have to do is to create documents that conform to the standards. Likewise, hardware and software developers know that if they also&amp;nbsp;follow the standards their reading devices will be able to display all conforming publications.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Three Standards&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;If you looked at the IDPF website you might be wondering why there are three standards documents for epub publications. The reason is that they address different aspects of an electronic publication. It's widely recognised in the industry that features like content and presentation should be stored separately. Just as content is stored in XML and presentation is stored in CSS, &lt;em&gt;what&lt;/em&gt; is packaged in an epub document is described separately from &lt;em&gt;how&lt;/em&gt; it is packaged.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Book Analogy&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Imagine for a moment that you have in your hand a first edition copy of &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt;. The words that Fitzgerald used belong to a vocabulary - the English language. The &lt;strong&gt;Open Publication Structure&lt;/strong&gt; defines the 'Preferred Vocabulary' of epub Content Documents - not the words you use to write a book, rather the XHTML elements that&amp;nbsp;you use to create the&amp;nbsp;structure of the book, building it&amp;nbsp;into sections, chapters, and parts.&lt;br /&gt;&lt;br /&gt;The paper book has several chapters that are printed in order (we hope). The book also has a title page and other items that are not part of the story but which &lt;a href="http://en.wikipedia.org/wiki/Book_design" target="_blank"&gt;book designers&lt;/a&gt; call Front Matter (Foreword, Preface, Acknowledgements etc.). The book may also have a Table of Contents in the Front Matter which helps the reader find their way around the content. The &lt;strong&gt;Open Packaging Format&lt;/strong&gt; defines how all of these parts of an epub document should be described.&lt;br /&gt;&lt;br /&gt;Finally, the book has a cover which binds the content into a single entity, as opposed to loose pages or separately bound chapters. The &lt;strong&gt;Open Container Format&lt;/strong&gt; specifies how the electronic Package, which is all of the necessary information for the book to comply with the epub standards, is held as a unit. In abstract terms the output of&amp;nbsp;this bundling together is called a Container. On your computer it might be represented as a folder in your file system or, as in our case, a Zip archive held in one file with the .epub extension.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Open Publication Structure&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;The Open Publication Structure is mostly concerned with which XHTML elements you can use to build a content document. Let's take a look at what we mean by a content document. The following image shows one of the shorter chapters from our sample ebook &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/Sz9z3ayw0FI/AAAAAAAAAEY/0Z2XubVIq14/s800/InsideEpub0004.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/Sz9z3ayw0FI/AAAAAAAAAEY/0Z2XubVIq14/s288/InsideEpub0004.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 1. Sample Content Document&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;You can see from the first line that a content document is an XML document. The next declaration says that it contains an XHTML document which means it contains a version of HTML that can be held as valid XML.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;If you're not sure about XML, take a look at the &lt;a href="http://www.w3schools.com/xml/default.asp"&gt;XML Tutorial&lt;/a&gt; at &lt;a href="http://www.w3schools.com/"&gt;W3Schools&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;This is followd by the XHTML content itself. You can see the usual &amp;lt;html&amp;gt;, &amp;lt;head&amp;gt;, and &amp;lt;body&amp;gt; elements that make up an XHTML document.&amp;nbsp;Let's look at the contents of the&amp;nbsp;&amp;lt;head&amp;gt; element first:&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue; font-size: x-small;"&gt;&amp;lt;link rel="stylesheet" href="css/book.css" type="text/css"/&amp;gt; &lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue; font-size: x-small;"&gt;&amp;lt;meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/&amp;gt;&lt;br /&gt;&amp;lt;meta name="EPB-UUID" content="CBC56AFC-6C29-1014-8672-..."&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;&amp;nbsp;&lt;span style="color: black;"&gt;The &amp;lt;head&amp;gt; contains a &amp;lt;link&amp;gt; element that points to the external stylesheet called 'body.css' which can be found in the css folder. It also contains a &amp;lt;meta&amp;gt; element which declares that the content type of the document is 'application/xhtml+xml' and that the character encoding is UTF-8.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;Now take a look at the &amp;lt;body&amp;gt; of the document:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="color: blue;"&gt;&amp;lt;div class="body"&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp; &amp;lt;div class="chapter"&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;h3 class="chapter-title"&amp;gt;&lt;/span&gt;&lt;span style="color: blue;"&gt;6&amp;lt;/h3&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;p&amp;gt;When, six months later, the...&amp;lt;/p&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;p&amp;gt;The Sunday supplements of...&amp;lt;/p&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;p&amp;gt;However, every one agreed with...&amp;lt;/p&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;p&amp;gt;On the part of the two people most...&amp;lt;/p&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/div&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: blue;"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;/blockquote&gt;The &amp;lt;body&amp;gt; element contains two &amp;lt;div&amp;gt; elements, each with a different class, so that appropriate CSS styling will be applied. It contains a chapter title with &amp;lt;h3&amp;gt; styling, and four paragraphs elements (&amp;lt;p&amp;gt;). &lt;br /&gt;&lt;br /&gt;What the Open Publication Structure tells you is which XHTML elements you can use to build a content document. The following extract from the standard shows how the elements we've just seen are defined as &lt;strong&gt;required&lt;/strong&gt; elements that all conforming epub reading software must accept and render.&lt;br /&gt;&lt;br /&gt;&lt;table border="1" cellpadding="2"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;XHTML 1.1 Module Name&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Elements&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Structure&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;body&lt;/strong&gt;, &lt;strong&gt;head&lt;/strong&gt;, &lt;strong&gt;html&lt;/strong&gt;, title&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Text&lt;br /&gt;&lt;/td&gt;&lt;td&gt;abbr, acronym, address, blockquote, br, cite, code, dfn, &lt;strong&gt;div&lt;/strong&gt;, em, h1, h2, &lt;strong&gt;h3&lt;/strong&gt;, h4, h5, h6, kbd, &lt;strong&gt;p&lt;/strong&gt;, pre, q, samp, span, strong, var&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Meta-information&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;meta&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Link&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;link&lt;/strong&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;em&gt;Table 1. Extract of Open Publication Structure Preferred Vocabulary&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The standard refers to XHTML Module Names because XHTML can be subdivided into areas of functionality called Modules. This allows designers and developers to know what they can use and what they might find in a document that allows or requires&amp;nbsp;the use of a given Module.&lt;br /&gt;&lt;br /&gt;There is a lot more to write about Open Publication Structure, but this introduction serves to show that, at its simplest, it specifies what can go into a content document.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Open Packaging Format&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;The Open Packaging Format is a standard that defines how the contents of an epub electronic publication are identified and specifies how they should be presented. Recalling the post &lt;a href="http://netkingcol.blogspot.com/2010/01/first-look-inside-epub-ebook.html"&gt;First look Inside an epub ebook&lt;/a&gt;, there's a file in the Zip archive called 'ebp.opf'. That .opf extension indicates that the file contains Open Packaging Format information.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In Figure 2. I've extracted ebp.opf from the Zip and renamed it as ebp_opf.xml. Then I've loaded the XML file into Visual Web Developer 2008 Express Edition and collapsed all but the root node and the first level of child nodes. This gives an overview of the package document.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz-YE494IgI/AAAAAAAAAEc/VCiyLGXqtHg/s800/InsideEpub0005.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz-YE494IgI/AAAAAAAAAEc/VCiyLGXqtHg/s288/InsideEpub0005.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 2. Package Overview&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;The &amp;lt;package&amp;gt; node&amp;nbsp;has three child nodes:&lt;br /&gt;&lt;/div&gt;&lt;ol&gt;&lt;li&gt;A &amp;lt;metadata&amp;gt; node which holds a range of information about the publication as a whole. This list is quite long and includes such information as: title, creator, subject, description, publisher, creator.&lt;/li&gt;&lt;li&gt;A &amp;lt;manifest&amp;gt; node which holds descriptions of every file making up the publication. The manifest lists all of the content files, style sheets that are referenced, and all images that are included in the content.&lt;/li&gt;&lt;li&gt;A &amp;lt;spine&amp;gt; node that includes an ordered list of the content documents, effectively defining their reading order or the order in which they will be displayed by a reading device.&lt;/li&gt;&lt;/ol&gt;We'll look in more detail at these elements in a future post, but notice that the &amp;lt;spine&amp;gt; node has a 'toc' attribute. 'toc' stands for Table of Contents, and the value of this attribute is the unique identifier (id) of the document that contains the table of contents. In our example the value is 'ncx'. This means that the manifest will include an item which has an 'id' attribute set to&amp;nbsp;"ncx" and the file that this item points to should be treated as the Table of Contents for the publication.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;Open Container Format&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Finally in this post giving an overview of how the IDPF standards work together, we'll take a quick&amp;nbsp;look at the Open Container Format. Remember from the book analogy how the container binds all of the package contents together. The OCF defines how this should be done by defining a container system analagous to a set of folders in your file system. &lt;br /&gt;&lt;br /&gt;The standard defines that there &lt;strong&gt;must&lt;/strong&gt; be a file called 'mimetype' at the root level of the container and the mimetype file should comprise the ASCII string: "application/epub+zip". By so doing, reader software can confirm that the file it's been asked to open&amp;nbsp;with extension .epub is intended to be an epub publication. There may be other problems that make it invalid, but the mimetype file is an assertion that the container and its contents should be handled as an epub document.&lt;br /&gt;&lt;br /&gt;To further qualify as valid epub, the container file system must include a folder called 'META-INF' and that folder must contain a file called 'container.xml'. It may contain other files, which will be the subject of later posts in the blog, but it &lt;strong&gt;must&lt;/strong&gt; have container.xml.&lt;br /&gt;&lt;br /&gt;Figure 2. shows the contents of our container.xml, again opened in Visual Web Developer 2008 Express Edition.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://lh5.ggpht.com/_cvaF-9-3DHs/S0CMbIf0i_I/AAAAAAAAAE4/1UCNxoFno6g/s800/InsideEpub0006.jpg" target="_blank"&gt;&lt;img alt="Click to see the full image" src="http://lh5.ggpht.com/_cvaF-9-3DHs/S0CMbIf0i_I/AAAAAAAAAE4/1UCNxoFno6g/s288/InsideEpub0006.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;Figure 3. OCF Container&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;container.xml has a root node called &amp;lt;rootfiles&amp;gt;. In our example there is only one &amp;lt;rootfile&amp;gt; and that points to the package file: 'OPS/ebp.opf'.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color: #b45f06;"&gt;The standards working together&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;We've seen that our sample epub, &lt;em&gt;The Curious Case of Benjamin Button&lt;/em&gt;, has satisfied the requirements of the Open Container Format - it has a valid mimetype file and a META-INF folder with a valid container.xml.&lt;br /&gt;&lt;br /&gt;Reading software accessing the epub will open container.xml and use it to&amp;nbsp;locate and open the Open Package Format information in the indicated .opf file. Typically, the Table of Contents will be opened and displayed and the first content document will be displayed in a viewing area. &lt;br /&gt;&lt;br /&gt;The vocabulary of the content documents should hold no surprises for the reading software and it should be able to render the text with the structure and style indicated, including the display of any referenced images.&lt;br /&gt;&lt;br /&gt;In this simple example, that is how the different IDPF standards work together. In subsequent posts we'll look at these features in greater detail, show how their scope is wider than I've shown so far, and also start looking at C# Class modelling of epub entities.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-5840042700211759158?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/5840042700211759158'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/5840042700211759158'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/how-standards-work-together.html' title='How the standards work together'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh5.ggpht.com/_cvaF-9-3DHs/Sz9z3ayw0FI/AAAAAAAAAEY/0Z2XubVIq14/s72-c/InsideEpub0004.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-8944538023297146016</id><published>2010-01-01T10:19:00.043Z</published><updated>2010-01-06T13:13:36.064Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='WinZip'/><category scheme='http://www.blogger.com/atom/ns#' term='Container'/><category scheme='http://www.blogger.com/atom/ns#' term='META-INF'/><category scheme='http://www.blogger.com/atom/ns#' term='OPS'/><category scheme='http://www.blogger.com/atom/ns#' term='metadata'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Zip'/><category scheme='http://www.blogger.com/atom/ns#' term='epubBooks'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><category scheme='http://www.blogger.com/atom/ns#' term='W3Schools'/><category scheme='http://www.blogger.com/atom/ns#' term='Benjamin Button'/><category scheme='http://www.blogger.com/atom/ns#' term='OCF'/><category scheme='http://www.blogger.com/atom/ns#' term='XML'/><category scheme='http://www.blogger.com/atom/ns#' term='Adobe Digital Editions'/><category scheme='http://www.blogger.com/atom/ns#' term='Package'/><title type='text'>First look inside an epub ebook</title><content type='html'>In this article we will download an ebook from the web and examine its structure using common browsing tools. For this exercise you will need to be able to view the contents of Zip files and to display XML files.&lt;br /&gt;&lt;br /&gt;I have a fully paid for and licenced copy of WinZip, so that's what I'll be using in the instructions and screenshots. If you haven't, there are many ways of opening Zip files - take a look at &lt;a href="http://www.wikihow.com/Open-a-.Zip-File-Without-Winzip"&gt;How to open a Zip file without WinZip&lt;/a&gt; for some examples.&lt;br /&gt;&lt;br /&gt;For browsing XML documents I use Internet Explorer and Visual Web Developer Express Edition from Microsoft. If you don't use Internet Explorer, most other browsers will also display them; or take a look at &lt;a href="http://www.w3schools.com/xmL/xml_view.asp"&gt;Viewing XML Files&lt;/a&gt; at &lt;a href="http://www.w3schools.com/"&gt;W3Schools&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div align="left"&gt;The purpose of this exercise is to get our hands dirty on a real ebook without looking too closely at the standards documents referred to in the &lt;a href="http://netkingcol.blogspot.com/2009/12/introduction-to-epub.html"&gt;Introduction to epub&lt;/a&gt; post.&lt;br /&gt;&lt;br /&gt;&lt;span style="color:#996633;"&gt;&lt;strong&gt;Download an ebook in the epub format&lt;/strong&gt;&lt;br /&gt;&lt;/span&gt;Go to the &lt;a href="http://www.epubbooks.com/"&gt;epubBooks&lt;/a&gt; website and download the free ebook: &lt;a href="http://www.epubbooks.com/book/167/curious-case-of-benjamin-button"&gt;The Curious Case of Benjamin Button&lt;/a&gt;. This is a short story by F.Scott Fitzgerald and, as an ebook, has a fairly simple structure. This makes it suitable for illustration of some key ideas about how epub books are put together. For convenience, save the download to a new folder. &lt;/div&gt;&lt;br /&gt;&lt;a href="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz3aEsArtJI/AAAAAAAAACM/WagfxPT7cgk/s800/InsideEpub0001.jpg" target="_blank"&gt;&lt;img src="http://lh3.ggpht.com/_cvaF-9-3DHs/Sz3aEsArtJI/AAAAAAAAACM/WagfxPT7cgk/s288/InsideEpub0001.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;em&gt;Figure 1. Downloaded epub ebook&lt;/em&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;(Hint: click on thumbnails to see the full screenshot or image)&lt;br /&gt;&lt;br /&gt;Figure 1. shows how the saved ebook looks on my machine. Notice that the ebook download consists of a single file with the file extension .epub.&lt;br /&gt;&lt;br /&gt;&lt;div align="left"&gt;&lt;strong&gt;&lt;span style="color:#996633;"&gt;Open the ebook using your Zip file viewer&lt;/span&gt;&lt;/strong&gt;&lt;/div&gt;&lt;div align="left"&gt;We'll look at the Open Publication Structure shortly but for now open the file using WinZip, or whichever program you are using to view Zip files. Figure 2. shows the contents of the file when the folder view of WinZip is switched on.&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/Sz38ntVkSYI/AAAAAAAAADg/wR2873Kuhkc/s800/InsideEpub0002.jpg" target="_blank"&gt;&lt;img src="http://lh4.ggpht.com/_cvaF-9-3DHs/Sz38ntVkSYI/AAAAAAAAADg/wR2873Kuhkc/s288/InsideEpub0002.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;&lt;span style="font-size:85%;"&gt;Figure 2. epub document opened in WinZip&lt;/span&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;div align="left"&gt;Figure 2. demonstrates first of all that the epub format uses Zip compression technology to package the different parts of an ebook into a single file. Next, notice that, at the top level of the folder hierarchy, there is a file called 'mimetype' and there are two folders called 'META-INF'&lt;em&gt; &lt;/em&gt;and 'OPS'. The OPS folder has two sub-folders called 'css' and 'images'.&lt;/div&gt;&lt;br /&gt;&lt;div align="left"&gt;Figure 3. shows the full contents of the file when the folder view is switched off. This view is sorted by the Path column. &lt;/div&gt;&lt;br /&gt;&lt;a href="http://lh4.ggpht.com/_cvaF-9-3DHs/Sz3-n_5_k5I/AAAAAAAAADk/Smi4rb9kcKo/s800/InsideEpub0003.jpg" target="_blank"&gt;&lt;img src="http://lh4.ggpht.com/_cvaF-9-3DHs/Sz3-n_5_k5I/AAAAAAAAADk/Smi4rb9kcKo/s288/InsideEpub0003.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div align="left"&gt;&lt;em&gt;&lt;span style="font-size:85%;"&gt;Figure 3. Alternate view of the book contents&lt;/span&gt;&lt;/em&gt;&lt;/div&gt;&lt;br /&gt;In this view, you can see that the META-INF folder holds a file called 'container.xml'&lt;em&gt;.&lt;/em&gt; This file holds information that conforms with the Open Container Format. We'll look at what that means shortly.&lt;br /&gt;&lt;br /&gt;The OPS folder contains a set of files with names like 'chapter-&lt;em&gt;nnn&lt;/em&gt;.xml'. You might guess correctly that these files hold the text of each chapter of the book in XML format. There are a few other files with the .xml extension: 'title.xml' and 'epubbooksinfo.xml'. These are also parts of the book that will be displayed when you read it, namely its title page and a page of information about epubBooks.&lt;br /&gt;&lt;br /&gt;Also in the OPS folder are two important files: 'epb.opf' and 'epb.ncx'. These files contain metadata or 'data about data' and together they describe the content of the book and things like the order in which the content files should be displayed by the reading device. A file with extension .opf is used to identify information that conforms with the Open Package Format, and a file with extension .ncx identifies a document containing navigation information i.e. the reading order of the content files.&lt;br /&gt;&lt;br /&gt;The folder called 'css' contains the files that apply some initial styling to the document. Aspects like relative font-size, text decoration (underline etc.), and margins are included, but it's common for the reading device to leave to the reader the choice of such aspects as font, font size. and text and background colours.&lt;br /&gt;&lt;br /&gt;Folder 'images' contains the images that are displayed in the book. In our example the images folder holds only a logo for epubBooks which is displayed on the 'epubbooksinfo' page.&lt;br /&gt;&lt;br /&gt;&lt;div align="left"&gt;So far, you've seen words like Container and Package without getting any detailed explanation of what they mean. Nor have you seen any reference to Open Publication Structure. That has been deliberate. Each of these concepts will be the subject of separate posts, indeed each topic will span several posts as we relate what we see in the example epub document both to the OPS specifications and to the development of a class model in C# which we can use to read and manipulate epub documents.&lt;/div&gt;&lt;br /&gt;&lt;div align="left"&gt;&lt;strong&gt;&lt;span style="color:#996633;"&gt;Desperate to read Benjamin Button?&lt;/span&gt;&lt;/strong&gt;&lt;/div&gt;By the way, if you're absolutely desperate to view the book before we examine it technically, you will need an epub reading device. That means any combination of hardware and software that allows you to open and display an epub document. This could be a smartphone, a dedicated reader, or your PC. On your PC you could download and try &lt;a href="http://www.adobe.com/products/digitaleditions/"&gt;Adobe Digital Editions&lt;/a&gt; which allows you to hold and view a library of ebooks.&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Copyright © Colin Hazlehurst, 2010&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-8944538023297146016?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8944538023297146016'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8944538023297146016'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2010/01/first-look-inside-epub-ebook.html' title='First look inside an epub ebook'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://lh3.ggpht.com/_cvaF-9-3DHs/Sz3aEsArtJI/AAAAAAAAACM/WagfxPT7cgk/s72-c/InsideEpub0001.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-6014512293401911267.post-8818023033970546526</id><published>2009-12-31T17:40:00.005Z</published><updated>2010-01-06T10:13:30.365Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='IDPF'/><category scheme='http://www.blogger.com/atom/ns#' term='XHTML'/><category scheme='http://www.blogger.com/atom/ns#' term='OPS'/><category scheme='http://www.blogger.com/atom/ns#' term='OCF'/><category scheme='http://www.blogger.com/atom/ns#' term='ebook'/><category scheme='http://www.blogger.com/atom/ns#' term='C#'/><category scheme='http://www.blogger.com/atom/ns#' term='ASP.Net'/><category scheme='http://www.blogger.com/atom/ns#' term='Zip'/><category scheme='http://www.blogger.com/atom/ns#' term='MCE'/><category scheme='http://www.blogger.com/atom/ns#' term='OPF'/><category scheme='http://www.blogger.com/atom/ns#' term='epub'/><title type='text'>Introduction to epub</title><content type='html'>&lt;span style="font-family: inherit;"&gt;In this series of articles I intend to explore the structure of ebooks held in the epub document format as defined by the &lt;/span&gt;&lt;a href="http://www.idpf.org/"&gt;&lt;span style="font-family: inherit;"&gt;International Digital Publishing Forum&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: inherit;"&gt;. This will include developing an understanding of the different components of epub as defined by the following standards documents: &lt;/span&gt;&lt;a href="http://www.idpf.org/2007/ops/OPS_2.0_final_spec.html"&gt;&lt;span style="font-family: inherit;"&gt;Open Publication Structure&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: inherit;"&gt;, &lt;/span&gt;&lt;a href="http://www.idpf.org/2007/opf/OPF_2.0_final_spec.html"&gt;&lt;span style="font-family: inherit;"&gt;Open Packaging Format&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: inherit;"&gt;, and &lt;/span&gt;&lt;a href="http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm"&gt;&lt;span style="font-family: inherit;"&gt;Open Container Format&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: inherit;"&gt;. If you take a look at the standards you will see that, like so many documents of their type, they are formal, dry, and require further explanation - and that is what I aim to provide.&lt;br /&gt;&lt;br /&gt;I will also create a set of C# Classes to model the different components of epub and use them to develop a range of ASP.Net samples which will illustrate the unpacking and display of an epub document.&lt;br /&gt;&lt;br /&gt;It's in the nature of epub that it uses Zip technology to compress and package all components of a publication into a single file, normally with the .epub extension. The code samples will use a freeware tool for handling Zip files. Further, the subdivisions of an ebook into Parts, Chapters, and Sections is typically handled using XHTML documents. To work with XHTML, the examples use the easily available Tiny MCE package to display ebook contents.&lt;br /&gt;&lt;br /&gt;If you want to run these samples on your own machine, you will need to download these tools. I'll give URLs for download at the appropriate places. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt;In the next article we will start by downloading a free ebook and taking a look inside it.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: inherit;"&gt;Copyright © Colin Hazlehurst, 2009 &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6014512293401911267-8818023033970546526?l=netkingcol.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8818023033970546526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6014512293401911267/posts/default/8818023033970546526'/><link rel='alternate' type='text/html' href='http://netkingcol.blogspot.com/2009/12/introduction-to-epub.html' title='Introduction to epub'/><author><name>NetKingCol</name><uri>http://www.blogger.com/profile/17306179527687254106</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_cvaF-9-3DHs/S0RhmUypIbI/AAAAAAAAAGM/8Oq61dX7Lb4/S220/webpic2.jpg'/></author></entry></feed>
