Introduction to the OPF File
The OPF file is an extremely important — and extremely complex! — document in an EPUB. This Introduction to the OPF file breaks down its contents.
Suggested Prerequisite
Before reading this, you might want to read:
Introduction to the OPF File
An EPUB’s Open Package Format file, or OPF file, is a key part of an EPUB. It usually has the filename “content.opf”. The job of the OPF file is to tell reading systems how to organize and display the contents of the EPUB. There are five main components:
- OPF Head
- Metadata
- Manifest
- Spine
- Guide (not common in EPUB 3 – replaced by Landmarks section in navigation file)
We’ll discuss a bit about each component.
OPF Head (Required)
The OPF file always begins with information about the file itself. The first line is:
<?xml version="1.0" encoding="UTF-8"?>
This line lets the computer know that the file is written in XML.
The second line is the package element, which usually looks something like:
<package xmlns="http://www.idpf.org/2007/opf" version="3.0"xml:lang="en" unique-identifier="9780123456789">
This tells the computer about the type of XML the document uses (the “xmlns”), the EPUB version, the language of the file, and the book’s identifier (usually the ISBN).
Metadata (Required)
This section is where the book’s metadata is provided. It helps the computer parse the information related to the book, and is also where e-reading devices and software pull information. An ebook will always have a few Dublin Core metadata items (for title, author, publication date, ISBN, etc.), and an accessible ebook will also have accessibility metadata.
For more information on what metadata should be included in the OPF file, check out the Next Steps section.
Manifest (Required)
The manifest is an unordered list of all of the content files that make up the book (including images, text documents, fonts, audio files — everything), and are contained within the OEBPS/OPS directory. Note that the OPF manifest does not need to include: file folders; the OPF itself; the mimetype; or anything in the META-INF folder.
Each item in the manifest has three components:
- An item ID, which you can make up, and should describe the file, like id=”TitlePage” in the following example:
<item href="text/titlepage.xhtml" id="TitlePage" media-type="application/xhtml+xml">
- A reference to the actual file, which includes the actual name of the file, like href=”fonts/VictorianParlor.ttf” in the following example:
<item id="font9" href="fonts/VictorianParlor.ttf" media-type="font/ttf"/>
- The media-type, which tells the parser (software/reading system) what type of file it is, like media-type=”image/jpeg”/ in the following example:
<item id="part05img" href="images/Ghost.jpg" media-type="image/jpeg"/>
The items in the manifest can be in any order – but keeping it organized by the type of file it is, and the order it appears in the book, may be helpful for you.
Spine (Required)
The spine is an ordered list of all of the contents of the book (except images, the CSS file(s), audio/visual files, or fonts). If you are working with ePUB 2.0, you should include the NCX file at the beginning of the spine. Each item in the spine gets an “itemref”, and you use the item id that you created in the manifest for the “idref” (if these ID values don’t match, your book won’t work properly). The spine defines the order of the HTML pages in your book.
For example:
<spine>
<itemref idref="coverpage" linear="yes"/>
<itemref idref="titlepage.xhtml"/ linear="no">
<itemref idref="copyright.xhtml"/>
<itemref idref="preliminary-chapter.xhtml"/>
<itemref idref="chapter-1.xhtml"/>
<itemref idref="chapter-2.xhtml"/>
<itemref idref="chapter-3.xhtml"/>
<itemref idref="aboutTheAuthor.xhtml"/>
<itemref idref="uncopyright.xhtml"/>
</spine>
Note that items can have an attribute of linear="yes"
or linear="no"
. Linear="yes"
means that the item will appear in order; linear="no"
means that the item will be excluded from the reading order, and items without a linear attribute will default to linear="yes"
. The most important use for linear="yes"
is on the cover, which some reading systems will leave out by default.
So in the case of this book, when the reader first opens it up on their device or software, it will open to the cover page. When they click or swipe to the next page, it will go to the copyright page, skipping the title page. It is okay to skip some documents, as long as they can be accessed from elsewhere, like the navigation pane.
Guide (Optional)
Finally, the OPF file will sometimes have a “Guide” section. The guide points the reading system to the main content in the book. Typically, the Guide includes the cover, the table of contents, and the beginning of the actual text. The Guide section is deprecated in EPUB 3, but some vendors still use it.
Instead of the Guide section, EPUBs should include a Landmarks section in their navigation file.
Next Steps
1
Understanding OPF Metadata (including accessibility metadata)
Introduction to OPF Metadata
The OPF file contains a lot of important metadata, including basic information about the book, accessibility metadata, and evaluation and conformance metadata! This resource discusses what to include, and how to include it.
Subject(s): Ebook Production
Resource Type(s): Step-by-Step
External Links to More Information
DCMI Metadata Terms
The Dublin Core Metadata Innovation is an organization supporting innovation in metadata design and best practices across the metadata ecology. This page shares all the possible Dublin Core terms you may want to use in your book’s OPF file.
What’s in an ePUB?: The OPF File
This page discusses the contents of the OPF file of an EPUB, which contains the book’s metadata and information about how the other files are organized.
Anatomy of an EPUB 3 file
This page gives an overview of how an EPUB file is structured. It covers the mimetype identification file, the container.xml file, and the OPF file.
EPUB Packages 3.2
This specification lays out the technical requirements for an EPUB Package, including what is required in the “Package Document”, also known as the OPF file (content.opf). This specification also details the requirements of the EPUB Navigation Document (a machine- and human-readable document—usually toc.xhtml or nav.xhtml—that provides navigation aids such as the table of contents.
Content Source Acknowledgement