We’ve looked at the need for a strategy for accessible PDFs and the most current standards, PDF/UA. Where do Tags fit into the picture?
Every piece of content on a page has a role to play for the person reading the document.
For adaptive technology, the role of a page element is determined by the use of Tags. Even elements of a page that are decorative or in the background have a Tag (referred to as Artifact).
Tags provide the structure for content on a page, such as headings, paragraphs, lists, tables, figures/images and equations. For example, this text is a paragraph. It has the role or structure of a paragraph in a PDF document. As a paragraph, the correct Tag would be <P>.
What would happen if the tags weren’t added? Adaptive technology would attempt to read the content on the page and guess the order to read it out or simply say “not accessible”. Either way the person using adaptive technology would hear meaningless text. For example, if a page has three columns of text, the author intended that the columns be read from top to bottom. If adaptive technology guesses and reads across the page from left to right, the first line of each column would be heard, then the second etc, which would not make any sense. The tags provide instructions to the adaptive technology to read down the columns and which order the columns should be read in.
Let’s talk about headings. Without headings, a PDF document represents pages of paragraphs with no way to find specific topics or chapters. Consider a textbook with no visual formatting to identify where you are and what you are reading. Without those visual cues, how would you find anything? Correctly Tagged headings provide the same structure to the PDF document that one can access visually.
The title of our blog article would have the structure of a heading. The correct Tag for the title of this blog article would be <H1> because it is the first heading. PDF documents can have more than one <H1> Tag. This Tag is often reserved for chapter titles. Subsequent heading levels such as <H2><H3> or <H4> provide a blueprint to the document content for those using adaptive technology. This is why it is important to understand that skipping heading levels (moving from an H1 to an H3) must not be done. Skipping around headings creates a barrier to accessibility and readability of a PDF document.
Lists are a technique frequently used in documents. Lists have Tags that provide a correct structure identifying bulleted or numbered lists through the PDF viewer/reader and the adaptive technology. This means that someone using a screen reader or Text-to-Speech tool know that they are entering a list, how many items are in the list, clear identification of each list item and when they are leaving the list and moving on to another element on the page.
Why is this important? Authors use a list to convey meaning. Lists of items show a relationship between the items in the list. Without correctly Tagged lists, adaptive technology users can’t identify the relationship between items presented in the list.
Table structure is equally important. Knowing how many rows or columns are in a table, being able to identify cell coordinates and to read the column and row titles provide valuable information to someone using a screen reader or Text-to-Speech tool.
Without a correctly Tagged table in a PDF document, someone might hear “cell B15, 543.” They are now 15 rows into the table and are presented with a number. What does that number mean? Is it related to a month, year, automobile colour, automobile model, expense category, or any other column or row title (table header) that may occur in a table?
Tags provide the structure, blueprint and roadmap for the person reading your content.
This article has identified a few of the major Tags in a PDF document. PDF documents come from all types of authoring tools including spreadsheet, word processed, presentation, desktop publishing and organizational charting software. PDF/UA provides specifications to ensure that the same type of content (for example a heading) is tagged the same in the PDF version of a document coming from any authoring tool. It is both a craft and technical skill to be able to look at a PDF document and decode the visual elements on a page correctly so that they can be tagged correctly. This often includes the ability to provide images with Alternate Text.
This is often overwhelming to someone assigned to “make our PDF documents accessible.” Most people in organizations have other duties and responsibilities and don’t work with accessible PDF on a daily basis. Knowing how to Tag a document is a completely separate skill set that requires training and experience to do properly.