Paul Howson’s Website

Building a Better Structured Editor

Preparing Text for Publication is Different from Authoring — and Needs Special Tools

In this post I want to suggest that a publication designer needs a different kind of text preparation tool than the typical “word processor” used by authors.

Old and New Ways

In the past, the craftsman typesetter received a handwritten or typed manuscript from an author or editor and was able to exert complete control over the typesetting process — translating pencilled markup into a typeset text galley embodying the arcane rules of high quality typography.

Nowadays, the publication designer receives an already prepared text file to convert into professionally typeset text.

And neither conventional word processors nor page layout programs are the optimal tool for this task.

Let us examine this “typesetting” process more closely.

Steps in Text Preparation

There are two important steps in preparing text for publication.

The first step is typographical clean up.

This means correcting the use of spaces, punctuation, quotation marks, dashes, blank paragraphs, etc. It can be done with search and replace in a word processor or with a specialised “Text Cleaning” tool.

The second step is structuring.

By structuring, I mean applying paragraph and character styles to reflect document structure. It is structure represented through styles which drives visual appearance of text in page layout software. (The equivalent for web pages is the use of classes on html elements.)

In this process, local formatting needs to be discovered and replaced by use of styles. Unless this is done, styles cannot be used to ensure absolute consistency of formatting in the page-layout software.

Any element left unstyled, or incorrectly styled, will most likely not have the correct formatting in the page layout, or even worse, may appear to have the correct formatting, but will in fact be an “orphan” and not respond to style definition changes.

It is this second task of “structuring” for which word processors are particularly poorly suited. Let’s examine the process more closely.

Ways of Identifying Structure

The document designer is typically handed either a plain text file or a word processor file (frequently a Microsoft Word “.doc” file).

How does a designer go about identifying the structure within such a file?

A plain text file contains only the visible characters you see on the screen — there is no structural or formatting metadata. Some notion of structure may be implied by a textual convention such as use of CAPITALS or Title Case to indicate headings, hyphens used as bullets to identify lists, and so on. However, it requires a fair amount of detective work by the designer to identify the structural purpose of every paragraph in a plain text file.

A word processor file, can have formatting metadata: either character formats such as typeface, size, weight, colour, etc, or paragraph formats such as paragraph spacing, text alignment, indents, and so on. Formatting, used purposefully, can suggest the structural purpose of each element within the document. If we’re lucky, the author may have used built-in heading styles (heading 1, 2, etc) to indicate the heading hierarchy. And if we’re very lucky, the author might have defined specific styles for each kind of structural element and applied these with complete consistency. That would be very lucky indeed, because in my experience, this virtually never happens.

In the next article, we will look more closely at how a different kind of text preparation tool could make this task much quicker and easier.