Paul Howson’s Website tdgq.com.au

Building a Better Structured Editor

About this Blog

Having spent many years using computers to prepare publications, it became obvious that the widely used word processors were less than ideal tools for preparing text for publication. As I thought more about this, ideas for a new kind of structured document editing tool came together in my mind. This project is an attempt to embody those ideas in a piece of software.

The Concept of the Structured Editor

Text files received by print or web designers often need a lot of cleaning up. And part of that cleaning up pertains to document structure. There can be lots of spurious direct formatting overrides at both the paragraph and character level. Some of these are structurally significant, but many are junk.

One important practical goal is speed of cleaning up and structuring a document. To get rid of the complexity and junk as quickly as possible.

That is the primary purpose of the Structured Editor: efficient cleanup and structuring of text documents for publication.

Another secondary goal is to convert structured documents from one representation to another as seamlessly as possible. While there are readily available format convertors (Word to html, rtf to html, tex to html, etc), most of these attempt to faithfully preserve all formatting — including formatting junk. The Structured Editor uses formatting only to deduce structure. Translations between different representations (e.g. rtf to html, html to tex) are based on structure, not formatting.

Project History

I first attempted to turn these ideas into a working piece of software in the early 2000s, using RealBasic on the Mac as the development environment. This was a time when Apple’s prognosis did not look good, and developing Mac-only software seemed a risky proposition with an unknown future. RealBasic was somewhat unique in that it supported multiple target platforms from a single set of source code. In particular RealBasic promised the possibility of developing an application which could be easily ported from the Mac to Windows.

RealBasic was an object-oriented language with an integrated IDE. For the era, it was quite advanced, with fast turn-around time for writing and testing code. It seemed like a good choice.

The project moved forward slowly in RealBasic until around 2004–2005, during which time I invested significant time and energy.

Family issues intervened and I had to put the project aside for a few years, eventually coming back to it in 2009.

The break had provided a new perspective on the earlier work. There were some good ideas and there were some ill-conceived ones also. I now had a much clearer idea about what I wanted to create.

Moving to MacRuby

The question was: “What language to use?” RealBasic had in my opinion stagnated in the intervening years. The Mac was no longer fading away, but had found new vitality and acceptance. Writing Mac-only software now seemed an acceptable trade-off, and it opened up other options for development tools.

Around 2006 I had discovered the Ruby language through the Ruby on Rails web application framework and had created a couple of web applications for clients.

Having been convinced of the advantages of Ruby over RealBasic, in 2009 I discovered that Apple was sponsoring a Mac-specific version of Ruby called MacRuby which integrated Ruby with the Mac’s Cocoa frameworks and Objective-C runtime. So I decided to convert the structured editor project over to MacRuby. This involved writing a rather complex Ruby program to machine translate as much as possible of the RealBasic into valid Ruby code.

This conversion tool was eventually able to convert almost all the RealBasic source code into correctly structured Ruby code and, remarkably, the majority of the generated Ruby code did not require further modification.

Of course it would not run because there was no RealBasic runtime present, but the hard work already done had been preserved. Then, bit by bit, the Ruby code was moved into a software test rig, to be tested and adjusted as necessary. Within a few months, the program was again running, but this time as a MacRuby application rather than a RealBasic application.

Sometime around May 2010 I was interviewed by British software developer “Scotty”, the genial host of the MDN Show (MDN stands for the Macintosh Developer Network). The subject of the interview was a “case study” of this Structured Editor project.

Click here to download an edited version of this MDN Show interview (mp3, 41.5mB), or else use the player here:

Where to Next?

Since then, I’ve been gradually working on extending the functionality.

One thing I learned from this whole process, which now is about ten years old, is that I probably cannot complete this project on my own in the time I have available.

The purpose of this blog is to share some of the ideas and work on this project with others who might be interested in this subject and may wish to help make this software idea a reality.

Paul Howson
March 2012