Home

About

Quickstart

Tutorial

Script Summary

More on XSL

Revision History

Proposed Changes



Proposed Changes to Xmltr

If you’re using xmltr, please read the following announcement.

Multiple translation tables

Revised handling of entity translation

Dealing with stack overflows

Translating the root node of a parse tree

Change to naming of special entries in translation tables

Caching translation rules

Since xmltr was released in October 1998, I’ve had the opportunity to apply it in a number of website projects. From this experience has come ideas for improvements and refinements to xmltr.

In addition, a number of users of xmltr have suggested changes and improvements.

Before proceeding with a new release of xmltr, I’d like to present a summary of the changes being considered.

Some of these changes may affect existing applications of xmltr, so I’d like to hear from any users who are concerned that any of the proposed changes might cause difficulties.

Also, if there are specific changes you would like to see incorporated, please let me know.

Please address your comments to phowson@flexi.net.au


Multiple Translation Tables

Currently xmltr allows only a single translation rule table to be supplied to TranslateTree(). You can write your own custom FindRule() script to implement multiple translation tables. However, there is a case for making this part of the standard behaviour.

Proposal:

Allow a list of translation tables (addresses) to be supplied to TranslateTree() (in addition to the current option of a single translation table). In this case, xmltr would search each translation table in turn for a rule match. If only a single translation table address is supplied (as it is currently), then behaviour would be unchanged (hence backward compatibility).

Benefit:

This would allow commonly used translation rules to be factored out into a separate translation table which could be shared amongst a number translation phases or a number of websites.

Consequences:

There are potential complications arising between this proposed behaviour (multiple translation tables) and the existing options of supplying a custom _FindRule() script or custom _ProcessRule() script at the root level of a translation table (since each translation rule table could in principle have such custom scripts and their interaction could be confusing).

It is therefore proposed to remove support for custom _FindRule() and _ProcessRule() scripts at the root level of translation tables.

It would still be possible to supply custom scripts for FindRule() and ProcessRule() as optional parameters to TranslateTree() (as is currently the case).

Default rule handling would also need some adjustment under this scheme.

If you were specifying a series of translation tables, then you would place the default translation rule only in the last translation table to be searched. Otherwise a default entry in, say, the first translation table would prevent subsequent translation tables from being searched.

What I’ve found works well is to place the default translation rule in its own translation table which is searched last. This way you can change the default translation rule by specifying a different list of translation tables to TranslateTree(). Why might you want to do this? If you’re using xmltr as a “filter” to select certain content from an xml parse tree, you’ll want a default rule which ignores unwanted content (i.e. translates it to nothing). On the other hand, if you’re translating the content of a web page, you might want a default rule which catches unknown tags and inserts an error message into the page.


Revised Handling of Entity Translation

Currently entity translation is handled by a custom script at the root level of a translation rule table called “_entity”. This script is passed the name of an entity and should return the translation for that entity.

This scheme does not adapt to the notion of multiple translation tables, since there is no way for the script to signal that it cannot translate the given entity and that another translator “further down the line” should be given a chance to do so.

Proposal

One possible solution would be to make entity translation parallel the translation of xml tags more closely:

Optionally allow the “_entity” entry in the root level of a translation table be a subtable (instead of a script).

If it’s a script, the behaviour is unchanged from the current behaviour. Since, under this scheme, there is no way for the script to signal whether it handled the entity, only one script would get a chance to translate the entity. This would logically be the first such script found in the list of supplied translation tables (assuming multiple translation tables as described above).

If it’s a subtable, then the table is searched for an entry which matches the entity name. If a match is found then the value of that table entry is used as the entity translation. This could be a string (or wpText or outline) or a script (which would be called, and its return value used). If no match is found, then an entity translation table would be looked for in the next translation table in the supplied list of translation tables.

If none of the entity translation tables handles the entity (or if there aren’t any entity translation tables) then the entity would be handed to a script provided as an optional parameter to TranslateTree().

If no such script was provided, then a ScriptError would result, reporting the entity not translated.

Benefits:

The ability to factor entity translation into a sequence of handlers, each associated with a translation rule table. Commonly used entity translations could be factored into a shared translation table (whose sole purpose might be to do standard entity translations).

The ability to specify simple lookup-style entity translations using a table rather than requiring a case statement inside a script.

Consequences:

This system would remain compatible with existing behaviour when a single entity translation script is provided at the root level of a translation table.


Dealing with Stack Overflows

The most commonly reported problem with xmltr is application stack overflows, caused by deeply nested xml tag patterns and the 50 stack frame limit in Frontier version 6 and earlier.

Frontier 6.1 raises this limit to 200 stack frames, which should be ample in practice.

An alternate solution of removing recursion from parts of xmltr was considered. However, the cost in lost clarity of code would have been significant.

If stack overflows are a problem for you, consider upgrading to Frontier 6.1 or later versions.


Translating the Root Node of a Parse Tree

Currently xmltr translates only the child nodes of the supplied xml parse tree (i.e. the topmost node is ignored).

Some users have implemented a system where the web page objects in a Frontier web site refer back to a branch of an xml parse tree (via an address). When the page is rendered, that segment of the xml parse tree is translated to create the page content. This “just-in-time” translation scheme reduces the number of intermediate representations of page content which need to be stored in the Frontier ODB and streamlines the rendering process.

In this case, it is desirable to translate the topmost node of the segment of the xml parse tree.

Proposal

Add an optional parameter to TranslateTree() which causes the root node of the xml parse tree to be translated. By default the root node is not translated (current behaviour).

Benefits:

More flexibility in how translation is done.

Consequences:

This system would remain compatible with existing behaviour.


Change to Naming of Special Entries in Translation Tables

One user of xmltr (Marcel Graf) questioned the use of underscore “_” as the indicator of special translation table entries such as “_any”, “_default”, “_entity” and so on. He pointed out that underscore is a valid character in xml tag names and hence could create clashes if there were xml tags named “_any” or “_default” or “_entity”. He suggested instead the use of forward slash “/” which is not a valid character in an xml tag name. Forward slash is the character used to identify special entries in xml parse trees within Frontier (such as those built by blox).

While I agree with the logic of this suggestion, there is a good reason why forward slash was not originally chosen.

Translation rules may be scripts and hence the name of a translation rule needs to be a valid Frontier script name (so it can be used as the name of the handler within the script outline). Forward slash is not a valid character in a Frontier script name unless you’re prepared to write your handler as:

on ["/handlername"](parameters ...)

Proposal

While I feel that using underscore to flag special entries is the more acceptable solution the alternate scheme can be accommodated easily by making the names of special entries (currently “_any” or “_default” or “_entity”) constants (strings) in the xmltr suite table.

Benefits:

Users who wish to use different names for these special entries can edit the string constants which define these special entries.

Consequences:

This system would remain compatible with existing behaviour.


Caching Translation Rules

When a match is found for a translation rule, xmltr caches the address of that rule with the corresponding xml tag pattern in a special cache table. This can speed up translation by a factor of two or more since tag patterns which occur more than once in an xml document need to be matched with a translation rule only once.

By default the rule cache table is created on the stack (i.e. as a local variable in TranslateTree()) so it is automatically discarded when TranslateTree() returns.

There has been an optional flag for TranslateTree() which causes the rule cache to be created at the root level of the translation table. This could be useful if you wish to speed up repetitive translations by reusing the rule cache.

Under a scheme which allows multiple translation tables (as described earlier), associating the rule cache with a single translation table would not make sense.

Proposal

Change the optional parameter to TranslateTree() to be a table address, which defaults to nil. If this address is nil, then the rule cache is created on the stack (the current default behaviour) and is discarded when TranslateTree() returns.

If the user supplies a table address for the optional parameter, then that table is used as the rule cache.

Benefits:

In the very rare situation where preserving the rule cache is desired, this can be accommodated. Preserving the cache or clearing it would then be the responsibility of the programmer.

Consequences:

This system would remain compatible with existing behaviour, unless you’ve been setting the clearCache parameter of TranslateTree() to false (in which case you’ll need to modify your calls to TranslateTree()).

 

Website built using Frontier and xmltr. Documentation also available in pdf format for offline reading. Copyright The Design Group Qld 1999. This page last updated Thu, 2 Dec 1999