| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
| <html> |
| <head> |
| <META name="Author" content="Matitiahu Allouche"> |
| </head> |
| <body bgcolor="white"> |
| |
| This package provides classes for |
| processing structured text. |
| <p> |
| There are various types of structured text. Each type should |
| be handled by a specific <i>type handler</i>. A number of standard |
| type handlers are supplied in the associated package |
| <a href="internal/consumable/package-summary.html"> |
| org.eclipse.equinox.bidi.internal.consumable</a>. |
| |
| <h2>Introduction to Structured Text</h2> |
| <p> |
| Bidirectional text offers interesting challenges to presentation systems. |
| For plain text, the Unicode Bidirectional Algorithm |
| (<a href="http://www.unicode.org/reports/tr9/">UBA</a>) |
| generally specifies satisfactorily how to reorder bidirectional text for |
| display. This algorithm is implemented in many presentation and operating |
| systems, like Java/Swing, Windows, Linux. |
| </p><p> |
| However, all bidirectional text is not necessarily plain text. There |
| are also instances of text structured to follow a given syntax, which |
| should be reflected in the display order. The general algorithm, which |
| has no awareness of these special cases, often gives incorrect results |
| when displaying such structured text. |
| </p><p> |
| The general idea in handling structured text in this package is to add |
| directional formatting characters at proper locations in the text to |
| supplement the standard algorithm, so that the final result is correctly |
| displayed using the UBA. |
| </p><p> |
| A class which handles structured text is thus essentially a |
| transformation engine which receives text without directional formatting |
| characters as input and produces as output the same text with added |
| directional formatting characters, hopefully in the minimum quantity |
| which is sufficient to ensure correct display, considering the type of |
| structured text involved. |
| </p><p> |
| In this package, text without directional formatting characters is |
| called <b><i>lean</i></b> text while the text with added directional |
| formatting characters is called <b><i>full</i></b> text. |
| </p><p> |
| The class {@link |
| <a href="STextProcessor.html"><b>STextProcessor</b></a>} |
| is the main tool for processing structured text. It facilitates |
| handling several types of structured text, each type being handled |
| by a specific |
| {@link <a href="custom/STextTypeHandler.html"> |
| <b><i>type handler</i></b></a>} :</p> |
| <ul> |
| <li>property (name=value)</li> |
| <li>compound name (xxx_yy_zzzz)</li> |
| <li>comma delimited list</li> |
| <li>system(user)</li> |
| <li>directory and file path</li> |
| <li>e-mail address</li> |
| <li>URL</li> |
| <li>regular expression</li> |
| <li>Xpath</li> |
| <li>Java code</li> |
| <li>SQL statements</li> |
| <li>RTL arithmetic expressions</li> |
| </ul> |
| <p> |
| For each of these types, an identifier is defined in |
| {@link <a href="STextTypeHandlerFactory.html"> |
| <b><i>STextTypeHandlerFactory</i></b></a>} :</p> |
| These identifiers can be used as argument in some methods of |
| <b>STextProcessor</b> to specify the type of handler to apply. |
| </p><p> |
| The classes included in this package are intended for users who |
| need to process structured text in the most straightforward |
| manner, when the following conditions are satisfied: |
| <ul> |
| <li>There exists an appropriate handler for the type of the |
| structured text.</li> |
| <li>There is no need to specify non-default conditions related |
| to the {@link <a href="advanced/STextEnvironment.html"> |
| environment</a>}.</li> |
| <li>The only operations needed are to transform <i>lean</i> text |
| into <i>full</i> text or vice versa.</li> |
| <li>There is no interdependence between the processing of a |
| given string and the processing of preceding or |
| succeeding strings.</li> |
| </ul> |
| <p> |
| When their needs go beyond the conditions above, |
| users can use classes in the |
| {@link <a href="advanced/package-summary.html"> |
| org.eclipse.equinox.bidi.advanced</a>} package. |
| </p><p> |
| Developers who want to develop new handlers to support types of |
| structured text not currently supported can use components |
| of the package {@link <a href="custom/package-summary.html"> |
| org.eclipse.equinox.bidi.custom</a>}. |
| </p><p> |
| There are two other packages associated with the current one:</p> |
| <ul> |
| <li>{@link <a href="internal/package-summary.html"> |
| org.eclipse.equinox.bidi.internal</a>}</li> |
| <li>{@link <a href="internal/consumable/package-summary.html"> |
| org.eclipse.equinox.bidi.internal.consumable</a>}</li> |
| </ul> |
| <p> |
| The tools in the first package can serve as example of how to develop |
| processors for currently unsupported types of structured text.<br> |
| The latter package contains classes for the processors implementing |
| the currently supported types of structured text. |
| </p><p> |
| However, users wishing to process the currently supported types of |
| structured text typically don't need to interact with these |
| two packages. |
| </p> |
| |
| <h2>Abbreviations used in the documentation of this package</h2> |
| |
| <dl> |
| <dt><b>UBA</b><dd>Unicode Bidirectional Algorithm |
| |
| <dt><b>Bidi</b><dd>Bidirectional |
| |
| <dt><b>GUI</b><dd>Graphical User Interface |
| |
| <dt><b>UI</b><dd>User Interface |
| |
| <dt><b>LTR</b><dd>Left to Right |
| |
| <dt><b>RTL</b><dd>Right to Left |
| |
| <dt><b>LRM</b><dd>Left-to-Right Mark |
| |
| <dt><b>RLM</b><dd>Right-to-Left Mark |
| |
| <dt><b>LRE</b><dd>Left-to-Right Embedding |
| |
| <dt><b>RLE</b><dd>Right-to-Left Embedding |
| |
| <dt><b>PDF</b><dd>Pop Directional Formatting |
| </dl> |
| |
| <p> </p> |
| |
| <h2>Known Limitations</h2> |
| |
| <p>The proposed solution is making extensive usage of LRM, RLM, LRE, RLE |
| and PDF directional controls which are invisible but affect the way bidi |
| text is displayed. The following related key points merit special |
| attention:</p> |
| |
| <ul> |
| <li>Implementations of the UBA on various platforms (e.g., Windows and |
| Linux) are very similar but nevertheless have known differences. Those |
| differences are minor and will not have a visible effect in most cases. |
| However there might be cases in which the same bidi text on two |
| platforms will look different. These differences will surface in Java |
| applications when they use the platform visual components for their UI |
| (e.g., AWT, SWT).</li> |
| |
| <li>It is assumed that the presentation engine supports LRE, RLE and |
| PDF directional formatting characters.</li> |
| |
| <li>Because some presentation engines are not strictly conformant to the |
| UBA, the implementation of structured text in this package adds LRM |
| or RLM characters in association with LRE, RLE or PDF in cases where |
| this would not be needed if the presentation engine was fully conformant |
| to the UBA. Such added marks will not have harmful effects on |
| conformant presentation engines and will help less conformant engines to |
| achieve the desired presentation.</li> |
| </ul> |
| |
| </body> |
| </html> |