Skip to main content
aboutsummaryrefslogtreecommitdiffstats
blob: 1b23975bc2feaddaed0d69976b3b7b5f3400f156 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<META name="Author" content="Matitiahu Allouche">
</head>
<body bgcolor="white">

This package provides classes for
processing structured text.
<p>
There are various types of structured text. Each type should
be handled by a specific <i>type handler</i>. A number of standard
type handlers are supplied in the associated package
<a href="internal/consumable/package-summary.html">
org.eclipse.equinox.bidi.internal.consumable</a>.

<h2>Introduction to Structured Text</h2>
<p>
Bidirectional text offers interesting challenges to presentation systems.
For plain text, the Unicode Bidirectional Algorithm
(<a href="http://www.unicode.org/reports/tr9/">UBA</a>)
generally specifies satisfactorily how to reorder bidirectional text for
display. This algorithm is implemented in many presentation and operating 
systems, like Java/Swing, Windows, Linux.
</p><p>
However, all bidirectional text is not necessarily plain text. There
are also instances of text structured to follow a given syntax, which
should be reflected in the display order. The general algorithm, which
has no awareness of these special cases, often gives incorrect results
when displaying such structured text.
</p><p>
The general idea in handling structured text in this package is to add
directional formatting characters at proper locations in the text to
supplement the standard algorithm, so that the final result is correctly
displayed using the UBA.
</p><p>
A class which handles structured text is thus essentially a
transformation engine which receives text without directional formatting
characters as input and produces as output the same text with added
directional formatting characters, hopefully in the minimum quantity
which is sufficient to ensure correct display, considering the type of
structured text involved.
</p><p>
In this package, text without directional formatting characters is
called <b><i>lean</i></b> text while the text with added directional
formatting characters is called <b><i>full</i></b> text.
</p><p>
The class {@link
<a href="STextProcessor.html"><b>STextProcessor</b></a>}
is the main tool for processing structured text.  It facilitates
handling several types of structured text, each type being handled
by a specific
{@link <a href="custom/STextTypeHandler.html">
<b><i>type handler</i></b></a>} :</p>
<ul>
  <li>property (name=value)</li>
  <li>compound name (xxx_yy_zzzz)</li>
  <li>comma delimited list</li>
  <li>system(user)</li>
  <li>directory and file path</li>
  <li>e-mail address</li>
  <li>URL</li>
  <li>regular expression</li>
  <li>Xpath</li>
  <li>Java code</li>
  <li>SQL statements</li>
  <li>RTL arithmetic expressions</li>
</ul>
<p>
For each of these types, an identifier is defined in
{@link <a href="STextTypeHandlerFactory.html">
<b><i>STextTypeHandlerFactory</i></b></a>} :</p>
These identifiers can be used as argument in some methods of
<b>STextProcessor</b> to specify the type of handler to apply.
</p><p>
The classes included in this package are intended for users who
need to process structured text in the most straightforward
manner, when the following conditions are satisfied:
<ul>
  <li>There exists an appropriate handler for the type of the
      structured text.</li>
  <li>There is no need to specify non-default conditions related
      to the {@link <a href="advanced/STextEnvironment.html">
      environment</a>}.</li>
  <li>The only operations needed are to transform <i>lean</i> text
      into <i>full</i> text or vice versa.</li>
  <li>There is no interdependence between the processing of a
      given string and the processing of preceding or
      succeeding strings.</li>
</ul>
<p>
When their needs go beyond the conditions above,
users can use classes in the
{@link <a href="advanced/package-summary.html">
org.eclipse.equinox.bidi.advanced</a>} package.
</p><p>
Developers who want to develop new handlers to support types of
structured text not currently supported can use components
of the package {@link <a href="custom/package-summary.html">
org.eclipse.equinox.bidi.custom</a>}.
</p><p>
There are two other packages associated with the current one:</p>
<ul>
  <li>{@link <a href="internal/package-summary.html">
      org.eclipse.equinox.bidi.internal</a>}</li>
  <li>{@link <a href="internal/consumable/package-summary.html">
       org.eclipse.equinox.bidi.internal.consumable</a>}</li>
</ul>
<p>
The tools in the first package can serve as example of how to develop
processors for currently unsupported types of structured text.<br>
The latter package contains classes for the processors implementing
the currently supported types of structured text.
</p><p>
However, users wishing to process the currently supported types of
structured text typically don't need to interact with these
two packages.
</p>

<h2>Abbreviations used in the documentation of this package</h2>

<dl>
<dt><b>UBA</b><dd>Unicode Bidirectional Algorithm

<dt><b>Bidi</b><dd>Bidirectional

<dt><b>GUI</b><dd>Graphical User Interface

<dt><b>UI</b><dd>User Interface

<dt><b>LTR</b><dd>Left to Right

<dt><b>RTL</b><dd>Right to Left

<dt><b>LRM</b><dd>Left-to-Right Mark

<dt><b>RLM</b><dd>Right-to-Left Mark

<dt><b>LRE</b><dd>Left-to-Right Embedding

<dt><b>RLE</b><dd>Right-to-Left Embedding

<dt><b>PDF</b><dd>Pop Directional Formatting
</dl>

<p>&nbsp;</p>

<h2>Known Limitations</h2>

<p>The proposed solution is making extensive usage of LRM, RLM, LRE, RLE
and PDF directional controls which are invisible but affect the way bidi
text is displayed. The following related key points merit special
attention:</p>

<ul>
<li>Implementations of the UBA on various platforms (e.g., Windows and
Linux) are very similar but nevertheless have known differences. Those
differences are minor and will not have a visible effect in most cases.
However there might be cases in which the same bidi text on two
platforms will look different. These differences will surface in Java
applications when they use the platform visual components for their UI
(e.g., AWT, SWT).</li>

<li>It is assumed that the presentation engine supports LRE, RLE and
PDF directional formatting characters.</li>

<li>Because some presentation engines are not strictly conformant to the
UBA, the implementation of structured text in this package adds LRM
or RLM characters in association with LRE, RLE or PDF in cases where
this would not be needed if the presentation engine was fully conformant
to the UBA. Such added marks will not have harmful effects on
conformant presentation engines and will help less conformant engines to
achieve the desired presentation.</li>
</ul>

</body>
</html>

Back to the top