Skip to main content
aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLeo Ufimtsev2018-06-29 14:13:22 +0000
committerLeo Ufimtsev2018-06-29 15:52:20 +0000
commit10147cdc9461fbde754a8d336f6c426653ca43c0 (patch)
treea528055b42cbe4eb8ec1b4387e45e4b128e39239 /bundles/org.eclipse.swt
parentf5334b55c7742348890cf84988d15c5d6cb8c5dd (diff)
downloadeclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.tar.gz
eclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.tar.xz
eclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.zip
Bug 535392 – [Webkit2] Browser.getText() returns wrong decoding when
setText() contains utf (code point >127) characters Removing some unicode characters due to compile issues on windows. Bug: https://bugs.eclipse.org/bugs/show_bug.cgi?id=5353922127 Change-Id: I77ef2f4a7803fc5145ac0533e2f83832888e19fb Signed-off-by: Leo Ufimtsev <lufimtse@redhat.com>
Diffstat (limited to 'bundles/org.eclipse.swt')
-rw-r--r--bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java4
1 files changed, 2 insertions, 2 deletions
diff --git a/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java b/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java
index 1e5c67ef7e..0ba3456b7f 100644
--- a/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java
+++ b/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java
@@ -274,7 +274,7 @@ public static String byteToStringViaHeuristic(byte [] bytes) {
* a byte sequence with many null bytes is likely UTF-16.
* - Valid UTF-8 technically can contain null bytes, but it's rare.
*
- * Some times it can get confused if it receives two non-null bytes. e.g Ё = (UTF-16 [01,04])
+ * Some times it can get confused if it receives two non-null bytes. e.g (E with two dots on top) = (UTF-16 [01,04])
* It can either mean a valid set of UTF-8 characters or a single UTF-16 character.
* This issue typically only occurs for very short sequences 1-5 characters of very special characters).
* Improving the heuristic for such corner cases is complicated. We'd have to implement a mechanism
@@ -333,7 +333,7 @@ public static String byteToStringViaHeuristic(byte [] bytes) {
}
// Problem 3: Short 2-byte sequences are very ambiguous.
- // E.g Unicode Hyphen U+2010 '‐' ( which btw different from the ascii U+002D '-' Hyphen-Minus)
+ // E.g Unicode Hyphen U+2010 (which looks like a '-') ( which btw different from the ascii U+002D '-' Hyphen-Minus)
// can be miss-understood as 16 (Synchronous Idle) & 32 (Space).
// Solution: Unless we have two valid alphabet characters, it's probably a single utf-16 character.
// However, this leads to the problem that single non-alphabetic unicode characters are not recognized correctly.

Back to the top