diff options
author | Leo Ufimtsev | 2018-06-29 14:13:22 +0000 |
---|---|---|
committer | Leo Ufimtsev | 2018-06-29 15:52:20 +0000 |
commit | 10147cdc9461fbde754a8d336f6c426653ca43c0 (patch) | |
tree | a528055b42cbe4eb8ec1b4387e45e4b128e39239 /bundles/org.eclipse.swt | |
parent | f5334b55c7742348890cf84988d15c5d6cb8c5dd (diff) | |
download | eclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.tar.gz eclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.tar.xz eclipse.platform.swt-10147cdc9461fbde754a8d336f6c426653ca43c0.zip |
Bug 535392 – [Webkit2] Browser.getText() returns wrong decoding when
setText() contains utf (code point >127) characters
Removing some unicode characters due to compile issues on windows.
Bug: https://bugs.eclipse.org/bugs/show_bug.cgi?id=5353922127
Change-Id: I77ef2f4a7803fc5145ac0533e2f83832888e19fb
Signed-off-by: Leo Ufimtsev <lufimtse@redhat.com>
Diffstat (limited to 'bundles/org.eclipse.swt')
-rw-r--r-- | bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java b/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java index 1e5c67ef7e..0ba3456b7f 100644 --- a/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java +++ b/bundles/org.eclipse.swt/Eclipse SWT/gtk/org/eclipse/swt/internal/Converter.java @@ -274,7 +274,7 @@ public static String byteToStringViaHeuristic(byte [] bytes) { * a byte sequence with many null bytes is likely UTF-16. * - Valid UTF-8 technically can contain null bytes, but it's rare. * - * Some times it can get confused if it receives two non-null bytes. e.g Ё = (UTF-16 [01,04]) + * Some times it can get confused if it receives two non-null bytes. e.g (E with two dots on top) = (UTF-16 [01,04]) * It can either mean a valid set of UTF-8 characters or a single UTF-16 character. * This issue typically only occurs for very short sequences 1-5 characters of very special characters). * Improving the heuristic for such corner cases is complicated. We'd have to implement a mechanism @@ -333,7 +333,7 @@ public static String byteToStringViaHeuristic(byte [] bytes) { } // Problem 3: Short 2-byte sequences are very ambiguous. - // E.g Unicode Hyphen U+2010 '‐' ( which btw different from the ascii U+002D '-' Hyphen-Minus) + // E.g Unicode Hyphen U+2010 (which looks like a '-') ( which btw different from the ascii U+002D '-' Hyphen-Minus) // can be miss-understood as 16 (Synchronous Idle) & 32 (Space). // Solution: Unless we have two valid alphabet characters, it's probably a single utf-16 character. // However, this leads to the problem that single non-alphabetic unicode characters are not recognized correctly. |