JEP 112: Charset Implementation Improvements
|Discussion||core dash libs dash dev at openjdk dot java dot net|
|Endorsed by||Brian Goetz|
|Relates to||6653797: Reimplement JDK charset repository charsets.jar|
|7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte)|
Improve the maintainability and performance of the standard and extended charset implementations.
Decrease the size of installed charsets
Reduce maintenance cost by generating charset implementations at build time from simple text-based mapping tables
Improve the performance of encoding/decoding
This is the second part of the sun.nio.cs/ext re-implementation project. In JDK 7 most of the charsets (80%+) were re-implemented to achieve better maintainability and performance. This JEP continues that work to:
Re-implement the remaining charsets, mainly the JIS_X_0208/0212 based Japanese charsets and couple of IBM double-byte charsets such as IBM964 and IBM33722.
Implement the sun.nio.cs.ArrayDecoder/Encoder API for the most frequently used double-byte charsets to enhance
Improve the start-up/access performance of the standard and extended charsets providers.
Need to ensure that the new implementations are completely compatible (for each and every code point) with the existing implementation. Will write new automatic unit tests running under current test framework to guarantee correctness.