Solutions to garbled Chinese characters in java linux files: 1. Download the sun source code of jdk1.8; 2. Change Font creation from physical fonts to logical fonts; 3. Restart the service.
The operating environment of this article: linux5.9.8 system, jdk1.8, Dell G3 computer.
How to solve the problem of Chinese garbled characters in java linux files?
Solution for Java Chinese garbled characters in Linux environment
I believe many friends I have encountered garbled characters in Java. Recently, I have also solved the problem of "garbled Chinese and special characters in the process of using text to generate images." It took me a lot of time to debug various source codes under sun.font and sun.awt. , finally understood the mechanism and solved the current problem; now write down the problem solving process and make a record to avoid encountering it again in the future.
The following is the code I want to execute (extremely simplified, but the meaning remains the same):
public static void main(String[] args) throws IOException { File file = new File("test.png"); Font font = new Font("宋体", Font.PLAIN, 10); BufferedImage bi = new BufferedImage(400, 200, BufferedImage.TYPE_INT_ARGB); Graphics2D g2 = (Graphics2D) bi.getGraphics(); g2.setBackground(Color.WHITE); g2.clearRect(0, 0, 400, 200); g2.setFont(font); g2.setColor(Color.BLACK); g2.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_ON); g2.drawString("为什么没有(ꐚꌒꑿꆺ)(ꐚꌒꑿꆺ)这名字特殊不?@¥¥¥ 为什么没有(ꐚꌒꑿꆺ)(ꐚꌒꑿꆺ)这名字特 ", 0, 10); g2.dispose(); ImageIO.write(bi, PNG, file); }
The goal is of course to open test. png, I saw the following scene:
After local debugging was no problem, I put it on the test machine (Linux) and executed it. The result is simply overwhelming:
follows the consistent style of programmers: since there is a problem, then Just Debug!
The trick is that the current source code package no longer contains the code of the sun package!
Fortunately, Java officially confirmed that the code of OpenJDK is basically consistent with the JVM source code. You can download it directly from OpenJDK8u: jdk8u
As for how to use the source code to debug, I won’t write about it... This is not even basic. Don’t read this article
Download the source code directly, use remote breakpoints, and execute the server. During debugging, I first discovered the first code that caused inconsistencies between the local and test servers:
It turns out that when the JVM creates a Font, it will use the FontManagerFactory to obtain the FontManager, and different systems use different FontManagers! Mac uses CFontManager, while Linux uses X11FontManager!
So what are the differences between these two FontManagers?
CFontManager will create CFont as Font2D. This CFont is a class created by JVM specifically for mac. Looking at the comments of the class and method, you can know that sometimes physical fonts will be wrapped by CFont in the mac environment, and this is in native Completed in the code:
The Font2D created by X11FontManager is a collection that contains logical fonts and physical fonts. X11FontManager inherits FcFontManager, and FcFontManager inherits SunFontManager; let’s take a look at X11FontManager’s loadFonts() method, which directly uses SunFontManager’s loadFonts(). SunFontManager’s loadFonts() method loads physical fonts. SunFontManager implements FontManager’s preferLocaleFonts() method. Loaded logical fonts:
Code debug here Bian has basically confirmed that it is a font loading problem in different environments. So what are the logical fonts and physical fonts found when debugging the Linux environment?
Physical fonts are actual font libraries that contain glyph data and tables that map character sequences to glyph sequences using font technologies such as TrueType or PostScript Type 1 . All implementations of the Java Platform support TrueType fonts; support for other font technologies is implementation-dependent. Physical fonts can use font names such as Helvetica, Palatino, HonMincho, or any number of other font names. Typically, each physical font supports only a limited set of writing systems, for example, only Latin characters, or only Japanese and Basic Latin. The set of physical fonts available varies depending on the configuration. Applications that require specific fonts can use the createFont method to bundle these fonts and instantiate them.
Logical fonts are five font families defined by the Java platform that must be supported by all Java runtime environments: Serif, SansSerif, Monospaced, Dialog, and DialogInput. These logical fonts are not actual font libraries. Additionally, it is the Java runtime environment that maps logical font names to physical fonts. Mappings are implementation and generally locale dependent, so the appearance and specifications they provide vary. Typically, each logical font name is mapped to several physical fonts in order to cover the huge range of characters.
debug的源码很多,但是此次问题的关键点就在这里了,其它debug内容就不贴了。
既然已经确认了本地(mac环境)是native的代码帮我们做了物理字体的封装,转换成了CFont进行渲染,而Linux环境的X11FontManager只是帮我们加载了物理字体和逻辑字体,但是却需要我们自己进行选择,那么解决问题的第一步就显而易见了:将Font的创建从物理字体改为逻辑字体
1 // Serif、SansSerif、Monospaced、Dialog 和 DialogInput 随意选择 2 Font font = new Font("Serif", Font.PLAIN, 10);
改完以后执行代码,仍然是乱码!继续Debug,发现是Linux上逻辑字体Serif映射的物理字体没有中文字体和对应的特殊符号字体,这就很简单了,直接在Linux上安装中文字体(simsun.ttf),再安装特殊符号“ꐚꌒꑿꆺ”可显示的字体(mysi.ttf),将这两个字体也放到了jdk的fonts目录(JAVA_HOME/jre/lib/fonts)下。文章后面有Linux字体安装方法。
完成上面的改动之后,重启服务,再次执行成功显示!热烈庆祝~~~~
以上的改动已经可以解决中文和特殊字符乱码问题,但是我在Debug过程中发现在逻辑字体加载过程中,JVM会参考一个配置文件,代码在sun.awt.FontConfiguration中,这个配置类完成了逻辑字体和物理字体的映射,也指导了SunFontManager创建逻辑字体,而这个FontConfiguration读取的配置文件就是fontconfig.properties,这个配置文件目录是JAVA_HOME/jre/lib
查阅了一下资料,JVM字体配置文件的加载顺序如下:
JAVA_HOME/jre/lib/fontconfig.OS.Version.properties
JAVA_HOME/jre/lib/fontconfig.OS.Version.bfc
JAVA_HOME/jre/lib/fontconfig.OS.properties
JAVA_HOME/jre/lib/fontconfig.OS.bfc
JAVA_HOME/jre/lib/fontconfig.Version.properties
JAVA_HOME/jre/lib/fontconfig.Version.bfc
JAVA_HOME/jre/lib/fontconfig.properties
JAVA_HOME/jre/lib/fontconfig.bfc
OS是系统,例如:Linux、CentOs、RedHat等;Version是版本号
在这个配置文件中可以修改逻辑字体与物理字体的对应关系,也就是说可以手动的修改Serif、SansSerif、Monospaced、Dialog 和 DialogInput这五个逻辑字体在不同场景下所使用的真正物理字体。
举个栗子,下面的配置将serif.plain逻辑字体的中文使用simsun.ttf,拉丁文使用java自带字体:
# @(#)linux.fontconfig.SuSE.properties 1.2 03/10/17 # # Copyright 2003 Sun Microsystems, Inc. All rights reserved. # # Version version=1 # Component Font Mappings serif.plain.chinese=-misc-simsun-medium-r-normal--*-%d-*-*-c-*-iso10646-1 serif.plain.latin-1=-b&h-lucidabright-medium-r-normal--*-%d-*-*-p-*-iso8859-1 # Search Sequences sequence.allfonts=latin-1,chinese # Exclusion Ranges # Font File Names filename.-misc-simsun-medium-r-normal--*-%d-*-*-c-*-iso10646-1=/usr/share/fonts/myfonts/simsun.ttf
PS:以上所有操作基本都需要root权限
推荐学习:《linux视频教程》
The above is the detailed content of How to solve the problem of Chinese garbled characters in Java Linux files. For more information, please follow other related articles on the PHP Chinese website!