Getting UTF-8 Encoding in Java Webapps
Problem: Implementing UTF-8 encoding to support non-Latin characters for text and specific alphabets.
Environment:
- Development: Windows XP
- Production: Debian
- Database: MySQL 5.x
- Browsers: Firefox2, Opera 9.x, FF3, IE7, Google Chrome
Solution:
-
Configure Tomcat's server.xml:
- Enable UTF-8 encoding for GET parameters:
-
CharsetFilter:
-
JSP Page Encoding:
- Specify encoding for JSP pages in web.xml or add the following meta tag to each page:
-
JDBC Connection:
- Use ?useEncoding=true&characterEncoding=UTF-8 in connection URL.
-
MySQL Database and Tables:
- Create database and tables with DEFAULT CHARACTER SET=utf8 COLLATE=utf8_swedish_ci.
-
MySQL Server Configuration:
- Set default-character-set=utf8 in my.ini (Windows) or my.cnf (Linux).
-
MySQL Procedures and Functions:
- Specify UTF-8 character set explicitly, e.g.:
CREATE FUNCTION ... RETURNS TEXT CHARACTER SET utf8
Handling GET Requests:
- By default, URLs are encoded in Latin1, causing problems with non-ASCII characters.
- To address this, define URL encoding in server.xml as UTF-8.
- Instruct browsers to read pages in UTF-8 using meta-tags and request headers.
UTF-8 vs. Latin1 in GET Requests:
- POST requests are encoded in UTF-8 by browsers.
- For GET requests, while the page is defined as UTF-8, some characters may still be encoded in Latin1. This results in mixed encoding, making it difficult for the webapp to handle request parameters correctly.
References:
- http://tagunov.tripod.com/i18n/i18n.html
- http://wiki.apache.org/tomcat/Tomcat/UTF-8
- http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
- http://dev.mysql.com/doc/refman/5.0/en/charset-syntax.html
- http://cagan327.blogspot.com/2006/05/utf-8-encoding-fix-tomcat-jsp-etc.html
- http://cagan327.blogspot.com/2006/05/utf-8-encoding-fix-for-mysql-tomcat.html
- http://jeppesn.dk/utf-8.html
- http://www.nabble.com/request-parameters-mishandle-utf-8-encoding-td18720039.html
- http://www.utoronto.ca/webdocs/HTMLdocs/NewHTML/iso_table.html
- http://www.utf8-chartable.de/
The above is the detailed content of How to Properly Implement UTF-8 Encoding in a Java Web Application?. For more information, please follow other related articles on the PHP Chinese website!