How to Enable UTF-8 in Java Web Applications
Overview
To support diverse character sets like Finnish (äöå) and Cyrillic (ЦжФ), enabling UTF-8 in Java web applications is crucial. This article provides step-by-step instructions for configuring Tomcat, database, and other components to ensure proper UTF-8 handling.
Tomcat Configuration
Configure server.xml for UTF-8 Encoding:
<Connector URIEncoding="UTF-8" ... />
Add CharacterSetFilter:
public class CharsetFilter implements Filter { ... if (null == request.getCharacterEncoding()) { request.setCharacterEncoding("UTF-8"); } ... }
Add CharsetFilter to web.xml:
<filter> <filter-name>CharsetFilter</filter-name> <filter-class>fi.foo.filters.CharsetFilter</filter-class> ... </filter> <filter-mapping> <filter-name>CharsetFilter</filter-name> <url-pattern>/*</url-pattern> </filter-mapping>
JSP and HTML
Configure Web.xml for JSP Encoding:
<jsp-config> <jsp-property-group> <page-encoding>UTF-8</page-encoding> </jsp-property-group> </jsp-config>
Declare Page Encoding in JSP:
<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>
Add HTML Meta Tag:
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8' />
JDBC Connection
Configure JDBC Datasource with UTF-8 Encoding:
<Resource> ... url="jdbc:mysql://...useEncoding=true&characterEncoding=UTF-8"... </Resource>
MySQL Configuration
Create UTF-8 Database:
CREATE DATABASE ... CHARSET=utf8 ...
Create UTF-8 Tables:
CREATE TABLE ... CHARSET=utf8 COLLATE=utf8_swedish_ci ...
Configure MySQL Server for UTF-8:
[mysql] default-character-set=utf8
Functions and Procedures
Declare Functions and Procedures with UTF-8 Character Set:
CREATE FUNCTION `pathToNode` RETURNS TEXT CHARACTER SET utf8 ...
Handling GET Requests
Important Note
MySQL supports UTF-8 with 3-byte characters. For extended character sets, consider using utf8mb4 (requires MySQL 5.5.3 or later) or VARBINARY columns.
Tomcat with Apache
If using Apache Tomcat mod_JK connector:
Enable UTF-8 in Tomcat's server.xml:
<Connector ... URIEncoding="UTF-8" ... />
Set Apache Default Charset:
AddDefaultCharset utf-8
The above is the detailed content of How to Properly Configure UTF-8 Encoding in Java Web Applications?. For more information, please follow other related articles on the PHP Chinese website!