How to Implement UTF-8 in Java Web Applications
Understanding the Problem
UTF-8 encoding is essential for supporting international characters, including special alphabets like Finnish and Cyrillic, in Java web applications.
Crafting the Solution
To resolve this issue, follow these steps:
1. Configure Tomcat's server.xml:
Configure the connector to encode URL parameters using UTF-8:
<Connector port="8080" ... URIEncoding="UTF-8"/>
2. Create a CharsetFilter:
Define a filter to ensure all requests and responses are handled in UTF-8:
public void doFilter(ServletRequest request, ServletResponse response, FilterChain next) throws IOException, ServletException { // Set the default character encoding request.setCharacterEncoding("UTF-8"); response.setContentType("text/html; charset=UTF-8"); response.setCharacterEncoding("UTF-8"); next.doFilter(request, response); }
3. Add the Filter to web.xml:
<filter> <filter-name>CharsetFilter</filter-name> <filter-class>fi.foo.filters.CharsetFilter</filter-class> <init-param> <param-name>requestEncoding</param-name> <param-value>UTF-8</param-value> </init-param> </filter> <filter-mapping> <filter-name>CharsetFilter</filter-name> <url-pattern>/*</url-pattern> </filter-mapping>
4. Set JSP Page Encoding:
In web.xml:
<jsp-config> <jsp-property-group> <url-pattern>*.jsp</url-pattern> <page-encoding>UTF-8</page-encoding> </jsp-property-group> </jsp-config>
Alternatively, in each JSP page:
<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>
5. Specify HTML Meta Tags:
Ensure browsers understand the encoding of the HTML page:
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8' />
6. Configure JDBC Connection:
<Resource name="jdbc/AppDB" ... url="jdbc:mysql://localhost:3306/ID_development?useEncoding=true&characterEncoding=UTF-8"/>
7. Set Up MySQL Database and Tables:
Create the database and tables using UTF-8:
CREATE DATABASE `ID_development` ... COLLATE utf8_swedish_ci; CREATE TABLE `Users` ... COLLATE utf8_swedish_ci;
8. Configure MySQL Server:
In my.ini or my.cnf, set the default character set:
[client] default-character-set=utf8 [mysql] default-character-set=utf8
9. Encode GET Requests Correctly:
Instructed by Tomcat, browsers should encode GET request parameters in UTF-8.
Latin1 and UTF-8 in GET Requests:
HTTP defaults to Latin1 for URL encoding, resulting in different encoding for some characters like "ä." This poses challenges for webapps handling requests.
The above is the detailed content of How to Properly Implement UTF-8 Encoding in Java Web Applications?. For more information, please follow other related articles on the PHP Chinese website!