How to Unescape HTML Character Entities in Java
In Java, the task of unescaping HTML character entities falls upon the Apache Commons StringEscapeUtils class. Its unescapeHtml4() method serves as the equivalent to .NET's HttpUtility.HtmlDecode method.
This method takes a string containing HTML entity escapes and converts it into a string containing the corresponding Unicode characters. It supports HTML 4.0 entities, ensuring compatibility with the widely used web standard.
For instance, the HTML character entity " " is rendered as a non-breaking space ( ) in browsers. Using StringEscapeUtils.unescapeHtml4(), you can convert a string containing this entity " " into " ". Similarly, ">" will be converted to its equivalent greater-than sign ">".
The above is the detailed content of How to Unescape HTML Character Entities in Java?. For more information, please follow other related articles on the PHP Chinese website!