Setting User Agent of a Java URLConnection
When attempting to parse a webpage using Java with URLConnection and setting the user-agent to a specified value, an additional "Java/1.5.0_19" may be appended to the end. This arises due to a limitation in older versions of Java.
Solution (Java 1.6.30 and Newer)
In Java 1.6.30 and newer, this issue has been resolved. Setting the user agent using setRequestProperty("User-Agent", "Mozilla ...") now works correctly without appending the Java version.
Verification
To verify this, you can listen on a port using netcat, which displays the raw HTTP headers of incoming requests. Without setting the user agent, the headers will show:
GET /foobar HTTP/1.1 User-Agent: Java/1.6.0_30 Host: localhost:8080 Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive
When setting the user agent, the headers will instead show:
GET /foobar HTTP/1.1 User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2 Host: localhost:8080 Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive
Example Code (Java 1.6.30 )
The following code example demonstrates how to correctly set the user agent:
import java.io.IOException; import java.net.URL; import java.net.URLConnection; public class TestUrlOpener { public static void main(String[] args) throws IOException { URL url = new URL("http://localhost:8080/foobar"); URLConnection hc = url.openConnection(); hc.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2"); System.out.println(hc.getContentType()); } }
The above is the detailed content of How to Correctly Set the User-Agent in Java URLConnection?. For more information, please follow other related articles on the PHP Chinese website!