网页表格信息抓取
页面源代码如下:
<br> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd"><br /> <HTML><HEAD><TITLE></TITLE><br> <META content="text/html; charset=GBK" http-equiv=Content-Type><br> <META name=GENERATOR content="MSHTML 8.00.7601.18106"></HEAD><br> <BODY><br> <FORM method=post name=pusManageForm action=pus.do><INPUT type=hidden <br /> 名前=メソッド> <INPUT value=15647695 type=hidden name=sid> <INPUT value=2 <br /> type=hidden name=partCount> <br> <TABLE width="100%" align=center><br> <TBODY><br> <tr> <br> <TD><br> <TABLE border=0 width="100%"><br> <TBODY><br> </tr><tr> <br> <TD width=10> </TD><br> <TD><br> <TABLE border=0 cellSpacing=1 cellPadding=0 width="95%" <br> align=center><br> <TBODY><br> </tr><tr> <br> <TD height=40 align=left><B><FONT color=rgb(0,0,20) <br /> size=2>aaaaaa</FONT></B> <BR><B><FONT <br /> color=rgb(0,0,20) size=2>aaaaaa</FONT></B> </TD><br> </tr><tr> <br> <TD height=40 align=left><FONT color=rgb(0,0,20) <br /> size=1>ああああああ</FONT> <FONT color=rgb(0,0,20) <br /> size=1>xxxx(aaaaaa)</FONT> </TD><br> </tr><tr> <div class="clear"></div> </tr>