如何抓取网站页面内容

本文主要是介绍如何抓取网站页面内容，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

很多时候，我们想获取一些网页的内容，可以运用以下几种方法：

HTTPCLIENT

get方法：

HttpClient httpClient = new HttpClient();  GetMethod getMethod = new GetMethod("http://www.baidu.com/");  try {  int statusCode = httpClient.executeMethod(getMethod);  if (statusCode != HttpStatus.SC_OK) {  System.err.println("Method failed: "  + getMethod.getStatusLine());  }  // 读取内容  byte[] responseBody = getMethod.getResponseBody();  // 处理内容  String html = new String(responseBody);  System.out.println(html);　  } catch (Exception e) {  System.err.println("页面无法访问");  }finally{  getMethod.releaseConnection();  }

post方法：

HttpClient httpClient = new HttpClient();  PostMethod postMethod = new PostMethod(UrlPath);  postMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,new DefaultHttpMethodRetryHandler());  NameValuePair[] postData = new NameValuePair[2];  postData[0] = new NameValuePair("username", "xkey");  postData[1] = new NameValuePair("userpass", "********");  postMethod.setRequestBody(postData);  try {  int statusCode = httpClient.executeMethod(postMethod);  if (statusCode == HttpStatus.SC_OK) {  byte[] responseBody = postMethod.getResponseBody();  String html = new String(responseBody);  System.out.println(html);  }  } catch (Exception e) {  System.err.println("页面无法访问");  }finally{  postMethod.releaseConnection();  }