current position:Home>Sorting out the knowledge blind area of Java crawler

Sorting out the knowledge blind area of Java crawler

2022-01-27 04:15:37 Big flicker love flicker

java Sorting out the blind area of reptile knowledge


HttpClient Redirection processing

【HttpClient4.5 Chinese Course 】 8、 ... and . Terminate request and redirect processing

First, HttpClient And the difference between browsers

We make a request from the browser , The browser will help you handle redirection 、 Cache and other things . That's why you use browser forms post After submission , No matter how the server redirects , Can normally receive the data returned by the server .

But with HttpClient Well , You'll find that , After the request , Returns the 302, because POST Mode submission HttpClient It won't help you deal with redirection . What do you do then ?

Method 1 :( Handle it manually )

HttpClient httpClient = HttpClients.createDefault();

        HttpPost httpPost= new HttpPost(http://ip:port/xxx);

        CloseableHttpResponse response = httpclient.execute(httpPost);

        int statusCode = response.getStatusLine().getStatusCode();
        System.out.println("statusCode=="+statusCode); // Return code 

        Header header=response.getFirstHeader("Location");

        // Redirect address 
        String location =  header.getValue();
        System.out.println(location);

        // And then on the new location Just make a request 

        HttpGet httpGet = new HttpGet(location);
        CloseableHttpResponse response2 = httpclient.execute(httpGet);
        System.out.println(" Return message "+EntityUtils.toString(response2.getEntity(), "UT-F-8"));

Method 2 :( Existing tool classes )

HttpClientBuilder builder = HttpClients.custom()
            .disableAutomaticRetries() // Turn off automatic processing redirection 
            .setRedirectStrategy(new LaxRedirectStrategy());// utilize LaxRedirectStrategy Handle POST Redirection problem 

       CloseableHttpClient client = builder.build();

        HttpPost httpPost= new HttpPost(http://ip:port/xxx);

        CloseableHttpResponse response = client.execute(httpPost);

        int statusCode = response.getStatusLine().getStatusCode();
        System.out.println("statusCode=="+statusCode); // Return code 

         System.out.println(" Return message "+EntityUtils.toString(response.getEntity(), "UT-F-8"));

HttpClient obtain Cookie Two ways

One 、 The old version of HttpClient obtain Cookies

p.s. This method is officially not recommended

Use DefaultHttpClient Class instantiation httpClient object :

public static String dooPost_deprecated(String url, Map<String, String> map, String charset) {
    
        DefaultHttpClient httpClient = null;
        HttpPost httpPost = null;
        String result = null;
        try {
    
            httpClient = new DefaultHttpClient();
            httpPost = new HttpPost(url);
            //  Set parameters 
            List<NameValuePair> list = new ArrayList<NameValuePair>();
            Iterator<Entry<String, String>> iterator = map.entrySet().iterator();
            while (iterator.hasNext()) {
    
                Entry<String, String> elem = (Entry<String, String>) iterator.next();
                list.add(new BasicNameValuePair(elem.getKey(), elem.getValue()));
            }
            if (list.size() > 0) {
    
                UrlEncodedFormEntity entity = new UrlEncodedFormEntity(list, charset);
                httpPost.setEntity(entity);
            }
            HttpResponse response = httpClient.execute(httpPost);
            System.out.println(response.getStatusLine().getStatusCode());
            String JSESSIONID = null;
            String cookie_user = null;
            // get Cookies
            CookieStore cookieStore = httpClient.getCookieStore();
            List<Cookie> cookies = cookieStore.getCookies();
            for (int i = 0; i < cookies.size(); i++) {
    
                // Traverse Cookies
                System.out.println(cookies.get(i));
                System.out.println("cookiename=="+cookies.get(i).getName());
                System.out.println("cookieValue=="+cookies.get(i).getValue());
                System.out.println("Domain=="+cookies.get(i).getDomain());
                System.out.println("Path=="+cookies.get(i).getPath());
                System.out.println("Version=="+cookies.get(i).getVersion());

                if (cookies.get(i).getName().equals("JSESSIONID")) {
    
                    JSESSIONID = cookies.get(i).getValue();
                }
                if (cookies.get(i).getName().equals("cookie_user")) {
    
                    cookie_user = cookies.get(i).getValue();
                }
            }
            if (cookie_user != null) {
    
                result = JSESSIONID;
            }
        } catch (Exception ex) {
    
            ex.printStackTrace();
        }
        return result;
    }

Two 、 The new version of the HttpClient obtain Cookies

Use CloseableHttpClient Class instantiation httpClient object :

public static String doPost(Map<String, String> map, String charset) {
    
        CloseableHttpClient httpClient = null;
        HttpPost httpPost = null;
        String result = null;
        try {
    
            CookieStore cookieStore = new BasicCookieStore();
            httpClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
            httpPost = new HttpPost("http://localhost:8080/testtoolmanagement/LoginServlet");
            List<NameValuePair> list = new ArrayList<NameValuePair>();
            Iterator<Map.Entry<String, String>> iterator = map.entrySet().iterator();
            while (iterator.hasNext()) {
    
                Entry<String, String> elem = (Entry<String, String>) iterator.next();
                list.add(new BasicNameValuePair(elem.getKey(), elem.getValue()));
            }
            if (list.size() > 0) {
    
                UrlEncodedFormEntity entity = new UrlEncodedFormEntity(list, charset);
                httpPost.setEntity(entity);
            }
            httpClient.execute(httpPost);
            String JSESSIONID = null;
            String cookie_user = null;
            List<Cookie> cookies = cookieStore.getCookies();
            for (int i = 0; i < cookies.size(); i++) {
    
                if (cookies.get(i).getName().equals("JSESSIONID")) {
    
                    JSESSIONID = cookies.get(i).getValue();
                }
                if (cookies.get(i).getName().equals("cookie_user")) {
    
                    cookie_user = cookies.get(i).getValue();
                }
            }
            if (cookie_user != null) {
    
                result = JSESSIONID;
            }
        } catch (Exception ex) {
    
            ex.printStackTrace();
        }
        return result;
    }

copyright notice
author[Big flicker love flicker],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/01/202201270415324929.html

Random recommended