| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" |
| "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"> |
| <!-- |
| ==================================================================== |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| ==================================================================== |
| --> |
| <chapter id="fundamentals"> |
| <title>Fundamentals</title> |
| <section> |
| <title>Request execution</title> |
| <para> The most essential function of HttpClient is to execute HTTP methods. Execution of an |
| HTTP method involves one or several HTTP request / HTTP response exchanges, usually |
| handled internally by HttpClient. The user is expected to provide a request object to |
| execute and HttpClient is expected to transmit the request to the target server return a |
| corresponding response object, or throw an exception if execution was unsuccessful. </para> |
| <para> Quite naturally, the main entry point of the HttpClient API is the HttpClient |
| interface that defines the contract described above. </para> |
| <para>Here is an example of request execution process in its simplest form:</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpGet httpget = new HttpGet("http://localhost/"); |
| CloseableHttpResponse response = httpclient.execute(httpget); |
| try { |
| <...> |
| } finally { |
| response.close(); |
| } |
| ]]></programlisting> |
| <section> |
| <title>HTTP request</title> |
| <para>All HTTP requests have a request line consisting a method name, a request URI and |
| an HTTP protocol version.</para> |
| <para>HttpClient supports out of the box all HTTP methods defined in the HTTP/1.1 |
| specification: <literal>GET</literal>, <literal>HEAD</literal>, |
| <literal>POST</literal>, <literal>PUT</literal>, <literal>DELETE</literal>, |
| <literal>TRACE</literal> and <literal>OPTIONS</literal>. There is a specific |
| class for each method type.: <classname>HttpGet</classname>, |
| <classname>HttpHead</classname>, <classname>HttpPost</classname>, |
| <classname>HttpPut</classname>, <classname>HttpDelete</classname>, |
| <classname>HttpTrace</classname>, and <classname>HttpOptions</classname>.</para> |
| <para>The Request-URI is a Uniform Resource Identifier that identifies the resource upon |
| which to apply the request. HTTP request URIs consist of a protocol scheme, host |
| name, optional port, resource path, optional query, and optional fragment.</para> |
| <programlisting><![CDATA[ |
| HttpGet httpget = new HttpGet( |
| "http://www.google.com/search?hl=en&q=httpclient&btnG=Google+Search&aq=f&oq="); |
| ]]></programlisting> |
| <para>HttpClient provides <classname>URIBuilder</classname> utility class to simplify |
| creation and modification of request URIs.</para> |
| <programlisting><![CDATA[ |
| URI uri = new URIBuilder() |
| .setScheme("http") |
| .setHost("www.google.com") |
| .setPath("/search") |
| .setParameter("q", "httpclient") |
| .setParameter("btnG", "Google Search") |
| .setParameter("aq", "f") |
| .setParameter("oq", "") |
| .build(); |
| HttpGet httpget = new HttpGet(uri); |
| System.out.println(httpget.getURI()); |
| ]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| http://www.google.com/search?q=httpclient&btnG=Google+Search&aq=f&oq= |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>HTTP response</title> |
| <para>HTTP response is a message sent by the server back to the client after having |
| received and interpreted a request message. The first line of that message consists |
| of the protocol version followed by a numeric status code and its associated textual |
| phrase.</para> |
| <programlisting><![CDATA[ |
| HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, |
| HttpStatus.SC_OK, "OK"); |
| |
| System.out.println(response.getProtocolVersion()); |
| System.out.println(response.getStatusLine().getStatusCode()); |
| System.out.println(response.getStatusLine().getReasonPhrase()); |
| System.out.println(response.getStatusLine().toString()); |
| ]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| HTTP/1.1 |
| 200 |
| OK |
| HTTP/1.1 200 OK |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>Working with message headers</title> |
| <para>An HTTP message can contain a number of headers describing properties of the |
| message such as the content length, content type and so on. HttpClient provides |
| methods to retrieve, add, remove and enumerate headers.</para> |
| <programlisting><![CDATA[ |
| HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, |
| HttpStatus.SC_OK, "OK"); |
| response.addHeader("Set-Cookie", |
| "c1=a; path=/; domain=localhost"); |
| response.addHeader("Set-Cookie", |
| "c2=b; path=\"/\", c3=c; domain=\"localhost\""); |
| Header h1 = response.getFirstHeader("Set-Cookie"); |
| System.out.println(h1); |
| Header h2 = response.getLastHeader("Set-Cookie"); |
| System.out.println(h2); |
| Header[] hs = response.getHeaders("Set-Cookie"); |
| System.out.println(hs.length); |
| ]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| Set-Cookie: c1=a; path=/; domain=localhost |
| Set-Cookie: c2=b; path="/", c3=c; domain="localhost" |
| 2 |
| ]]></programlisting> |
| <para>The most efficient way to obtain all headers of a given type is by using the |
| <interfacename>HeaderIterator</interfacename> interface.</para> |
| <programlisting><![CDATA[ |
| HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, |
| HttpStatus.SC_OK, "OK"); |
| response.addHeader("Set-Cookie", |
| "c1=a; path=/; domain=localhost"); |
| response.addHeader("Set-Cookie", |
| "c2=b; path=\"/\", c3=c; domain=\"localhost\""); |
| |
| HeaderIterator it = response.headerIterator("Set-Cookie"); |
| |
| while (it.hasNext()) { |
| System.out.println(it.next()); |
| } |
| ]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| Set-Cookie: c1=a; path=/; domain=localhost |
| Set-Cookie: c2=b; path="/", c3=c; domain="localhost" |
| ]]></programlisting> |
| <para>It also provides convenience methods to parse HTTP messages into individual header |
| elements.</para> |
| <programlisting><![CDATA[ |
| HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, |
| HttpStatus.SC_OK, "OK"); |
| response.addHeader("Set-Cookie", |
| "c1=a; path=/; domain=localhost"); |
| response.addHeader("Set-Cookie", |
| "c2=b; path=\"/\", c3=c; domain=\"localhost\""); |
| |
| HeaderElementIterator it = new BasicHeaderElementIterator( |
| response.headerIterator("Set-Cookie")); |
| |
| while (it.hasNext()) { |
| HeaderElement elem = it.nextElement(); |
| System.out.println(elem.getName() + " = " + elem.getValue()); |
| NameValuePair[] params = elem.getParameters(); |
| for (int i = 0; i < params.length; i++) { |
| System.out.println(" " + params[i]); |
| } |
| } |
| ]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| c1 = a |
| path=/ |
| domain=localhost |
| c2 = b |
| path=/ |
| c3 = c |
| domain=localhost |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>HTTP entity</title> |
| <para>HTTP messages can carry a content entity associated with the request or response. |
| Entities can be found in some requests and in some responses, as they are optional. |
| Requests that use entities are referred to as entity enclosing requests. The HTTP |
| specification defines two entity enclosing request methods: <literal>POST</literal> and |
| <literal>PUT</literal>. Responses are usually expected to enclose a content |
| entity. There are exceptions to this rule such as responses to |
| <literal>HEAD</literal> method and <literal>204 No Content</literal>, |
| <literal>304 Not Modified</literal>, <literal>205 Reset Content</literal> |
| responses.</para> |
| <para>HttpClient distinguishes three kinds of entities, depending on where their content |
| originates:</para> |
| <itemizedlist> |
| <listitem> |
| <formalpara> |
| <title>streamed:</title> |
| <para>The content is received from a stream, or generated on the fly. In |
| particular, this category includes entities being received from HTTP |
| responses. Streamed entities are generally not repeatable.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <title>self-contained:</title> |
| <para>The content is in memory or obtained by means that are independent |
| from a connection or other entity. Self-contained entities are generally |
| repeatable. This type of entities will be mostly used for entity |
| enclosing HTTP requests.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <title>wrapping:</title> |
| <para>The content is obtained from another entity.</para> |
| </formalpara> |
| </listitem> |
| </itemizedlist> |
| <para>This distinction is important for connection management when streaming out content |
| from an HTTP response. For request entities that are created by an application and |
| only sent using HttpClient, the difference between streamed and self-contained is of |
| little importance. In that case, it is suggested to consider non-repeatable entities |
| as streamed, and those that are repeatable as self-contained.</para> |
| <section> |
| <title>Repeatable entities</title> |
| <para>An entity can be repeatable, meaning its content can be read more than once. |
| This is only possible with self contained entities (like |
| <classname>ByteArrayEntity</classname> or |
| <classname>StringEntity</classname>)</para> |
| </section> |
| <section> |
| <title>Using HTTP entities</title> |
| <para>Since an entity can represent both binary and character content, it has |
| support for character encodings (to support the latter, ie. character |
| content).</para> |
| <para>The entity is created when executing a request with enclosed content or when |
| the request was successful and the response body is used to send the result back |
| to the client.</para> |
| <para>To read the content from the entity, one can either retrieve the input stream |
| via the <methodname>HttpEntity#getContent()</methodname> method, which returns |
| an <classname>java.io.InputStream</classname>, or one can supply an output |
| stream to the <methodname>HttpEntity#writeTo(OutputStream)</methodname> method, |
| which will return once all content has been written to the given stream.</para> |
| <para>When the entity has been received with an incoming message, the methods |
| <methodname>HttpEntity#getContentType()</methodname> and |
| <methodname>HttpEntity#getContentLength()</methodname> methods can be used |
| for reading the common metadata such as <literal>Content-Type</literal> and |
| <literal>Content-Length</literal> headers (if they are available). Since the |
| <literal>Content-Type</literal> header can contain a character encoding for |
| text mime-types like text/plain or text/html, the |
| <methodname>HttpEntity#getContentEncoding()</methodname> method is used to |
| read this information. If the headers aren't available, a length of -1 will be |
| returned, and NULL for the content type. If the <literal>Content-Type</literal> |
| header is available, a <interfacename>Header</interfacename> object will be |
| returned.</para> |
| <para>When creating an entity for a outgoing message, this meta data has to be |
| supplied by the creator of the entity.</para> |
| <programlisting><![CDATA[ |
| StringEntity myEntity = new StringEntity("important message", |
| ContentType.create("text/plain", "UTF-8")); |
| |
| System.out.println(myEntity.getContentType()); |
| System.out.println(myEntity.getContentLength()); |
| System.out.println(EntityUtils.toString(myEntity)); |
| System.out.println(EntityUtils.toByteArray(myEntity).length);]]></programlisting> |
| <para>stdout ></para> |
| <programlisting><![CDATA[ |
| Content-Type: text/plain; charset=utf-8 |
| 17 |
| important message |
| 17 |
| ]]></programlisting> |
| </section> |
| </section> |
| <section> |
| <title>Ensuring release of low level resources</title> |
| <para> In order to ensure proper release of system resources one must close either |
| the content stream associated with the entity or the response itself</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpGet httpget = new HttpGet("http://localhost/"); |
| CloseableHttpResponse response = httpclient.execute(httpget); |
| try { |
| HttpEntity entity = response.getEntity(); |
| if (entity != null) { |
| InputStream instream = entity.getContent(); |
| try { |
| // do something useful |
| } finally { |
| instream.close(); |
| } |
| } |
| } finally { |
| response.close(); |
| } |
| ]]></programlisting> |
| <para>The difference between closing the content stream and closing the response |
| is that the former will attempt to keep the underlying connection alive |
| by consuming the entity content while the latter immediately shuts down |
| and discards the connection.</para> |
| <para>Please note that the <methodname>HttpEntity#writeTo(OutputStream)</methodname> |
| method is also required to ensure proper release of system resources once the |
| entity has been fully written out. If this method obtains an instance of |
| <classname>java.io.InputStream</classname> by calling |
| <methodname>HttpEntity#getContent()</methodname>, it is also expected to close |
| the stream in a finally clause.</para> |
| <para>When working with streaming entities, one can use the |
| <methodname>EntityUtils#consume(HttpEntity)</methodname> method to ensure that |
| the entity content has been fully consumed and the underlying stream has been |
| closed.</para> |
| <para>There can be situations, however, when only a small portion of the entire response |
| content needs to be retrieved and the performance penalty for consuming the |
| remaining content and making the connection reusable is too high, in which case |
| one can terminate the content stream by closing the response.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpGet httpget = new HttpGet("http://localhost/"); |
| CloseableHttpResponse response = httpclient.execute(httpget); |
| try { |
| HttpEntity entity = response.getEntity(); |
| if (entity != null) { |
| InputStream instream = entity.getContent(); |
| int byteOne = instream.read(); |
| int byteTwo = instream.read(); |
| // Do not need the rest |
| } |
| } finally { |
| response.close(); |
| } |
| ]]></programlisting> |
| <para>The connection will not be reused, but all level resources held by it will be |
| correctly deallocated.</para> |
| </section> |
| <section> |
| <title>Consuming entity content</title> |
| <para>The recommended way to consume the content of an entity is by using its |
| <methodname>HttpEntity#getContent()</methodname> or |
| <methodname>HttpEntity#writeTo(OutputStream)</methodname> methods. HttpClient |
| also comes with the <classname>EntityUtils</classname> class, which exposes several |
| static methods to more easily read the content or information from an entity. |
| Instead of reading the <classname>java.io.InputStream</classname> directly, one can |
| retrieve the whole content body in a string / byte array by using the methods from |
| this class. However, the use of <classname>EntityUtils</classname> is |
| strongly discouraged unless the response entities originate from a trusted HTTP |
| server and are known to be of limited length.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpGet httpget = new HttpGet("http://localhost/"); |
| CloseableHttpResponse response = httpclient.execute(httpget); |
| try { |
| HttpEntity entity = response.getEntity(); |
| if (entity != null) { |
| long len = entity.getContentLength(); |
| if (len != -1 && len < 2048) { |
| System.out.println(EntityUtils.toString(entity)); |
| } else { |
| // Stream content out |
| } |
| } |
| } finally { |
| response.close(); |
| } |
| ]]></programlisting> |
| <para>In some situations it may be necessary to be able to read entity content more than |
| once. In this case entity content must be buffered in some way, either in memory or |
| on disk. The simplest way to accomplish that is by wrapping the original entity with |
| the <classname>BufferedHttpEntity</classname> class. This will cause the content of |
| the original entity to be read into a in-memory buffer. In all other ways the entity |
| wrapper will be have the original one.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpResponse response = <...> |
| HttpEntity entity = response.getEntity(); |
| if (entity != null) { |
| entity = new BufferedHttpEntity(entity); |
| } |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>Producing entity content</title> |
| <para>HttpClient provides several classes that can be used to efficiently stream out |
| content throught HTTP connections. Instances of those classes can be associated with |
| entity enclosing requests such as <literal>POST</literal> and <literal>PUT</literal> |
| in order to enclose entity content into outgoing HTTP requests. HttpClient provides |
| several classes for most common data containers such as string, byte array, input |
| stream, and file: <classname>StringEntity</classname>, |
| <classname>ByteArrayEntity</classname>, |
| <classname>InputStreamEntity</classname>, and |
| <classname>FileEntity</classname>.</para> |
| <programlisting><![CDATA[ |
| File file = new File("somefile.txt"); |
| FileEntity entity = new FileEntity(file, |
| ContentType.create("text/plain", "UTF-8")); |
| |
| HttpPost httppost = new HttpPost("http://localhost/action.do"); |
| httppost.setEntity(entity); |
| ]]></programlisting> |
| <para>Please note <classname>InputStreamEntity</classname> is not repeatable, because it |
| can only read from the underlying data stream once. Generally it is recommended to |
| implement a custom <interfacename>HttpEntity</interfacename> class which is |
| self-contained instead of using the generic <classname>InputStreamEntity</classname>. |
| <classname>FileEntity</classname> can be a good starting point.</para> |
| <section> |
| <title>HTML forms</title> |
| <para>Many applications need to simulate the process of submitting an |
| HTML form, for instance, in order to log in to a web application or submit input |
| data. HttpClient provides the entity class |
| <classname>UrlEncodedFormEntity</classname> to facilitate the |
| process.</para> |
| <programlisting><![CDATA[ |
| List<NameValuePair> formparams = new ArrayList<NameValuePair>(); |
| formparams.add(new BasicNameValuePair("param1", "value1")); |
| formparams.add(new BasicNameValuePair("param2", "value2")); |
| UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, Consts.UTF_8); |
| HttpPost httppost = new HttpPost("http://localhost/handler.do"); |
| httppost.setEntity(entity); |
| ]]></programlisting> |
| <para>The <classname>UrlEncodedFormEntity</classname> instance will use the so |
| called URL encoding to encode parameters and produce the following |
| content:</para> |
| <programlisting><![CDATA[ |
| param1=value1¶m2=value2 |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>Content chunking</title> |
| <para>Generally it is recommended to let HttpClient choose the most appropriate |
| transfer encoding based on the properties of the HTTP message being transferred. |
| It is possible, however, to inform HttpClient that chunk coding is preferred |
| by setting <methodname>HttpEntity#setChunked()</methodname> to true. Please note |
| that HttpClient will use this flag as a hint only. This value will be ignored |
| when using HTTP protocol versions that do not support chunk coding, such as |
| HTTP/1.0.</para> |
| <programlisting><![CDATA[ |
| StringEntity entity = new StringEntity("important message", |
| ContentType.create("plain/text", Consts.UTF_8)); |
| entity.setChunked(true); |
| HttpPost httppost = new HttpPost("http://localhost/acrtion.do"); |
| httppost.setEntity(entity); |
| ]]></programlisting> |
| </section> |
| </section> |
| <section> |
| <title>Response handlers</title> |
| <para>The simplest and the most convenient way to handle responses is by using |
| the <interfacename>ResponseHandler</interfacename> interface, which includes |
| the <methodname>handleResponse(HttpResponse response)</methodname> method. |
| This method completely |
| relieves the user from having to worry about connection management. When using a |
| <interfacename>ResponseHandler</interfacename>, HttpClient will automatically |
| take care of ensuring release of the connection back to the connection manager |
| regardless whether the request execution succeeds or causes an exception.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpGet httpget = new HttpGet("http://localhost/json"); |
| |
| ResponseHandler<MyJsonObject> rh = new ResponseHandler<MyJsonObject>() { |
| |
| @Override |
| public JsonObject handleResponse( |
| final HttpResponse response) throws IOException { |
| StatusLine statusLine = response.getStatusLine(); |
| HttpEntity entity = response.getEntity(); |
| if (statusLine.getStatusCode() >= 300) { |
| throw new HttpResponseException( |
| statusLine.getStatusCode(), |
| statusLine.getReasonPhrase()); |
| } |
| if (entity == null) { |
| throw new ClientProtocolException("Response contains no content"); |
| } |
| Gson gson = new GsonBuilder().create(); |
| ContentType contentType = ContentType.getOrDefault(entity); |
| Charset charset = contentType.getCharset(); |
| Reader reader = new InputStreamReader(entity.getContent(), charset); |
| return gson.fromJson(reader, MyJsonObject.class); |
| } |
| }; |
| MyJsonObject myjson = client.execute(httpget, rh); |
| ]]></programlisting> |
| </section> |
| </section> |
| <section> |
| <title>HttpClient interface</title> |
| <para><interfacename>HttpClient</interfacename> interface represents the most essential |
| contract for HTTP request execution. It imposes no restrictions or particular details on |
| the request execution process and leaves the specifics of connection management, state |
| management, authentication and redirect handling up to individual implementations. This |
| should make it easier to decorate the interface with additional functionality such as |
| response content caching.</para> |
| <para>Generally <interfacename>HttpClient</interfacename> implementations act as a facade |
| to a number of special purpose handler or strategy interface implementations |
| responsible for handling of a particular aspect of the HTTP protocol such as redirect |
| or authentication handling or making decision about connection persistence and keep |
| alive duration. This enables the users to selectively replace default implementation |
| of those aspects with custom, application specific ones.</para> |
| <programlisting><![CDATA[ |
| ConnectionKeepAliveStrategy keepAliveStrat = new DefaultConnectionKeepAliveStrategy() { |
| |
| @Override |
| public long getKeepAliveDuration( |
| HttpResponse response, |
| HttpContext context) { |
| long keepAlive = super.getKeepAliveDuration(response, context); |
| if (keepAlive == -1) { |
| // Keep connections alive 5 seconds if a keep-alive value |
| // has not be explicitly set by the server |
| keepAlive = 5000; |
| } |
| return keepAlive; |
| } |
| |
| }; |
| CloseableHttpClient httpclient = HttpClients.custom() |
| .setKeepAliveStrategy(keepAliveStrat) |
| .build(); |
| ]]></programlisting> |
| <section> |
| <title>HttpClient thread safety</title> |
| <para><interfacename>HttpClient</interfacename> implementations are expected to be |
| thread safe. It is recommended that the same instance of this class is reused for |
| multiple request executions.</para> |
| </section> |
| <section> |
| <title>HttpClient resource deallocation</title> |
| <para>When an instance <classname>CloseableHttpClient</classname> is no longer needed |
| and is about to go out of scope the connection manager associated with it must |
| be shut down by calling the <methodname>CloseableHttpClient#close()</methodname> |
| method.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| try { |
| <...> |
| } finally { |
| httpclient.close(); |
| } |
| ]]></programlisting> |
| </section> |
| </section> |
| <section> |
| <title>HTTP execution context</title> |
| <para>Originally HTTP has been designed as a stateless, response-request oriented protocol. |
| However, real world applications often need to be able to persist state information |
| through several logically related request-response exchanges. In order to enable |
| applications to maintain a processing state HttpClient allows HTTP requests to be |
| executed within a particular execution context, referred to as HTTP context. Multiple |
| logically related requests can participate in a logical session if the same context is |
| reused between consecutive requests. HTTP context functions similarly to |
| a <interfacename>java.util.Map<String, Object></interfacename>. It is |
| simply a collection of arbitrary named values. An application can populate context |
| attributes prior to request execution or examine the context after the execution has |
| been completed.</para> |
| <para><interfacename>HttpContext</interfacename> can contain arbitrary objects and |
| therefore may be unsafe to share between multiple threads. It is recommended that |
| each thread of execution maintains its own context.</para> |
| <para>In the course of HTTP request execution HttpClient adds the following attributes to |
| the execution context:</para> |
| <itemizedlist> |
| <listitem> |
| <formalpara> |
| <para><interfacename>HttpConnection</interfacename> instance representing the |
| actual connection to the target server.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><classname>HttpHost</classname> instance representing the connection |
| target.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><classname>HttpRoute</classname> instance representing the complete |
| connection route</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><interfacename>HttpRequest</interfacename> instance representing the |
| actual HTTP request. The final HttpRequest object in the execution context |
| always represents the state of the message <emphasis>exactly</emphasis> |
| as it was sent to the target server. Per default HTTP/1.0 and HTTP/1.1 |
| use relative request URIs. However if the request is sent via a proxy |
| in a non-tunneling mode then the URI will be absolute.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><interfacename>HttpResponse</interfacename> instance representing the |
| actual HTTP response.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><classname>java.lang.Boolean</classname> object representing the flag |
| indicating whether the actual request has been fully transmitted to the |
| connection target.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><classname>RequestConfig</classname> object representing the actual |
| request configuation.</para> |
| </formalpara> |
| </listitem> |
| <listitem> |
| <formalpara> |
| <para><classname>java.util.List<URI></classname> object representing a collection |
| of all redirect locations received in the process of request |
| execution.</para> |
| </formalpara> |
| </listitem> |
| </itemizedlist> |
| <para>One can use <classname>HttpClientContext</classname> adaptor class to simplify |
| interractions with the context state.</para> |
| <programlisting><![CDATA[ |
| HttpContext context = <...> |
| HttpClientContext clientContext = HttpClientContext.adapt(context); |
| HttpHost target = clientContext.getTargetHost(); |
| HttpRequest request = clientContext.getRequest(); |
| HttpResponse response = clientContext.getResponse(); |
| RequestConfig config = clientContext.getRequestConfig(); |
| ]]></programlisting> |
| <para>Multiple request sequences that represent a logically related session should be |
| executed with the same <interfacename>HttpContext</interfacename> instance to ensure |
| automatic propagation of conversation context and state information between |
| requests.</para> |
| <para>In the following example the request configuration set by the initial request will be |
| kept in the execution context and get propagated to the consecutive requests sharing |
| the same context.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| RequestConfig requestConfig = RequestConfig.custom() |
| .setSocketTimeout(1000) |
| .setConnectTimeout(1000) |
| .build(); |
| |
| HttpGet httpget1 = new HttpGet("http://localhost/1"); |
| httpget1.setConfig(requestConfig); |
| CloseableHttpResponse response1 = httpclient.execute(httpget1, context); |
| try { |
| HttpEntity entity1 = response1.getEntity(); |
| } finally { |
| response1.close(); |
| } |
| HttpGet httpget2 = new HttpGet("http://localhost/2"); |
| CloseableHttpResponse response2 = httpclient.execute(httpget2, context); |
| try { |
| HttpEntity entity2 = response2.getEntity(); |
| } finally { |
| response2.close(); |
| } |
| ]]></programlisting> |
| </section> |
| <section id="protocol_interceptors"> |
| <title>HTTP protocol interceptors</title> |
| <para>The HTTP protocol interceptor is a routine that implements a specific aspect of the HTTP |
| protocol. Usually protocol interceptors are expected to act upon one specific header or |
| a group of related headers of the incoming message, or populate the outgoing message with |
| one specific header or a group of related headers. Protocol interceptors can also |
| manipulate content entities enclosed with messages - transparent content compression / |
| decompression being a good example. Usually this is accomplished by using the |
| 'Decorator' pattern where a wrapper entity class is used to decorate the original |
| entity. Several protocol interceptors can be combined to form one logical unit.</para> |
| <para>Protocol interceptors can collaborate by sharing information - such as a processing |
| state - through the HTTP execution context. Protocol interceptors can use HTTP context |
| to store a processing state for one request or several consecutive requests.</para> |
| <para>Usually the order in which interceptors are executed should not matter as long as they |
| do not depend on a particular state of the execution context. If protocol interceptors |
| have interdependencies and therefore must be executed in a particular order, they should |
| be added to the protocol processor in the same sequence as their expected execution |
| order.</para> |
| <para>Protocol interceptors must be implemented as thread-safe. Similarly to servlets, |
| protocol interceptors should not use instance variables unless access to those variables |
| is synchronized.</para> |
| <para>This is an example of how local context can be used to persist a processing state |
| between consecutive requests:</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.custom() |
| .addInterceptorLast(new HttpRequestInterceptor() { |
| |
| public void process( |
| final HttpRequest request, |
| final HttpContext context) throws HttpException, IOException { |
| AtomicInteger count = (AtomicInteger) context.getAttribute("count"); |
| request.addHeader("Count", Integer.toString(count.getAndIncrement())); |
| } |
| |
| }) |
| .build(); |
| |
| AtomicInteger count = new AtomicInteger(1); |
| HttpClientContext localContext = HttpClientContext.create(); |
| localContext.setAttribute("count", count); |
| |
| HttpGet httpget = new HttpGet("http://localhost/"); |
| for (int i = 0; i < 10; i++) { |
| CloseableHttpResponse response = httpclient.execute(httpget, localContext); |
| try { |
| HttpEntity entity = response.getEntity(); |
| } finally { |
| response.close(); |
| } |
| } |
| ]]></programlisting> |
| </section> |
| <section> |
| <title>Exception handling</title> |
| <para>HTTP protocol processors can throw two types of exceptions: |
| <exceptionname>java.io.IOException</exceptionname> in case of an I/O failure such as |
| socket timeout or an socket reset and <exceptionname>HttpException</exceptionname> that |
| signals an HTTP failure such as a violation of the HTTP protocol. Usually I/O errors are |
| considered non-fatal and recoverable, whereas HTTP protocol errors are considered fatal |
| and cannot be automatically recovered from. Please note that <interfacename>HttpClient |
| </interfacename> implementations re-throw <exceptionname>HttpException</exceptionname>s |
| as <exceptionname>ClientProtocolException</exceptionname>, which is a subclass |
| of <exceptionname>java.io.IOException</exceptionname>. This enables the users |
| of <interfacename>HttpClient</interfacename> to handle both I/O errors and protocol |
| violations from a single catch clause.</para> |
| <section> |
| <title>HTTP transport safety</title> |
| <para>It is important to understand that the HTTP protocol is not well suited to all |
| types of applications. HTTP is a simple request/response oriented protocol which was |
| initially designed to support static or dynamically generated content retrieval. It |
| has never been intended to support transactional operations. For instance, the HTTP |
| server will consider its part of the contract fulfilled if it succeeds in receiving |
| and processing the request, generating a response and sending a status code back to |
| the client. The server will make no attempt to roll back the transaction if the |
| client fails to receive the response in its entirety due to a read timeout, a |
| request cancellation or a system crash. If the client decides to retry the same |
| request, the server will inevitably end up executing the same transaction more than |
| once. In some cases this may lead to application data corruption or inconsistent |
| application state.</para> |
| <para>Even though HTTP has never been designed to support transactional processing, it |
| can still be used as a transport protocol for mission critical applications provided |
| certain conditions are met. To ensure HTTP transport layer safety the system must |
| ensure the idempotency of HTTP methods on the application layer.</para> |
| </section> |
| <section> |
| <title>Idempotent methods</title> |
| <para>HTTP/1.1 specification defines an idempotent method as</para> |
| <para> |
| <citation>Methods can also have the property of "idempotence" in |
| that (aside from error or expiration issues) the side-effects of N > 0 |
| identical requests is the same as for a single request</citation> |
| </para> |
| <para>In other words the application ought to ensure that it is prepared to deal with |
| the implications of multiple execution of the same method. This can be achieved, for |
| instance, by providing a unique transaction id and by other means of avoiding |
| execution of the same logical operation.</para> |
| <para>Please note that this problem is not specific to HttpClient. Browser based |
| applications are subject to exactly the same issues related to HTTP methods |
| non-idempotency.</para> |
| <para>By default HttpClient assumes only non-entity enclosing methods such as |
| <literal>GET</literal> and <literal>HEAD</literal> to be idempotent and entity |
| enclosing methods such as <literal>POST</literal> and <literal>PUT</literal> to be |
| not for compatibility reasons.</para> |
| </section> |
| <section> |
| <title>Automatic exception recovery</title> |
| <para>By default HttpClient attempts to automatically recover from I/O exceptions. The |
| default auto-recovery mechanism is limited to just a few exceptions that are known |
| to be safe.</para> |
| <itemizedlist> |
| <listitem> |
| <para>HttpClient will make no attempt to recover from any logical or HTTP |
| protocol errors (those derived from |
| <exceptionname>HttpException</exceptionname> class).</para> |
| </listitem> |
| <listitem> |
| <para>HttpClient will automatically retry those methods that are assumed to be |
| idempotent.</para> |
| </listitem> |
| <listitem> |
| <para>HttpClient will automatically retry those methods that fail with a |
| transport exception while the HTTP request is still being transmitted to the |
| target server (i.e. the request has not been fully transmitted to the |
| server).</para> |
| </listitem> |
| </itemizedlist> |
| </section> |
| <section> |
| <title>Request retry handler</title> |
| <para>In order to enable a custom exception recovery mechanism one should provide an |
| implementation of the <interfacename>HttpRequestRetryHandler</interfacename> |
| interface.</para> |
| <programlisting><![CDATA[ |
| HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler() { |
| |
| public boolean retryRequest( |
| IOException exception, |
| int executionCount, |
| HttpContext context) { |
| if (executionCount >= 5) { |
| // Do not retry if over max retry count |
| return false; |
| } |
| if (exception instanceof InterruptedIOException) { |
| // Timeout |
| return false; |
| } |
| if (exception instanceof UnknownHostException) { |
| // Unknown host |
| return false; |
| } |
| if (exception instanceof ConnectTimeoutException) { |
| // Connection refused |
| return false; |
| } |
| if (exception instanceof SSLException) { |
| // SSL handshake exception |
| return false; |
| } |
| HttpClientContext clientContext = HttpClientContext.adapt(context); |
| HttpRequest request = clientContext.getRequest(); |
| boolean idempotent = !(request instanceof HttpEntityEnclosingRequest); |
| if (idempotent) { |
| // Retry if the request is considered idempotent |
| return true; |
| } |
| return false; |
| } |
| |
| }; |
| CloseableHttpClient httpclient = HttpClients.custom() |
| .setRetryHandler(myRetryHandler) |
| .build(); |
| ]]></programlisting> |
| <para>Please note that one can use <classname>StandardHttpRequestRetryHandler</classname> |
| instead of the one used by default in order to treat those request methods defined |
| as idempotent by RFC-2616 as safe to retry automatically: <literal>GET</literal>, |
| <literal>HEAD</literal>, <literal>PUT</literal>, <literal>DELETE</literal>, <literal> |
| OPTIONS</literal>, and <literal>TRACE</literal>.</para> |
| </section> |
| </section> |
| <section> |
| <title>Aborting requests</title> |
| <para>In some situations HTTP request execution fails to complete within the expected time |
| frame due to high load on the target server or too many concurrent requests issued on |
| the client side. In such cases it may be necessary to terminate the request prematurely |
| and unblock the execution thread blocked in a I/O operation. HTTP requests being |
| executed by HttpClient can be aborted at any stage of execution by invoking |
| <methodname>HttpUriRequest#abort()</methodname> method. This method is thread-safe |
| and can be called from any thread. When an HTTP request is aborted its execution thread |
| - even if currently blocked in an I/O operation - is guaranteed to unblock by throwing a |
| <exceptionname>InterruptedIOException</exceptionname></para> |
| </section> |
| <section> |
| <title>Redirect handling</title> |
| <para>HttpClient handles all types of redirects automatically, except those explicitly |
| prohibited by the HTTP specification as requiring user intervention. <literal>See |
| Other</literal> (status code 303) redirects on <literal>POST</literal> and |
| <literal>PUT</literal> requests are converted to <literal>GET</literal> requests as |
| required by the HTTP specification. One can use a custom redirect strategy to relaxe |
| restrictions on automatic redirection of POST methods imposed by the HTTP |
| specification.</para> |
| <programlisting><![CDATA[ |
| LaxRedirectStrategy redirectStrategy = new LaxRedirectStrategy(); |
| CloseableHttpClient httpclient = HttpClients.custom() |
| .setRedirectStrategy(redirectStrategy) |
| .build(); |
| ]]></programlisting> |
| <para>HttpClient often has to rewrite the request message in the process of its execution. |
| Per default HTTP/1.0 and HTTP/1.1 generally use relative request URIs. Likewise, |
| original request may get redirected from location to another multiple times. The final |
| interpreted absolute HTTP location can be built using the original request and |
| the context. The utility method <classname>URIUtils#resolve</classname> can be used |
| to build the interpreted absolute URI used to generate the final request. This method |
| includes the last fragment identifier from the redirect requests or the original |
| request.</para> |
| <programlisting><![CDATA[ |
| CloseableHttpClient httpclient = HttpClients.createDefault(); |
| HttpClientContext context = HttpClientContext.create(); |
| HttpGet httpget = new HttpGet("http://localhost:8080/"); |
| CloseableHttpResponse response = httpclient.execute(httpget, context); |
| try { |
| HttpHost target = context.getTargetHost(); |
| List<URI> redirectLocations = context.getRedirectLocations(); |
| URI location = URIUtils.resolve(httpget.getURI(), target, redirectLocations); |
| System.out.println("Final HTTP location: " + location.toASCIIString()); |
| // Expected to be an absolute URI |
| } finally { |
| response.close(); |
| } |
| ]]></programlisting> |
| </section> |
| </chapter> |