You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Night Mare <ni...@gmail.com> on 2011/02/11 00:51:48 UTC

Header and Content parsing and saving as html page

Hi there,
I have captured some http packets using wireshark. And I want to save
content of these packets like index.html
I can read packets and convert them to the string but  i can not decode them
(chunk, gzip vs. problems)
For example i have a sample string like below(Note: string contains \r\n
characters but i can not show them in mail) :


--------------------------------------------------------
HTTP/1.1 200 OK
Pragma: no-cache
Cache-Control: private, max-age=0, no-cache
Expires: Mon, 26 Jul 1997 05:00:00 GMT
P3P: CP="CUR ADM OUR NOR STA NID"
Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri, 10-Feb-2012
22:39:47 GMT; path=/
Content-type: text/html; charset=UTF-8
Content-Encoding: gzip
Vary: Accept-Encoding
Connection: close
Transfer-Encoding: chunked
Date: Thu, 10 Feb 2011 22:39:47 GMT
Server: what server

292
..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
*]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
x.,....4..f.dG.f-...........@w
M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
0

---------------------------------------------------------------------------



is there any way to decode this as html page. if it is possible please some
code or  similar code.

Best Regards.

Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
Thanks olegk you gave me indeed best answer that I wait.
I really appreciate your help.

olegk wrote:
> 
> On Fri, 2011-02-11 at 01:30 -0800, CodingForever wrote:
>> I appreciated. 
>> That is working like I want.
>> You see that, i am trying to decoding html page using header and
>> content(offline). And I am not a perfect about httpclient. So I could not
>> find the best solution for my problem.
>> 
>> Think that you have a,
>> Header and Content(raw content(gzip,deflate,may be chunked) )
>> I need a solution that , I will give header and content then reading the
>> decoded output until the end of the page.
>> Can you offer me a solution for this problem?
>> 
>> Best Regards.
>> 
> 
> String s =
>     "HTTP/1.1 200 OK\r\n"
>     + "Server: whatever\r\n"
>     + "Date: some date\r\n"
>     + "Set-Cookie: c1=stuff\r\n"
>     + "Transfer-Encoding: chunked\r\n"
>     + "Content-Type: text/html; charset=ISO-8859-1\r\n"
>     + "\r\n"
>     + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
> SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
> "US-ASCII");
> HttpResponseParser parser = new HttpResponseParser(
>     inbuffer,
>     BasicLineParser.DEFAULT,
>     new DefaultHttpResponseFactory(),
>     new BasicHttpParams());
> HttpResponse response = (HttpResponse) parser.parse();
> EntityDeserializer deserializer = new EntityDeserializer(new
> LaxContentLengthStrategy());
> HttpEntity entity = deserializer.deserialize(inbuffer, response);
> response.setEntity(entity);
> BasicHttpContext context = new BasicHttpContext();
> ResponseContentEncoding processor = new ResponseContentEncoding();
> processor.process(response, context);
> System.out.println(EntityUtils.toString(entity));
> 
> ---
> 
> Oleg
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30902172.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
I am still need some help
Anybody can help me about this problem ?

CodingForever wrote:
> 
> Hi, oleg
> I used the code that you have post. But I have some problems;
> 
> 1- My message can consist of multiple parts so when data transferred as
> chunked.(incompleted chunks) It throws;
> 
> ---------------------------------------------------------
> org.apache.http.TruncatedChunkException: Truncated chunk ( expected size:
> 8106; actual size: 899)
>         at
> org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:182)
>         at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
>         at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
>         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
>         at java.io.InputStreamReader.read(InputStreamReader.java:184)
>         at java.io.Reader.read(Reader.java:140)
> ------------------------------------------------------------- 
> 
> 2- When I buffered all data, buffer(all chunks) reaches big buffer size
> and also it throws;
> 
> ---------------------------------------------------------------
> org.apache.http.MalformedChunkCodingException: Unexpected content at the
> end of chunk
> at
> org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:239)
>         at
> org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:204)
>         at
> org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:167)
>         at
> org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:196)
>         at
> org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:292)
>         at sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:376)
>         at sun.nio.cs.StreamDecoder.close(StreamDecoder.java:191)
>         at java.io.InputStreamReader.close(InputStreamReader.java:199)
>         at org.apache.http.util.EntityUtils.toString(EntityUtils.java:203)
>         at org.apache.http.util.EntityUtils.toString(EntityUtils.java:221)
> ----------------------------------------------------------------------------------
> 
> Best Regards.
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30926244.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
Hi, oleg
I used the code that you have post. But I have some problems;

1- My message can consist of multiple parts so when data transferred as
chunked.(incompleted chunks) It throws;

---------------------------------------------------------
org.apache.http.TruncatedChunkException: Truncated chunk ( expected size:
8106; actual size: 899)
        at
org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:182)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at java.io.Reader.read(Reader.java:140)
------------------------------------------------------------- 

2- When I buffered all data, buffer(all chunks) reaches big buffer size and
also it throws;

---------------------------------------------------------------
org.apache.http.MalformedChunkCodingException: Unexpected content at the end
of chunk
at
org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:239)
        at
org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:204)
        at
org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:167)
        at
org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:196)
        at
org.apache.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:292)
        at sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java:376)
        at sun.nio.cs.StreamDecoder.close(StreamDecoder.java:191)
        at java.io.InputStreamReader.close(InputStreamReader.java:199)
        at org.apache.http.util.EntityUtils.toString(EntityUtils.java:203)
        at org.apache.http.util.EntityUtils.toString(EntityUtils.java:221)
----------------------------------------------------------------------------------

Best Regards.
-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30922010.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Fri, 2011-02-11 at 01:30 -0800, CodingForever wrote:
> I appreciated. 
> That is working like I want.
> You see that, i am trying to decoding html page using header and
> content(offline). And I am not a perfect about httpclient. So I could not
> find the best solution for my problem.
> 
> Think that you have a,
> Header and Content(raw content(gzip,deflate,may be chunked) )
> I need a solution that , I will give header and content then reading the
> decoded output until the end of the page.
> Can you offer me a solution for this problem?
> 
> Best Regards.
> 

String s =
    "HTTP/1.1 200 OK\r\n"
    + "Server: whatever\r\n"
    + "Date: some date\r\n"
    + "Set-Cookie: c1=stuff\r\n"
    + "Transfer-Encoding: chunked\r\n"
    + "Content-Type: text/html; charset=ISO-8859-1\r\n"
    + "\r\n"
    + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
"US-ASCII");
HttpResponseParser parser = new HttpResponseParser(
    inbuffer,
    BasicLineParser.DEFAULT,
    new DefaultHttpResponseFactory(),
    new BasicHttpParams());
HttpResponse response = (HttpResponse) parser.parse();
EntityDeserializer deserializer = new EntityDeserializer(new
LaxContentLengthStrategy());
HttpEntity entity = deserializer.deserialize(inbuffer, response);
response.setEntity(entity);
BasicHttpContext context = new BasicHttpContext();
ResponseContentEncoding processor = new ResponseContentEncoding();
processor.process(response, context);
System.out.println(EntityUtils.toString(entity));

---

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
Thanks, I will use Tika after the httpclient operations. 

Ken Krugler wrote:
> 
> Normally you'd let HttpClient handle decoding, chunked responses, etc.  
> Then what you save is the raw content (as an array of bytes) and the  
> response headers.
> 
> Converting the above into a parsable page is something best handled by  
> Tika (as an example), since it will attempt to determine the charset  
> encoding for the bytes, based on the response header, the HTML markup,  
> and (worst case) statistics from the bytes.
> 
> The second, off-line step has nothing to do with HttpClient.
> 
> -- Ken
> 
> On Feb 11, 2011, at 1:30am, CodingForever wrote:
> 
>>
>> I appreciated.
>> That is working like I want.
>> You see that, i am trying to decoding html page using header and
>> content(offline). And I am not a perfect about httpclient. So I  
>> could not
>> find the best solution for my problem.
>>
>> Think that you have a,
>> Header and Content(raw content(gzip,deflate,may be chunked) )
>> I need a solution that , I will give header and content then reading  
>> the
>> decoded output until the end of the page.
>> Can you offer me a solution for this problem?
>>
>> Best Regards.
>>
>> olegk wrote:
>>>
>>> On Fri, 2011-02-11 at 00:39 -0800, CodingForever wrote:
>>>> Thanks olegk for the answer,Now I am looking that. But I will ask
>>>> something
>>>> I wrote the code that below. How can I get the decoded content using
>>>> header
>>>> parameters ?
>>>
>>> String s =
>>>    "HTTP/1.1 200 OK\r\n"
>>>    + "Server: whatever\r\n"
>>>    + "Date: some date\r\n"
>>>    + "Set-Cookie: c1=stuff\r\n"
>>>    + "Transfer-Encoding: chunked\r\n"
>>>    + "Content-Type: text/html; charset=ISO-8859-1\r\n"
>>>    + "\r\n"
>>>    + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
>>> SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
>>> "US-ASCII");
>>> HttpResponseParser parser = new HttpResponseParser(
>>>    inbuffer,
>>>    BasicLineParser.DEFAULT,
>>>    new DefaultHttpResponseFactory(),
>>>    new BasicHttpParams());
>>> HttpResponse response = (HttpResponse) parser.parse();
>>> EntityDeserializer deserializer = new EntityDeserializer(new
>>> LaxContentLengthStrategy());
>>> HttpEntity entity = deserializer.deserialize(inbuffer, response);
>>> System.out.println(EntityUtils.toString(entity, HTTP.ASCII));
>>>
>>> ---
>>>
>>> Oleg
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30899580.html
>> Sent from the HttpClient-User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
> 
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30901913.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by Ken Krugler <kk...@transpac.com>.
Normally you'd let HttpClient handle decoding, chunked responses, etc.  
Then what you save is the raw content (as an array of bytes) and the  
response headers.

Converting the above into a parsable page is something best handled by  
Tika (as an example), since it will attempt to determine the charset  
encoding for the bytes, based on the response header, the HTML markup,  
and (worst case) statistics from the bytes.

The second, off-line step has nothing to do with HttpClient.

-- Ken

On Feb 11, 2011, at 1:30am, CodingForever wrote:

>
> I appreciated.
> That is working like I want.
> You see that, i am trying to decoding html page using header and
> content(offline). And I am not a perfect about httpclient. So I  
> could not
> find the best solution for my problem.
>
> Think that you have a,
> Header and Content(raw content(gzip,deflate,may be chunked) )
> I need a solution that , I will give header and content then reading  
> the
> decoded output until the end of the page.
> Can you offer me a solution for this problem?
>
> Best Regards.
>
> olegk wrote:
>>
>> On Fri, 2011-02-11 at 00:39 -0800, CodingForever wrote:
>>> Thanks olegk for the answer,Now I am looking that. But I will ask
>>> something
>>> I wrote the code that below. How can I get the decoded content using
>>> header
>>> parameters ?
>>
>> String s =
>>    "HTTP/1.1 200 OK\r\n"
>>    + "Server: whatever\r\n"
>>    + "Date: some date\r\n"
>>    + "Set-Cookie: c1=stuff\r\n"
>>    + "Transfer-Encoding: chunked\r\n"
>>    + "Content-Type: text/html; charset=ISO-8859-1\r\n"
>>    + "\r\n"
>>    + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
>> SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
>> "US-ASCII");
>> HttpResponseParser parser = new HttpResponseParser(
>>    inbuffer,
>>    BasicLineParser.DEFAULT,
>>    new DefaultHttpResponseFactory(),
>>    new BasicHttpParams());
>> HttpResponse response = (HttpResponse) parser.parse();
>> EntityDeserializer deserializer = new EntityDeserializer(new
>> LaxContentLengthStrategy());
>> HttpEntity entity = deserializer.deserialize(inbuffer, response);
>> System.out.println(EntityUtils.toString(entity, HTTP.ASCII));
>>
>> ---
>>
>> Oleg
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30899580.html
> Sent from the HttpClient-User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
I appreciated. 
That is working like I want.
You see that, i am trying to decoding html page using header and
content(offline). And I am not a perfect about httpclient. So I could not
find the best solution for my problem.

Think that you have a,
Header and Content(raw content(gzip,deflate,may be chunked) )
I need a solution that , I will give header and content then reading the
decoded output until the end of the page.
Can you offer me a solution for this problem?

Best Regards.

olegk wrote:
> 
> On Fri, 2011-02-11 at 00:39 -0800, CodingForever wrote:
>> Thanks olegk for the answer,Now I am looking that. But I will ask
>> something
>> I wrote the code that below. How can I get the decoded content using
>> header
>> parameters ?
> 
> String s =
>     "HTTP/1.1 200 OK\r\n"
>     + "Server: whatever\r\n"
>     + "Date: some date\r\n"
>     + "Set-Cookie: c1=stuff\r\n"
>     + "Transfer-Encoding: chunked\r\n"
>     + "Content-Type: text/html; charset=ISO-8859-1\r\n"
>     + "\r\n"
>     + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
> SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
> "US-ASCII");
> HttpResponseParser parser = new HttpResponseParser(
>     inbuffer,
>     BasicLineParser.DEFAULT,
>     new DefaultHttpResponseFactory(),
>     new BasicHttpParams());
> HttpResponse response = (HttpResponse) parser.parse();
> EntityDeserializer deserializer = new EntityDeserializer(new
> LaxContentLengthStrategy());
> HttpEntity entity = deserializer.deserialize(inbuffer, response);
> System.out.println(EntityUtils.toString(entity, HTTP.ASCII));
> 
> ---
> 
> Oleg
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30899580.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Fri, 2011-02-11 at 00:39 -0800, CodingForever wrote:
> Thanks olegk for the answer,Now I am looking that. But I will ask something
> I wrote the code that below. How can I get the decoded content using header
> parameters ?

String s =
    "HTTP/1.1 200 OK\r\n"
    + "Server: whatever\r\n"
    + "Date: some date\r\n"
    + "Set-Cookie: c1=stuff\r\n"
    + "Transfer-Encoding: chunked\r\n"
    + "Content-Type: text/html; charset=ISO-8859-1\r\n"
    + "\r\n"
    + "5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test";
SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
"US-ASCII");
HttpResponseParser parser = new HttpResponseParser(
    inbuffer,
    BasicLineParser.DEFAULT,
    new DefaultHttpResponseFactory(),
    new BasicHttpParams());
HttpResponse response = (HttpResponse) parser.parse();
EntityDeserializer deserializer = new EntityDeserializer(new
LaxContentLengthStrategy());
HttpEntity entity = deserializer.deserialize(inbuffer, response);
System.out.println(EntityUtils.toString(entity, HTTP.ASCII));

---

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
Thanks olegk for the answer,Now I am looking that. But I will ask something
I wrote the code that below. How can I get the decoded content using header
parameters ?
also I want to read until end of chunk specifier 0\r\n but it returns all
data include "test" string.

------------------------------------------------------
String s =
                "HTTP/1.1 200 OK\r\n"
                + "Server: whatever\r\n"
                + "Date: some date\r\n"
                + "Set-Cookie: c1=stuff\r\n"
                + "Transfer-Encoding: chunked"
                + "Content-Type: text/html; charset=ISO-8859-1"
                + "\r\n";
        SessionInputBuffer inbuffer = new SessionInputBufferMockup(s,
"US-ASCII");
        HttpResponseParser parser = new HttpResponseParser(
                inbuffer,
                BasicLineParser.DEFAULT,
                new DefaultHttpResponseFactory(),
                new BasicHttpParams());


        HttpContext context = new BasicHttpContext();
        HttpResponse response = (HttpResponse)parser.parse();
        StringEntity entity = new
StringEntity("5\r\n01234\r\n5\r\n56789\r\n6\r\nabcdef\r\n0\r\n\r\n test");
        entity.setChunked(true);
        response.setEntity(entity);
        ResponseConnControl interceptor = new ResponseConnControl();
        interceptor.process(response, context);

        byte [] b = new byte[1000];
        int len;
        len = response.getEntity().getContent().read(b);
        System.out.println(new String(b,0,len));
-----------------------------------------------------------------------------







olegk wrote:
> 
> On Thu, 2011-02-10 at 23:30 -0800, CodingForever wrote:
>> I am still need help , I could not find the solution.
>> CodingForever wrote:
>> > 
> 
> Have a look at the test cases for EntityDeserializer
> 
> http://hc.apache.org/httpcomponents-core-ga/httpcore/xref-test/org/apache/http/impl/entity/TestEntityDeserializer.html
> 
> Oleg
> 
>> > Thanks for answer but,
>> > I looked the examples that you send many times before. But I am working
>> on
>> > offline mode. But example codes are for online test.
>> > I will not read requests or responses from socket. I will read from a
>> > string.
>> > I need a mechanism that
>> > I will give string( header and body)  and I will save the output as
>> html
>> > page.
>> > 
>> > Best Regards.
>> > 
>> > On 11 February 2011 02:07, sebb <se...@gmail.com> wrote:
>> > 
>> >> On 10 February 2011 23:51, Night Mare <ni...@gmail.com> wrote:
>> >> > Hi there,
>> >> > I have captured some http packets using wireshark. And I want to
>> save
>> >> > content of these packets like index.html
>> >> > I can read packets and convert them to the string but  i can not
>> decode
>> >> them
>> >> > (chunk, gzip vs. problems)
>> >> > For example i have a sample string like below(Note: string contains
>> >> \r\n
>> >> > characters but i can not show them in mail) :
>> >> >
>> >> >
>> >> > --------------------------------------------------------
>> >> > HTTP/1.1 200 OK
>> >> > Pragma: no-cache
>> >> > Cache-Control: private, max-age=0, no-cache
>> >> > Expires: Mon, 26 Jul 1997 05:00:00 GMT
>> >> > P3P: CP="CUR ADM OUR NOR STA NID"
>> >> > Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri,
>> >> 10-Feb-2012
>> >> > 22:39:47 GMT; path=/
>> >> > Content-type: text/html; charset=UTF-8
>> >> > Content-Encoding: gzip
>> >> > Vary: Accept-Encoding
>> >> > Connection: close
>> >> > Transfer-Encoding: chunked
>> >> > Date: Thu, 10 Feb 2011 22:39:47 GMT
>> >> > Server: what server
>> >> >
>> >> > 292
>> >> >
>> >>
>> ..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
>> >> >
>> >>
>> *]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
>> >> > x.,....4..f.dG.f-...........@w
>> >> > M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
>> >> > V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
>> >> >
>> ....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
>> >> >
>> >>
>> ../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
>> >> > 0
>> >> >
>> >> >
>> >>
>> ---------------------------------------------------------------------------
>> >> >
>> >> >
>> >> >
>> >> > is there any way to decode this as html page. if it is possible
>> please
>> >> some
>> >> > code or  similar code.
>> >>
>> >> http://hc.apache.org/httpcomponents-client-ga/examples.html
>> >>
>> >> which is the first link I found by googling for "httpclient content
>> gzip"
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> >> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>> >>
>> >>
>> > 
>> > 
>> > -- 
>> > One Day You See Me...Not too far
>> > 
>> > 
>> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30899433.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2011-02-10 at 23:30 -0800, CodingForever wrote:
> I am still need help , I could not find the solution.
> CodingForever wrote:
> > 

Have a look at the test cases for EntityDeserializer

http://hc.apache.org/httpcomponents-core-ga/httpcore/xref-test/org/apache/http/impl/entity/TestEntityDeserializer.html

Oleg

> > Thanks for answer but,
> > I looked the examples that you send many times before. But I am working on
> > offline mode. But example codes are for online test.
> > I will not read requests or responses from socket. I will read from a
> > string.
> > I need a mechanism that
> > I will give string( header and body)  and I will save the output as html
> > page.
> > 
> > Best Regards.
> > 
> > On 11 February 2011 02:07, sebb <se...@gmail.com> wrote:
> > 
> >> On 10 February 2011 23:51, Night Mare <ni...@gmail.com> wrote:
> >> > Hi there,
> >> > I have captured some http packets using wireshark. And I want to save
> >> > content of these packets like index.html
> >> > I can read packets and convert them to the string but  i can not decode
> >> them
> >> > (chunk, gzip vs. problems)
> >> > For example i have a sample string like below(Note: string contains
> >> \r\n
> >> > characters but i can not show them in mail) :
> >> >
> >> >
> >> > --------------------------------------------------------
> >> > HTTP/1.1 200 OK
> >> > Pragma: no-cache
> >> > Cache-Control: private, max-age=0, no-cache
> >> > Expires: Mon, 26 Jul 1997 05:00:00 GMT
> >> > P3P: CP="CUR ADM OUR NOR STA NID"
> >> > Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri,
> >> 10-Feb-2012
> >> > 22:39:47 GMT; path=/
> >> > Content-type: text/html; charset=UTF-8
> >> > Content-Encoding: gzip
> >> > Vary: Accept-Encoding
> >> > Connection: close
> >> > Transfer-Encoding: chunked
> >> > Date: Thu, 10 Feb 2011 22:39:47 GMT
> >> > Server: what server
> >> >
> >> > 292
> >> >
> >> ..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
> >> >
> >> *]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
> >> > x.,....4..f.dG.f-...........@w
> >> > M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
> >> > V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
> >> > ....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
> >> >
> >> ../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
> >> > 0
> >> >
> >> >
> >> ---------------------------------------------------------------------------
> >> >
> >> >
> >> >
> >> > is there any way to decode this as html page. if it is possible please
> >> some
> >> > code or  similar code.
> >>
> >> http://hc.apache.org/httpcomponents-client-ga/examples.html
> >>
> >> which is the first link I found by googling for "httpclient content gzip"
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> >> For additional commands, e-mail: httpclient-users-help@hc.apache.org
> >>
> >>
> > 
> > 
> > -- 
> > One Day You See Me...Not too far
> > 
> > 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by CodingForever <ni...@gmail.com>.
I am still need help , I could not find the solution.
CodingForever wrote:
> 
> Thanks for answer but,
> I looked the examples that you send many times before. But I am working on
> offline mode. But example codes are for online test.
> I will not read requests or responses from socket. I will read from a
> string.
> I need a mechanism that
> I will give string( header and body)  and I will save the output as html
> page.
> 
> Best Regards.
> 
> On 11 February 2011 02:07, sebb <se...@gmail.com> wrote:
> 
>> On 10 February 2011 23:51, Night Mare <ni...@gmail.com> wrote:
>> > Hi there,
>> > I have captured some http packets using wireshark. And I want to save
>> > content of these packets like index.html
>> > I can read packets and convert them to the string but  i can not decode
>> them
>> > (chunk, gzip vs. problems)
>> > For example i have a sample string like below(Note: string contains
>> \r\n
>> > characters but i can not show them in mail) :
>> >
>> >
>> > --------------------------------------------------------
>> > HTTP/1.1 200 OK
>> > Pragma: no-cache
>> > Cache-Control: private, max-age=0, no-cache
>> > Expires: Mon, 26 Jul 1997 05:00:00 GMT
>> > P3P: CP="CUR ADM OUR NOR STA NID"
>> > Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri,
>> 10-Feb-2012
>> > 22:39:47 GMT; path=/
>> > Content-type: text/html; charset=UTF-8
>> > Content-Encoding: gzip
>> > Vary: Accept-Encoding
>> > Connection: close
>> > Transfer-Encoding: chunked
>> > Date: Thu, 10 Feb 2011 22:39:47 GMT
>> > Server: what server
>> >
>> > 292
>> >
>> ..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
>> >
>> *]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
>> > x.,....4..f.dG.f-...........@w
>> > M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
>> > V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
>> > ....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
>> >
>> ../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
>> > 0
>> >
>> >
>> ---------------------------------------------------------------------------
>> >
>> >
>> >
>> > is there any way to decode this as html page. if it is possible please
>> some
>> > code or  similar code.
>>
>> http://hc.apache.org/httpcomponents-client-ga/examples.html
>>
>> which is the first link I found by googling for "httpclient content gzip"
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
>> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>>
>>
> 
> 
> -- 
> One Day You See Me...Not too far
> 
> 

-- 
View this message in context: http://old.nabble.com/Header-and-Content-parsing-and-saving-as-html-page-tp30897495p30899084.html
Sent from the HttpClient-User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Header and Content parsing and saving as html page

Posted by Night Mare <ni...@gmail.com>.
Thanks for answer but,
I looked the examples that you send many times before. But I am working on
offline mode. But example codes are for online test.
I will not read requests or responses from socket. I will read from a
string.
I need a mechanism that
I will give string( header and body)  and I will save the output as html
page.

Best Regards.

On 11 February 2011 02:07, sebb <se...@gmail.com> wrote:

> On 10 February 2011 23:51, Night Mare <ni...@gmail.com> wrote:
> > Hi there,
> > I have captured some http packets using wireshark. And I want to save
> > content of these packets like index.html
> > I can read packets and convert them to the string but  i can not decode
> them
> > (chunk, gzip vs. problems)
> > For example i have a sample string like below(Note: string contains \r\n
> > characters but i can not show them in mail) :
> >
> >
> > --------------------------------------------------------
> > HTTP/1.1 200 OK
> > Pragma: no-cache
> > Cache-Control: private, max-age=0, no-cache
> > Expires: Mon, 26 Jul 1997 05:00:00 GMT
> > P3P: CP="CUR ADM OUR NOR STA NID"
> > Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri,
> 10-Feb-2012
> > 22:39:47 GMT; path=/
> > Content-type: text/html; charset=UTF-8
> > Content-Encoding: gzip
> > Vary: Accept-Encoding
> > Connection: close
> > Transfer-Encoding: chunked
> > Date: Thu, 10 Feb 2011 22:39:47 GMT
> > Server: what server
> >
> > 292
> >
> ..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
> >
> *]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
> > x.,....4..f.dG.f-...........@w
> > M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
> > V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
> > ....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
> >
> ../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
> > 0
> >
> >
> ---------------------------------------------------------------------------
> >
> >
> >
> > is there any way to decode this as html page. if it is possible please
> some
> > code or  similar code.
>
> http://hc.apache.org/httpcomponents-client-ga/examples.html
>
> which is the first link I found by googling for "httpclient content gzip"
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org
>
>


-- 
One Day You See Me...Not too far

Re: Header and Content parsing and saving as html page

Posted by sebb <se...@gmail.com>.
On 10 February 2011 23:51, Night Mare <ni...@gmail.com> wrote:
> Hi there,
> I have captured some http packets using wireshark. And I want to save
> content of these packets like index.html
> I can read packets and convert them to the string but  i can not decode them
> (chunk, gzip vs. problems)
> For example i have a sample string like below(Note: string contains \r\n
> characters but i can not show them in mail) :
>
>
> --------------------------------------------------------
> HTTP/1.1 200 OK
> Pragma: no-cache
> Cache-Control: private, max-age=0, no-cache
> Expires: Mon, 26 Jul 1997 05:00:00 GMT
> P3P: CP="CUR ADM OUR NOR STA NID"
> Set-Cookie: OAID=e1911d5cd304171a10af30e33c8a6126; expires=Fri, 10-Feb-2012
> 22:39:47 GMT; path=/
> Content-type: text/html; charset=UTF-8
> Content-Encoding: gzip
> Vary: Accept-Encoding
> Connection: close
> Transfer-Encoding: chunked
> Date: Thu, 10 Feb 2011 22:39:47 GMT
> Server: what server
>
> 292
> ..........|S.n.0.}_i..].X.H.do........Z.<UN.&n.;r.........NBY$...x.>sf.89.z...r{
> *]sp.....K...>./...\...7oo..z`..h.fR`...;.`.u.!........D...`.|{x4.}t.%..l....5.m...?...4..".E.B*
> x.,....4..f.dG.f-...........@w
> M......5gsI:....d".bPQVV:.=.4.9..K%..8..RE}.
> V.9.{Ft5.......Qa.T..'..s..h$h.Y*.b....z..x...,KN.g:).,9......
> ....m.xa....b...W......:..:..9.|.....-.1...B7.....c.....|./..b5......
> ../.....B..Mm.=wk.....UE:......%p.........-d=FQ[........8UB..0....B.;...`.^.9.}.S..Ak..yk..i.8........lFk.Z.3.t....BE....%.t....{.jW.G.kJ:li.ZY."B93.../..j^.X....`...M\..6...e..*.5+....,R........f=._....coD.s..6.q..yz\v?..zp.qobn.O-.$;T=*._@.....Z..*E..e._.......9.......
> 0
>
> ---------------------------------------------------------------------------
>
>
>
> is there any way to decode this as html page. if it is possible please some
> code or  similar code.

http://hc.apache.org/httpcomponents-client-ga/examples.html

which is the first link I found by googling for "httpclient content gzip"

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org