You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by "Mario Sangiorgio (JIRA)" <ji...@apache.org> on 2010/05/29 23:29:35 UTC

[jira] Created: (HTTPCLIENT-947) HTTPClient downloads an empty page

HTTPClient downloads an empty page
----------------------------------

                 Key: HTTPCLIENT-947
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-947
             Project: HttpComponents HttpClient
          Issue Type: Bug
    Affects Versions: 4.0.1
         Environment: Mac OS X
            Reporter: Mario Sangiorgio
         Attachments: log

I am facing a really weird behavior of HTTPClient downloading dynamically generated pages. It seems that depending on the page it is able or not to get the content.

My issue is related to the IEEE explore website, I can regularly access information for journal papers, but I cannot see data about conference papers.
This is an example of the pages I can download
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5456077
and this is one of the pages is giving me troubles
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=671096

Please note that without the proxy that provides the authentication I am able to download both the pages, but if I need to authenticate to see the data HTTPClient downloads in the right way just the journal papers.

I attached the log of a simple application that first successfully accesses to http://ieexplore.ieee.org/ and then tries to get http://ieexplore.ieee.org/xpls/abs_all.jsp?arnumber=840991&tag=1
For privacy reason I substituted with OMITTED the username I used to authenticate with my university proxy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-947) HTTPClient downloads an empty page

Posted by "Mario Sangiorgio (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873484#action_12873484 ] 

Mario Sangiorgio commented on HTTPCLIENT-947:
---------------------------------------------

You can see the failing request in the log file I attached, right after the first line of ********************************************
If you need something else (For example a request that is not failing to a journal paper) please let me know and I will provide it to you as soon as possible

> HTTPClient downloads an empty page
> ----------------------------------
>
>                 Key: HTTPCLIENT-947
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-947
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 4.0.1
>         Environment: Mac OS X
>            Reporter: Mario Sangiorgio
>         Attachments: log
>
>
> I am facing a really weird behavior of HTTPClient downloading dynamically generated pages. It seems that depending on the page it is able or not to get the content.
> My issue is related to the IEEE explore website, I can regularly access information for journal papers, but I cannot see data about conference papers.
> This is an example of the pages I can download
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5456077
> and this is one of the pages is giving me troubles
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=671096
> Please note that without the proxy that provides the authentication I am able to download both the pages, but if I need to authenticate to see the data HTTPClient downloads in the right way just the journal papers.
> I attached the log of a simple application that first successfully accesses to http://ieexplore.ieee.org/ and then tries to get http://ieexplore.ieee.org/xpls/abs_all.jsp?arnumber=840991&tag=1
> For privacy reason I substituted with OMITTED the username I used to authenticate with my university proxy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Updated: (HTTPCLIENT-947) HTTPClient downloads an empty page

Posted by "Mario Sangiorgio (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HTTPCLIENT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mario Sangiorgio updated HTTPCLIENT-947:
----------------------------------------

    Attachment: log

> HTTPClient downloads an empty page
> ----------------------------------
>
>                 Key: HTTPCLIENT-947
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-947
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 4.0.1
>         Environment: Mac OS X
>            Reporter: Mario Sangiorgio
>         Attachments: log
>
>
> I am facing a really weird behavior of HTTPClient downloading dynamically generated pages. It seems that depending on the page it is able or not to get the content.
> My issue is related to the IEEE explore website, I can regularly access information for journal papers, but I cannot see data about conference papers.
> This is an example of the pages I can download
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5456077
> and this is one of the pages is giving me troubles
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=671096
> Please note that without the proxy that provides the authentication I am able to download both the pages, but if I need to authenticate to see the data HTTPClient downloads in the right way just the journal papers.
> I attached the log of a simple application that first successfully accesses to http://ieexplore.ieee.org/ and then tries to get http://ieexplore.ieee.org/xpls/abs_all.jsp?arnumber=840991&tag=1
> For privacy reason I substituted with OMITTED the username I used to authenticate with my university proxy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Commented: (HTTPCLIENT-947) HTTPClient downloads an empty page

Posted by "Oleg Kalnichevski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HTTPCLIENT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873483#action_12873483 ] 

Oleg Kalnichevski commented on HTTPCLIENT-947:
----------------------------------------------

Where is the log of the failing request?

Oleg

> HTTPClient downloads an empty page
> ----------------------------------
>
>                 Key: HTTPCLIENT-947
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-947
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 4.0.1
>         Environment: Mac OS X
>            Reporter: Mario Sangiorgio
>         Attachments: log
>
>
> I am facing a really weird behavior of HTTPClient downloading dynamically generated pages. It seems that depending on the page it is able or not to get the content.
> My issue is related to the IEEE explore website, I can regularly access information for journal papers, but I cannot see data about conference papers.
> This is an example of the pages I can download
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5456077
> and this is one of the pages is giving me troubles
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=671096
> Please note that without the proxy that provides the authentication I am able to download both the pages, but if I need to authenticate to see the data HTTPClient downloads in the right way just the journal papers.
> I attached the log of a simple application that first successfully accesses to http://ieexplore.ieee.org/ and then tries to get http://ieexplore.ieee.org/xpls/abs_all.jsp?arnumber=840991&tag=1
> For privacy reason I substituted with OMITTED the username I used to authenticate with my university proxy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[jira] Resolved: (HTTPCLIENT-947) HTTPClient downloads an empty page

Posted by "Oleg Kalnichevski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HTTPCLIENT-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Oleg Kalnichevski resolved HTTPCLIENT-947.
------------------------------------------

    Resolution: Invalid

You are getting back HTTP/1.0 200 OK, which means the request was successfully authenticated by the proxy. I cannot tell why there was no content returned, but in any case this is not a problem with HttpClient.

Oleg   

> HTTPClient downloads an empty page
> ----------------------------------
>
>                 Key: HTTPCLIENT-947
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-947
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>    Affects Versions: 4.0.1
>         Environment: Mac OS X
>            Reporter: Mario Sangiorgio
>         Attachments: log
>
>
> I am facing a really weird behavior of HTTPClient downloading dynamically generated pages. It seems that depending on the page it is able or not to get the content.
> My issue is related to the IEEE explore website, I can regularly access information for journal papers, but I cannot see data about conference papers.
> This is an example of the pages I can download
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5456077
> and this is one of the pages is giving me troubles
> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=671096
> Please note that without the proxy that provides the authentication I am able to download both the pages, but if I need to authenticate to see the data HTTPClient downloads in the right way just the journal papers.
> I attached the log of a simple application that first successfully accesses to http://ieexplore.ieee.org/ and then tries to get http://ieexplore.ieee.org/xpls/abs_all.jsp?arnumber=840991&tag=1
> For privacy reason I substituted with OMITTED the username I used to authenticate with my university proxy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org