You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Philip Brown <ph...@primeradesigns.com> on 2006/08/28 15:30:47 UTC

opensearchservlet - HowTo ?

Hello,

I am looking at running an search on my index(nutch retreived) using 
opensearchservlet.

I have searched this user-forum and google**.com  for further info, but 
alas, no "no brainer" tutorials out there.

I have a simple idea how it might work...
1.set up a html search page
2.send query(form field) to servlet
3.servlet generates xml
4.servlet redirects to page which then reads outputted(stage 3) xml.
that's sort of how I envisage it would work. but only a guess.

Would anybody care to spend 5 minutes to kindly point me in the right 
direction.

Would appreciate any help very Nutch!

Kind regards,
Philip

Re: opensearchservlet - HowTo ?

Posted by Andrzej Bialecki <ab...@getopt.org>.
Philip Brown wrote:
>
> I did not want to drop whole .war into my already running web ap. Just 
> looking at dropping this nutch.searcher package in with the two 
> dependent imports needed ie. hadoop.conf.Configuration; and 
> nutch.util.NutchConfiguraion;  it will access the index under 
> tomcatServerRoot/nutch-0.8/crawl-result/index...

It will access whatever you specified in your nutch-default.xml or 
nutch-site.xml property "searcher.dir".

>
> so I see as you run the query and servlet outputs xml ... does it 
> include a xslt stylesheet with it to format the page.

No, but it shouldn't be too difficult to write it. Contributions are 
welcome :)

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: opensearchservlet - HowTo ?

Posted by Philip Brown <ph...@primeradesigns.com>.
Andrzej Bialecki wrote:
> Michael Wechner wrote:
>> Sandy Polanski wrote:
>>
>>> I second that.  Is there anyone that can give us some tips on how to 
>>> use the OpenSearchServlet?  I'd really like to see a standalone Java 
>>> program that would allow me to see the results in RSS format that I 
>>> can call from the "./bin/nutch" executable.
>>>  
>>>
>>
>> I guess the bin/nutch resp. some other program (maybe based on 
>> NutchBean) should return a RSS feed which then can be pulled/parsed 
>> by the OpenSearchServlet. The question is does something like this 
>> already exist within Nutch and if not is somebody writing something 
>> like this
>> (for instance myself ;-) but I would rather wait if somebody might 
>> answer the "exist" question ...
>
>
> Folks,
>
> As the name itself suggests, the servlet needs a servlet container to 
> run. If you build a standard WAR you will get among others the 
> OpenSearchServlet included in the WAR, under <contextPath>/opensearch. 
> Deploy this WAR to your favorite servlet container, e.g. Tomcat, and 
> you are ready to go.
>
> This is a REST-type service, which means that you send it requests as 
> standard HTTP GET-s with parameters in the URL, and as a response you 
> get an XML document.
>
> Example request:
>
>    http://localhost:8081/nutch/opensearch?query=cnn
>
> Example response:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <rss xmlns:nutch="http://www.nutch.org/opensearchrss/1.0/" 
> xmlns:opensearch="http://a9.com/-/spec/opensearchrss/1.0/" version="2.0">
> <channel>
> <title>Nutch: cnn</title>
> <description>Nutch search results for query: cnn</description>
> <link>http://localhost:8081/nutch/search.jsp?query=cnn&amp;start=0&amp;hitsPerDup=2&amp;hitsPerPage=10</link> 
>
> <opensearch:totalResults>1</opensearch:totalResults>
> <opensearch:startIndex>0</opensearch:startIndex>
> <opensearch:itemsPerPage>10</opensearch:itemsPerPage>
>
> <nutch:query>cnn</nutch:query>
> <item>
> <title>CNN.com - Breaking News, U.S., World, Weather, Entertainment 
> &amp; Video News</title>
> <description>&lt;span class="ellipsis"&gt; ... &lt;/span&gt;the world 
> Instant Access &lt;span class="highlight"&gt;CNN&lt;/span&gt; 
> International Live newscasts and&lt;span class="ellipsis"&gt; ... 
> &lt;/span&gt;Pipeline Overnight Live feeds from &lt;span 
> class="highlight"&gt;CNN&lt;/span&gt; and its global&lt;span 
> class="ellipsis"&gt; ... &lt;/span&gt;</description>
>
> <link>http://www.cnn.com/</link>
> <nutch:site>www.cnn.com</nutch:site>
> <nutch:cache>http://localhost:8081/nutch/cached.jsp?idx=0&amp;id=0</nutch:cache> 
>
> <nutch:explain>http://localhost:8081/nutch/explain.jsp?idx=0&amp;id=0&amp;query=cnn&amp;lang=null</nutch:explain> 
>
> <nutch:segment>20060817135307</nutch:segment>
> <nutch:digest>6e5e1ede359a88f11fc564cf22f79305</nutch:digest>
> <nutch:boost>2.5735338</nutch:boost>
>
> </item>
> </channel>
> </rss>
>
>
>
Thanks for reply,

I did not want to drop whole .war into my already running web ap. Just 
looking at dropping this nutch.searcher package in with the two 
dependent imports needed ie. hadoop.conf.Configuration; and 
nutch.util.NutchConfiguraion;  it will access the index under 
tomcatServerRoot/nutch-0.8/crawl-result/index...

so I see as you run the query and servlet outputs xml ... does it 
include a xslt stylesheet with it to format the page.

Unfortunately I don't have time to start experimenting and seeing my own 
results for the next few days, so I am just trying to garner info, to 
point me in the right direction. I appreciate all the replys.

Thanks

Re: opensearchservlet - HowTo ?

Posted by Andrzej Bialecki <ab...@getopt.org>.
Michael Wechner wrote:
> Sandy Polanski wrote:
>
>> I second that.  Is there anyone that can give us some tips on how to 
>> use the OpenSearchServlet?  I'd really like to see a standalone Java 
>> program that would allow me to see the results in RSS format that I 
>> can call from the "./bin/nutch" executable.
>>  
>>
>
> I guess the bin/nutch resp. some other program (maybe based on 
> NutchBean) should return a RSS feed which then can be pulled/parsed by 
> the OpenSearchServlet. The question is does something like this 
> already exist within Nutch and if not is somebody writing something 
> like this
> (for instance myself ;-) but I would rather wait if somebody might 
> answer the "exist" question ...


Folks,

As the name itself suggests, the servlet needs a servlet container to 
run. If you build a standard WAR you will get among others the 
OpenSearchServlet included in the WAR, under <contextPath>/opensearch. 
Deploy this WAR to your favorite servlet container, e.g. Tomcat, and you 
are ready to go.

This is a REST-type service, which means that you send it requests as 
standard HTTP GET-s with parameters in the URL, and as a response you 
get an XML document.

Example request:

    http://localhost:8081/nutch/opensearch?query=cnn

Example response:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:nutch="http://www.nutch.org/opensearchrss/1.0/" xmlns:opensearch="http://a9.com/-/spec/opensearchrss/1.0/" version="2.0">
<channel>
<title>Nutch: cnn</title>
<description>Nutch search results for query: cnn</description>
<link>http://localhost:8081/nutch/search.jsp?query=cnn&amp;start=0&amp;hitsPerDup=2&amp;hitsPerPage=10</link>
<opensearch:totalResults>1</opensearch:totalResults>
<opensearch:startIndex>0</opensearch:startIndex>
<opensearch:itemsPerPage>10</opensearch:itemsPerPage>

<nutch:query>cnn</nutch:query>
<item>
<title>CNN.com - Breaking News, U.S., World, Weather, Entertainment &amp; Video News</title>
<description>&lt;span class="ellipsis"&gt; ... &lt;/span&gt;the world Instant Access &lt;span class="highlight"&gt;CNN&lt;/span&gt; International Live newscasts and&lt;span class="ellipsis"&gt; ... &lt;/span&gt;Pipeline Overnight Live feeds from &lt;span class="highlight"&gt;CNN&lt;/span&gt; and its global&lt;span class="ellipsis"&gt; ... &lt;/span&gt;</description>

<link>http://www.cnn.com/</link>
<nutch:site>www.cnn.com</nutch:site>
<nutch:cache>http://localhost:8081/nutch/cached.jsp?idx=0&amp;id=0</nutch:cache>
<nutch:explain>http://localhost:8081/nutch/explain.jsp?idx=0&amp;id=0&amp;query=cnn&amp;lang=null</nutch:explain>
<nutch:segment>20060817135307</nutch:segment>
<nutch:digest>6e5e1ede359a88f11fc564cf22f79305</nutch:digest>
<nutch:boost>2.5735338</nutch:boost>

</item>
</channel>
</rss>



-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: opensearchservlet - HowTo ?

Posted by Michael Wechner <mi...@wyona.com>.
Sandy Polanski wrote:

>I second that.  Is there anyone that can give us some tips on how to use the OpenSearchServlet?  I'd really like to see a standalone Java program that would allow me to see the results in RSS format that I can call from the "./bin/nutch" executable.
>  
>

I guess the bin/nutch resp. some other program (maybe based on 
NutchBean) should return a RSS feed which then can be pulled/parsed by 
the OpenSearchServlet. The question is does something like this already 
exist within Nutch and if not is somebody writing something like this
(for instance myself ;-) but I would rather wait if somebody might 
answer the "exist" question ...

Michi

> 		
>---------------------------------
>Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls.  Great rates starting at 1¢/min.
>  
>


-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org
+41 44 272 91 61


Re: opensearchservlet - HowTo ?

Posted by Sandy Polanski <sa...@yahoo.com>.
I second that.  Is there anyone that can give us some tips on how to use the OpenSearchServlet?  I'd really like to see a standalone Java program that would allow me to see the results in RSS format that I can call from the "./bin/nutch" executable.

 		
---------------------------------
Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls.  Great rates starting at 1¢/min.