You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Lukáš Vlček <lu...@gmail.com> on 2010/09/02 21:20:15 UTC

Federated search with opensearch or proprietary APIs for Atlassian

Hi,

Does anybody have any experience with building federated search using
opensearch and/or proprietary APIs of Atlassian's products?

Many Atlassian's products have built in full-text search modules (running on
top of Lucene I think) that provide interesting and sometimes quite advanced
search features (including recommendations and identification of similar or
"the-like" documents). However, there does not seem to be any easy way how
to allow users search across multiple CMSs in useful way (or is there?) -
aka federated search. Atlassian seems to provide opensearch API in some
products (1.1 version still in draft since 2005!) or proprietary API (REST
based for example) that is subject to change with every product update. As
far as I understand opensearch is quite limited as well as the mentioned
REST API is. The later is bit more advanced but documentation for their
search API seems to be quite brief and learning how to do sorting, phrase
querying, boosting or anything more advanced then simple term query sounds
like spending time experimenting.

I believe the only reasonable and efficient way how to allow federated
search is to pull the content from individual CMSs into new search server
(like Solr for example). However, this may sound like reinventing the wheel
to the customers ("Atlassian product owns the data and their developers
invested a lot of resources into all the fancy search features, right? Why
to build it again?"). So, I would like to hear from anybody who can prove me
wrong on my opinion that unless I can grab, pull (steal if you will) and
index the data again then there is no way how to build something really
useful. By "more useful" I mean something more then just sending term
queries into individual CMSs and merging individual search results (which
does not allow building facets in efficient way, does not allow good sorting
and filtering, does not allow scoring control... etc... not to mention spell
checking, search suggestions, did you mean ...).

Regards,
Lukas

Re: Federated search with opensearch or proprietary APIs for Atlassian

Posted by Chris Lu <ch...@gmail.com>.
I understand your point. But Atlassian does not have a stable search API and
you can not do too much with it.

It's better to have a separate search engine, where you can have most of the
features, like facet search, ranking, pagination, etc, done for you(see my
signature). And you will be more flexible with the structure, even dealing
with data beyond Atlassian products.

I guess that's the reason Google did not rely on each website's own search
mechanism.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Thu, Sep 2, 2010 at 12:20 PM, Lukáš Vlček <lu...@gmail.com> wrote:

> Hi,
>
> Does anybody have any experience with building federated search using
> opensearch and/or proprietary APIs of Atlassian's products?
>
> Many Atlassian's products have built in full-text search modules (running
> on
> top of Lucene I think) that provide interesting and sometimes quite
> advanced
> search features (including recommendations and identification of similar or
> "the-like" documents). However, there does not seem to be any easy way how
> to allow users search across multiple CMSs in useful way (or is there?) -
> aka federated search. Atlassian seems to provide opensearch API in some
> products (1.1 version still in draft since 2005!) or proprietary API (REST
> based for example) that is subject to change with every product update. As
> far as I understand opensearch is quite limited as well as the mentioned
> REST API is. The later is bit more advanced but documentation for their
> search API seems to be quite brief and learning how to do sorting, phrase
> querying, boosting or anything more advanced then simple term query sounds
> like spending time experimenting.
>
> I believe the only reasonable and efficient way how to allow federated
> search is to pull the content from individual CMSs into new search server
> (like Solr for example). However, this may sound like reinventing the wheel
> to the customers ("Atlassian product owns the data and their developers
> invested a lot of resources into all the fancy search features, right? Why
> to build it again?"). So, I would like to hear from anybody who can prove
> me
> wrong on my opinion that unless I can grab, pull (steal if you will) and
> index the data again then there is no way how to build something really
> useful. By "more useful" I mean something more then just sending term
> queries into individual CMSs and merging individual search results (which
> does not allow building facets in efficient way, does not allow good
> sorting
> and filtering, does not allow scoring control... etc... not to mention
> spell
> checking, search suggestions, did you mean ...).
>
> Regards,
> Lukas
>

Re: Federated search with opensearch or proprietary APIs for Atlassian

Posted by mark harwood <ma...@yahoo.co.uk>.
A pretty thorough exploration of the issues in federated search here:  
http://ilpubs.stanford.edu:8090/271/
I'd add "security" i.e. authentication and authorisation to the list of issues 
to be considered (key in some environments).

If you consolidate content in a centralised Solr/Lucene indexing facility by 
providing batch data feeds from source systems I don't think that can be 
described as "federated search". The issues of such consolidation become:
* Establishing connectivity/feeds that pass only deltas from data sources
* Latency of updates 
* Respecting any security around source material (authentication, authorisation 
and auditing)
* Tackling concerns around overheads in "duplicating data" for large 
installations


Cheers,
Mark




----- Original Message ----
From: Lukáš Vlček <lu...@gmail.com>
To: java-user@lucene.apache.org
Sent: Fri, 3 September, 2010 4:50:15
Subject: Federated search with opensearch or proprietary APIs for Atlassian

Hi,

Does anybody have any experience with building federated search using
opensearch and/or proprietary APIs of Atlassian's products?

Many Atlassian's products have built in full-text search modules (running on
top of Lucene I think) that provide interesting and sometimes quite advanced
search features (including recommendations and identification of similar or
"the-like" documents). However, there does not seem to be any easy way how
to allow users search across multiple CMSs in useful way (or is there?) -
aka federated search. Atlassian seems to provide opensearch API in some
products (1.1 version still in draft since 2005!) or proprietary API (REST
based for example) that is subject to change with every product update. As
far as I understand opensearch is quite limited as well as the mentioned
REST API is. The later is bit more advanced but documentation for their
search API seems to be quite brief and learning how to do sorting, phrase
querying, boosting or anything more advanced then simple term query sounds
like spending time experimenting.

I believe the only reasonable and efficient way how to allow federated
search is to pull the content from individual CMSs into new search server
(like Solr for example). However, this may sound like reinventing the wheel
to the customers ("Atlassian product owns the data and their developers
invested a lot of resources into all the fancy search features, right? Why
to build it again?"). So, I would like to hear from anybody who can prove me
wrong on my opinion that unless I can grab, pull (steal if you will) and
index the data again then there is no way how to build something really
useful. By "more useful" I mean something more then just sending term
queries into individual CMSs and merging individual search results (which
does not allow building facets in efficient way, does not allow good sorting
and filtering, does not allow scoring control... etc... not to mention spell
checking, search suggestions, did you mean ...).

Regards,
Lukas



      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org