You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by "Teki, Prasad" <pr...@standardandpoors.com> on 2010/11/09 19:39:01 UTC

RE: returning message to sender

------=_Part_27114_30663314.1289327581322
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Hi guys,
I have been exploring Solr since last few weeks. Our main intension is
to
expose the data, as WS, across various data sources by linking them
using
some scenario.

I have couple of questions.
Is there any good document/URL, which answers...

How the indexing happens/built for the queries across different data
sources
(DIH)?

Does the Lucene store the actual data of each individual query or a
combination?, where, if yes?

Whenever we do a query against built index, when exactly it fires the
query
to database?

How does the index get the updates from the DIH, For example, if my
query
includes 3 DIH and 
What is the max number of data sources, I can include to get better
performace?

How do we measure the scalablity?

Can I run these search engines in a grid mode?

Thanks.
-- 
View this message in context:
http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html
Sent from the Solr - User mailing list archive at Nabble.com.

------=_Part_27114_30663314.1289327581322
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit


Hi guys,
I have been exploring Solr since last few weeks. Our main intension is
to expose the data, as WS, across various data sources by linking them
using some scenario.

I have couple of questions.
Is there any good document/URL, which answers...

How the indexing happens/built for the queries across different data
sources (DIH)?

Does the Lucene store the actual data of each individual query or a
combination?, where, if yes?

Whenever we do a query against built index, when exactly it fires the
query to database?

How does the index get the updates from the DIH, For example, if my
query includes 3 DIH and 
What is the max number of data sources, I can include to get better
performace?

How do we measure the scalablity?

Can I run these search engines in a grid mode?

Thanks.<img class='smiley'
src='http://n3.nabble.com/images/smiley/anim_confused.gif' />
<br><hr align="left" width="300">
View this message in context: <a
href="http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html"
>Storage</a><br>
Sent from the <a
href="http://lucene.472066.n3.nabble.com/Solr-User-f472068.html">Solr -
User mailing list archive</a> at Nabble.com.<br>

------=_Part_27114_30663314.1289327581322-- 
Standard & Poor's: Empowering Investors and Markets for 150 Years
 
--------------------------------------------------------

The information contained in this message is intended only for the recipient, and may be a confidential attorney-client communication or may otherwise be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, please be aware that any dissemination or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by replying to the message and deleting it from your computer. The McGraw-Hill Companies, Inc. reserves the right, subject to applicable local law, to monitor and review the content of any electronic message or information sent to or from McGraw-Hill employee e-mail addresses without informing the sender or recipient of the message.
--------------------------------------------------------

Re: returning message to sender

Posted by Lance Norskog <go...@gmail.com>.

David Smiley and Eric Pugh wrote a wonderful book on Solr:

http://www.lucidimagination.com/blog/2010/01/11/book-review-solr-packt-book/

Reading through this book and trying the examples will address all of
your questions.

On Tue, Nov 9, 2010 at 3:23 PM, Erick Erickson <er...@gmail.com> wrote:
> Hmmm, this is a little murky....
> I'm inferring that you believe that DIH somehow
> queries the data source at #query# time, and this
> is not true.  DIH is an #index time# concept.
>
> DIH is used to add data to an index. Once that index is
> created, all searches against are unaware that there
> were different data sources.
>
> So, with a single Solr schema, you can use DIH
> on as many different data sources as you want,
> mapping the various bits of information from each
> data source into your Solr schema. Searches go
> against fields defined in the schema, so you're
> automatically searching against all the databases
> (assuming you've mapped your data into your
> schema)....
>
> If I've misunderstood, perhaps you can add some
> details?
>
> Best
> Erick
>
> On Tue, Nov 9, 2010 at 1:39 PM, Teki, Prasad <
> prasad_teki@standardandpoors.com> wrote:
>
>> ------=_Part_27114_30663314.1289327581322
>> Content-Type: text/plain; charset=us-ascii
>> Content-Transfer-Encoding: 7bit
>>
>>
>> Hi guys,
>> I have been exploring Solr since last few weeks. Our main intension is
>> to
>> expose the data, as WS, across various data sources by linking them
>> using
>> some scenario.
>>
>> I have couple of questions.
>> Is there any good document/URL, which answers...
>>
>> How the indexing happens/built for the queries across different data
>> sources
>> (DIH)?
>>
>> Does the Lucene store the actual data of each individual query or a
>> combination?, where, if yes?
>>
>> Whenever we do a query against built index, when exactly it fires the
>> query
>> to database?
>>
>> How does the index get the updates from the DIH, For example, if my
>> query
>> includes 3 DIH and
>> What is the max number of data sources, I can include to get better
>> performace?
>>
>> How do we measure the scalablity?
>>
>> Can I run these search engines in a grid mode?
>>
>> Thanks.
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>> ------=_Part_27114_30663314.1289327581322
>> Content-Type: text/html; charset=us-ascii
>> Content-Transfer-Encoding: 7bit
>>
>>
>> Hi guys,
>> I have been exploring Solr since last few weeks. Our main intension is
>> to expose the data, as WS, across various data sources by linking them
>> using some scenario.
>>
>> I have couple of questions.
>> Is there any good document/URL, which answers...
>>
>> How the indexing happens/built for the queries across different data
>> sources (DIH)?
>>
>> Does the Lucene store the actual data of each individual query or a
>> combination?, where, if yes?
>>
>> Whenever we do a query against built index, when exactly it fires the
>> query to database?
>>
>> How does the index get the updates from the DIH, For example, if my
>> query includes 3 DIH and
>> What is the max number of data sources, I can include to get better
>> performace?
>>
>> How do we measure the scalablity?
>>
>> Can I run these search engines in a grid mode?
>>
>> Thanks.<img class='smiley'
>> src='http://n3.nabble.com/images/smiley/anim_confused.gif' />
>> <br><hr align="left" width="300">
>> View this message in context: <a
>> href="http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html"
>> >Storage</a><br>
>> Sent from the <a
>> href="http://lucene.472066.n3.nabble.com/Solr-User-f472068.html">Solr -
>> User mailing list archive</a> at Nabble.com.<br>
>>
>> ------=_Part_27114_30663314.1289327581322--
>> Standard & Poor's: Empowering Investors and Markets for 150 Years
>>
>> --------------------------------------------------------
>>
>> The information contained in this message is intended only for the
>> recipient, and may be a confidential attorney-client communication or may
>> otherwise be privileged and confidential and protected from disclosure. If
>> the reader of this message is not the intended recipient, or an employee or
>> agent responsible for delivering this message to the intended recipient,
>> please be aware that any dissemination or copying of this communication is
>> strictly prohibited. If you have received this communication in error,
>> please immediately notify us by replying to the message and deleting it from
>> your computer. The McGraw-Hill Companies, Inc. reserves the right, subject
>> to applicable local law, to monitor and review the content of any electronic
>> message or information sent to or from McGraw-Hill employee e-mail addresses
>> without informing the sender or recipient of the message.
>> --------------------------------------------------------
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: returning message to sender

Posted by Erick Erickson <er...@gmail.com>.

Hmmm, this is a little murky....
I'm inferring that you believe that DIH somehow
queries the data source at #query# time, and this
is not true.  DIH is an #index time# concept.

DIH is used to add data to an index. Once that index is
created, all searches against are unaware that there
were different data sources.

So, with a single Solr schema, you can use DIH
on as many different data sources as you want,
mapping the various bits of information from each
data source into your Solr schema. Searches go
against fields defined in the schema, so you're
automatically searching against all the databases
(assuming you've mapped your data into your
schema)....

If I've misunderstood, perhaps you can add some
details?

Best
Erick

On Tue, Nov 9, 2010 at 1:39 PM, Teki, Prasad <
prasad_teki@standardandpoors.com> wrote:

> ------=_Part_27114_30663314.1289327581322
> Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit
>
>
> Hi guys,
> I have been exploring Solr since last few weeks. Our main intension is
> to
> expose the data, as WS, across various data sources by linking them
> using
> some scenario.
>
> I have couple of questions.
> Is there any good document/URL, which answers...
>
> How the indexing happens/built for the queries across different data
> sources
> (DIH)?
>
> Does the Lucene store the actual data of each individual query or a
> combination?, where, if yes?
>
> Whenever we do a query against built index, when exactly it fires the
> query
> to database?
>
> How does the index get the updates from the DIH, For example, if my
> query
> includes 3 DIH and
> What is the max number of data sources, I can include to get better
> performace?
>
> How do we measure the scalablity?
>
> Can I run these search engines in a grid mode?
>
> Thanks.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
> ------=_Part_27114_30663314.1289327581322
> Content-Type: text/html; charset=us-ascii
> Content-Transfer-Encoding: 7bit
>
>
> Hi guys,
> I have been exploring Solr since last few weeks. Our main intension is
> to expose the data, as WS, across various data sources by linking them
> using some scenario.
>
> I have couple of questions.
> Is there any good document/URL, which answers...
>
> How the indexing happens/built for the queries across different data
> sources (DIH)?
>
> Does the Lucene store the actual data of each individual query or a
> combination?, where, if yes?
>
> Whenever we do a query against built index, when exactly it fires the
> query to database?
>
> How does the index get the updates from the DIH, For example, if my
> query includes 3 DIH and
> What is the max number of data sources, I can include to get better
> performace?
>
> How do we measure the scalablity?
>
> Can I run these search engines in a grid mode?
>
> Thanks.<img class='smiley'
> src='http://n3.nabble.com/images/smiley/anim_confused.gif' />
> <br><hr align="left" width="300">
> View this message in context: <a
> href="http://lucene.472066.n3.nabble.com/Storage-tp1871155p1871155.html"
> >Storage</a><br>
> Sent from the <a
> href="http://lucene.472066.n3.nabble.com/Solr-User-f472068.html">Solr -
> User mailing list archive</a> at Nabble.com.<br>
>
> ------=_Part_27114_30663314.1289327581322--
> Standard & Poor's: Empowering Investors and Markets for 150 Years
>
> --------------------------------------------------------
>
> The information contained in this message is intended only for the
> recipient, and may be a confidential attorney-client communication or may
> otherwise be privileged and confidential and protected from disclosure. If
> the reader of this message is not the intended recipient, or an employee or
> agent responsible for delivering this message to the intended recipient,
> please be aware that any dissemination or copying of this communication is
> strictly prohibited. If you have received this communication in error,
> please immediately notify us by replying to the message and deleting it from
> your computer. The McGraw-Hill Companies, Inc. reserves the right, subject
> to applicable local law, to monitor and review the content of any electronic
> message or information sent to or from McGraw-Hill employee e-mail addresses
> without informing the sender or recipient of the message.
> --------------------------------------------------------
>