You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by senthil <se...@sardonyx.in> on 2018/09/15 03:51:26 UTC
Need Support - Apache Solr - 20180915
Dear Team,
We are beginners to Apache Solr and its implementations. We need the
following basic clarifications regarding Apache Solr usage and implementing
with MS-SQL server database.
1. Our MS-SQL server database having the data table which contains 20
columns with billions of data.
2. How to implement Apache Solr in the particular above table to increase
search capability?
3. Is there any way to call the data which is distributed across 2
shards/node of Apache Solr at a time?
4. Is there any performance difference between search the data in a single
shard/node and multiple shard/node?
Thanks & Regards
SENTHIL KUMAR P
Team Leader
Office: +91-4362-243433
Skype: syxsenthilp
Sardonyx Technologies Pvt. Ltd.,
Thanjavur - 613007
www.sardonyx.in <http://www.sardonyx.in/>
From: senthil [mailto:senthilkumarp@sardonyx.in]
Sent: Friday, September 14, 2018 12:40 PM
To: 'general@lucene.apache.org'; 'solr-user@lucene.apache.org';
'dev@lucene.apache.org'
Cc: kulothungand; 'karthickrm@sardonyx.in'
Subject: Need Support - Apache Solr - 20180914
Dear Team,
We need the below clarifications and doubts in the "Apache Solr" and please
give us the solution.
1. Apache Solr is the Database or not ?
2. The limitation for Apache Solr is 2 billion records and how can we
increase it (unlimitted) ?
3. How many users can use / access Apache Solr through web application
at a time ?
Thanks & Regards
SENTHIL KUMAR P
Team Leader
Office: +91-4362-243433
Skype: syxsenthilp
Sardonyx Technologies Pvt. Ltd.,
Thanjavur - 613007
www.sardonyx.in <http://www.sardonyx.in/>
Re: Need Support - Apache Solr - 20180915
Posted by Walter Underwood <wu...@wunderwood.org>.
> On Sep 15, 2018, at 12:14 PM, Rajdeep Sahoo <ra...@gmail.com> wrote:
>
> Solr is a file based database for retriving data and not for complex
> oprations like joining multiple tables.
Solr is NOT a file-based database.
Solr is a search engine. It is not any kind of database because it does not meet the ACID properties.
wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/ (my blog)
Re: Need Support - Apache Solr - 20180915
Posted by Rajdeep Sahoo <ra...@gmail.com>.
You can go for solr cloud if you have billions of data and in future if you
want to increase the volume.
Solr is a file based database for retriving data and not for complex
oprations like joining multiple tables.If your requirement is only storing
and fast retrieving then solr is the best option in comparison to
conventional relational db.
You can configurer the no of threads from jetty server configuration .
On Sat, Sep 15, 2018 at 8:54 PM senthil <se...@sardonyx.in> wrote:
> Dear Team,
>
>
>
> We are beginners to Apache Solr and its implementations. We need the
> following basic clarifications regarding Apache Solr usage and implementing
> with MS-SQL server database.
>
>
>
> 1. Our MS-SQL server database having the data table which contains 20
> columns with billions of data.
>
>
>
> 2. How to implement Apache Solr in the particular above table to increase
> search capability?
>
>
>
> 3. Is there any way to call the data which is distributed across 2
> shards/node of Apache Solr at a time?
>
>
>
> 4. Is there any performance difference between search the data in a single
> shard/node and multiple shard/node?
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> *SENTHIL KUMAR P*
>
> Team Leader
>
>
>
> Office: +91-4362-243433
>
> Skype: syxsenthilp
>
>
>
> *Sardonyx Technologies Pvt. Ltd.,*
>
> Thanjavur - 613007
>
> www.sardonyx.in
>
>
>
> [image: sardonyx_logo]
>
>
>
> [image: signature-1]
>
>
>
> *From:* senthil [mailto:senthilkumarp@sardonyx.in]
> *Sent:* Friday, September 14, 2018 12:40 PM
> *To:* 'general@lucene.apache.org'; 'solr-user@lucene.apache.org'; '
> dev@lucene.apache.org'
> *Cc:* kulothungand; 'karthickrm@sardonyx.in'
> *Subject:* Need Support - Apache Solr - 20180914
>
>
>
> Dear Team,
>
>
>
> We need the below clarifications and doubts in the “*Apache Solr*” and
> please give us the solution.
>
>
>
> 1. Apache Solr is the Database or not ?
>
>
>
> 2. The limitation for Apache Solr is 2 billion records and how can
> we increase it (unlimitted) ?
>
>
>
> 3. How many users can use / access Apache Solr through web
> application at a time ?
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> *SENTHIL KUMAR P*
>
> Team Leader
>
>
>
> Office: +91-4362-243433
>
> Skype: syxsenthilp
>
>
>
> *Sardonyx Technologies Pvt. Ltd.,*
>
> Thanjavur - 613007
>
> www.sardonyx.in
>
>
>
> [image: sardonyx_logo]
>
>
>
> [image: signature-1]
>
>
>
Re: Need Support - Apache Solr - 20180915
Posted by Shawn Heisey <ap...@elyograg.org>.
On 9/14/2018 9:51 PM, senthil wrote:
>
> We are beginners to Apache Solr and its implementations. We need the
> following basic clarifications regarding Apache Solr usage and
> implementing with MS-SQL server database.
>
I don't know what you think of Erick's answers, but he's right on the
money with everything he said. Here's my contribution. Just more detail.
> 1. Our MS-SQL server database having the data table which contains 20
> columns with billions of data.
>
MS SQL probably means your environment is Windows Server. If you can,
run Solr on something other than Windows. Solr can run just fine on a
Server edition of Windows, but it will run better on something else.
Open source operating systems will serve you very well.
> 2. How to implement Apache Solr in the particular above table to
> increase search capability?
>
Setting Solr up to import from a database is not terribly difficult.
Where you will probably spend the most time is perfecting your field
analysis in your schema. Getting that right can take a lot of
experimentation, rebuilding the index every time you change something.
You probably don't want to import your whole database table every time
while you work on this step.
There are certain gotchas when using the DataImport Handler with
SolrCloud. You'll be happier with Solr if you can build your own
program to transfer data from your database into Solr. With a
multi-threaded indexing application, you can achieve import speeds far
greater than DIH can.
> 3. Is there any way to call the data which is distributed across 2
> shards/node of Apache Solr at a time?
>
As Erick said, this is where SolrCloud shines. You can do sharded
indexes without SolrCloud, but it is much more difficult to manage.
> 4. Is there any performance difference between search the data in a
> single shard/node and multiple shard/node?
>
I'm not sure how to approach this question - mostly because I cannot
tell exactly what you're asking. Are you asking about multiple shards
per node, or multiple shards in general?
The short answer is yes in either case. And if all you want to know is
whether a performance difference EXISTS, then the answer is yes. The
long answer, like MANY questions about Solr, is "it depends." If, in
addition to whether a performance difference exists, you want to know
which way has better performance, the answer is still "it depends."
If your query rate is VERY low, splitting into multiple shards on the
same node can actually perform BETTER than a single node on the same
machine. As your query rate grows, you'll want those shards to be on
separate machines, or query performance will suffer.
Thanks,
Shawn
Re: Need Support - Apache Solr - 20180915
Posted by Erick Erickson <er...@gmail.com>.
2. It's almost always a mistake to think about it this way. You're not
going to "implement Apache Solr in the particular above table to
increase search capability", you're going to build a search
application where the system of record is the database. So forget
about the table! Just consider the data it contains and ask "how do I
want to search this?" Build out your use cases which will be vital to
design your schema, and _then_ consider how to move the data from your
DB into Solr to support those use cases.
3. That's what SolrCloud is all about
4. Again, the wrong question. At the "billions" scale you have to use
shards. As was said before, there's a hard limit of 2 billion
docs/shard, and practically less than that. You have no choice but to
shard so a better question is "can I get the performance I need from a
multi-shard Solr collection with the data I have and the use-cases I
need to support", which you can't tell until you prototype. Here's a
blog on this very topic:
https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
Best,
Erick
On Sat, Sep 15, 2018 at 8:24 AM senthil <se...@sardonyx.in> wrote:
>
> Dear Team,
>
>
>
> We are beginners to Apache Solr and its implementations. We need the following basic clarifications regarding Apache Solr usage and implementing with MS-SQL server database.
>
>
>
> 1. Our MS-SQL server database having the data table which contains 20 columns with billions of data.
>
>
>
> 2. How to implement Apache Solr in the particular above table to increase search capability?
>
>
>
> 3. Is there any way to call the data which is distributed across 2 shards/node of Apache Solr at a time?
>
>
>
> 4. Is there any performance difference between search the data in a single shard/node and multiple shard/node?
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> SENTHIL KUMAR P
>
> Team Leader
>
>
>
> Office: +91-4362-243433
>
> Skype: syxsenthilp
>
>
>
> Sardonyx Technologies Pvt. Ltd.,
>
> Thanjavur - 613007
>
> www.sardonyx.in
>
>
>
>
>
>
>
> From: senthil [mailto:senthilkumarp@sardonyx.in]
> Sent: Friday, September 14, 2018 12:40 PM
> To: 'general@lucene.apache.org'; 'solr-user@lucene.apache.org'; 'dev@lucene.apache.org'
> Cc: kulothungand; 'karthickrm@sardonyx.in'
> Subject: Need Support - Apache Solr - 20180914
>
>
>
> Dear Team,
>
>
>
> We need the below clarifications and doubts in the “Apache Solr” and please give us the solution.
>
>
>
> 1. Apache Solr is the Database or not ?
>
>
>
> 2. The limitation for Apache Solr is 2 billion records and how can we increase it (unlimitted) ?
>
>
>
> 3. How many users can use / access Apache Solr through web application at a time ?
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> SENTHIL KUMAR P
>
> Team Leader
>
>
>
> Office: +91-4362-243433
>
> Skype: syxsenthilp
>
>
>
> Sardonyx Technologies Pvt. Ltd.,
>
> Thanjavur - 613007
>
> www.sardonyx.in
>
>
>
>
>
>