You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Jens Melgaard <Je...@Systematic.com.INVALID> on 2021/04/12 08:00:32 UTC

RE: Lusene.Net | .Net | SQL Server | Databases

What Andy is describing is sort of exactly what we do in: https://github.com/dotJEM/web-host 

It uses https://github.com/dotJEM/json-storage to emulate a simple key-value store (JSON Based) in a SQL server database with some additional features and it then uses https://github.com/dotJEM/json-index to index that json.

(Using the SQL Server was not a choice we could divert from so that is why it's used like this, but there would certainly be better alternatives considering how we use it)

These frameworks are certainly still to be seen as very much in a flux state (hence no v1.0 release) but maybe you can get some ideas from them.


Kind regards

Jens Melgaard
Architect

Systematic A/S
Søren Frichs Vej 39
8000 Aarhus C
Denmark

Mobile: +45 4196 5119
Jens.Melgaard@Systematic.com

https://systematic.com/defence/defence-newsletter
https://www.linkedin.com/showcase/systematic-defence/
-----Original Message-----
From: Andy Pook <an...@gmail.com> 
Sent: 25. marts 2021 12:01
To: user@lucenenet.apache.org
Subject: Re: Lusene.Net | .Net | SQL Server | Databases

 CAUTION - External Mail

1. add ChangeTracking (or CDC) to your tables. Write a service to "listen"
to the changes and apply those to your index.
2. If you have a "ChangedAt" timestamp column. Then "poll" the table where ChangedAt > last-seen-change. Foreach over those rows, update the index.
How frequently you poll is part of how NRT the index will be.
3. if your system is based on event messaging, listen to "update" messages and update the index from those 4. ...

You can think of the index in the same way as a "data warehouse". It's just some other data store with a different "schema" than your OLTP system (which is often awesome normalized sql thing). So this is "just" another ETL feature. Extract from one datastore, Transform it, Load it into the index.

off-topic...
Often the pattern is to include some primary key of the thing being indexed as a field. Then the system can "search" the index, get the id/key to query the original store to get the whole entity.
Several previous Lucene things I've had a hand in decided to also store the entity directly in the index (we serialized the object as json into a binary/blob field of the Document. You can use some other serializer/encoder with compression if you want to obsess over size :) ).
Obviously this makes the index files/folder much larger (depending on how big the entity is). But it does not have to be the entire entity, just the bits that are needed. Makes the index feel more like a document database.
The upside is that you don't have to do a two phase query to get at the entity. And that the index can be used in isolation without the origin data store.

It does mean that there is an "eventual consistency" aspect to consider.
But depending on the polling or ChangeTracking (or whatever ETL scheme you go with) and all the NRT stuff that Shad is talking about... you can get very close to "real time".
But do consider how near to realtime your system needs to be. I would suggest that some minutes to hours is more than good enough for most systems. Needing sub-second is rare (but possible).

Wow, all that just flowed out :) I hope some of it might be useful/relevant to your endeavours

On Thu, 25 Mar 2021 at 08:17, Hassan Iftikhar <Ha...@enghouse.com.invalid> wrote:

> Hi Shad,
>
> As Lucene.Net is a general purpose library and it has nothing to do 
> with data sources like SQL Server, SQLite, etc. It only knows you have 
> a Lucene document that you want indexed. So when we dump data to 
> Lucene.Net from any data source. How can we make Lucene.Net documents 
> up to date as the data is in SQL Database(For example). One way to 
> keep both data, i.e. (Lucene.Net and SQL) sync is to continually 
> update the Lucene index during each database update. We also know that 
> there is a possibility that someone can made manually changes to SQL 
> database, in that scenario how we can update Lucene indexes?
>
> Thanks,
>
> Regards,
> Hassan Iftikhar
> Software Engineer, R&D
> m: +92 (0) 300 064 9845
> w: 
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
> nghouseinteractive.com%2F&amp;data=04%7C01%7Cjens.melgaard%40systemati
> c.com%7C7957f6f7062848a3a79d08d8ef7d57a9%7C7f6211b17c5c42778403c0ccbd7
> f0408%7C0%7C0%7C637522668910168789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sda
> ta=4wlRXeV%2FY36dWuvry5qMK012%2FfIJuwA6pSGBZk2sQ1k%3D&amp;reserved=0
> e: hassan.iftikhar@enghouse.com
>
> As the world responds to the Covid-19 outbreak, Enghouse is committed 
> to doing its part to support organisations' risk management efforts.
> We are providing temporary licences of our secure cloud-based 
> communications platform at no cost to your organisation.
>
>
>
>
> -----Original Message-----
> From: Shad Storhaug <sh...@shadstorhaug.com>
> Sent: Wednesday, March 24, 2021 1:21 AM
> To: user@lucenenet.apache.org
> Subject: RE: Lusene.Net | .Net | SQL Server | Databases
>
> Hello Hassan,
>
> While you might extend the Directory class to provide a "native"
> communication channel with another data storage medium other than the 
> file system, you will quickly find out how challenging such a task is, 
> and there will always be a high price to pay in terms of performance.
>
> Basically, there are a few different ways people deal with data from a 
> database but most of the time you have to accept that some data will 
> be duplicated between the database and Lucene, but the risk of that 
> duplication can be reduced/eliminated by applying automation. The 
> exact solution depends on how much data there is and how often it 
> needs to be refreshed.
>
> 1. Make indexing part of the deployment process so the most current 
> copy of the search index is the deployment date.
> 2. Design a custom job to update the index at specified intervals.
> 3. Use Lucene's near real-time search (NRT) feature to continually 
> update the Lucene index during each database update and keep a "live" 
> view of the data in search.
>
> There might be other solutions, but these are generally the best 
> options for most scenarios.
>
> Thanks,
> Shad Storhaug (NightOwl888)
> Project Chairperson - Apache Lucene.NET
>
> -----Original Message-----
> From: Ron.Git <Ro...@GiftOasis.com>
> Sent: Tuesday, March 23, 2021 7:49 PM
> To: user@lucenenet.apache.org
> Subject: RE: Lusene.Net | .Net | SQL Server | Databases
>
> Lucene is a general purpose indexing and search library.  As such it 
> is not concerned with where the data comes from.  When that data comes 
> from a sql database, developers typically use their normal approach to 
> retrieve the data from the sql database an then use that data to 
> populate a Lucene document for indexing.  Lucene itself has no 
> knowledge of where the data came from.  It only knows you have a Lucene document that you want indexed.
>
> -Ron
>
>
>
> -----Original Message-----
> From: Hassan Iftikhar [mailto:Hassan.Iftikhar@enghouse.com.INVALID]
> Sent: Tuesday, March 23, 2021 5:07 AM
> To: user@lucenenet.apache.org
> Subject: RE: Lusene.Net | .Net | SQL Server | Databases
>
> Hi,
>
> Hope you guys are doing well. Thanks for the feedback for my last email.
> Now I have another query regarding data sources.
>
> I want to know how Lucene.Net communicate with data sources i.e. with 
> existing SQL Server database. As I mentioned earlier we will be using 
> Lucene.Net in existing .Net Server/Client applications. So I am 
> interested to know how we can take the advantage of Lucene.Net with 
> existing SQL server/ SQLite Database?
>
> Thanks in advance!
>
> Regards,
> Hassan Iftikhar
> Software Engineer, R&D
> m: +92 (0) 300 064 9845
> w:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
> nghouseinteractive.com%2F&amp;data=04%7C01%7Cjens.melgaard%40systemati
> c.com%7C7957f6f7062848a3a79d08d8ef7d57a9%7C7f6211b17c5c42778403c0ccbd7
> f0408%7C0%7C0%7C637522668910168789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sda
> ta=4wlRXeV%2FY36dWuvry5qMK012%2FfIJuwA6pSGBZk2sQ1k%3D&amp;reserved=0
> e: hassan.iftikhar@enghouse.com
>
> As the world responds to the Covid-19 outbreak, Enghouse is committed 
> to doing its part to support organisations' risk management efforts.
> We are providing temporary licences of our secure cloud-based 
> communications platform at no cost to your organisation.
>
>
>
>
> -----Original Message-----
> From: RonClabo@GiftOasis.com <Ro...@GiftOasis.com>
> Sent: Wednesday, March 17, 2021 7:29 PM
> To: user@lucenenet.apache.org
> Subject: RE: Lusene.Net | .Net | .Net Core | .Net5 | .Net framework
>
> Hi Hassan,
>
>
>
> Thanks for your interest in Lucene.Net.  I am a recent contributor to 
> the Lucene.Net 4.8 project and will do my best to answer your questions.
>
>
>
> 1)    Lucene.Net 3.0.3 is from some time ago so it makes sense that it
> doesn't specifically target a newer version of the .Net 4.x framework.
> However, if it's documented to support .Net Framework 4.0 it's very 
> likely it will work on 4.61 since .Net Frameworks are generally 
> backward compatible especially for the major version.
>
> 2)    Lucenet.Net 4.8 which is currently in release Beta 13 supports both
> the .Net Full Framework 4.5 and supports NetStandard2.0, NetStandard2.1.
> As such it is compatible with .Net Core 2.0 or higher.
>
> 3)    Each Lucene.Net version attempts to be a faithful port of the
> functionality in the corresponding Java Lucene version.  This is 
> largely accomplished by porting the project from Java to C# on a line by line basis.
> However, occasionally a few features and bug fixes are pulled in from 
> later versions.  In general anything you read online about Java Lucene
> 3.0.3 will be accurate for Lucene.Net 3.0.3 and anything you read 
> online about Java Lucene 4.8 will be accurate about Lucene.Net 4.8.  
> Also note that while Lucene.Net 4.8 may seem like it's version number 
> is far behind the current Java version which is version 8.8, the 
> reality is that the version 4.8 contains the _vast_ majority of 
> features found in 8.8 because the big change (and multi-year effort) 
> for Lucene came in version 4.0 when codecs were introduced.  Since 
> then the features added per release have been much more modest and the 
> releases much more frequent, hence the rapid escalation in version number.
>
> 4)    Lucene 4.8 can be used on the latest Full Framework and on the latest
> .Net Core Framework.  More specifically, yes - Lucene.Net 4.8 is fully 
> compatible with .Net5.  I personally use it with .Net5.  Lucene.Net 
> 4.8 is in beta but I believe some people do use it in production as it 
> has been extremely stable for a very long time and has a large number 
> of unit tests ported from Java Lucene which are must pass before 
> commits are added to the project.  Some in the LuceneNet developer 
> community have even stated that they think 4.8 is already more solid 
> then 3.0.3 given that the earlier version did not have the extensive 
> unit tests to ensure accuracy of the port.  Lucene.Net 4.8 is actively 
> being worked on with a goal to getting it to final release.  If you 
> have any time to donate we'd welcome your help in polishing this version of LuceneNet.
>
>
>
> Best,
>
>
>
> Ron Clabo
>
> rclabo on Github.
>
>
>
> From: Hassan Iftikhar [mailto:Hassan.Iftikhar@enghouse.com.INVALID]
> Sent: Wednesday, March 17, 2021 7:58 AM
> To: user@lucenenet.apache.org
> Subject: Lusene.Net | .Net | .Net Core | .Net5 | .Net framework
>
>
>
> Hi,
>
>
>
> I am new at Lucene.Net and exploring it now a days to use it in our 
> products. Here I have some questions to ask:
>
>
>
> 1.      Can we use Lucene.Net in .Net framework 4.6.1? As the stable
> version
> of Lucene.Net is 3.0.3 and from your website what can I see that it 
> supports till .Net Framework 4.0.
>
> 2.      Can we use Lucene.Net in .Net Core? Because there is nothing
> information on your website related to support for Lucene.Net for .Net 
> core.
>
> 3.      Is Lucene.Net providing the same set of features as compare to
> Lucene for Java?
>
> 4.      Can we use Lucene.Net in .Net5 i.e. on latest .Net frameworks?
>
>
>
> Regards,
>
> Hassan Iftikhar
>
> Software Engineer, R&D
>
> m: +92 (0) 300 064 9845
>
> w:
> <
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
> nghou
>
> seinteractive.com%2F&amp;data=04%7C01%7CHassan.Iftikhar%40enghouse.com
> %7C3c0
>
> 3eeefda9448a3229f08d8e951310e%7C427e40023c0240489e280eba58b331f4%7C1%7
> C0%7C6 
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> enghouseinteractive.com%2F&amp;data=04%7C01%7CHassan.Iftikhar%40enghou
> se.com%7C3c03eeefda9448a3229f08d8e951310e%7C427e40023c0240489e280eba58
> b331f4%7C1%7C0%7C6>
>
> 37515882219503948%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMz
>
> IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g%2BHSNyUoxQc9TZCt20
> NnNw96 puSdxWNBuiE372d7HMc%3D&amp;reserved=0>
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
> nghous
>
> einteractive.com%2F&amp;data=04%7C01%7CHassan.Iftikhar%40enghouse.com%
> 7C3c03
>
> eeefda9448a3229f08d8e951310e%7C427e40023c0240489e280eba58b331f4%7C1%7C
> 0%7C63 
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> enghouseinteractive.com%2F&amp;data=04%7C01%7CHassan.Iftikhar%40enghou
> se.com%7C3c03eeefda9448a3229f08d8e951310e%7C427e40023c0240489e280eba58
> b331f4%7C1%7C0%7C63>
>
> 7515882219503948%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
> 2luMzI
>
> iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g%2BHSNyUoxQc9TZCt20N
> nNw96p
> uSdxWNBuiE372d7HMc%3D&amp;reserved=0
>
> e: hassan.iftikhar@enghouse.com
>
>
>
> As the world responds to the Covid-19 outbreak, Enghouse is committed 
> to doing its part to support organisations' risk management efforts.
> We are providing temporary licences of our secure cloud-based 
> communications platform at no cost to your organisation.
>
>
>
>
> <
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fengh
> ousei
>
> nteractive.de%2Feuropean-contact-center-dmg%2F&amp;data=04%7C01%7CHass
> an.Ift 
> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Feng
> houseinteractive.de%2Feuropean-contact-center-dmg%2F&amp;data=04%7C01%
> 7CHassan.Ift>
> ikhar%40enghouse.com
> %7C3c03eeefda9448a3229f08d8e951310e%7C427e40023c0240489e
>
> 280eba58b331f4%7C1%7C0%7C637515882219513908%7CUnknown%7CTWFpbGZsb3d8ey
> JWIjoi
>
> MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;
> sdata= 
> uOV3W9yR2h9I%2BYtigYu07wAZRZa27D%2BGSVDv64yiGFw%3D&amp;reserved=0>
> vidyo-trial-outlook-signature-v2
>
>
>
>
>
>
>
>