You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Andrea Di Menna <ni...@gmail.com> on 2013/01/15 15:48:50 UTC

dbpedia 3.8 index installation

Hi all,

I have been deploying a couple of instances on remote servers, using Tomcat
as a servlet engine.
The deployment is ok but when I try to install a custom dbpedia 3.8 index I
get a behaviour I do not understand.

Steps:
1) deploy stanbol.war into Tomcat
2) wait for Stanbol to start
3) copy the dbpedia.solrindex.zip into stanbol/datafiles

I see Stanbol starts to install the index and I see two folders growing in
size:
a) dbpedia-2013.15.1
b) dbpedia-2013.15.1-1

when the zip file has been uncompressed both folders are the same size and
contain the same data, and they are both kept on the HDD.

Of course, in a way this makes the index double its size.

I remember seeing this pattern applied also when deploying Stanbol in a
Jetty instance, but as far as I remember one of the two folders was delete
when the installation process ended.
What could be the cause of this behaviour?

Regards
Andrea

Re: dbpedia 3.8 index installation

Posted by Alessandro Adamou <ad...@cs.unibo.it>.
Yep, I ended up with the 9+ GiB unpacked index

and entityhub FieldQueries seem to be picking it up indeed.

Now moving on to figure out how to serve it with the plain Solr API (see 
my other bumped thread)

--A


On 11/07/2013 19:51, Fabian Christ wrote:
> Hi,
>
> and this is exactly what should happen IIRC ;-)
>
> Best,
> - Fabian
> Am 11.07.2013 17:05 schrieb "Alessandro Adamou" <ad...@cs.unibo.it>:
>
>> Hang in there, I just proceeded to restart org.apache.stanbol.data.sites.*
>> *dbpedia and it seems to have triggered the unpacking...
>>
>>
>> On 11/07/2013 15:49, Alessandro Adamou wrote:
>>
>>> Sorry for bumping this thread, but I just needed to check something about
>>> this.
>>>
>>> Is it correct that if I just place the DBPedia 3.8 solrindex in
>>> {working-dir}/datafiles and start Stanbol, it will be immediately unpacked
>>> to {working-dir}/indexes/default
>>>
>>> and that it will be automatically picked up by the existing Solr Yard
>>> configuration for the default dbpedia index?
>>>
>>> I simply copied Andrea's dbpedia.solrindex.zip (3.3 GiB compressed) to
>>> {working-dir}/datafiles then started the regular full launcher, but it
>>> doesn't seem to even start unpacking it.
>>>
>>> Clearly I have just the Solr index and not the OSGi bundle that will
>>> create its own Yard, Cache configuration etc. like those created by the
>>> indexing tool. Am I wrong assuming the existing components for the default
>>> dbpedia index will pick this one up automatically?
>>>
>>> Just to make things faster, is there a single EntityHub bundle that I can
>>> simply restart to trigger the index loading?
>>>
>>> Best,
>>>
>>> Alessandro
>>>
>>>
>>> On 16/01/2013 19:08, Rupert Westenthaler wrote:
>>>
>>>> Hi Andrea
>>>>
>>>> Thanks for all those information. I will try to reproduce this
>>>> behavior in the coming days. I will keep you informed.
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> On Wed, Jan 16, 2013 at 5:24 PM, Andrea Di Menna <ni...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I made another try moving the dbpedia index when Tomcat was stopped, in
>>>>> scenario 1.
>>>>> Looks like in this case everything works ok:
>>>>> 1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~
>>>>> 100MB)
>>>>> 2) The custom dbpedia index is installed in dbpedia-2013.01.16-1
>>>>>
>>>>> Hope this can help to understand what is happening.
>>>>>
>>>>> Cheers
>>>>> Andrea
>>>>>
>>>>> 2013/1/16 Andrea Di Menna <ni...@gmail.com>
>>>>>
>>>>>   Hi Rupert,
>>>>>> did some more tests also on a local machine (with Tomcat6) and I get
>>>>>> the
>>>>>> following behavior:
>>>>>>
>>>>>> Scenario 1: *FAILING*
>>>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>>>> - Create a (emtpy) stanbol working dir in disk B
>>>>>> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol
>>>>>> working
>>>>>> dir in disk B
>>>>>>
>>>>>> Scenario 2: *SUCCESSFUL*
>>>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>>>> - No directory created, let Stanbol create a working dir itself
>>>>>>
>>>>>> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
>>>>>> web.xml of the webapp.
>>>>>>
>>>>>> In one try with Scenario 1, Stanbol started to create 3 different
>>>>>> dbpedia
>>>>>> index dirs and it only stopped because disk space was over ;)
>>>>>> Is the symbolic link somehow the cause of this?
>>>>>>
>>>>>> I really don't know what is happening, but I have some more questions:
>>>>>>
>>>>>> 1) Should I explicitely set the sling.fileinstall.dir in the web.xml
>>>>>> or I
>>>>>> cam omit it? (I tried using the full-war, which does not have this
>>>>>> param
>>>>>> set and as far as I could see the installation was not triggered at all
>>>>>> copying the index into the datafiles folder - so I guess I have to set
>>>>>> this
>>>>>> param to enable the datafileprovider)
>>>>>>
>>>>>> 2) Should I stop the web server before copying the index file into the
>>>>>> datafiles dir? Of course copying some GBs takes some time, so I am
>>>>>> wondering if the installation process starts as soon as the file is
>>>>>> created
>>>>>> in the dir and if it is possible that the process gets confused
>>>>>> everytime
>>>>>> the file changes on the disk (I know I am talking nonsense here...)
>>>>>>
>>>>>> Answers to your questions are inline.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> 2013/1/15 Rupert Westenthaler <rupert.westenthaler@gmail.com**>
>>>>>>
>>>>>>   Hi
>>>>>>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It is like the installing phase starts with two different concurrent
>>>>>>>> threads and ends up with two different indexes (which are actually
>>>>>>>> the
>>>>>>>> same).
>>>>>>>>
>>>>>>>> What do you think the cause could be?
>>>>>>>>
>>>>>>> Yes it looks like that. I never observed this, but I can think of the
>>>>>>> following causes
>>>>>>>
>>>>>>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>>>>>>> Managed Solr Server" that both use "default" as value for the
>>>>>>> "org.apache.solr.core.**CoreContainer.name<http://org.apache.solr.core.CoreContainer.name>"
>>>>>>> property. Two instances of
>>>>>>> the service would cause the installation to happen two times. You can
>>>>>>> check this in the Configuration tab of the Felix Webconsole
>>>>>>>
>>>>>>>   Only one such configuration is present in the Configuration tab.
>>>>>>
>>>>>>   (2) The same could happen if you have two versions of the
>>>>>>> commons.solr.managed Bundle installed. You can check this in the
>>>>>>> Bundle Tab of the Webconsole
>>>>>>>
>>>>>>>   One one such version is present in the Bundle tab.
>>>>>>
>>>>>>   (3) Could there be two Stanbol instances running using the same
>>>>>>> working directory?
>>>>>>>
>>>>>>>
>>>>>>>   There is only one Stanbol instance running.
>>>>>>
>>>>>>     >
>>>>>>>> 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>>>>>>>> stanbol/datafiles and at that moment two different folders are
>>>>>>>> created
>>>>>>>> dbpedia-2013.01.15-1
>>>>>>>> dbpedia-2013.01.15-2
>>>>>>>>
>>>>>>> If the above is not the reason for the issue you could check if both
>>>>>>> indexes are active. Check for open Files in both directories and what
>>>>>>> processes they are assigned to
>>>>>>>
>>>>>>>
>>>>>>>   Only one index is active and used by the Tomcat process.
>>>>>>
>>>>>>     >
>>>>>>>> I have checked the logs, but can only see references to an error
>>>>>>>> which
>>>>>>>> states "Too many close" on the SolrCore:
>>>>>>>>
>>>>>>> That is caused by Stanbol during the deactivation of a SolrCore and
>>>>>>> can be ignored.
>>>>>>>
>>>>>>> Hope this helps in tracking down the reason for this behavior
>>>>>>>
>>>>>>> best
>>>>>>> Rupert
>>>>>>>
>>>>>>> --
>>>>>>> | Rupert Westenthaler rupert.westenthaler@gmail.com
>>>>>>> | Bodenlehenstraße 11 ++43-699-11108907
>>>>>>> | A-5500 Bischofshofen
>>>>>>>
>>>>>>>
>>>>
>>>
>> --
>> Alessandro Adamou, Ph.D.
>>
>> Knowledge Media Institute
>> The Open University
>> Walton Hall, Milton Keynes MK7 6AA
>> United Kingdom
>>
>>
>> "I will give you everything, just don't demand anything."
>> (Ettore Petrolini, 1917)
>>
>> Not sent from my iSnobTechDevice
>>
>>


-- 
Alessandro Adamou, Ph.D.

Knowledge Media Institute
The Open University
Walton Hall, Milton Keynes MK7 6AA
United Kingdom


"I will give you everything, just don't demand anything."
(Ettore Petrolini, 1917)

Not sent from my iSnobTechDevice


Re: dbpedia 3.8 index installation

Posted by Fabian Christ <ch...@googlemail.com>.
Hi,

and this is exactly what should happen IIRC ;-)

Best,
- Fabian
Am 11.07.2013 17:05 schrieb "Alessandro Adamou" <ad...@cs.unibo.it>:

> Hang in there, I just proceeded to restart org.apache.stanbol.data.sites.*
> *dbpedia and it seems to have triggered the unpacking...
>
>
> On 11/07/2013 15:49, Alessandro Adamou wrote:
>
>> Sorry for bumping this thread, but I just needed to check something about
>> this.
>>
>> Is it correct that if I just place the DBPedia 3.8 solrindex in
>> {working-dir}/datafiles and start Stanbol, it will be immediately unpacked
>> to {working-dir}/indexes/default
>>
>> and that it will be automatically picked up by the existing Solr Yard
>> configuration for the default dbpedia index?
>>
>> I simply copied Andrea's dbpedia.solrindex.zip (3.3 GiB compressed) to
>> {working-dir}/datafiles then started the regular full launcher, but it
>> doesn't seem to even start unpacking it.
>>
>> Clearly I have just the Solr index and not the OSGi bundle that will
>> create its own Yard, Cache configuration etc. like those created by the
>> indexing tool. Am I wrong assuming the existing components for the default
>> dbpedia index will pick this one up automatically?
>>
>> Just to make things faster, is there a single EntityHub bundle that I can
>> simply restart to trigger the index loading?
>>
>> Best,
>>
>> Alessandro
>>
>>
>> On 16/01/2013 19:08, Rupert Westenthaler wrote:
>>
>>> Hi Andrea
>>>
>>> Thanks for all those information. I will try to reproduce this
>>> behavior in the coming days. I will keep you informed.
>>>
>>> best
>>> Rupert
>>>
>>> On Wed, Jan 16, 2013 at 5:24 PM, Andrea Di Menna <ni...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I made another try moving the dbpedia index when Tomcat was stopped, in
>>>> scenario 1.
>>>> Looks like in this case everything works ok:
>>>> 1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~
>>>> 100MB)
>>>> 2) The custom dbpedia index is installed in dbpedia-2013.01.16-1
>>>>
>>>> Hope this can help to understand what is happening.
>>>>
>>>> Cheers
>>>> Andrea
>>>>
>>>> 2013/1/16 Andrea Di Menna <ni...@gmail.com>
>>>>
>>>>  Hi Rupert,
>>>>>
>>>>> did some more tests also on a local machine (with Tomcat6) and I get
>>>>> the
>>>>> following behavior:
>>>>>
>>>>> Scenario 1: *FAILING*
>>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>>> - Create a (emtpy) stanbol working dir in disk B
>>>>> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol
>>>>> working
>>>>> dir in disk B
>>>>>
>>>>> Scenario 2: *SUCCESSFUL*
>>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>>> - No directory created, let Stanbol create a working dir itself
>>>>>
>>>>> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
>>>>> web.xml of the webapp.
>>>>>
>>>>> In one try with Scenario 1, Stanbol started to create 3 different
>>>>> dbpedia
>>>>> index dirs and it only stopped because disk space was over ;)
>>>>> Is the symbolic link somehow the cause of this?
>>>>>
>>>>> I really don't know what is happening, but I have some more questions:
>>>>>
>>>>> 1) Should I explicitely set the sling.fileinstall.dir in the web.xml
>>>>> or I
>>>>> cam omit it? (I tried using the full-war, which does not have this
>>>>> param
>>>>> set and as far as I could see the installation was not triggered at all
>>>>> copying the index into the datafiles folder - so I guess I have to set
>>>>> this
>>>>> param to enable the datafileprovider)
>>>>>
>>>>> 2) Should I stop the web server before copying the index file into the
>>>>> datafiles dir? Of course copying some GBs takes some time, so I am
>>>>> wondering if the installation process starts as soon as the file is
>>>>> created
>>>>> in the dir and if it is possible that the process gets confused
>>>>> everytime
>>>>> the file changes on the disk (I know I am talking nonsense here...)
>>>>>
>>>>> Answers to your questions are inline.
>>>>>
>>>>> Thanks
>>>>>
>>>>> 2013/1/15 Rupert Westenthaler <rupert.westenthaler@gmail.com**>
>>>>>
>>>>>  Hi
>>>>>>
>>>>>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> It is like the installing phase starts with two different concurrent
>>>>>>> threads and ends up with two different indexes (which are actually
>>>>>>> the
>>>>>>> same).
>>>>>>>
>>>>>>> What do you think the cause could be?
>>>>>>>
>>>>>> Yes it looks like that. I never observed this, but I can think of the
>>>>>> following causes
>>>>>>
>>>>>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>>>>>> Managed Solr Server" that both use "default" as value for the
>>>>>> "org.apache.solr.core.**CoreContainer.name<http://org.apache.solr.core.CoreContainer.name>"
>>>>>> property. Two instances of
>>>>>> the service would cause the installation to happen two times. You can
>>>>>> check this in the Configuration tab of the Felix Webconsole
>>>>>>
>>>>>>  Only one such configuration is present in the Configuration tab.
>>>>>
>>>>>
>>>>>  (2) The same could happen if you have two versions of the
>>>>>> commons.solr.managed Bundle installed. You can check this in the
>>>>>> Bundle Tab of the Webconsole
>>>>>>
>>>>>>  One one such version is present in the Bundle tab.
>>>>>
>>>>>
>>>>>  (3) Could there be two Stanbol instances running using the same
>>>>>> working directory?
>>>>>>
>>>>>>
>>>>>>  There is only one Stanbol instance running.
>>>>>
>>>>>
>>>>>    >
>>>>>>
>>>>>>> 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>>>>>>> stanbol/datafiles and at that moment two different folders are
>>>>>>> created
>>>>>>> dbpedia-2013.01.15-1
>>>>>>> dbpedia-2013.01.15-2
>>>>>>>
>>>>>> If the above is not the reason for the issue you could check if both
>>>>>> indexes are active. Check for open Files in both directories and what
>>>>>> processes they are assigned to
>>>>>>
>>>>>>
>>>>>>  Only one index is active and used by the Tomcat process.
>>>>>
>>>>>
>>>>>    >
>>>>>>
>>>>>>> I have checked the logs, but can only see references to an error
>>>>>>> which
>>>>>>> states "Too many close" on the SolrCore:
>>>>>>>
>>>>>> That is caused by Stanbol during the deactivation of a SolrCore and
>>>>>> can be ignored.
>>>>>>
>>>>>> Hope this helps in tracking down the reason for this behavior
>>>>>>
>>>>>> best
>>>>>> Rupert
>>>>>>
>>>>>> --
>>>>>> | Rupert Westenthaler rupert.westenthaler@gmail.com
>>>>>> | Bodenlehenstraße 11 ++43-699-11108907
>>>>>> | A-5500 Bischofshofen
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>>
>>
>
> --
> Alessandro Adamou, Ph.D.
>
> Knowledge Media Institute
> The Open University
> Walton Hall, Milton Keynes MK7 6AA
> United Kingdom
>
>
> "I will give you everything, just don't demand anything."
> (Ettore Petrolini, 1917)
>
> Not sent from my iSnobTechDevice
>
>

Re: dbpedia 3.8 index installation

Posted by Alessandro Adamou <ad...@cs.unibo.it>.
Hang in there, I just proceeded to restart 
org.apache.stanbol.data.sites.dbpedia and it seems to have triggered the 
unpacking...


On 11/07/2013 15:49, Alessandro Adamou wrote:
> Sorry for bumping this thread, but I just needed to check something 
> about this.
>
> Is it correct that if I just place the DBPedia 3.8 solrindex in 
> {working-dir}/datafiles and start Stanbol, it will be immediately 
> unpacked to {working-dir}/indexes/default
>
> and that it will be automatically picked up by the existing Solr Yard 
> configuration for the default dbpedia index?
>
> I simply copied Andrea's dbpedia.solrindex.zip (3.3 GiB compressed) to 
> {working-dir}/datafiles then started the regular full launcher, but it 
> doesn't seem to even start unpacking it.
>
> Clearly I have just the Solr index and not the OSGi bundle that will 
> create its own Yard, Cache configuration etc. like those created by 
> the indexing tool. Am I wrong assuming the existing components for the 
> default dbpedia index will pick this one up automatically?
>
> Just to make things faster, is there a single EntityHub bundle that I 
> can simply restart to trigger the index loading?
>
> Best,
>
> Alessandro
>
>
> On 16/01/2013 19:08, Rupert Westenthaler wrote:
>> Hi Andrea
>>
>> Thanks for all those information. I will try to reproduce this
>> behavior in the coming days. I will keep you informed.
>>
>> best
>> Rupert
>>
>> On Wed, Jan 16, 2013 at 5:24 PM, Andrea Di Menna <ni...@gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> I made another try moving the dbpedia index when Tomcat was stopped, in
>>> scenario 1.
>>> Looks like in this case everything works ok:
>>> 1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~ 
>>> 100MB)
>>> 2) The custom dbpedia index is installed in dbpedia-2013.01.16-1
>>>
>>> Hope this can help to understand what is happening.
>>>
>>> Cheers
>>> Andrea
>>>
>>> 2013/1/16 Andrea Di Menna <ni...@gmail.com>
>>>
>>>> Hi Rupert,
>>>>
>>>> did some more tests also on a local machine (with Tomcat6) and I 
>>>> get the
>>>> following behavior:
>>>>
>>>> Scenario 1: *FAILING*
>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>> - Create a (emtpy) stanbol working dir in disk B
>>>> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol 
>>>> working
>>>> dir in disk B
>>>>
>>>> Scenario 2: *SUCCESSFUL*
>>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>>> - No directory created, let Stanbol create a working dir itself
>>>>
>>>> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
>>>> web.xml of the webapp.
>>>>
>>>> In one try with Scenario 1, Stanbol started to create 3 different 
>>>> dbpedia
>>>> index dirs and it only stopped because disk space was over ;)
>>>> Is the symbolic link somehow the cause of this?
>>>>
>>>> I really don't know what is happening, but I have some more questions:
>>>>
>>>> 1) Should I explicitely set the sling.fileinstall.dir in the 
>>>> web.xml or I
>>>> cam omit it? (I tried using the full-war, which does not have this 
>>>> param
>>>> set and as far as I could see the installation was not triggered at 
>>>> all
>>>> copying the index into the datafiles folder - so I guess I have to 
>>>> set this
>>>> param to enable the datafileprovider)
>>>>
>>>> 2) Should I stop the web server before copying the index file into the
>>>> datafiles dir? Of course copying some GBs takes some time, so I am
>>>> wondering if the installation process starts as soon as the file is 
>>>> created
>>>> in the dir and if it is possible that the process gets confused 
>>>> everytime
>>>> the file changes on the disk (I know I am talking nonsense here...)
>>>>
>>>> Answers to your questions are inline.
>>>>
>>>> Thanks
>>>>
>>>> 2013/1/15 Rupert Westenthaler <ru...@gmail.com>
>>>>
>>>>> Hi
>>>>>
>>>>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>>>>> wrote:
>>>>>> It is like the installing phase starts with two different concurrent
>>>>>> threads and ends up with two different indexes (which are 
>>>>>> actually the
>>>>>> same).
>>>>>>
>>>>>> What do you think the cause could be?
>>>>> Yes it looks like that. I never observed this, but I can think of the
>>>>> following causes
>>>>>
>>>>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>>>>> Managed Solr Server" that both use "default" as value for the
>>>>> "org.apache.solr.core.CoreContainer.name" property. Two instances of
>>>>> the service would cause the installation to happen two times. You can
>>>>> check this in the Configuration tab of the Felix Webconsole
>>>>>
>>>> Only one such configuration is present in the Configuration tab.
>>>>
>>>>
>>>>> (2) The same could happen if you have two versions of the
>>>>> commons.solr.managed Bundle installed. You can check this in the
>>>>> Bundle Tab of the Webconsole
>>>>>
>>>> One one such version is present in the Bundle tab.
>>>>
>>>>
>>>>> (3) Could there be two Stanbol instances running using the same
>>>>> working directory?
>>>>>
>>>>>
>>>> There is only one Stanbol instance running.
>>>>
>>>>
>>>>>   >
>>>>>> 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>>>>>> stanbol/datafiles and at that moment two different folders are 
>>>>>> created
>>>>>> dbpedia-2013.01.15-1
>>>>>> dbpedia-2013.01.15-2
>>>>> If the above is not the reason for the issue you could check if both
>>>>> indexes are active. Check for open Files in both directories and what
>>>>> processes they are assigned to
>>>>>
>>>>>
>>>> Only one index is active and used by the Tomcat process.
>>>>
>>>>
>>>>>   >
>>>>>> I have checked the logs, but can only see references to an error 
>>>>>> which
>>>>>> states "Too many close" on the SolrCore:
>>>>> That is caused by Stanbol during the deactivation of a SolrCore and
>>>>> can be ignored.
>>>>>
>>>>> Hope this helps in tracking down the reason for this behavior
>>>>>
>>>>> best
>>>>> Rupert
>>>>>
>>>>> -- 
>>>>> | Rupert Westenthaler rupert.westenthaler@gmail.com
>>>>> | Bodenlehenstraße 11 ++43-699-11108907
>>>>> | A-5500 Bischofshofen
>>>>>
>>>>
>>
>>
>
>


-- 
Alessandro Adamou, Ph.D.

Knowledge Media Institute
The Open University
Walton Hall, Milton Keynes MK7 6AA
United Kingdom


"I will give you everything, just don't demand anything."
(Ettore Petrolini, 1917)

Not sent from my iSnobTechDevice


Re: dbpedia 3.8 index installation

Posted by Alessandro Adamou <ad...@cs.unibo.it>.
Sorry for bumping this thread, but I just needed to check something 
about this.

Is it correct that if I just place the DBPedia 3.8 solrindex in 
{working-dir}/datafiles and start Stanbol, it will be immediately 
unpacked to {working-dir}/indexes/default

and that it will be automatically picked up by the existing Solr Yard 
configuration for the default dbpedia index?

I simply copied Andrea's dbpedia.solrindex.zip (3.3 GiB compressed) to 
{working-dir}/datafiles then started the regular full launcher, but it 
doesn't seem to even start unpacking it.

Clearly I have just the Solr index and not the OSGi bundle that will 
create its own Yard, Cache configuration etc. like those created by the 
indexing tool. Am I wrong assuming the existing components for the 
default dbpedia index will pick this one up automatically?

Just to make things faster, is there a single EntityHub bundle that I 
can simply restart to trigger the index loading?

Best,

Alessandro


On 16/01/2013 19:08, Rupert Westenthaler wrote:
> Hi Andrea
>
> Thanks for all those information. I will try to reproduce this
> behavior in the coming days. I will keep you informed.
>
> best
> Rupert
>
> On Wed, Jan 16, 2013 at 5:24 PM, Andrea Di Menna <ni...@gmail.com> wrote:
>> Hi,
>>
>> I made another try moving the dbpedia index when Tomcat was stopped, in
>> scenario 1.
>> Looks like in this case everything works ok:
>> 1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~ 100MB)
>> 2) The custom dbpedia index is installed in dbpedia-2013.01.16-1
>>
>> Hope this can help to understand what is happening.
>>
>> Cheers
>> Andrea
>>
>> 2013/1/16 Andrea Di Menna <ni...@gmail.com>
>>
>>> Hi Rupert,
>>>
>>> did some more tests also on a local machine (with Tomcat6) and I get the
>>> following behavior:
>>>
>>> Scenario 1: *FAILING*
>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>> - Create a (emtpy) stanbol working dir in disk B
>>> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol working
>>> dir in disk B
>>>
>>> Scenario 2: *SUCCESSFUL*
>>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>>> - No directory created, let Stanbol create a working dir itself
>>>
>>> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
>>> web.xml of the webapp.
>>>
>>> In one try with Scenario 1, Stanbol started to create 3 different dbpedia
>>> index dirs and it only stopped because disk space was over ;)
>>> Is the symbolic link somehow the cause of this?
>>>
>>> I really don't know what is happening, but I have some more questions:
>>>
>>> 1) Should I explicitely set the sling.fileinstall.dir in the web.xml or I
>>> cam omit it? (I tried using the full-war, which does not have this param
>>> set and as far as I could see the installation was not triggered at all
>>> copying the index into the datafiles folder - so I guess I have to set this
>>> param to enable the datafileprovider)
>>>
>>> 2) Should I stop the web server before copying the index file into the
>>> datafiles dir? Of course copying some GBs takes some time, so I am
>>> wondering if the installation process starts as soon as the file is created
>>> in the dir and if it is possible that the process gets confused everytime
>>> the file changes on the disk (I know I am talking nonsense here...)
>>>
>>> Answers to your questions are inline.
>>>
>>> Thanks
>>>
>>> 2013/1/15 Rupert Westenthaler <ru...@gmail.com>
>>>
>>>> Hi
>>>>
>>>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>>>> wrote:
>>>>> It is like the installing phase starts with two different concurrent
>>>>> threads and ends up with two different indexes (which are actually the
>>>>> same).
>>>>>
>>>>> What do you think the cause could be?
>>>> Yes it looks like that. I never observed this, but I can think of the
>>>> following causes
>>>>
>>>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>>>> Managed Solr Server" that both use "default" as value for the
>>>> "org.apache.solr.core.CoreContainer.name" property. Two instances of
>>>> the service would cause the installation to happen two times. You can
>>>> check this in the Configuration tab of the Felix Webconsole
>>>>
>>> Only one such configuration is present in the Configuration tab.
>>>
>>>
>>>> (2) The same could happen if you have two versions of the
>>>> commons.solr.managed Bundle installed. You can check this in the
>>>> Bundle Tab of the Webconsole
>>>>
>>> One one such version is present in the Bundle tab.
>>>
>>>
>>>> (3) Could there be two Stanbol instances running using the same
>>>> working directory?
>>>>
>>>>
>>> There is only one Stanbol instance running.
>>>
>>>
>>>>   >
>>>>> 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>>>>> stanbol/datafiles and at that moment two different folders are created
>>>>> dbpedia-2013.01.15-1
>>>>> dbpedia-2013.01.15-2
>>>> If the above is not the reason for the issue you could check if both
>>>> indexes are active. Check for open Files in both directories and what
>>>> processes they are assigned to
>>>>
>>>>
>>> Only one index is active and used by the Tomcat process.
>>>
>>>
>>>>   >
>>>>> I have checked the logs, but can only see references to an error which
>>>>> states "Too many close" on the SolrCore:
>>>> That is caused by Stanbol during the deactivation of a SolrCore and
>>>> can be ignored.
>>>>
>>>> Hope this helps in tracking down the reason for this behavior
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> --
>>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>>> | A-5500 Bischofshofen
>>>>
>>>
>
>


-- 
Alessandro Adamou, Ph.D.

Knowledge Media Institute
The Open University
Walton Hall, Milton Keynes MK7 6AA
United Kingdom


"I will give you everything, just don't demand anything."
(Ettore Petrolini, 1917)

Not sent from my iSnobTechDevice


Re: dbpedia 3.8 index installation

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Andrea

Thanks for all those information. I will try to reproduce this
behavior in the coming days. I will keep you informed.

best
Rupert

On Wed, Jan 16, 2013 at 5:24 PM, Andrea Di Menna <ni...@gmail.com> wrote:
> Hi,
>
> I made another try moving the dbpedia index when Tomcat was stopped, in
> scenario 1.
> Looks like in this case everything works ok:
> 1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~ 100MB)
> 2) The custom dbpedia index is installed in dbpedia-2013.01.16-1
>
> Hope this can help to understand what is happening.
>
> Cheers
> Andrea
>
> 2013/1/16 Andrea Di Menna <ni...@gmail.com>
>
>> Hi Rupert,
>>
>> did some more tests also on a local machine (with Tomcat6) and I get the
>> following behavior:
>>
>> Scenario 1: *FAILING*
>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>> - Create a (emtpy) stanbol working dir in disk B
>> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol working
>> dir in disk B
>>
>> Scenario 2: *SUCCESSFUL*
>> - Tomcat is installed on disk A dir /var/lib/tomcat6
>> - No directory created, let Stanbol create a working dir itself
>>
>> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
>> web.xml of the webapp.
>>
>> In one try with Scenario 1, Stanbol started to create 3 different dbpedia
>> index dirs and it only stopped because disk space was over ;)
>> Is the symbolic link somehow the cause of this?
>>
>> I really don't know what is happening, but I have some more questions:
>>
>> 1) Should I explicitely set the sling.fileinstall.dir in the web.xml or I
>> cam omit it? (I tried using the full-war, which does not have this param
>> set and as far as I could see the installation was not triggered at all
>> copying the index into the datafiles folder - so I guess I have to set this
>> param to enable the datafileprovider)
>>
>> 2) Should I stop the web server before copying the index file into the
>> datafiles dir? Of course copying some GBs takes some time, so I am
>> wondering if the installation process starts as soon as the file is created
>> in the dir and if it is possible that the process gets confused everytime
>> the file changes on the disk (I know I am talking nonsense here...)
>>
>> Answers to your questions are inline.
>>
>> Thanks
>>
>> 2013/1/15 Rupert Westenthaler <ru...@gmail.com>
>>
>>> Hi
>>>
>>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>>> wrote:
>>> > It is like the installing phase starts with two different concurrent
>>> > threads and ends up with two different indexes (which are actually the
>>> > same).
>>> >
>>> > What do you think the cause could be?
>>>
>>> Yes it looks like that. I never observed this, but I can think of the
>>> following causes
>>>
>>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>>> Managed Solr Server" that both use "default" as value for the
>>> "org.apache.solr.core.CoreContainer.name" property. Two instances of
>>> the service would cause the installation to happen two times. You can
>>> check this in the Configuration tab of the Felix Webconsole
>>>
>>
>> Only one such configuration is present in the Configuration tab.
>>
>>
>>> (2) The same could happen if you have two versions of the
>>> commons.solr.managed Bundle installed. You can check this in the
>>> Bundle Tab of the Webconsole
>>>
>>
>> One one such version is present in the Bundle tab.
>>
>>
>>> (3) Could there be two Stanbol instances running using the same
>>> working directory?
>>>
>>>
>> There is only one Stanbol instance running.
>>
>>
>>>  >
>>> > 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>>> > stanbol/datafiles and at that moment two different folders are created
>>> > dbpedia-2013.01.15-1
>>> > dbpedia-2013.01.15-2
>>>
>>> If the above is not the reason for the issue you could check if both
>>> indexes are active. Check for open Files in both directories and what
>>> processes they are assigned to
>>>
>>>
>> Only one index is active and used by the Tomcat process.
>>
>>
>>>  >
>>> > I have checked the logs, but can only see references to an error which
>>> > states "Too many close" on the SolrCore:
>>>
>>> That is caused by Stanbol during the deactivation of a SolrCore and
>>> can be ignored.
>>>
>>> Hope this helps in tracking down the reason for this behavior
>>>
>>> best
>>> Rupert
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: dbpedia 3.8 index installation

Posted by Andrea Di Menna <ni...@gmail.com>.
Hi,

I made another try moving the dbpedia index when Tomcat was stopped, in
scenario 1.
Looks like in this case everything works ok:
1) The default dbpedia index is kept stored in dbpedia-2013.01.16 (~ 100MB)
2) The custom dbpedia index is installed in dbpedia-2013.01.16-1

Hope this can help to understand what is happening.

Cheers
Andrea

2013/1/16 Andrea Di Menna <ni...@gmail.com>

> Hi Rupert,
>
> did some more tests also on a local machine (with Tomcat6) and I get the
> following behavior:
>
> Scenario 1: *FAILING*
> - Tomcat is installed on disk A dir /var/lib/tomcat6
> - Create a (emtpy) stanbol working dir in disk B
> - Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol working
> dir in disk B
>
> Scenario 2: *SUCCESSFUL*
> - Tomcat is installed on disk A dir /var/lib/tomcat6
> - No directory created, let Stanbol create a working dir itself
>
> The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
> web.xml of the webapp.
>
> In one try with Scenario 1, Stanbol started to create 3 different dbpedia
> index dirs and it only stopped because disk space was over ;)
> Is the symbolic link somehow the cause of this?
>
> I really don't know what is happening, but I have some more questions:
>
> 1) Should I explicitely set the sling.fileinstall.dir in the web.xml or I
> cam omit it? (I tried using the full-war, which does not have this param
> set and as far as I could see the installation was not triggered at all
> copying the index into the datafiles folder - so I guess I have to set this
> param to enable the datafileprovider)
>
> 2) Should I stop the web server before copying the index file into the
> datafiles dir? Of course copying some GBs takes some time, so I am
> wondering if the installation process starts as soon as the file is created
> in the dir and if it is possible that the process gets confused everytime
> the file changes on the disk (I know I am talking nonsense here...)
>
> Answers to your questions are inline.
>
> Thanks
>
> 2013/1/15 Rupert Westenthaler <ru...@gmail.com>
>
>> Hi
>>
>> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
>> wrote:
>> > It is like the installing phase starts with two different concurrent
>> > threads and ends up with two different indexes (which are actually the
>> > same).
>> >
>> > What do you think the cause could be?
>>
>> Yes it looks like that. I never observed this, but I can think of the
>> following causes
>>
>> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
>> Managed Solr Server" that both use "default" as value for the
>> "org.apache.solr.core.CoreContainer.name" property. Two instances of
>> the service would cause the installation to happen two times. You can
>> check this in the Configuration tab of the Felix Webconsole
>>
>
> Only one such configuration is present in the Configuration tab.
>
>
>> (2) The same could happen if you have two versions of the
>> commons.solr.managed Bundle installed. You can check this in the
>> Bundle Tab of the Webconsole
>>
>
> One one such version is present in the Bundle tab.
>
>
>> (3) Could there be two Stanbol instances running using the same
>> working directory?
>>
>>
> There is only one Stanbol instance running.
>
>
>>  >
>> > 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
>> > stanbol/datafiles and at that moment two different folders are created
>> > dbpedia-2013.01.15-1
>> > dbpedia-2013.01.15-2
>>
>> If the above is not the reason for the issue you could check if both
>> indexes are active. Check for open Files in both directories and what
>> processes they are assigned to
>>
>>
> Only one index is active and used by the Tomcat process.
>
>
>>  >
>> > I have checked the logs, but can only see references to an error which
>> > states "Too many close" on the SolrCore:
>>
>> That is caused by Stanbol during the deactivation of a SolrCore and
>> can be ignored.
>>
>> Hope this helps in tracking down the reason for this behavior
>>
>> best
>> Rupert
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>
>

Re: dbpedia 3.8 index installation

Posted by Andrea Di Menna <ni...@gmail.com>.
Hi Rupert,

did some more tests also on a local machine (with Tomcat6) and I get the
following behavior:

Scenario 1: *FAILING*
- Tomcat is installed on disk A dir /var/lib/tomcat6
- Create a (emtpy) stanbol working dir in disk B
- Create a symbolic link from /var/lib/tomcat6/stanbol to stanbol working
dir in disk B

Scenario 2: *SUCCESSFUL*
- Tomcat is installed on disk A dir /var/lib/tomcat6
- No directory created, let Stanbol create a working dir itself

The sling.fileinstall.dir parameter is set to stanbol/datafiles in the
web.xml of the webapp.

In one try with Scenario 1, Stanbol started to create 3 different dbpedia
index dirs and it only stopped because disk space was over ;)
Is the symbolic link somehow the cause of this?

I really don't know what is happening, but I have some more questions:

1) Should I explicitely set the sling.fileinstall.dir in the web.xml or I
cam omit it? (I tried using the full-war, which does not have this param
set and as far as I could see the installation was not triggered at all
copying the index into the datafiles folder - so I guess I have to set this
param to enable the datafileprovider)

2) Should I stop the web server before copying the index file into the
datafiles dir? Of course copying some GBs takes some time, so I am
wondering if the installation process starts as soon as the file is created
in the dir and if it is possible that the process gets confused everytime
the file changes on the disk (I know I am talking nonsense here...)

Answers to your questions are inline.

Thanks

2013/1/15 Rupert Westenthaler <ru...@gmail.com>

> Hi
>
> On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com>
> wrote:
> > It is like the installing phase starts with two different concurrent
> > threads and ends up with two different indexes (which are actually the
> > same).
> >
> > What do you think the cause could be?
>
> Yes it looks like that. I never observed this, but I can think of the
> following causes
>
> (1) Maybe you have two configuration for the "Apache Stanbol Solr:
> Managed Solr Server" that both use "default" as value for the
> "org.apache.solr.core.CoreContainer.name" property. Two instances of
> the service would cause the installation to happen two times. You can
> check this in the Configuration tab of the Felix Webconsole
>

Only one such configuration is present in the Configuration tab.


> (2) The same could happen if you have two versions of the
> commons.solr.managed Bundle installed. You can check this in the
> Bundle Tab of the Webconsole
>

One one such version is present in the Bundle tab.


> (3) Could there be two Stanbol instances running using the same
> working directory?
>
>
There is only one Stanbol instance running.


>  >
> > 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
> > stanbol/datafiles and at that moment two different folders are created
> > dbpedia-2013.01.15-1
> > dbpedia-2013.01.15-2
>
> If the above is not the reason for the issue you could check if both
> indexes are active. Check for open Files in both directories and what
> processes they are assigned to
>
>
Only one index is active and used by the Tomcat process.


>  >
> > I have checked the logs, but can only see references to an error which
> > states "Too many close" on the SolrCore:
>
> That is caused by Stanbol during the deactivation of a SolrCore and
> can be ignored.
>
> Hope this helps in tracking down the reason for this behavior
>
> best
> Rupert
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: dbpedia 3.8 index installation

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi

On Tue, Jan 15, 2013 at 6:35 PM, Andrea Di Menna <ni...@gmail.com> wrote:
> It is like the installing phase starts with two different concurrent
> threads and ends up with two different indexes (which are actually the
> same).
>
> What do you think the cause could be?

Yes it looks like that. I never observed this, but I can think of the
following causes

(1) Maybe you have two configuration for the "Apache Stanbol Solr:
Managed Solr Server" that both use "default" as value for the
"org.apache.solr.core.CoreContainer.name" property. Two instances of
the service would cause the installation to happen two times. You can
check this in the Configuration tab of the Felix Webconsole
(2) The same could happen if you have two versions of the
commons.solr.managed Bundle installed. You can check this in the
Bundle Tab of the Webconsole
(3) Could there be two Stanbol instances running using the same
working directory?

>
> 2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
> stanbol/datafiles and at that moment two different folders are created
> dbpedia-2013.01.15-1
> dbpedia-2013.01.15-2

If the above is not the reason for the issue you could check if both
indexes are active. Check for open Files in both directories and what
processes they are assigned to

>
> I have checked the logs, but can only see references to an error which
> states "Too many close" on the SolrCore:

That is caused by Stanbol during the deactivation of a SolrCore and
can be ignored.

Hope this helps in tracking down the reason for this behavior

best
Rupert

--
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: dbpedia 3.8 index installation

Posted by Andrea Di Menna <ni...@gmail.com>.
Hi Rupert,

tried again on one of the machines and I still get the same behaviour.

What actually happens is:
1) I start Stanbol and the default dbpedia index is installed in
dbpedia-2013.01.15

This dir is about 100MB

2) I add the dbpedia.solrindex.zip (which is about 3.5 GB) to
stanbol/datafiles and at that moment two different folders are created
dbpedia-2013.01.15-1
dbpedia-2013.01.15-2

They grow to the same size (about 9.5 GB)

After the process stops dbpedia-2013.01.15 is deleted (the default index).

The ref file points to dbpedia-2013.01.15-2

I have checked the logs, but can only see references to an error which
states "Too many close" on the SolrCore:

15.01.2013 15:16:14.493 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.managed.impl.ManagedSolrServerImpl  ...
start to ACTIVATE Index dbpedia on ManagedSolrServer
15.01.2013 15:16:19.493 *INFO* [DataFileTrackingDaemon]
org.apache.stanbol.commons.stanboltools.datafileprovider.impl.tracking.DataFileTrackerImpl
 ... tracking stopped!
15.01.2013 15:22:57.477 *WARN* [OsgiInstallerImpl]
org.apache.solr.handler.component.SpellCheckComponent No queryConverter
defined, using default converter
15.01.2013 15:22:57.526 *INFO* [OsgiInstallerImpl]
org.apache.stanbol.commons.solr.RegisteredSolrServerTracker  ... in
addingService for IndexReference[server:null,index:dbpedia] (ref:
[org.apache.solr.core.SolrCore])
15.01.2013 15:22:57.529 *INFO* [OsgiInstallerImpl]
org.apache.stanbol.commons.solr.managed Service [173] ServiceEvent
REGISTERED
15.01.2013 15:22:57.530 *INFO* [OsgiInstallerImpl]
org.apache.stanbol.commons.solr.RegisteredSolrServerTracker  ... in
removedService for IndexReference[server:null,index:dbpedia] (ref:
[org.apache.solr.core.SolrCore], service
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer@5dcfd078)
15.01.2013 15:22:57.606 *INFO* [OsgiInstallerImpl]
org.apache.stanbol.commons.solr.managed Service [167] ServiceEvent
UNREGISTERING
15.01.2013 15:22:57.616 *ERROR* [OsgiInstallerImpl]
org.apache.solr.core.SolrCore Too many close [count:-1] on
org.apache.solr.core.SolrCore@33318b82. Please report this exception to
solr-user@lucene.apache.org
15.01.2013 15:22:57.644 *WARN* [Thread-44]
org.apache.solr.handler.component.SpellCheckComponent No queryConverter
defined, using default converter
15.01.2013 15:22:57.693 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.RegisteredSolrServerTracker  ... in
addingService for IndexReference[server:null,index:dbpedia] (ref:
[org.apache.solr.core.SolrCore])
15.01.2013 15:22:57.697 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.managed Service [174] ServiceEvent
REGISTERED
15.01.2013 15:22:57.698 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.RegisteredSolrServerTracker  ... in
removedService for IndexReference[server:null,index:dbpedia] (ref:
[org.apache.solr.core.SolrCore], service
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer@f50dbe6)
15.01.2013 15:22:57.700 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.managed Service [173] ServiceEvent
UNREGISTERING
15.01.2013 15:22:59.307 *ERROR* [Thread-44] org.apache.solr.core.SolrCore
Too many close [count:-1] on org.apache.solr.core.SolrCore@4c458d2b. Please
report this exception to solr-user@lucene.apache.org
15.01.2013 15:22:59.311 *INFO* [Thread-44]
org.apache.stanbol.commons.solr.managed.impl.ManagedSolrServerImpl  ...
Index dbpedia on ManagedSolrServer default is now ACTIVE

It is like the installing phase starts with two different concurrent
threads and ends up with two different indexes (which are actually the
same).

What do you think the cause could be?

Regards
Andrea

2013/1/15 Rupert Westenthaler <ru...@gmail.com>

> On Tue, Jan 15, 2013 at 3:48 PM, Andrea Di Menna <ni...@gmail.com>
> wrote:
> > a) dbpedia-2013.15.1
>
> This is the folder created for the DBpedia default index that is
> included in the Stanbol launcher
>
> > b) dbpedia-2013.15.1-1
>
> This is the folder created for the dbpedia.solrindex.zip in the datafiles
> folder
>
> (a) should get deleted as soon as (b) is fully copied, initialized and
> added as SolrCore to the CoreContainer.
>
>
> >
> > when the zip file has been uncompressed both folders are the same size
> and
> > contain the same data, and they are both kept on the HDD.
>
> (a) should be ~100MByte in size. (b) depends on the
> dbpedia.solrindex.zip you are using. (a) and (b) should not be the
> same data.
>
> >
> > Of course, in a way this makes the index double its size.
> >
> > I remember seeing this pattern applied also when deploying Stanbol in a
> > Jetty instance, but as far as I remember one of the two folders was
> delete
> > when the installation process ended.
>
> Exactly this is the expected behavior
>
> > What could be the cause of this behaviour?
> >
>
> I have never seen this happen. You can in the stanbol working dir and
> search for the dbpedia.solrindex.ref file (find . -name
> "dbpedia.solrindex.ref"). This file is a Java properties file and the
> value of the "Directory" parameter will tell you what file is actually
> used.
>
> However please have also a look at the logs. Maybe this behavior is
> caused by some Exception during the initialization.
>
> best
> Rupert
>
> > Regards
> > Andrea
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: dbpedia 3.8 index installation

Posted by Rupert Westenthaler <ru...@gmail.com>.
On Tue, Jan 15, 2013 at 3:48 PM, Andrea Di Menna <ni...@gmail.com> wrote:
> a) dbpedia-2013.15.1

This is the folder created for the DBpedia default index that is
included in the Stanbol launcher

> b) dbpedia-2013.15.1-1

This is the folder created for the dbpedia.solrindex.zip in the datafiles folder

(a) should get deleted as soon as (b) is fully copied, initialized and
added as SolrCore to the CoreContainer.


>
> when the zip file has been uncompressed both folders are the same size and
> contain the same data, and they are both kept on the HDD.

(a) should be ~100MByte in size. (b) depends on the
dbpedia.solrindex.zip you are using. (a) and (b) should not be the
same data.

>
> Of course, in a way this makes the index double its size.
>
> I remember seeing this pattern applied also when deploying Stanbol in a
> Jetty instance, but as far as I remember one of the two folders was delete
> when the installation process ended.

Exactly this is the expected behavior

> What could be the cause of this behaviour?
>

I have never seen this happen. You can in the stanbol working dir and
search for the dbpedia.solrindex.ref file (find . -name
"dbpedia.solrindex.ref"). This file is a Java properties file and the
value of the "Directory" parameter will tell you what file is actually
used.

However please have also a look at the logs. Maybe this behavior is
caused by some Exception during the initialization.

best
Rupert

> Regards
> Andrea



--
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen