You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by ka...@nokia.com on 2010/06/03 23:34:09 UTC

Derby

For what it's worth, after some 5 days of work, and a couple of schema changes to boot, LCF now runs with Derby.
Some caveats:

(1)     You can't run more than one LCF process at a time.  That means you need to either run the daemon or the crawler-ui web application, but you can't run both at the same time.
(2)     I haven't tested every query, so I'm sure there are probably some that are still broken.
(3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate of Postgresql for you.
(4)     Transactional integrity hasn't been evaluated.
(5)     Deadlock detection and unique constraint violation detection is probably not right, because I'd need to cause these errors to occur before being able to key off their exception messages.
(6)     I had to turn off the ability to sort on certain columns in the reports - basically, any column that was represented as a large character field.

Nevertheless, this represents an important milestone on the path to being able to write some kind of unit tests that have at least some meaning.

If you have an existing LCF Postgresql database, you will need to force an upgrade after going to the new trunk code.  To do this, repeat the "org.apache.lcf.agents.Install" command, and the "org.apache.lcf.agents.Register org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the new code.  And, please, let me know of any kind of errors you notice that could be related to the schema change.

Thanks,
Karl



RE: Derby

Posted by ka...@nokia.com.
Yup.

Karl

-----Original Message-----
From: ext Jack Krupansky [mailto:jack.krupansky@lucidimagination.com] 
Sent: Friday, June 04, 2010 12:27 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

Just to be clear, the full sequence would be:

1) Start UI app. Agent process should not be running.
2) "Start" LCF job in UI.
3) Shutdown UI app. Not just close the browser window.
4) AgentRun.
5) Wait long enough for crawl to have finished. Maybe watch to see that Solr 
has become idle.
6) Possibly commit to Solr.
7) AgentStop.
8) Back to step 1 for additional jobs.

Correct?

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 7:24 PM
To: <co...@incubator.apache.org>
Subject: RE: Derby

> The daemon does not need to interact with the UI directly, only with the 
> database.  So, you stop the UI, start the daemon, and after a while, shut 
> down the daemon and restart the UI.
>
> Karl
>
> -----Original Message-----
> From: ext Jack Krupansky [mailto:jack.krupansky@lucidimagination.com]
> Sent: Thursday, June 03, 2010 5:51 PM
> To: connectors-dev@incubator.apache.org
> Subject: Re: Derby
>
>> (1)     You can't run more than one LCF process at a time.  That means 
>> you
>> need to either run the daemon or the crawler-ui web application, but you
>> can't run both at the same time.
>
> How do you "Start" a crawl then if not in the web app which then starts 
> the
> agent process crawling?
>
> Thanks for all of this effort!
>
> -- Jack Krupansky
>
> --------------------------------------------------
> From: <ka...@nokia.com>
> Sent: Thursday, June 03, 2010 5:34 PM
> To: <co...@incubator.apache.org>
> Subject: Derby
>
>> For what it's worth, after some 5 days of work, and a couple of schema
>> changes to boot, LCF now runs with Derby.
>> Some caveats:
>>
>> (1)     You can't run more than one LCF process at a time.  That means 
>> you
>> need to either run the daemon or the crawler-ui web application, but you
>> can't run both at the same time.
>> (2)     I haven't tested every query, so I'm sure there are probably some
>> that are still broken.
>> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate 
>> of
>> Postgresql for you.
>> (4)     Transactional integrity hasn't been evaluated.
>> (5)     Deadlock detection and unique constraint violation detection is
>> probably not right, because I'd need to cause these errors to occur 
>> before
>> being able to key off their exception messages.
>> (6)     I had to turn off the ability to sort on certain columns in the
>> reports - basically, any column that was represented as a large character
>> field.
>>
>> Nevertheless, this represents an important milestone on the path to being
>> able to write some kind of unit tests that have at least some meaning.
>>
>> If you have an existing LCF Postgresql database, you will need to force 
>> an
>> upgrade after going to the new trunk code.  To do this, repeat the
>> "org.apache.lcf.agents.Install" command, and the
>> "org.apache.lcf.agents.Register
>> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the
>> new code.  And, please, let me know of any kind of errors you notice that
>> could be related to the schema change.
>>
>> Thanks,
>> Karl
>>
>>
>> 

Re: Derby

Posted by Jack Krupansky <ja...@lucidimagination.com>.
Just to be clear, the full sequence would be:

1) Start UI app. Agent process should not be running.
2) "Start" LCF job in UI.
3) Shutdown UI app. Not just close the browser window.
4) AgentRun.
5) Wait long enough for crawl to have finished. Maybe watch to see that Solr 
has become idle.
6) Possibly commit to Solr.
7) AgentStop.
8) Back to step 1 for additional jobs.

Correct?

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 7:24 PM
To: <co...@incubator.apache.org>
Subject: RE: Derby

> The daemon does not need to interact with the UI directly, only with the 
> database.  So, you stop the UI, start the daemon, and after a while, shut 
> down the daemon and restart the UI.
>
> Karl
>
> -----Original Message-----
> From: ext Jack Krupansky [mailto:jack.krupansky@lucidimagination.com]
> Sent: Thursday, June 03, 2010 5:51 PM
> To: connectors-dev@incubator.apache.org
> Subject: Re: Derby
>
>> (1)     You can't run more than one LCF process at a time.  That means 
>> you
>> need to either run the daemon or the crawler-ui web application, but you
>> can't run both at the same time.
>
> How do you "Start" a crawl then if not in the web app which then starts 
> the
> agent process crawling?
>
> Thanks for all of this effort!
>
> -- Jack Krupansky
>
> --------------------------------------------------
> From: <ka...@nokia.com>
> Sent: Thursday, June 03, 2010 5:34 PM
> To: <co...@incubator.apache.org>
> Subject: Derby
>
>> For what it's worth, after some 5 days of work, and a couple of schema
>> changes to boot, LCF now runs with Derby.
>> Some caveats:
>>
>> (1)     You can't run more than one LCF process at a time.  That means 
>> you
>> need to either run the daemon or the crawler-ui web application, but you
>> can't run both at the same time.
>> (2)     I haven't tested every query, so I'm sure there are probably some
>> that are still broken.
>> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate 
>> of
>> Postgresql for you.
>> (4)     Transactional integrity hasn't been evaluated.
>> (5)     Deadlock detection and unique constraint violation detection is
>> probably not right, because I'd need to cause these errors to occur 
>> before
>> being able to key off their exception messages.
>> (6)     I had to turn off the ability to sort on certain columns in the
>> reports - basically, any column that was represented as a large character
>> field.
>>
>> Nevertheless, this represents an important milestone on the path to being
>> able to write some kind of unit tests that have at least some meaning.
>>
>> If you have an existing LCF Postgresql database, you will need to force 
>> an
>> upgrade after going to the new trunk code.  To do this, repeat the
>> "org.apache.lcf.agents.Install" command, and the
>> "org.apache.lcf.agents.Register
>> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the
>> new code.  And, please, let me know of any kind of errors you notice that
>> could be related to the schema change.
>>
>> Thanks,
>> Karl
>>
>>
>> 

RE: Derby

Posted by ka...@nokia.com.
The daemon does not need to interact with the UI directly, only with the database.  So, you stop the UI, start the daemon, and after a while, shut down the daemon and restart the UI.

Karl

-----Original Message-----
From: ext Jack Krupansky [mailto:jack.krupansky@lucidimagination.com] 
Sent: Thursday, June 03, 2010 5:51 PM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.

How do you "Start" a crawl then if not in the web app which then starts the 
agent process crawling?

Thanks for all of this effort!

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 5:34 PM
To: <co...@incubator.apache.org>
Subject: Derby

> For what it's worth, after some 5 days of work, and a couple of schema 
> changes to boot, LCF now runs with Derby.
> Some caveats:
>
> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.
> (2)     I haven't tested every query, so I'm sure there are probably some 
> that are still broken.
> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
> Postgresql for you.
> (4)     Transactional integrity hasn't been evaluated.
> (5)     Deadlock detection and unique constraint violation detection is 
> probably not right, because I'd need to cause these errors to occur before 
> being able to key off their exception messages.
> (6)     I had to turn off the ability to sort on certain columns in the 
> reports - basically, any column that was represented as a large character 
> field.
>
> Nevertheless, this represents an important milestone on the path to being 
> able to write some kind of unit tests that have at least some meaning.
>
> If you have an existing LCF Postgresql database, you will need to force an 
> upgrade after going to the new trunk code.  To do this, repeat the 
> "org.apache.lcf.agents.Install" command, and the 
> "org.apache.lcf.agents.Register 
> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the 
> new code.  And, please, let me know of any kind of errors you notice that 
> could be related to the schema change.
>
> Thanks,
> Karl
>
>
> 

Re: Derby

Posted by Jack Krupansky <ja...@lucidimagination.com>.
> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.

How do you "Start" a crawl then if not in the web app which then starts the 
agent process crawling?

Thanks for all of this effort!

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 5:34 PM
To: <co...@incubator.apache.org>
Subject: Derby

> For what it's worth, after some 5 days of work, and a couple of schema 
> changes to boot, LCF now runs with Derby.
> Some caveats:
>
> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.
> (2)     I haven't tested every query, so I'm sure there are probably some 
> that are still broken.
> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
> Postgresql for you.
> (4)     Transactional integrity hasn't been evaluated.
> (5)     Deadlock detection and unique constraint violation detection is 
> probably not right, because I'd need to cause these errors to occur before 
> being able to key off their exception messages.
> (6)     I had to turn off the ability to sort on certain columns in the 
> reports - basically, any column that was represented as a large character 
> field.
>
> Nevertheless, this represents an important milestone on the path to being 
> able to write some kind of unit tests that have at least some meaning.
>
> If you have an existing LCF Postgresql database, you will need to force an 
> upgrade after going to the new trunk code.  To do this, repeat the 
> "org.apache.lcf.agents.Install" command, and the 
> "org.apache.lcf.agents.Register 
> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the 
> new code.  And, please, let me know of any kind of errors you notice that 
> could be related to the schema change.
>
> Thanks,
> Karl
>
>
> 

RE: Derby

Posted by ka...@nokia.com.
The reason this occurs is because I am using Derby in embedded mode, and the restriction appears to be a limitation of that mode of operation.  However, this mode is necessary to meet the testing goal, which was the prime motivator behind doing a Derby implementation.  I am sure that if we were to use Derby as a service, the restriction would no longer apply, but then there would be no conceivable benefit either.

Karl

-----Original Message-----
From: ext Jack Krupansky [mailto:jack.krupansky@lucidimagination.com] 
Sent: Friday, June 04, 2010 12:41 AM
To: connectors-dev@incubator.apache.org
Subject: Re: Derby

What is the nature of the single LCF process issue? Is it because the 
database is being used in single-user mode, or some other issue? Is it a 
permanent issue, or is there a solution or workaround anticipated at some 
stage.

Thanks.

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 5:34 PM
To: <co...@incubator.apache.org>
Subject: Derby

> For what it's worth, after some 5 days of work, and a couple of schema 
> changes to boot, LCF now runs with Derby.
> Some caveats:
>
> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.
> (2)     I haven't tested every query, so I'm sure there are probably some 
> that are still broken.
> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
> Postgresql for you.
> (4)     Transactional integrity hasn't been evaluated.
> (5)     Deadlock detection and unique constraint violation detection is 
> probably not right, because I'd need to cause these errors to occur before 
> being able to key off their exception messages.
> (6)     I had to turn off the ability to sort on certain columns in the 
> reports - basically, any column that was represented as a large character 
> field.
>
> Nevertheless, this represents an important milestone on the path to being 
> able to write some kind of unit tests that have at least some meaning.
>
> If you have an existing LCF Postgresql database, you will need to force an 
> upgrade after going to the new trunk code.  To do this, repeat the 
> "org.apache.lcf.agents.Install" command, and the 
> "org.apache.lcf.agents.Register 
> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the 
> new code.  And, please, let me know of any kind of errors you notice that 
> could be related to the schema change.
>
> Thanks,
> Karl
>
>
> 

Re: Derby

Posted by Jack Krupansky <ja...@lucidimagination.com>.
What is the nature of the single LCF process issue? Is it because the 
database is being used in single-user mode, or some other issue? Is it a 
permanent issue, or is there a solution or workaround anticipated at some 
stage.

Thanks.

-- Jack Krupansky

--------------------------------------------------
From: <ka...@nokia.com>
Sent: Thursday, June 03, 2010 5:34 PM
To: <co...@incubator.apache.org>
Subject: Derby

> For what it's worth, after some 5 days of work, and a couple of schema 
> changes to boot, LCF now runs with Derby.
> Some caveats:
>
> (1)     You can't run more than one LCF process at a time.  That means you 
> need to either run the daemon or the crawler-ui web application, but you 
> can't run both at the same time.
> (2)     I haven't tested every query, so I'm sure there are probably some 
> that are still broken.
> (3)     It's slow.  Count yourself as fortunate if it runs 1/5 the rate of 
> Postgresql for you.
> (4)     Transactional integrity hasn't been evaluated.
> (5)     Deadlock detection and unique constraint violation detection is 
> probably not right, because I'd need to cause these errors to occur before 
> being able to key off their exception messages.
> (6)     I had to turn off the ability to sort on certain columns in the 
> reports - basically, any column that was represented as a large character 
> field.
>
> Nevertheless, this represents an important milestone on the path to being 
> able to write some kind of unit tests that have at least some meaning.
>
> If you have an existing LCF Postgresql database, you will need to force an 
> upgrade after going to the new trunk code.  To do this, repeat the 
> "org.apache.lcf.agents.Install" command, and the 
> "org.apache.lcf.agents.Register 
> org.apache.lcf.crawler.system.CrawlerAgent" command after deploying the 
> new code.  And, please, let me know of any kind of errors you notice that 
> could be related to the schema change.
>
> Thanks,
> Karl
>
>
>