You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Chaudhuri, Rajiv" <ra...@pearson.com> on 2014/02/04 14:28:45 UTC

Fuseki Data backup schedule

Hi,

We are taking Fuseki dataset backup on daily basis and we have configured
this backup process at cronjob.

When should we take the backup-
1. During off time

Or

2. During peak time- Will it cause any issue if we consider the concurrency
and data corruption?

We have found Fuseki stop responding (with no error log at all) during peak
time and during this time our backup process was also triggered.

Our data file directory size-
300 MB
and backup (nq.gz) file size is 7.6 MB


What do you recommend? What is the suitable time to take the backup of
Fuseki dataset?



-- 
Regards,
*Rajiv*

Re: Fuseki Data backup schedule

Posted by Andy Seaborne <an...@apache.org>.
On 05/02/14 16:45, Chaudhuri, Rajiv wrote:
> Hi Andy,
>
> Yes; It's happened once.
>
> I am running the Fuseki service as background process.
> Command (from the home of Fuseki):
> ./fuseki-server --config=/etc/fuseki/conf.ttl --mgtPort=58080 &

stderr will be printed to the terminal.

As there is no "nohup" then if the terminal goes away or the TCP 
connection breaks (e.g. the ssh client disconnects) the process group is 
killed and the Fuseki server will exit (terminated with SIGHUP).

> and the logger setting is as follow and all log is getting generated in the
> specified log file; and no stderrout.log is not generating- (Can it be
> modified further so that we can have error log?)

See the 'fuseki' script for running in the background as a service. 
This is a much better way to do it for a server.  It's an init.d script 
as well.

You can disconnect from the machine; fuseki will not exit.

	Andy

>
> log4j.rootLogger=INFO, FusekiFileLog
>
> #log4j.appender.stdlog=org.apache.log4j.ConsoleAppender
> ## log4j.appender.stdlog.target=System.err
> #log4j.appender.stdlog.layout=org.apache.log4j.PatternLayout
> #log4j.appender.stdlog.layout.ConversionPattern=%d{HH:mm:ss} %-5p %-20c{1}
> :: %m%n
>
> ## # Example for file logging.
> log4j.appender.FusekiFileLog=org.apache.log4j.DailyRollingFileAppender
> log4j.appender.FusekiFileLog.DatePattern='.'yyyy-MM-dd
> log4j.appender.FusekiFileLog.File=/var/log/fuseki/fuseki.log
> log4j.appender.FusekiFileLog.layout=org.apache.log4j.PatternLayout
> log4j.appender.FusekiFileLog.layout.ConversionPattern=%d{HH:mm:ss} %-5p
> %-20c{1} :: %m%n
>
>
> Regards,
> Rajiv
>
>
> On Wed, Feb 5, 2014 at 11:35 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> Rajiv,
>>
>> So it's happened once?
>>
>> I suspect the process exited and the message would have been on stderr,
>> which is not send to the log4j log.
>>
>> I don't know how you are running it as a service but the script in the
>> distributuion sends stderr to a file called 'stderrout.log'
>>
>>          Andy
>>
>>
>> On 04/02/14 18:21, Chaudhuri, Rajiv wrote:
>>
>>> Hi Andy,
>>>
>>> I checked whether fuseki process was running or not when customer reported
>>> that data was not displaying at UI and found that Fuseki process was no
>>> more running after performing the scheduled backup during peak load and I
>>> had to start the Fuseki server again.
>>>
>>> Regards,
>>> Rajiv
>>>
>>>
>>> On Tue, Feb 4, 2014 at 1:04 PM, Andy Seaborne <an...@apache.org> wrote:
>>>
>>>   On 04/02/14 15:33, Chaudhuri, Rajiv wrote:
>>>>
>>>>   Hi Andy,
>>>>>
>>>>> The process gets killed- So it stops permanently.
>>>>>
>>>>>
>>>> I don't understand - does some other process kill it, or does it exit (if
>>>> so, with what exit code?)
>>>>
>>>>           Andy
>>>>
>>>>
>>>>
>>>>   Regards,
>>>>> Rajiv
>>>>>
>>>>>
>>>>> On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>>
>>>>>    On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>>>>>
>>>>>>
>>>>>>    Hi,
>>>>>>
>>>>>>>
>>>>>>> We are taking Fuseki dataset backup on daily basis and we have
>>>>>>> configured
>>>>>>> this backup process at cronjob.
>>>>>>>
>>>>>>> When should we take the backup-
>>>>>>> 1. During off time
>>>>>>>
>>>>>>> Or
>>>>>>>
>>>>>>> 2. During peak time- Will it cause any issue if we consider the
>>>>>>> concurrency
>>>>>>> and data corruption?
>>>>>>>
>>>>>>> We have found Fuseki stop responding (with no error log at all) during
>>>>>>> peak
>>>>>>> time and during this time our backup process was also triggered.
>>>>>>>
>>>>>>> Our data file directory size-
>>>>>>> 300 MB
>>>>>>> and backup (nq.gz) file size is 7.6 MB
>>>>>>>
>>>>>>>
>>>>>>> What do you recommend? What is the suitable time to take the backup of
>>>>>>> Fuseki dataset?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    A backup, if you're calling the Fuseki server backup function, is a
>>>>>>>
>>>>>> read-transaction.  A writer can be active at the same time but if there
>>>>>> are
>>>>>> many updates, it accumulates waiting for the DB to become free for the
>>>>>> main
>>>>>> DB files to be updated from the journal.
>>>>>>
>>>>>> If you have a frequent update load, then off-peak is better but it
>>>>>> should
>>>>>> all work.
>>>>>>
>>>>>> What do you mean by "stop responding" - permanently or for the duration
>>>>>> of
>>>>>> the backup?
>>>>>>
>>>>>>            Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>


Re: Fuseki Data backup schedule

Posted by "Chaudhuri, Rajiv" <ra...@pearson.com>.
Hi Andy,

Yes; It's happened once.

I am running the Fuseki service as background process.
Command (from the home of Fuseki):
./fuseki-server --config=/etc/fuseki/conf.ttl --mgtPort=58080 &

and the logger setting is as follow and all log is getting generated in the
specified log file; and no stderrout.log is not generating- (Can it be
modified further so that we can have error log?)

log4j.rootLogger=INFO, FusekiFileLog

#log4j.appender.stdlog=org.apache.log4j.ConsoleAppender
## log4j.appender.stdlog.target=System.err
#log4j.appender.stdlog.layout=org.apache.log4j.PatternLayout
#log4j.appender.stdlog.layout.ConversionPattern=%d{HH:mm:ss} %-5p %-20c{1}
:: %m%n

## # Example for file logging.
log4j.appender.FusekiFileLog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.FusekiFileLog.DatePattern='.'yyyy-MM-dd
log4j.appender.FusekiFileLog.File=/var/log/fuseki/fuseki.log
log4j.appender.FusekiFileLog.layout=org.apache.log4j.PatternLayout
log4j.appender.FusekiFileLog.layout.ConversionPattern=%d{HH:mm:ss} %-5p
%-20c{1} :: %m%n


Regards,
Rajiv


On Wed, Feb 5, 2014 at 11:35 AM, Andy Seaborne <an...@apache.org> wrote:

> Rajiv,
>
> So it's happened once?
>
> I suspect the process exited and the message would have been on stderr,
> which is not send to the log4j log.
>
> I don't know how you are running it as a service but the script in the
> distributuion sends stderr to a file called 'stderrout.log'
>
>         Andy
>
>
> On 04/02/14 18:21, Chaudhuri, Rajiv wrote:
>
>> Hi Andy,
>>
>> I checked whether fuseki process was running or not when customer reported
>> that data was not displaying at UI and found that Fuseki process was no
>> more running after performing the scheduled backup during peak load and I
>> had to start the Fuseki server again.
>>
>> Regards,
>> Rajiv
>>
>>
>> On Tue, Feb 4, 2014 at 1:04 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 04/02/14 15:33, Chaudhuri, Rajiv wrote:
>>>
>>>  Hi Andy,
>>>>
>>>> The process gets killed- So it stops permanently.
>>>>
>>>>
>>> I don't understand - does some other process kill it, or does it exit (if
>>> so, with what exit code?)
>>>
>>>          Andy
>>>
>>>
>>>
>>>  Regards,
>>>> Rajiv
>>>>
>>>>
>>>> On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>>   On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>>>>
>>>>>
>>>>>   Hi,
>>>>>
>>>>>>
>>>>>> We are taking Fuseki dataset backup on daily basis and we have
>>>>>> configured
>>>>>> this backup process at cronjob.
>>>>>>
>>>>>> When should we take the backup-
>>>>>> 1. During off time
>>>>>>
>>>>>> Or
>>>>>>
>>>>>> 2. During peak time- Will it cause any issue if we consider the
>>>>>> concurrency
>>>>>> and data corruption?
>>>>>>
>>>>>> We have found Fuseki stop responding (with no error log at all) during
>>>>>> peak
>>>>>> time and during this time our backup process was also triggered.
>>>>>>
>>>>>> Our data file directory size-
>>>>>> 300 MB
>>>>>> and backup (nq.gz) file size is 7.6 MB
>>>>>>
>>>>>>
>>>>>> What do you recommend? What is the suitable time to take the backup of
>>>>>> Fuseki dataset?
>>>>>>
>>>>>>
>>>>>>
>>>>>>   A backup, if you're calling the Fuseki server backup function, is a
>>>>>>
>>>>> read-transaction.  A writer can be active at the same time but if there
>>>>> are
>>>>> many updates, it accumulates waiting for the DB to become free for the
>>>>> main
>>>>> DB files to be updated from the journal.
>>>>>
>>>>> If you have a frequent update load, then off-peak is better but it
>>>>> should
>>>>> all work.
>>>>>
>>>>> What do you mean by "stop responding" - permanently or for the duration
>>>>> of
>>>>> the backup?
>>>>>
>>>>>           Andy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 
Regards,
*Rajiv*

Re: Fuseki Data backup schedule

Posted by Andy Seaborne <an...@apache.org>.
Rajiv,

So it's happened once?

I suspect the process exited and the message would have been on stderr, 
which is not send to the log4j log.

I don't know how you are running it as a service but the script in the 
distributuion sends stderr to a file called 'stderrout.log'

	Andy

On 04/02/14 18:21, Chaudhuri, Rajiv wrote:
> Hi Andy,
>
> I checked whether fuseki process was running or not when customer reported
> that data was not displaying at UI and found that Fuseki process was no
> more running after performing the scheduled backup during peak load and I
> had to start the Fuseki server again.
>
> Regards,
> Rajiv
>
>
> On Tue, Feb 4, 2014 at 1:04 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 04/02/14 15:33, Chaudhuri, Rajiv wrote:
>>
>>> Hi Andy,
>>>
>>> The process gets killed- So it stops permanently.
>>>
>>
>> I don't understand - does some other process kill it, or does it exit (if
>> so, with what exit code?)
>>
>>          Andy
>>
>>
>>
>>> Regards,
>>> Rajiv
>>>
>>>
>>> On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:
>>>
>>>   On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>>>>
>>>>   Hi,
>>>>>
>>>>> We are taking Fuseki dataset backup on daily basis and we have
>>>>> configured
>>>>> this backup process at cronjob.
>>>>>
>>>>> When should we take the backup-
>>>>> 1. During off time
>>>>>
>>>>> Or
>>>>>
>>>>> 2. During peak time- Will it cause any issue if we consider the
>>>>> concurrency
>>>>> and data corruption?
>>>>>
>>>>> We have found Fuseki stop responding (with no error log at all) during
>>>>> peak
>>>>> time and during this time our backup process was also triggered.
>>>>>
>>>>> Our data file directory size-
>>>>> 300 MB
>>>>> and backup (nq.gz) file size is 7.6 MB
>>>>>
>>>>>
>>>>> What do you recommend? What is the suitable time to take the backup of
>>>>> Fuseki dataset?
>>>>>
>>>>>
>>>>>
>>>>>   A backup, if you're calling the Fuseki server backup function, is a
>>>> read-transaction.  A writer can be active at the same time but if there
>>>> are
>>>> many updates, it accumulates waiting for the DB to become free for the
>>>> main
>>>> DB files to be updated from the journal.
>>>>
>>>> If you have a frequent update load, then off-peak is better but it should
>>>> all work.
>>>>
>>>> What do you mean by "stop responding" - permanently or for the duration
>>>> of
>>>> the backup?
>>>>
>>>>           Andy
>>>>
>>>>
>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>


Re: Fuseki Data backup schedule

Posted by "Chaudhuri, Rajiv" <ra...@pearson.com>.
Hi Andy,

I checked whether fuseki process was running or not when customer reported
that data was not displaying at UI and found that Fuseki process was no
more running after performing the scheduled backup during peak load and I
had to start the Fuseki server again.

Regards,
Rajiv


On Tue, Feb 4, 2014 at 1:04 PM, Andy Seaborne <an...@apache.org> wrote:

> On 04/02/14 15:33, Chaudhuri, Rajiv wrote:
>
>> Hi Andy,
>>
>> The process gets killed- So it stops permanently.
>>
>
> I don't understand - does some other process kill it, or does it exit (if
> so, with what exit code?)
>
>         Andy
>
>
>
>> Regards,
>> Rajiv
>>
>>
>> On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>>>
>>>  Hi,
>>>>
>>>> We are taking Fuseki dataset backup on daily basis and we have
>>>> configured
>>>> this backup process at cronjob.
>>>>
>>>> When should we take the backup-
>>>> 1. During off time
>>>>
>>>> Or
>>>>
>>>> 2. During peak time- Will it cause any issue if we consider the
>>>> concurrency
>>>> and data corruption?
>>>>
>>>> We have found Fuseki stop responding (with no error log at all) during
>>>> peak
>>>> time and during this time our backup process was also triggered.
>>>>
>>>> Our data file directory size-
>>>> 300 MB
>>>> and backup (nq.gz) file size is 7.6 MB
>>>>
>>>>
>>>> What do you recommend? What is the suitable time to take the backup of
>>>> Fuseki dataset?
>>>>
>>>>
>>>>
>>>>  A backup, if you're calling the Fuseki server backup function, is a
>>> read-transaction.  A writer can be active at the same time but if there
>>> are
>>> many updates, it accumulates waiting for the DB to become free for the
>>> main
>>> DB files to be updated from the journal.
>>>
>>> If you have a frequent update load, then off-peak is better but it should
>>> all work.
>>>
>>> What do you mean by "stop responding" - permanently or for the duration
>>> of
>>> the backup?
>>>
>>>          Andy
>>>
>>>
>>>
>>>>
>>>
>>
>>
>


-- 
Regards,
*Rajiv*

Re: Fuseki Data backup schedule

Posted by Andy Seaborne <an...@apache.org>.
On 04/02/14 15:33, Chaudhuri, Rajiv wrote:
> Hi Andy,
>
> The process gets killed- So it stops permanently.

I don't understand - does some other process kill it, or does it exit 
(if so, with what exit code?)

	Andy

>
> Regards,
> Rajiv
>
>
> On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>>
>>> Hi,
>>>
>>> We are taking Fuseki dataset backup on daily basis and we have configured
>>> this backup process at cronjob.
>>>
>>> When should we take the backup-
>>> 1. During off time
>>>
>>> Or
>>>
>>> 2. During peak time- Will it cause any issue if we consider the
>>> concurrency
>>> and data corruption?
>>>
>>> We have found Fuseki stop responding (with no error log at all) during
>>> peak
>>> time and during this time our backup process was also triggered.
>>>
>>> Our data file directory size-
>>> 300 MB
>>> and backup (nq.gz) file size is 7.6 MB
>>>
>>>
>>> What do you recommend? What is the suitable time to take the backup of
>>> Fuseki dataset?
>>>
>>>
>>>
>> A backup, if you're calling the Fuseki server backup function, is a
>> read-transaction.  A writer can be active at the same time but if there are
>> many updates, it accumulates waiting for the DB to become free for the main
>> DB files to be updated from the journal.
>>
>> If you have a frequent update load, then off-peak is better but it should
>> all work.
>>
>> What do you mean by "stop responding" - permanently or for the duration of
>> the backup?
>>
>>          Andy
>>
>>
>>>
>>
>
>


Re: Fuseki Data backup schedule

Posted by "Chaudhuri, Rajiv" <ra...@pearson.com>.
Hi Andy,

The process gets killed- So it stops permanently.

Regards,
Rajiv


On Tue, Feb 4, 2014 at 10:30 AM, Andy Seaborne <an...@apache.org> wrote:

> On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
>
>> Hi,
>>
>> We are taking Fuseki dataset backup on daily basis and we have configured
>> this backup process at cronjob.
>>
>> When should we take the backup-
>> 1. During off time
>>
>> Or
>>
>> 2. During peak time- Will it cause any issue if we consider the
>> concurrency
>> and data corruption?
>>
>> We have found Fuseki stop responding (with no error log at all) during
>> peak
>> time and during this time our backup process was also triggered.
>>
>> Our data file directory size-
>> 300 MB
>> and backup (nq.gz) file size is 7.6 MB
>>
>>
>> What do you recommend? What is the suitable time to take the backup of
>> Fuseki dataset?
>>
>>
>>
> A backup, if you're calling the Fuseki server backup function, is a
> read-transaction.  A writer can be active at the same time but if there are
> many updates, it accumulates waiting for the DB to become free for the main
> DB files to be updated from the journal.
>
> If you have a frequent update load, then off-peak is better but it should
> all work.
>
> What do you mean by "stop responding" - permanently or for the duration of
> the backup?
>
>         Andy
>
>
>>
>


-- 
Regards,
*Rajiv*

Re: Fuseki Data backup schedule

Posted by Andy Seaborne <an...@apache.org>.
On 04/02/14 13:28, Chaudhuri, Rajiv wrote:
> Hi,
>
> We are taking Fuseki dataset backup on daily basis and we have configured
> this backup process at cronjob.
>
> When should we take the backup-
> 1. During off time
>
> Or
>
> 2. During peak time- Will it cause any issue if we consider the concurrency
> and data corruption?
>
> We have found Fuseki stop responding (with no error log at all) during peak
> time and during this time our backup process was also triggered.
>
> Our data file directory size-
> 300 MB
> and backup (nq.gz) file size is 7.6 MB
>
>
> What do you recommend? What is the suitable time to take the backup of
> Fuseki dataset?
>
>

A backup, if you're calling the Fuseki server backup function, is a 
read-transaction.  A writer can be active at the same time but if there 
are many updates, it accumulates waiting for the DB to become free for 
the main DB files to be updated from the journal.

If you have a frequent update load, then off-peak is better but it 
should all work.

What do you mean by "stop responding" - permanently or for the duration 
of the backup?

	Andy

>