You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by Peng Zhang <pe...@yahoo-inc.com.INVALID> on 2014/10/03 18:38:31 UTC

Retention does not work?

Hi Falcon Devs,

I run the example apps in the src package and the work flow does work well. However, after successfully finishing the job, I did’t see any retention job scheduled neither does my input and output path get deleted.  Attachment is the sample in-feed.xml I have. In addition, I can not see any errors in ooze log/job logs. Could you please help me figure out what the problem might be?
Thanks in advance.

Best,
Peng

feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
    <groups>input</groups>

    <frequency>minutes(1)</frequency>
    <timezone>UTC</timezone>
    <late-arrival cut-off="minutes(3)"/>

    <clusters>
        <cluster name="local">
            <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
            <retention limit="minutes(15)" action="delete"/>
        </cluster>
    </clusters>

    <locations>
        <location type="data" path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
    </locations>

    <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
    <schema location="/schema/log/log.format.csv" provider="csv"/>
    <properties>


Re: Retention does not work?

Posted by Peng Zhang <pe...@yahoo-inc.com>.
Hi Venkatesh,
I see the staging path and there only exists a workflow dir which contains the process info and logs. I didn't see the Is there anything wrong with my conf?
One of the log file is like below. It's in the json format. What the problem would be then?Looking forward for your reply.
Thanks,Peng


{"timeStamp":"2014-10-03-15-57","status":"SUCCEEDED","feedNames":"out","runId":"0","entityType":"PROCESS","nominalTime":"2013-11-15-00-55","userWorkflowName":"oozie-co-workflow","workflowUser":"pengzhang","entityName":"oozie-co","feedInstancePaths":"hdfs:\/\/nn\/data\/out\/2013\/11\/15\/00","operation":"GENERATE","logDir":"hdfs:\/\/nn:8020\/projects\/falcon\/staging\/falcon\/workflows\/process\/oozie-co\/logs\/job-2013-11-15-00-55\/","falconInputPaths":"hdfs:\/\/nn:8020\/data\/in\/2013\/11\/15\/00\/54,hdfs:\/\/nn:8020\/data\/in\/2013\/11\/15\/00\/53,hdfs:\/\/nn:8020\/data\/in\/2013\/11\/15\/00\/52,hdfs:\/\/nng:8020\/data\/in\/2013\/11\/15\/00\/51,hdfs:\/\/nn:8020\/data\/in\/2013\/11\/15\/00\/50","falconInputFeeds":"in","userWorkflowEngine":"oozie","userWorkflowVersion":"1.0","workflowId":"0000086-140929160402261-oozie-oozi-W","cluster":"local","workflowEngineUrl":"http:\/\/oozie:4080\/oozie","subflowId":"0000086-140929160402261-oozie-oozi-W@user-oozie-workflow"}
 

     On Friday, October 3, 2014 7:11 PM, Seetharam Venkatesh <ve...@innerzeal.com> wrote:
   

 Can you please check the instancePaths*.csv under staging dir and see if it
had any contents in it.

On Fri, Oct 3, 2014 at 9:38 AM, Peng Zhang <pe...@yahoo-inc.com.invalid>
wrote:

> Hi Falcon Devs,
>
> I run the example apps in the src package and the work flow does work
> well. However, after successfully finishing the job, I did’t see any
> retention job scheduled neither does my input and output path get deleted.
> Attachment is the sample in-feed.xml I have. In addition, I can not see any
> errors in ooze log/job logs. Could you please help me figure out what the
> problem might be?
> Thanks in advance.
>
> Best,
> Peng
>
> feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
>    <groups>input</groups>
>
>    <frequency>minutes(1)</frequency>
>    <timezone>UTC</timezone>
>    <late-arrival cut-off="minutes(3)"/>
>
>    <clusters>
>        <cluster name="local">
>            <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
>            <retention limit="minutes(15)" action="delete"/>
>        </cluster>
>    </clusters>
>
>    <locations>
>        <location type="data"
> path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
>    </locations>
>
>    <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
>    <schema location="/schema/log/log.format.csv" provider="csv"/>
>    <properties>
>
>


-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry

   

Re: Retention does not work?

Posted by Peng Zhang <pe...@yahoo-inc.com>.
Hi Sanjeev,
The coord seems kicked off but the minute folder doesn't get deleted. So what the problem might be? I've attached two pics to make sure that the coord are rightly kicked off.

Regards,Peng
 

     On Friday, October 3, 2014 10:44 PM, Sanjeev Tripurari <sa...@inmobi.com> wrote:
   

 Hi Peng,

Can you check for the few things

1. feed neeeds to be scheduled for retention.
    a) check the oozie coord kicked of for retention.
        if its not, check the falcon application log, and oozie log for
the issue.

2. Check the path, /data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}
    it should have deleted the MINUTE directory, if its running fine

Regards
-Sanjeev


On Sat, Oct 4, 2014 at 5:40 AM, Seetharam Venkatesh <venkatesh@innerzeal.com
> wrote:

> Can you please check the instancePaths*.csv under staging dir and see if it
> had any contents in it.
>
> On Fri, Oct 3, 2014 at 9:38 AM, Peng Zhang <pengzhang@yahoo-inc.com.invalid
> >
> wrote:
>
> > Hi Falcon Devs,
> >
> > I run the example apps in the src package and the work flow does work
> > well. However, after successfully finishing the job, I did’t see any
> > retention job scheduled neither does my input and output path get
> deleted.
> > Attachment is the sample in-feed.xml I have. In addition, I can not see
> any
> > errors in ooze log/job logs. Could you please help me figure out what the
> > problem might be?
> > Thanks in advance.
> >
> > Best,
> > Peng
> >
> > feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
> >    <groups>input</groups>
> >
> >    <frequency>minutes(1)</frequency>
> >    <timezone>UTC</timezone>
> >    <late-arrival cut-off="minutes(3)"/>
> >
> >    <clusters>
> >        <cluster name="local">
> >            <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
> >            <retention limit="minutes(15)" action="delete"/>
> >        </cluster>
> >    </clusters>
> >
> >    <locations>
> >        <location type="data"
> > path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
> >    </locations>
> >
> >    <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
> >    <schema location="/schema/log/log.format.csv" provider="csv"/>
> >    <properties>
> >
> >
>
>
> --
> Regards,
> Venkatesh
>
> “Perfection (in design) is achieved not when there is nothing more to add,
> but rather when there is nothing more to take away.”
> - Antoine de Saint-Exupéry
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

   

Re: Retention does not work?

Posted by Sanjeev Tripurari <sa...@inmobi.com>.
Hi Peng,

Can you check for the few things

1. feed neeeds to be scheduled for retention.
    a) check the oozie coord kicked of for retention.
         if its not, check the falcon application log, and oozie log for
the issue.

2. Check the path, /data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}
    it should have deleted the MINUTE directory, if its running fine

Regards
-Sanjeev


On Sat, Oct 4, 2014 at 5:40 AM, Seetharam Venkatesh <venkatesh@innerzeal.com
> wrote:

> Can you please check the instancePaths*.csv under staging dir and see if it
> had any contents in it.
>
> On Fri, Oct 3, 2014 at 9:38 AM, Peng Zhang <pengzhang@yahoo-inc.com.invalid
> >
> wrote:
>
> > Hi Falcon Devs,
> >
> > I run the example apps in the src package and the work flow does work
> > well. However, after successfully finishing the job, I did’t see any
> > retention job scheduled neither does my input and output path get
> deleted.
> > Attachment is the sample in-feed.xml I have. In addition, I can not see
> any
> > errors in ooze log/job logs. Could you please help me figure out what the
> > problem might be?
> > Thanks in advance.
> >
> > Best,
> > Peng
> >
> > feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
> >     <groups>input</groups>
> >
> >     <frequency>minutes(1)</frequency>
> >     <timezone>UTC</timezone>
> >     <late-arrival cut-off="minutes(3)"/>
> >
> >     <clusters>
> >         <cluster name="local">
> >             <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
> >             <retention limit="minutes(15)" action="delete"/>
> >         </cluster>
> >     </clusters>
> >
> >     <locations>
> >         <location type="data"
> > path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
> >     </locations>
> >
> >     <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
> >     <schema location="/schema/log/log.format.csv" provider="csv"/>
> >     <properties>
> >
> >
>
>
> --
> Regards,
> Venkatesh
>
> “Perfection (in design) is achieved not when there is nothing more to add,
> but rather when there is nothing more to take away.”
> - Antoine de Saint-Exupéry
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: Retention does not work?

Posted by Seetharam Venkatesh <ve...@innerzeal.com>.
Can you please check the instancePaths*.csv under staging dir and see if it
had any contents in it.

On Fri, Oct 3, 2014 at 9:38 AM, Peng Zhang <pe...@yahoo-inc.com.invalid>
wrote:

> Hi Falcon Devs,
>
> I run the example apps in the src package and the work flow does work
> well. However, after successfully finishing the job, I did’t see any
> retention job scheduled neither does my input and output path get deleted.
> Attachment is the sample in-feed.xml I have. In addition, I can not see any
> errors in ooze log/job logs. Could you please help me figure out what the
> problem might be?
> Thanks in advance.
>
> Best,
> Peng
>
> feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
>     <groups>input</groups>
>
>     <frequency>minutes(1)</frequency>
>     <timezone>UTC</timezone>
>     <late-arrival cut-off="minutes(3)"/>
>
>     <clusters>
>         <cluster name="local">
>             <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
>             <retention limit="minutes(15)" action="delete"/>
>         </cluster>
>     </clusters>
>
>     <locations>
>         <location type="data"
> path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
>     </locations>
>
>     <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
>     <schema location="/schema/log/log.format.csv" provider="csv"/>
>     <properties>
>
>


-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry

Re: Retention does not work?

Posted by Peng Zhang <pe...@yahoo-inc.com.INVALID>.
Hi Sowmya,

Thanks for your reply. I’ll try to look through the log and try it using a future time.

Regards,
Peng
On Oct 15, 2014, at 3:12 PM, Sowmya Ramesh <sr...@hortonworks.com> wrote:

> "Feed Retention is not applicable as Feed's end time for cluster local is


Re: Retention does not work?

Posted by Sowmya Ramesh <sr...@hortonworks.com>.
Hey Peng,

In your feed entity definition validity is not set correctly. Please change
the validity end to date in the future. Replication coord is not getting
kicked because of this. You should be seeing belong warning log in
falcon.application.log
*Warning:*
"Feed Retention is not applicable as Feed's end time for cluster local is
not in the future"

 <clusters>
        <cluster name="local">
            <validity start="2013-11-15T00:00Z" *end="2013-11-15T00:50Z"*/>
            <retention limit="minutes(15)" action="delete"/>
        </cluster>
    </clusters>

Thanks!



On Fri, Oct 3, 2014 at 9:38 AM, Peng Zhang <pe...@yahoo-inc.com.invalid>
wrote:

> Hi Falcon Devs,
>
> I run the example apps in the src package and the work flow does work
> well. However, after successfully finishing the job, I did’t see any
> retention job scheduled neither does my input and output path get deleted.
> Attachment is the sample in-feed.xml I have. In addition, I can not see any
> errors in ooze log/job logs. Could you please help me figure out what the
> problem might be?
> Thanks in advance.
>
> Best,
> Peng
>
> feed description="input" name="in" xmlns="uri:falcon:feed:0.1">
>     <groups>input</groups>
>
>     <frequency>minutes(1)</frequency>
>     <timezone>UTC</timezone>
>     <late-arrival cut-off="minutes(3)"/>
>
>     <clusters>
>         <cluster name="local">
>             <validity start="2013-11-15T00:00Z" end="2013-11-15T00:50Z"/>
>             <retention limit="minutes(15)" action="delete"/>
>         </cluster>
>     </clusters>
>
>     <locations>
>         <location type="data"
> path="/data/in/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
>     </locations>
>
>     <ACL owner="pengzhang" group="hdfs" permission="0x777"/>
>     <schema location="/schema/log/log.format.csv" provider="csv"/>
>     <properties>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.