You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@gobblin.apache.org by "Zhang, Xiuzhu(AWF)" <xi...@paypal.com> on 2017/12/19 03:28:06 UTC

compaction feature

Hi guys,

The compaction feature currently can handle datasets between 'compaction.timebase.max.time.ago' and '...min.time.age' the time based on current mean the time is changeable.
Could let it to handle datasets after a fixed time point presented in directory(e.g.2017/12/16/08 or else)?

I am also know can configure 'compaction.timebased.folder.pattern' to point the time patter based on TimeBasedSubDirDatasetsFinder.java at line 136:
String folderStructure = getFolderStructure();
for (FileStatus status : this.fs.globStatus(new Path(inputPath, folderStructure))) {xxx}
Could configure multiple time pattern in xx.pull file(e.g.YYYY/MM/dd/HH and YYYY/MM/dd)?

Thanks,
Ethan

Re: compaction feature

Posted by Abhishek Tiwari <ab...@apache.org>.
Hi Ethan,

Did you modify it already? If so, please feel free to send a PR.

Abhishek

On Mon, Dec 18, 2017 at 7:28 PM, Zhang, Xiuzhu(AWF) <xi...@paypal.com>
wrote:

> Hi guys,
>
>
>
> The compaction feature currently can handle datasets between
> ‘compaction.timebase.max.time.ago’ and ‘…min.time.age’ the time based on
> current mean the time is changeable.
>
> Could let it to handle datasets after a fixed time point presented in
> directory(e.g.2017/12/16/08 or else)?
>
>
>
> I am also know can configure ‘compaction.timebased.folder.pattern’ to
> point the time patter based on TimeBasedSubDirDatasetsFinder.java at line
> 136:
>
> String folderStructure = getFolderStructure();
>
> for (FileStatus status : this.fs.globStatus(new Path(inputPath,
> folderStructure))) {xxx}
>
> Could configure multiple time pattern in xx.pull file(e.g.YYYY/MM/dd/HH
> and YYYY/MM/dd)?
>
>
>
> Thanks,
>
> Ethan
>

Re: compaction feature

Posted by Abhishek Tiwari <ab...@apache.org>.
Hi Ethan,

Did you modify it already? If so, please feel free to send a PR.

Abhishek

On Mon, Dec 18, 2017 at 7:28 PM, Zhang, Xiuzhu(AWF) <xi...@paypal.com>
wrote:

> Hi guys,
>
>
>
> The compaction feature currently can handle datasets between
> ‘compaction.timebase.max.time.ago’ and ‘…min.time.age’ the time based on
> current mean the time is changeable.
>
> Could let it to handle datasets after a fixed time point presented in
> directory(e.g.2017/12/16/08 or else)?
>
>
>
> I am also know can configure ‘compaction.timebased.folder.pattern’ to
> point the time patter based on TimeBasedSubDirDatasetsFinder.java at line
> 136:
>
> String folderStructure = getFolderStructure();
>
> for (FileStatus status : this.fs.globStatus(new Path(inputPath,
> folderStructure))) {xxx}
>
> Could configure multiple time pattern in xx.pull file(e.g.YYYY/MM/dd/HH
> and YYYY/MM/dd)?
>
>
>
> Thanks,
>
> Ethan
>