You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by durga <du...@gmail.com> on 2014/12/17 00:35:30 UTC
S3 globbing
Hi All,
I need help with regex in my sc.textFile()
I have lots of files with with epoch millisecond timestamp.
ex:abc_1418759383723.json
Now I need to consume last one hour files using the epoch time stamp as
mentioned above.
I tried couple of options , nothing seems working for me.
If any one of you face this issue and got a solution , please help me.
Appreciating your help,
Thanks,
D
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/S3-globbing-tp20731.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: S3 globbing
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Yes, you can create a Calendar instance and set it to an hour before and
use that object.
Eg:
val hourBefore = Calendar.getInstance()
hourBefore.add(Calendar.HOUR, -1)
hourBefore.getTimeInMillis
[image: Inline image 1]
Thanks
Best Regards
On Thu, Dec 18, 2014 at 12:57 AM, durga katakam <du...@gmail.com> wrote:
>
> Hi Akhil,
>
> Thanks for your time. I appreciate .I tried this approach , but either I
> am getting less files or more files not exact hour files.
>
> Is there any way I can tell the range (between this time to this time)
>
> Thanks,
> D
>
> On Tue, Dec 16, 2014 at 11:04 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>>
>> Did you try something like:
>>
>> //Get the last hour
>> val d = (System.currentTimeMillis() - 3600 * 1000)
>> val ex = "abc_" + d.toString().substring(0,7) + "*.json"
>>
>>
>> [image: Inline image 1]
>>
>> Thanks
>> Best Regards
>>
>> On Wed, Dec 17, 2014 at 5:05 AM, durga <du...@gmail.com> wrote:
>>>
>>> Hi All,
>>>
>>> I need help with regex in my sc.textFile()
>>>
>>> I have lots of files with with epoch millisecond timestamp.
>>>
>>> ex:abc_1418759383723.json
>>>
>>> Now I need to consume last one hour files using the epoch time stamp as
>>> mentioned above.
>>>
>>> I tried couple of options , nothing seems working for me.
>>>
>>> If any one of you face this issue and got a solution , please help me.
>>>
>>> Appreciating your help,
>>>
>>> Thanks,
>>> D
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-globbing-tp20731.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
Re: S3 globbing
Posted by durga katakam <du...@gmail.com>.
Hi Akhil,
Thanks for your time. I appreciate .I tried this approach , but either I am
getting less files or more files not exact hour files.
Is there any way I can tell the range (between this time to this time)
Thanks,
D
On Tue, Dec 16, 2014 at 11:04 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:
>
> Did you try something like:
>
> //Get the last hour
> val d = (System.currentTimeMillis() - 3600 * 1000)
> val ex = "abc_" + d.toString().substring(0,7) + "*.json"
>
>
> [image: Inline image 1]
>
> Thanks
> Best Regards
>
> On Wed, Dec 17, 2014 at 5:05 AM, durga <du...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I need help with regex in my sc.textFile()
>>
>> I have lots of files with with epoch millisecond timestamp.
>>
>> ex:abc_1418759383723.json
>>
>> Now I need to consume last one hour files using the epoch time stamp as
>> mentioned above.
>>
>> I tried couple of options , nothing seems working for me.
>>
>> If any one of you face this issue and got a solution , please help me.
>>
>> Appreciating your help,
>>
>> Thanks,
>> D
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-globbing-tp20731.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
Re: S3 globbing
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Did you try something like:
//Get the last hour
val d = (System.currentTimeMillis() - 3600 * 1000)
val ex = "abc_" + d.toString().substring(0,7) + "*.json"
[image: Inline image 1]
Thanks
Best Regards
On Wed, Dec 17, 2014 at 5:05 AM, durga <du...@gmail.com> wrote:
>
> Hi All,
>
> I need help with regex in my sc.textFile()
>
> I have lots of files with with epoch millisecond timestamp.
>
> ex:abc_1418759383723.json
>
> Now I need to consume last one hour files using the epoch time stamp as
> mentioned above.
>
> I tried couple of options , nothing seems working for me.
>
> If any one of you face this issue and got a solution , please help me.
>
> Appreciating your help,
>
> Thanks,
> D
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/S3-globbing-tp20731.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>