You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Bill Graham (JIRA)" <ji...@apache.org> on 2010/01/22 00:48:54 UTC

[jira] Created: (CHUKWA-449) Clreate utility to generate a sequence files from a log file

Clreate utility to generate a sequence files from a log file
------------------------------------------------------------

                 Key: CHUKWA-449
                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
             Project: Hadoop Chukwa
          Issue Type: New Feature
            Reporter: Bill Graham


See this thread:
http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html

We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803550#action_12803550 ] 

Ari Rabkin commented on CHUKWA-449:
-----------------------------------

Code looks good. I think it should live in a new file in org.apache.hadoop.chukwa.util, rather than in TempFileUtil.  Also, I would give it a name to indicate that it's a seq file of Records, not of raw Chunks.  Something like CreateRecordFile

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham updated CHUKWA-449:
-------------------------------

    Attachment: CHUKWA-449-2.patch

Attaching CHUKWA-449-2.patch.

I've moved the code to org.apache.hadoop.chukwa.util.CreateRecordFile and added a unit test. The test reads test/samples/ClientTrace.log and writes a SequenceFile to disk. I then read the sequence file and assert the entries against the original.

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch, CHUKWA-449-2.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CHUKWA-449) Clreate utility to generate a sequence files from a log file

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham reassigned CHUKWA-449:
----------------------------------

    Assignee: Bill Graham

> Clreate utility to generate a sequence files from a log file
> ------------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850982#action_12850982 ] 

Hudson commented on CHUKWA-449:
-------------------------------

Integrated in Chukwa-trunk #330 (See [http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/330/])
    

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: Data Processors
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>             Fix For: 0.4.0
>
>         Attachments: CHUKWA-449-1.patch, CHUKWA-449-2.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-449:
------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Bill!

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch, CHUKWA-449-2.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Clreate utility to generate a sequence files from a log file

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham updated CHUKWA-449:
-------------------------------

    Attachment: CHUKWA-449-1.patch

Attaching CHUKWA-449-1.patch.

I've added a new method to TempFileUtil: 

{code}
public static void makeTestSequenceFile(File inputFile,
                                        Path outputFile,
                                        String clusterName,
                                        String dataType,
                                        String streamName,
                                        MapProcessor processor) throws IOException
{code}

I've also included a main method, with the following usage message:
{code}
Usage: java org.apache.hadoop.chukwa.util.TempFileUtil <inputFile> <outputFile> [clusterName] [dataType] [streamName] [processorClass]
Description: Takes a plain text input file and generates a Hadoop sequence
             file contaning ChukwaRecordKey,ChukwaRecord entries
Parameters: inputFile      - Text input file to read
            outputFile     - Where to write the sequence file
            clusterName    - Cluster name to use in the records
            dataType       - Data type to use in the records
            streamName     - Stream name to use in the records
            processorClass - Processor class to use. Defaults to TsProcessor
{code}

I wasn't sure where to put this code, so let me know if there's a better home for it. Also, since this is just a static helper utility there isn't a unit test.

> Clreate utility to generate a sequence files from a log file
> ------------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham updated CHUKWA-449:
-------------------------------

    Summary: Create utility to generate a sequence file from a log file  (was: Clreate utility to generate a sequence files from a log file)

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Create utility to generate a sequence file from a log file

Posted by "Ari Rabkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin updated CHUKWA-449:
------------------------------

      Component/s: Data Processors
    Fix Version/s: 0.4.0

> Create utility to generate a sequence file from a log file
> ----------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: Data Processors
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>             Fix For: 0.4.0
>
>         Attachments: CHUKWA-449-1.patch, CHUKWA-449-2.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CHUKWA-449) Clreate utility to generate a sequence files from a log file

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham updated CHUKWA-449:
-------------------------------

    Release Note: Added new utility for creating Chukwa sequence files in development.
          Status: Patch Available  (was: Open)

> Clreate utility to generate a sequence files from a log file
> ------------------------------------------------------------
>
>                 Key: CHUKWA-449
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-449
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>         Attachments: CHUKWA-449-1.patch
>
>
> See this thread:
> http://www.mail-archive.com/chukwa-user%40hadoop.apache.org/msg00084.html
> We should have a utility class that can generate a Chukwa sequence file from a raw log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.