You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Nicholas Verbeck (JIRA)" <ji...@apache.org> on 2011/08/04 04:01:28 UTC
[jira] [Created] (FLUME-722) RegexAllExtractor doesn't ignore empty
groups
RegexAllExtractor doesn't ignore empty groups
---------------------------------------------
Key: FLUME-722
URL: https://issues.apache.org/jira/browse/FLUME-722
Project: Flume
Issue Type: Bug
Components: Sinks+Sources
Affects Versions: v0.9.4
Reporter: Nicholas Verbeck
Assignee: Nicholas Verbeck
Priority: Minor
Hi flume devs.
I saw a bug when using RegexAllExtractor: line 94:
if(names.get(grp-1) != ""){
Attributes.setString(e, names.get(grp-1), val);
}
Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
(I don't think I can open a issue at cloudera jira.)
Thanks,
Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-722) RegexAllExtractor doesn't ignore empty
groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hsieh updated FLUME-722:
---------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-722) RegexAllExtractor doesn't ignore empty
groups
Posted by "Mingjie Lai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mingjie Lai updated FLUME-722:
------------------------------
Attachment: FLUME-722.patch
Patch available.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Nicholas Verbeck
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129329#comment-13129329 ]
Hudson commented on FLUME-722:
------------------------------
Integrated in flume-728 #40 (See [https://builds.apache.org/job/flume-728/40/])
FLUME-722. Transactinal memory channel implementation.
(Prasad Mujumdar via Arvind Prabhakar)
arvind : http://svn.apache.org/viewvc/?view=rev&rev=1185412
Files :
* /incubator/flume/branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MultiOpMemChannel.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/PseudoTxnMemoryChannel.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannelTransaction.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/sink/TestAvroSink.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/sink/TestLoggerSink.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/sink/TestRollingFileSink.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/source/TestAvroSource.java
* /incubator/flume/branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/source/TestSequenceGeneratorSource.java
* /incubator/flume/branches/flume-728/flume-ng-node/src/test/java/org/apache/flume/source/TestNetcatSource.java
* /incubator/flume/branches/flume-728/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (FLUME-722) RegexAllExtractor doesn't
ignore empty groups
Posted by "Mingjie Lai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079579#comment-13079579 ]
Mingjie Lai edited comment on FLUME-722 at 8/4/11 8:40 PM:
-----------------------------------------------------------
I cannot reproduce it with test cases -- TestExtractors.
But for a flume deployment, I have an flume node configured as:
<code>
tail ("/tmp/log.log") | regexAll( "(.+)\\t(.+)\\t(.+)\\t(.+)\\t(.+)", "", "license", "timestamp", "rating", "url") => text("/tmp/aa.txt")
<code>
what I saw is:
<code>
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 210.5.102.6 } { license : xxx } { rating : 38 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 194.78.100.30 } { license : xxx } { rating : 41 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
<code>
It's' an obvious java string comparison bug. So no test cases for this issue.
was (Author: mingjielai):
I cannot reproduce it with test cases -- TestExtractors.
But for a flume deployment, I have an flume node configured as:
<code>
tail ("/tmp/log.log") | regexAll( "(.+)\\t(.+)\\t(.+)\\t(.+)\\t(.+)", "", "license", "timestamp", "rating", "url") => text("/tmp/aa.txt")
</code>
what I saw is:
<code>
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 210.5.102.6 } { license : xxx } { rating : 38 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 194.78.100.30 } { license : xxx } { rating : 41 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
</code>
It's' an obvious java string comparison bug. So no test cases for this issue.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Nicholas Verbeck
> Priority: Minor
> Labels: decorator, regexall
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hsieh reassigned FLUME-722:
------------------------------------
Assignee: Mingjie Lai (was: Nicholas Verbeck)
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079876#comment-13079876 ]
Jonathan Hsieh commented on FLUME-722:
--------------------------------------
Ok, I've been able to cause the problem and it seems to come from the parsing of the config. Diving deeper.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079877#comment-13079877 ]
Jonathan Hsieh commented on FLUME-722:
--------------------------------------
Here's a test the fails before and succeeds after the fix. Fold this into TestExtractors please?
{code}
/**
* the parser step creates a "" that is not the canonical "" which requires equals vs == test
**/
@Test
public void testRegexAllExtractorEmptyProblem() throws IOException,
InterruptedException, FlumeSpecException {
final MemorySinkSource mem = new MemorySinkSource();
mem.open();
SinkFactoryImpl sf = new SinkFactoryImpl();
sf.setSink("mem", new SinkBuilder() {
@Override
public EventSink build(Context context, String... argv) {
return mem;
}
});
FlumeBuilder.setSinkFactory(sf);
RegexAllExtractor re = (RegexAllExtractor) FlumeBuilder.buildSink(
LogicalNodeContext.testingContext(), "regexAll(\"(.+)\\\\t(.+)\","
+ "\"\", \"keep\") mem");
re.open();
re.append(new EventImpl("ignoreme\tkeepme".getBytes()));
re.close();
mem.close();
mem.open();
Event e1 = mem.next();
assertEquals(null, Attributes.readString(e1, ""));
assertEquals("keepme", Attributes.readString(e1, "keep"));
}
{code}
Thanks,
Jon.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-722) RegexAllExtractor doesn't ignore empty
groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hsieh updated FLUME-722:
---------------------------------
Fix Version/s: v0.9.5
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-722) RegexAllExtractor doesn't ignore empty
groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hsieh updated FLUME-722:
---------------------------------
Status: Patch Available (was: Open)
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Mingjie Lai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081283#comment-13081283 ]
Mingjie Lai commented on FLUME-722:
-----------------------------------
@jon Thanks for your comments.
I created https://review.cloudera.org/r/1891/ for easier review. And I also posted a new patch which includes the test case at the rb.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081459#comment-13081459 ]
Jonathan Hsieh commented on FLUME-722:
--------------------------------------
looks good. Thanks Mingjie!
Still waiting on svn import so committing to github.com for now. Will close issue when pushed to apache svn.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079867#comment-13079867 ]
Jonathan Hsieh commented on FLUME-722:
--------------------------------------
@Mingjie
Nice catch! I agree this is a bad code smell. Am I right to assume that after the patch, the { : 210.5.102.6 } and { :194.78.100.30 } parts are not present?
I'm really curious about how to duplicate this and why this happens and would really like to figure this out before we commit. I've tried to make a test case that would make an "" attribute show up but I can't seem to do it either. Can you give a scrubbed example line? Do they look like:
"1.2.3.4\txxxLicense\t1304406740\t42\txxxUrl"
Also, a style nit -- can you invert the equals expression to eliminate the possibility of a NPE? This could be done by changing:
!names.get(grp-1).equals("")
into
!"".equals(names.get(grp-1))
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Mingjie Lai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079579#comment-13079579 ]
Mingjie Lai commented on FLUME-722:
-----------------------------------
I cannot reproduce it with test cases -- TestExtractors.
But for a flume deployment, I have an flume node configured as:
<code>
tail ("/tmp/log.log") | regexAll( "(.+)\\t(.+)\\t(.+)\\t(.+)\\t(.+)", "", "license", "timestamp", "rating", "url") => text("/tmp/aa.txt")
</code>
what I saw is:
<code>
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 210.5.102.6 } { license : xxx } { rating : 38 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
localhost [INFO Thu Aug 04 13:05:12 PDT 2011] { : 194.78.100.30 } { license : xxx } { rating : 41 } { tailSrcFile : (long)7382926376034922343 (string) 'full.log' (double)3.641231503360977E185 } { timestamp : 1304406740 } { url : xxx}
</code>
It's' an obvious java string comparison bug. So no test cases for this issue.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Nicholas Verbeck
> Priority: Minor
> Labels: decorator, regexall
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079883#comment-13079883 ]
Jonathan Hsieh commented on FLUME-722:
--------------------------------------
Actually, instead of the previous comment, this is a smaller and better change that better demonstrates the root cause. (second call to names.add different and added an assert).
Can you modify the existing function to be this instead?
{code}
@Test
public void testRegexAllExtractor() throws IOException, InterruptedException {
MemorySinkSource mem = new MemorySinkSource();
mem.open();
ArrayList<String> names = new ArrayList<String>();
names.add("d1");
// when parsed, a separate instance of "" is created
names.add(new String(new byte[0]));
names.add("d2");
RegexAllExtractor re = new RegexAllExtractor(mem, "(\\d):(\\d):(\\d)",
names);
re.open();
re.append(new EventImpl("1:2:3.4foobar5".getBytes()));
re.close();
mem.close();
mem.open();
Event e1 = mem.next();
assertEquals("1", Attributes.readString(e1, "d1"));
assertEquals(null, Attributes.readString(e1, ""));
assertEquals("3", Attributes.readString(e1, "d2"));
}
{code}
Thanks,
Jon.
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-722) RegexAllExtractor doesn't ignore
empty groups
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082202#comment-13082202 ]
jiraposter@reviews.apache.org commented on FLUME-722:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1448/
-----------------------------------------------------------
(Updated 2011-08-10 07:11:31.240611)
Review request for Flume and jmhsieh.
Changes
-------
Update do include link to jira issue
Summary
-------
Reposted for Mingjie Lai (was originally posted and reviewed on https://review.cloudera.org/r/1891/)
This addresses bug flume-722.
https://issues.apache.org/jira/browse/flume-722
Diffs
-----
/trunk/flume-core/src/main/java/com/cloudera/flume/core/extractors/RegexAllExtractor.java 1155987
/trunk/flume-core/src/test/java/com/cloudera/flume/core/extractors/TestExtractors.java 1155987
Diff: https://reviews.apache.org/r/1448/diff
Testing
-------
Ran test, it passed.
Thanks,
jmhsieh
> RegexAllExtractor doesn't ignore empty groups
> ---------------------------------------------
>
> Key: FLUME-722
> URL: https://issues.apache.org/jira/browse/FLUME-722
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.4
> Reporter: Nicholas Verbeck
> Assignee: Mingjie Lai
> Priority: Minor
> Labels: decorator, regexall
> Fix For: v0.9.5
>
> Attachments: FLUME-722.patch
>
>
> Hi flume devs.
> I saw a bug when using RegexAllExtractor: line 94:
> if(names.get(grp-1) != ""){
> Attributes.setString(e, names.get(grp-1), val);
> }
> Please help to file a jira and correct it to use String.equal(), otherwise it doesn't ignore empty groups.
> (I don't think I can open a issue at cloudera jira.)
> Thanks,
> Mingjie
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira