You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2011/03/05 18:56:46 UTC

[jira] Created: (PIG-1885) SUBSTRING fails when input length less than start

SUBSTRING fails when input length less than start
-------------------------------------------------

                 Key: PIG-1885
                 URL: https://issues.apache.org/jira/browse/PIG-1885
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0, 0.9.0
            Reporter: Alan Gates
            Priority: Minor


SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deepak Kumar V updated PIG-1885:
--------------------------------

    Attachment: PIG-1885.txt

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004136#comment-13004136 ] 

Alan Gates commented on PIG-1885:
---------------------------------

Changes look good.  One thought I had is rather than paying the cost of three ifs up front we could still call String.substring as we do today and catch the StringIndexOutofBoundsException, and only then figure out what was and wrong and give an error message.  That way you are only paying the cost of the checks once (since String.substring will do it anyway).  Up to you.

You will need to add units tests before we can commit it.  Add them to the existing TestStringUDFs class.  A test should be added for each of the error conditions you check for.

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006561#comment-13006561 ] 

Alan Gates commented on PIG-1885:
---------------------------------

  [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
     [exec]
     [exec]

Unit tests pass

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005789#comment-13005789 ] 

Deepak Kumar V commented on PIG-1885:
-------------------------------------

Please review.

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-1885:
----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Patch committed.  Thanks Deepak.

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates reassigned PIG-1885:
-------------------------------

    Assignee: Deepak Kumar V

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deepak Kumar V updated PIG-1885:
--------------------------------

    Release Note: 
Fix has been tested with 0.9.0 source. (http://svn.apache.org/repos/asf/pig/trunk/)
JUnit test case included.
Created a individual test cases for each scenario. This way its clear that SUBSTRING has been tested with different scenarios , also the same is reflected in test reports. Otherwise having all the scenarios within a single test case, does not reflect the various scenarios that a given api has been tested with.

  was:Fix has been tested with 0.9.0 source.

          Status: Patch Available  (was: Open)

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003968#comment-13003968 ] 

Deepak Kumar V commented on PIG-1885:
-------------------------------------

There are few more cases that needs to be handled.

a) beginindex is -ve
Current Behavior: SUBSTRING throws ExceException because of java.lang.StringIndexOutOfBoundsException

b) beginindex > endindex
Current Behavior: SUBSTRING throws ExceException because of java.lang.StringIndexOutOfBoundsException

c) beginindex > string length -- bug reported
Current Behavior: SUBSTRING throws ExceException because of java.lang.StringIndexOutOfBoundsException

d) endindex is -ve
Current Behavior: SUBSTRING throws ExceException because of java.lang.StringIndexOutOfBoundsException


SUBSTRING.java can be modified to return null in all above cases as follows
            if(beginindex < 0 || beginindex > source.length() || beginindex > endindex) {
                return null;
            }else {
                return source.substring(beginindex, Math.min(source.length(), endindex));
            }


In case no one is working on this defect, reassign to me.




> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Priority: Minor
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005904#comment-13005904 ] 

Alan Gates commented on PIG-1885:
---------------------------------

Patch looks good.  I'll run the unit tests and test_patch.

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deepak Kumar V updated PIG-1885:
--------------------------------

    Status: Open  (was: Patch Available)

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006795#comment-13006795 ] 

Deepak Kumar V commented on PIG-1885:
-------------------------------------

Yee-Haw
1st Commit.

@Alan
Can you review https://issues.apache.org/jira/browse/PIG-671 

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt, PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deepak Kumar V updated PIG-1885:
--------------------------------

    Fix Version/s: 0.9.0
     Release Note: Fix has been tested with 0.9.0 source.
           Status: Patch Available  (was: Open)

Following scenarios are tested

grunt> a = load 'substrinput.txt' as (data:chararray);   
grunt> dump a;                                        
(abcde)
()
(fghi)

1) beginindex == endindex
grunt> a = load 'substrinput.txt' as (data:chararray);
grunt> b = foreach a generate SUBSTRING(data,0,0);    
grunt> dump b;
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp55103794"
()
()
()

2) beginindex < endindex  - normal scenario 
grunt> b = foreach a generate SUBSTRING(data,0,3);
grunt> dump b;                                    
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp-314465179"
(abc)
()
(fgh)

3) beginindex is -ve
grunt> b = foreach a generate SUBSTRING(data,-1,3);
grunt> dump b;                                     
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp286819157"
()
()
()

4) beginindex > String length.
grunt> b = foreach a generate SUBSTRING(data,8,3); 
grunt> dump b;                                    
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp-900213360"
()
()
()

5) beginindex > endindex
grunt> b = foreach a generate SUBSTRING(data,2,0);
grunt> dump b;                                    
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp-834743686"
()
()
()

5) beginindex is correct endindex > string length.
grunt> b = foreach a generate SUBSTRING(data,0,9);
grunt> dump b;                                    
Output(s):
Successfully stored records in: "file:/tmp/temp1752997904/tmp243973324"
(abcde)
()
(fghi)

6) beginindex is correct and endindex is -ve
grunt> b = foreach a generate SUBSTRING(data,0,-2);   
grunt> dump b;
Output(s):
Successfully stored records in: "file:/tmp/temp970048830/tmp1979923245"
()
()
()
grunt> quit


> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

Posted by "Deepak Kumar V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deepak Kumar V updated PIG-1885:
--------------------------------

    Attachment: PIG-1885.txt

> SUBSTRING fails when input length less than start
> -------------------------------------------------
>
>                 Key: PIG-1885
>                 URL: https://issues.apache.org/jira/browse/PIG-1885
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Alan Gates
>            Assignee: Deepak Kumar V
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: PIG-1885.txt
>
>
> SUBSTRING throws an error if it gets a string which has a length less than its start value.  For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100.  It should return null instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira