You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by arunvinudss <gi...@git.apache.org> on 2017/07/24 04:04:27 UTC

[GitHub] commons-text pull request #57: TEXT-98: Remove isDelimiter and use HashSets ...

GitHub user arunvinudss opened a pull request:

    https://github.com/apache/commons-text/pull/57

    TEXT-98: Remove isDelimiter and use HashSets for delimiter checks

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arunvinudss/commons-text TEXT-98

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/commons-text/pull/57.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #57
    
----
commit eabb18efa39b1fbebf66d46282d6abc3f9b2c7aa
Author: Arun Vinud <ar...@capitalone.com>
Date:   2017-07-23T14:57:37Z

    Remove isDelimiter and using HashSets for delimiter checks

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by arunvinudss <gi...@git.apache.org>.
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    @ameyjadiye I want to remove the isDelimiter method. I would be surprised if someone uses isDelimiter separately because all it does is to check if a given element is present in an array or not. Moreover the isDelimiter char version is already dead code as we don't use it anymore. I would say the scope of the isDelimiter should have been private.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by chtompki <gi...@git.apache.org>.
Github user chtompki commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    This all opens the question about going `2.x`. I think we have a couple of things that would warrant a 2.x move. Do we want to attempt that in the fall, or is it still too premature?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by chtompki <gi...@git.apache.org>.
Github user chtompki commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    Will get to this today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text pull request #57: TEXT-98: Remove isDelimiter and use HashSets ...

Posted by ameyjadiye <gi...@git.apache.org>.
Github user ameyjadiye commented on a diff in the pull request:

    https://github.com/apache/commons-text/pull/57#discussion_r129102692
  
    --- Diff: src/main/java/org/apache/commons/text/WordUtils.java ---
    @@ -747,45 +750,29 @@ public static boolean containsAllWords(final CharSequence word, final CharSequen
             return true;
         }
     
    -    //-----------------------------------------------------------------------
    +    // -----------------------------------------------------------------------
         /**
    -     * Is the character a delimiter.
    +     * <p>
    +     * Converts an array of delimiters to a hash set of code points. Code point of space(32) is added as the default
    +     * value if delimiters is null. The generated hash set provides O(1) lookup time.
    +     * </p>
          *
    -     * @param ch  the character to check
    -     * @param delimiters  the delimiters
    -     * @return true if it is a delimiter
    +     * @param delimiters set of characters to determine capitalization, null means whitespace
    +     * @return Set<Integer>
          */
    -    public static boolean isDelimiter(final char ch, final char[] delimiters) {
    -        if (delimiters == null) {
    -            return Character.isWhitespace(ch);
    -        }
    -        for (final char delimiter : delimiters) {
    -            if (ch == delimiter) {
    -                return true;
    +    private static Set<Integer> generateDelimiterSet(final char[] delimiters) {
    +        Set<Integer> delimiterHashSet = new HashSet<>();
    +        if (delimiters == null || delimiters.length == 0) {
    +            if (delimiters == null) {
    +                delimiterHashSet.add(Character.codePointAt(new char[] {' '}, 0));
                 }
    +            return delimiterHashSet;
             }
    -        return false;
    -    }
     
    -  //-----------------------------------------------------------------------
    -    /**
    -     * Is the codePoint a delimiter.
    -     *
    -     * @param codePoint the codePint to check
    -     * @param delimiters  the delimiters
    -     * @return true if it is a delimiter
    -     */
    -    public static boolean isDelimiter(final int codePoint, final char[] delimiters) {
    --- End diff --
    
    Rather removing we should keep this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by PascalSchumacher <gi...@git.apache.org>.
Github user PascalSchumacher commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    @arunvinudss While I agree that `isDelimiter` should have been private, it is public and was released with commons-text `1.1`. Due to the strict binary compatibilities promise of commons it can not be removed before `2.0`. For now the best we can do is mark it as deprecated and explain that it will be removed in version `2.0`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by arunvinudss <gi...@git.apache.org>.
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    @chtompki Please review and merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by coveralls <gi...@git.apache.org>.
Github user coveralls commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    
    [![Coverage Status](https://coveralls.io/builds/12640859/badge)](https://coveralls.io/builds/12640859)
    
    Coverage decreased (-0.2%) to 98.021% when pulling **fb6d5935451397c561bd52cf1d483975f83b2c7b on arunvinudss:TEXT-98** into **998764ebe38113eb51e6850058ca01936625dd92 on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text pull request #57: TEXT-98: Remove isDelimiter and use HashSets ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/commons-text/pull/57


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by ameyjadiye <gi...@git.apache.org>.
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    @arunvinudss , just addition to @PascalSchumacher comment , at this point we don't know other than Commons text who else having dependancy on `isDelimiter` so better we can make it depricated and we can remove code all together in 2.x


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

[GitHub] commons-text pull request #57: TEXT-98: Remove isDelimiter and use HashSets ...

Posted by ameyjadiye <gi...@git.apache.org>.
Github user ameyjadiye commented on a diff in the pull request:

    https://github.com/apache/commons-text/pull/57#discussion_r129102634
  
    --- Diff: src/main/java/org/apache/commons/text/WordUtils.java ---
    @@ -747,45 +750,29 @@ public static boolean containsAllWords(final CharSequence word, final CharSequen
             return true;
         }
     
    -    //-----------------------------------------------------------------------
    +    // -----------------------------------------------------------------------
         /**
    -     * Is the character a delimiter.
    +     * <p>
    +     * Converts an array of delimiters to a hash set of code points. Code point of space(32) is added as the default
    +     * value if delimiters is null. The generated hash set provides O(1) lookup time.
    +     * </p>
          *
    -     * @param ch  the character to check
    -     * @param delimiters  the delimiters
    -     * @return true if it is a delimiter
    +     * @param delimiters set of characters to determine capitalization, null means whitespace
    +     * @return Set<Integer>
          */
    -    public static boolean isDelimiter(final char ch, final char[] delimiters) {
    --- End diff --
    
    Rather removing we should keep this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


[GitHub] commons-text issue #57: TEXT-98: Remove isDelimiter and use HashSets for del...

Posted by ameyjadiye <gi...@git.apache.org>.
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/57
  
    @chtompki , I think whichever items are piled up for 2.x are not too critical, we should wait for 2.x release.
    If we are releasing some major improvement or fix we can release all queued items in that. ATM I don't see anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org