You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by "Spencer Hivert (JIRA)" <ji...@apache.org> on 2018/08/01 17:58:00 UTC

[jira] [Commented] (GRIFFIN-164) Make 'Regular expression detection count' available in UI

    [ https://issues.apache.org/jira/browse/GRIFFIN-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565721#comment-16565721 ] 

Spencer Hivert commented on GRIFFIN-164:
----------------------------------------

I agree, this feature would be quite useful!

I'm working on this right now!



> Make 'Regular expression detection count' available in UI
> ---------------------------------------------------------
>
>                 Key: GRIFFIN-164
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-164
>             Project: Griffin (Incubating)
>          Issue Type: Improvement
>    Affects Versions: 0.1.6-incubating
>            Reporter: Enrico D'Urso
>            Priority: Minor
>             Fix For: 1.0.0-incubating
>
>
> Hi,
> I have been playing for one month now with Griffin.
> Given my experience, some companies (included the one am working for as a consultant) prefer doing stuff using UI.
> Personally, I find very useful the following feature:
>  
>  * Regular expression detection count
> which is, I have a column which should contain just numbers so I want to check if my ETL process, wrongly, has populated my table with non-numeric values.
> I have been able to run such a job creating my self the right config.json, in particular, using spark-sql as dialect:
> {code:java}
> select count(*) from src where account_id rlike [^0-9]  
> {code}
> I saw that in pr.component.ts there is a commented line of code:
> {code:java}
> // {"id":10,"itemName":"Regular Expression Detection Count","category": "Advanced Statistics"}
> {code}
> which I think is what I am talking about.
> Also, I can read:
> {code:java}
> // case 'Regular Expression Detection Count': // return 'count(source.`'+col.name+'`) where source.`'+col.name+'` LIKE ';
> {code}
> which should be the griffin-dsl dialect, even if, probably, the regex should be added just after LIKE.
> Then, once that the above griffin-dsl statement is available in the backend, 
> ProfilingRulePlanTrans class
> should map that into 'rlike' Spark-sql clause.
> Am not sure where (and if) ProfilingRulePlanTrans should be modified as 
> preGroupbyClause should contains everything, but I do not have enough knowledge about it.
>  
> Please judge yourself the priority of such a feature, which knowing well the code, should not be too hard to make.
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)