You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/08/16 00:15:20 UTC
[jira] [Updated] (MADLIB-943) Path - multiple symbol matches per
row
[ https://issues.apache.org/jira/browse/MADLIB-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan updated MADLIB-943:
-----------------------------------
Fix Version/s: (was: v1.9.1)
> Path - multiple symbol matches per row
> --------------------------------------
>
> Key: MADLIB-943
> URL: https://issues.apache.org/jira/browse/MADLIB-943
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Utilities
> Reporter: Frank McQuillan
> Attachments: Ecommerce data set for path test 3.csv, path-multi-symbol-per-row.ipynb, screenshot-1.png
>
>
> Story
> As a data scientist, I want to be able to define multiple symbols per row for pattern matching.
> See
> http://madlib.incubator.apache.org/docs/latest/group__grp__path.html
> for a description of what a symbol is.
> Currently in 1.9, a given row can only match one symbol. If a row matches multiple symbols, the symbol that comes first in the symbol definition list will take precedence. This story is about all matching symbols on a row being used.
> Acceptance
> The attached data set and query should should produce the following output, also on screenshot attached:
> Event Timestamp User ID Age Group Income Group Gender Region Household Size Click Event Purchase Event Revenue Margin
> 4/14/12 23:43 102201 3 3 Female East 3 1 1 112 36
> 4/15/12 2:53 102201 3 3 Female East 3 1 1 117 28
> 4/15/12 8:51 102201 3 3 Female East 3 0 0 0 0
> 4/15/12 23:13 102201 3 3 Female East 3 0 0 0 0
> 4/16/12 4:20 102201 3 3 Female East 3 0 0 0 0
> 4/16/12 5:44 102201 3 3 Female East 3 1 0 0 0
> There are symbol matches for:
> Gender=Female
> Region=East
> Household Size=3
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)