You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Roberto Coluccio <ro...@gmail.com> on 2014/07/22 12:04:51 UTC

Querying arrays of structs with regular expressions or like/rlike functions

Hello folks,

I am performing some tests with Hive 0.12.0 (cdh5.0.3). I have a quite
complex data model, in particular I modeled a filed in my table as an array
of structs, like:

people array<
                  struct<
                       name:string,
                       surname:string,
                       address:string,
                       role:string,
                       dateofbirth: int,
                       id: string>
              >

I am able to query such field by using the general default UDF function
"array_contains(Array <T>, value)", by hitting something like: select *
from table1 where array_contains(people.name, "Roberto");

What I have experienced is that such function performs a 1:1
match/comparison, and this is fine for some problems. But, (how) can I use
a regular expression applied the sub-field name of my struct (inside my
array people) in order to retrieve e.g. all the people whose name starts
with "Ro"? I know that Hive gives us the "like" and "rlike" functions, but
how can I apply them to a field inside a struct that is one of the elements
of an array?

Please, do not just tell me to change my data model: I'm already
considering this, but the problem is that my table is way more complex (it
is made of several more fields, arrays and arrays of structs).

Thank you.
Roberto