You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Ian Maxon (JIRA)" <ji...@apache.org> on 2016/02/21 09:09:18 UTC

[jira] [Created] (ASTERIXDB-1308) Query for calculating MAD of list fails to optimize

Ian Maxon created ASTERIXDB-1308:
------------------------------------

             Summary: Query for calculating MAD of list fails to optimize
                 Key: ASTERIXDB-1308
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1308
             Project: Apache AsterixDB
          Issue Type: Bug
          Components: AsterixDB, Optimizer
            Reporter: Ian Maxon
            Assignee: Yingyi Bu


This is a complicated query, it's already doing time binning, but now it needs to also include the Median Average Distance summary statistic. My first crack at doing this was to create two functions:

create type HRMType as closed {
row_id: int32,
  sid: int32,
  date: date,
  day: int32,
  time: time,
  bpm: int32,
  RR: float
};

declare function median($x){
if (count($x)%2 = 0) then avg([$x[count($x)],$x[count($x)-1]])
else $x[count($x)]
}
 
declare function MAD($x){
median(
for $xi in $x
let $dist := abs(median($x)-$xi)
return $dist
)
}
 
 
for $i in dataset HRM
group by $sid := $i.sid, $gdate := $i.date, $gday := $i.day, $timebin := interval-bin($i.time, time("00:00:00"), day-time-duration("PT15M")) with $i
return {
"sid": $sid,
"gdate": $gdate,
"gday": $gday,
"timebin": $timebin,
"stdv": (avg(for $ii in $i return $ii.RR * $ii.RR) - avg(for $ii in $i return $ii.RR) * avg(for $ii in $i return $ii.RR))^(0.5),
"MAD": MAD(for $ii in $i return $ii.RR)
};


But this query fails to optimize with error: "Could not infer type for variable '$$30'. [AlgebricksException]"
Any suggestions for a work-around would be welcome. Would writing everything without declaring functions perhaps get around this? 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)