You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Ian Maxon (JIRA)" <ji...@apache.org> on 2016/02/21 09:09:18 UTC
[jira] [Created] (ASTERIXDB-1308) Query for calculating MAD of list
fails to optimize
Ian Maxon created ASTERIXDB-1308:
------------------------------------
Summary: Query for calculating MAD of list fails to optimize
Key: ASTERIXDB-1308
URL: https://issues.apache.org/jira/browse/ASTERIXDB-1308
Project: Apache AsterixDB
Issue Type: Bug
Components: AsterixDB, Optimizer
Reporter: Ian Maxon
Assignee: Yingyi Bu
This is a complicated query, it's already doing time binning, but now it needs to also include the Median Average Distance summary statistic. My first crack at doing this was to create two functions:
create type HRMType as closed {
row_id: int32,
sid: int32,
date: date,
day: int32,
time: time,
bpm: int32,
RR: float
};
declare function median($x){
if (count($x)%2 = 0) then avg([$x[count($x)],$x[count($x)-1]])
else $x[count($x)]
}
declare function MAD($x){
median(
for $xi in $x
let $dist := abs(median($x)-$xi)
return $dist
)
}
for $i in dataset HRM
group by $sid := $i.sid, $gdate := $i.date, $gday := $i.day, $timebin := interval-bin($i.time, time("00:00:00"), day-time-duration("PT15M")) with $i
return {
"sid": $sid,
"gdate": $gdate,
"gday": $gday,
"timebin": $timebin,
"stdv": (avg(for $ii in $i return $ii.RR * $ii.RR) - avg(for $ii in $i return $ii.RR) * avg(for $ii in $i return $ii.RR))^(0.5),
"MAD": MAD(for $ii in $i return $ii.RR)
};
But this query fails to optimize with error: "Could not infer type for variable '$$30'. [AlgebricksException]"
Any suggestions for a work-around would be welcome. Would writing everything without declaring functions perhaps get around this?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)