You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@climate.apache.org by Michael Anderson <mi...@gmail.com> on 2017/11/26 23:40:06 UTC

Re: data_processor..mask_missing_data - What is the definition of a missing value?

Apologies, the method is in data_process, not utils, as incorrectly stated
in the header of the previous email.

On Sun, Nov 26, 2017 at 6:29 PM, Michael Anderson <
michael.arthur.anderson@gmail.com> wrote:

> I'm working on https://issues.apache.org/jira/browse/CLIMATE-797.
>
> The comments in the method in question state:
>
> If any of dataset in dataset_array has missing values at a grid point,
> the values at the grid point in all other datasets are masked.
>
>
> The problem here is that the method assumes a masked array is passed as an input.
>
> If a regular numpy array (e.g. OCW dataset) is passed, it does not have a mask attribute and an error is thrown
>
>
> 1.  I could tidy up the error handling to make it more clear to the caller that a masked array was expected.
>
>
> 2.  I could check if a mask exists and use that.  In the case of the mask not being supplied, I could carry out the intent of the function and manually check the array for "missing values".  Other than None or NaN, are there any other values that by convention constitute missing?  The netCDF default fill values?
>
>
> Preferences on the approach and / or suggestions on the second approach?
>
>
> Thanks,
>
>
> Michael A. Anderson
>
>
>
>
>