You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@drill.apache.org by Yun Liu <y....@castsoftware.com> on 2017/11/02 14:46:12 UTC

Drill Capacity

Hi,

I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.

Thanks,

Yun Liu
Solutions Delivery Consultant
321 West 44th St | Suite 501 | New York, NY 10036
+1 212.871.8355 office | +1 646.752.4933 mobile

CAST, Leader in Software Analysis and Measurement
Achieve Insight. Deliver Excellence.
Join the discussion http://blog.castsoftware.com/
LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

Re: Drill Capacity

Posted by Prasad Nagaraj Subramanya <pr...@gmail.com>.

Hi Yun,

Drill is designed to query large datasets. There is no specific limit on
the size, it works well even when data is in hundreds of GBs.

DATA_READ ERROR has something to do with the data in your file. The data in
some of the columns may not be consistent with the datatype.
Please refer to this link for one such example -
https://stackoverflow.com/questions/40217328/apache-drill-mysql-and-data-read-error-failure-while-attempting-to-read-from

Thanks,
Prasad

On Thu, Nov 2, 2017 at 7:46 AM, Yun Liu <y....@castsoftware.com> wrote:

> Hi,
>
> I've been using Apache Drill actively and just wondering what is the
> capacity of Drill? I have a json file which is 390MB and it keeps throwing
> me an DATA_READ ERROR. I have another json file with exact same format but
> only 150MB and it's processing fine. When I did a *select* on the large
> json, it returns successfully for some of the fields. None of these errors
> really apply to me. So I am trying to understand the capacity of the json
> files Drill supports up to. Or if there's something else I missed.
>
> Thanks,
>
> Yun Liu
> Solutions Delivery Consultant
> 321 West 44th St | Suite 501 | New York, NY 10036
> +1 212.871.8355 office | +1 646.752.4933 mobile
>
> CAST, Leader in Software Analysis and Measurement
> Achieve Insight. Deliver Excellence.
> Join the discussion http://blog.castsoftware.com/
> LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<
> http://twitter.com/onquality> | Facebook<http://www.facebook.
> com/pages/CAST/105668942817177>
>
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Yes- I guess breaking them into smaller file will solve this.

Thanks!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com] 
Sent: Friday, November 3, 2017 5:40 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity



I have seen a use-case where query fails for 12 GB single json file having structure ''{ "key":[obj1, obj2, obj3..objn]}''. Here json file has a key element and value is array of json object 'obj'. There were around 175K objects in this array and each obj is again complex json object with nested array elements. From what I understood, Drill reads entire file content as  single json record (which actually is) and fails with DATA_READ ERROR.


The solution was to re organize the data to either of following structure. Also to break single file into multiple smaller file for better parallelism.


Structure 2: File has array of json object like below
[ {obj1},{obj2}..,{objn}]

Structure 3:  File has  json objects as below
{obj1}
{obj1}
..
{objn}


I was checking if this is the case here..


Thanks,


Arjun


________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 2:27 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun,

Column 4 has the most data and a bit long here. The other 3 columns has maybe a word or 2. Thanks for your patience.

[ {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid unreferenced Tables",
    "key" : "1634",
    "critical" : false
  },
  "result" : {
    "grade" : 2,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 5,
      "successfulChecks" : 47,
      "ratio" : 0.9038461538461539
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.0769230769230775,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 5,
        "successfulChecks" : 47,
        "ratio" : 0.9038461538461539
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Namespace naming convention - case control",
    "key" : "3550",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 0,
      "successfulChecks" : 31,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 0,
        "successfulChecks" : 31,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "2",
    "name" : "Interface naming convention - case and character set control",
    "key" : "3554",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumerations naming convention - case and character set control",
    "key" : "3558",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 7,
      "failedChecks" : 0,
      "successfulChecks" : 7,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 7,
        "failedChecks" : 0,
        "successfulChecks" : 7,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumeration Items naming convention - case and character set control",
    "key" : "3560",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 65,
      "failedChecks" : 0,
      "successfulChecks" : 65,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 65,
        "failedChecks" : 0,
        "successfulChecks" : 65,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Private Fields naming convention - case and character set control",
    "key" : "3562",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 90,
      "successfulChecks" : 40,
      "ratio" : 0.3076923076923077
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 90,
        "successfulChecks" : 40,
        "ratio" : 0.3076923076923077
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Methods naming convention - case and character set control",
    "key" : "3566",
    "critical" : false
  },
  "result" : {
    "grade" : 3.759803921568629,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 11,
      "successfulChecks" : 550,
      "ratio" : 0.9803921568627451
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.759803921568629,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 11,
        "successfulChecks" : 550,
        "ratio" : 0.9803921568627451
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid using Keywords as names",
    "key" : "3570",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 1596,
      "failedChecks" : 4,
      "successfulChecks" : 1592,
      "ratio" : 0.9974937343358395
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 1596,
        "failedChecks" : 4,
        "successfulChecks" : 1592,
        "ratio" : 0.9974937343358395
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3574/snapshots/3722",
    "name" : "Properties naming convention - case and character set control",
    "key" : "3574",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 352,
      "failedChecks" : 0,
      "successfulChecks" : 352,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 352,
        "failedChecks" : 0,
        "successfulChecks" : 352,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3576/snapshots/3722",
    "name" : "Avoid declaring public Fields",
    "key" : "3576",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 0,
      "successfulChecks" : 130,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 0,
        "successfulChecks" : 130,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3578/snapshots/3722",
    "name" : "Avoid large Classes - too many Constructors",
    "key" : "3578",
    "critical" : false
  },
  "result" : {
    "grade" : 3.9691011235955074,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 2,
      "successfulChecks" : 176,
      "ratio" : 0.9887640449438202
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.9691011235955074,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 2,
        "successfulChecks" : 176,
        "ratio" : 0.9887640449438202
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3580/snapshots/3722",
    "name" : "Avoid large Classes - too many Methods",
    "key" : "3580",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 1,
      "successfulChecks" : 177,
      "ratio" : 0.9943820224719101
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 1,
        "successfulChecks" : 177,
        "ratio" : 0.9943820224719101
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3586/snapshots/3722",
    "name" : "Avoid large Methods - too many Lines of Code",
    "key" : "3586",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 4,
      "successfulChecks" : 492,
      "ratio" : 0.9919354838709677
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 4,
        "successfulChecks" : 492,
        "ratio" : 0.9919354838709677
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3612/snapshots/3722",
    "name" : "Close SQL connection ASAP",
    "key" : "3612",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3614/snapshots/3722",
    "name" : "Avoid using String.Empty for empty string tests",
    "key" : "3614",
    "critical" : false
  },
  "result" : {
    "grade" : 3.971603563474389,
    "violationRatio" : {
      "totalChecks" : 449,
      "failedChecks" : 5,
      "successfulChecks" : 444,
      "ratio" : 0.9888641425389755
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.971603563474389,
      "violationRatio" : {
        "totalChecks" : 449,
        "failedChecks" : 5,
        "successfulChecks" : 444,
        "ratio" : 0.9888641425389755
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3626/snapshots/3722",
    "name" : "Avoid Interfaces with a very low comment/code ratio",
    "key" : "3626",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 8,
      "successfulChecks" : 2,
      "ratio" : 0.2
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 8,
        "successfulChecks" : 2,
        "ratio" : 0.2
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3630/snapshots/3722",
    "name" : "Avoid having Classes implementing too many Interfaces",
    "key" : "3630",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 0,
      "successfulChecks" : 178,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 0,
        "successfulChecks" : 178,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4062/snapshots/3722",
    "name" : "Avoid Functions and Procedures doing an Insert, Update or Delete without managing a transaction",
    "key" : "4062",
    "critical" : false
  },
  "result" : {
    "grade" : 1.3841463414634148,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 71,
      "successfulChecks" : 134,
      "ratio" : 0.6536585365853659
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 12,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.3841463414634148,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 71,
        "successfulChecks" : 134,
        "ratio" : 0.6536585365853659
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 12,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4064/snapshots/3722",
    "name" : "Avoid Procedures using an Insert, Update, Delete, Create Table or Select without including error management",
    "key" : "4064",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 172,
      "successfulChecks" : 31,
      "ratio" : 0.15270935960591134
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 23,
      "removedCriticalViolations" : 2,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 172,
        "successfulChecks" : 31,
        "ratio" : 0.15270935960591134
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 23,
        "removedCriticalViolations" : 2,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4066/snapshots/3722",
    "name" : "Avoid Stored Procedures not returning a status value",
    "key" : "4066",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 184,
      "successfulChecks" : 19,
      "ratio" : 0.09359605911330049
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 184,
        "successfulChecks" : 19,
        "ratio" : 0.09359605911330049
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4070/snapshots/3722",
    "name" : "Avoid use of \"truncate table\"",
    "key" : "4070",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4076/snapshots/3722",
    "name" : "Avoid using temporary Objects",
    "key" : "4076",
    "critical" : false
  },
  "result" : {
    "grade" : 3.274390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 8,
      "successfulChecks" : 197,
      "ratio" : 0.9609756097560975
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.274390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 8,
        "successfulChecks" : 197,
        "ratio" : 0.9609756097560975
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4084/snapshots/3722",
    "name" : "Avoid nested Stored Procedures using temporary Tables",
    "key" : "4084",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 0,
      "successfulChecks" : 203,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 0,
        "successfulChecks" : 203,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7126/snapshots/3722",
    "name" : "Avoid Artifacts with high Commented-out Code Lines/Code Lines ratio",
    "key" : "7126",
    "critical" : false
  },
  "result" : {
    "grade" : 3.7591514143094855,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 59,
      "successfulChecks" : 2946,
      "ratio" : 0.9803660565723794
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.8659342622365145,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 43,
        "successfulChecks" : 2756,
        "ratio" : 0.9846373704894605
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4466019417475735,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 16,
        "successfulChecks" : 190,
        "ratio" : 0.9223300970873787
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7130/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Nested Subqueries",
    "key" : "7130",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7156/snapshots/3722",
    "name" : "Avoid Too Many Copy Pasted Artifacts",
    "key" : "7156",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6478102189781023,
    "violationRatio" : {
      "totalChecks" : 959,
      "failedChecks" : 231,
      "successfulChecks" : 728,
      "ratio" : 0.7591240875912408
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 7,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.671151439299124,
      "violationRatio" : {
        "totalChecks" : 799,
        "failedChecks" : 185,
        "successfulChecks" : 614,
        "ratio" : 0.7684605757196495
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.53125,
      "violationRatio" : {
        "totalChecks" : 160,
        "failedChecks" : 46,
        "successfulChecks" : 114,
        "ratio" : 0.7125
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 7,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7194/snapshots/3722",
    "name" : "Avoid large number of String concatenation",
    "key" : "7194",
    "critical" : false
  },
  "result" : {
    "grade" : 3.796370967741936,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 9,
      "successfulChecks" : 487,
      "ratio" : 0.9818548387096774
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.796370967741936,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 9,
        "successfulChecks" : 487,
        "ratio" : 0.9818548387096774
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7198/snapshots/3722",
    "name" : "Avoid String concatenation in loops",
    "key" : "7198",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 15,
      "successfulChecks" : 481,
      "ratio" : 0.969758064516129
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 15,
        "successfulChecks" : 481,
        "ratio" : 0.969758064516129
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7204/snapshots/3722",
    "name" : "Avoid method invocation in a loop termination expression",
    "key" : "7204",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 2,
      "successfulChecks" : 559,
      "ratio" : 0.9964349376114082
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 2,
        "successfulChecks" : 559,
        "ratio" : 0.9964349376114082
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7208/snapshots/3722",
    "name" : "Avoid the use of is inside loops",
    "key" : "7208",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 0,
      "successfulChecks" : 496,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 0,
        "successfulChecks" : 496,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7212/snapshots/3722",
    "name" : "Avoid instantiations inside loops",
    "key" : "7212",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 23,
      "successfulChecks" : 473,
      "ratio" : 0.9536290322580645
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 23,
        "successfulChecks" : 473,
        "ratio" : 0.9536290322580645
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7260/snapshots/3722",
    "name" : "User Interface elements must not use directly the database",
    "key" : "7260",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7262/snapshots/3722",
    "name" : "Avoid Namespaces with High Efferent Coupling (CE)",
    "key" : "7262",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7264/snapshots/3722",
    "name" : "Avoid namespaces with High Afferent Coupling (CA)",
    "key" : "7264",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7266/snapshots/3722",
    "name" : "Call 'base.Dispose()' or 'MyBase.Finalize()' in the \"finally\" block of 'Dispose(bool)' methods",
    "key" : "7266",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 3,
      "failedChecks" : 3,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 3,
        "failedChecks" : 3,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7268/snapshots/3722",
    "name" : "Dispose() methods should call GC.SuppressFinalize",
    "key" : "7268",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 5,
      "failedChecks" : 0,
      "successfulChecks" : 5,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 5,
        "failedChecks" : 0,
        "successfulChecks" : 5,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7270/snapshots/3722",
    "name" : "Declare as Static all methods not using instance members",
    "key" : "7270",
    "critical" : false
  },
  "result" : {
    "grade" : 3.5369875222816383,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 16,
      "successfulChecks" : 545,
      "ratio" : 0.9714795008912656
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.5369875222816383,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 16,
        "successfulChecks" : 545,
        "ratio" : 0.9714795008912656
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7272/snapshots/3722",
    "name" : "Provide a private default Constructor for utility Classes",
    "key" : "7272",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 6,
      "failedChecks" : 6,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 6,
        "failedChecks" : 6,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7294/snapshots/3722",
    "name" : "Avoid cyclical calls and inheritances between namespaces content",
    "key" : "7294",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 5,
      "successfulChecks" : 26,
      "ratio" : 0.8387096774193549
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 5,
        "successfulChecks" : 26,
        "ratio" : 0.8387096774193549
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7344/snapshots/3722",
    "name" : "Avoid \"SELECT *\" queries",
    "key" : "7344",
    "critical" : false
  },
  "result" : {
    "grade" : 2.932038834951456,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 11,
      "successfulChecks" : 195,
      "ratio" : 0.9466019417475728
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.932038834951456,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7346/snapshots/3722",
    "name" : "Avoid redundant indexes",
    "key" : "7346",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7348/snapshots/3722",
    "name" : "Avoid too many Indexes on one Table",
    "key" : "7348",
    "critical" : false
  },
  "result" : {
    "grade" : 3.76923076923077,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 1,
      "successfulChecks" : 51,
      "ratio" : 0.9807692307692307
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.76923076923077,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 1,
        "successfulChecks" : 51,
        "ratio" : 0.9807692307692307
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7352/snapshots/3722",
    "name" : "Avoid calling properties that clone values in loops",
    "key" : "7352",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7386/snapshots/3722",
    "name" : "Avoid Tables without a clustered Index",
    "key" : "7386",
    "critical" : false
  },
  "result" : {
    "grade" : 1.721153846153846,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 11,
      "successfulChecks" : 41,
      "ratio" : 0.7884615384615384
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.721153846153846,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 11,
        "successfulChecks" : 41,
        "ratio" : 0.7884615384615384
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7388/snapshots/3722",
    "name" : "Avoid artifacts having recursive calls",
    "key" : "7388",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 19,
      "successfulChecks" : 2986,
      "ratio" : 0.9936772046589019
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 19,
        "successfulChecks" : 2780,
        "ratio" : 0.9932118613790639
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7390/snapshots/3722",
    "name" : "Avoid having multiple Artifacts inserting data on the same SQL Table",
    "key" : "7390",
    "critical" : false
  },
  "result" : {
    "grade" : 1.673076923076923,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 12,
      "successfulChecks" : 40,
      "ratio" : 0.7692307692307693
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.673076923076923,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 12,
        "successfulChecks" : 40,
        "ratio" : 0.7692307692307693
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7392/snapshots/3722",
    "name" : "Avoid having multiple artifacts deleting data on the same SQL table",
    "key" : "7392",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8653846153846154,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 8,
      "successfulChecks" : 44,
      "ratio" : 0.8461538461538461
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.8653846153846154,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 8,
        "successfulChecks" : 44,
        "ratio" : 0.8461538461538461
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7394/snapshots/3722",
    "name" : "Avoid having multiple Artifacts updating data on the same SQL Table",
    "key" : "7394",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4615384615384612,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 4,
      "successfulChecks" : 48,
      "ratio" : 0.9230769230769231
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4615384615384612,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 4,
        "successfulChecks" : 48,
        "ratio" : 0.9230769230769231
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7418/snapshots/3722",
    "name" : "Avoid SQL queries using functions on indexed columns in the WHERE clause",
    "key" : "7418",
    "critical" : false
  },
  "result" : {
    "grade" : 3.28846153846154,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 7,
      "successfulChecks" : 175,
      "ratio" : 0.9615384615384616
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 5,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.28846153846154,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 7,
        "successfulChecks" : 175,
        "ratio" : 0.9615384615384616
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 5,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7420/snapshots/3722",
    "name" : "Avoid SQL queries with implicit conversions in the WHERE clause",
    "key" : "7420",
    "critical" : false
  },
  "result" : {
    "grade" : 3.425824175824175,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 6,
      "successfulChecks" : 176,
      "ratio" : 0.967032967032967
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.425824175824175,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 6,
        "successfulChecks" : 176,
        "ratio" : 0.967032967032967
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7424/snapshots/3722",
    "name" : "Avoid using SQL queries inside a loop",
    "key" : "7424",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 221,
      "failedChecks" : 11,
      "successfulChecks" : 210,
      "ratio" : 0.9502262443438914
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 15,
        "failedChecks" : 0,
        "successfulChecks" : 15,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7428/snapshots/3722",
    "name" : "Avoid SQL queries not using the first column of a composite index in the WHERE clause",
    "key" : "7428",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 24,
      "failedChecks" : 24,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 3,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 24,
        "failedChecks" : 24,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 3,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7458/snapshots/3722",
    "name" : "Avoid large Interfaces - too many Methods",
    "key" : "7458",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7460/snapshots/3722",
    "name" : "Avoid using untyped DataSet",
    "key" : "7460",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 17,
      "failedChecks" : 17,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 17,
        "failedChecks" : 17,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7470/snapshots/3722",
    "name" : "Avoid doing select on Datatable in loop",
    "key" : "7470",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 2,
      "failedChecks" : 0,
      "successfulChecks" : 2,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2,
        "failedChecks" : 0,
        "successfulChecks" : 2,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7760/snapshots/3722",
    "name" : "Avoid triggers, functions and procedures with a very low comment/code ratio",
    "key" : "7760",
    "critical" : false
  },
  "result" : {
    "grade" : 1.4695121951219512,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 64,
      "successfulChecks" : 141,
      "ratio" : 0.6878048780487804
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 2,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.4695121951219512,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 64,
        "successfulChecks" : 141,
        "ratio" : 0.6878048780487804
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 2,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7762/snapshots/3722",
    "name" : "Avoid undocumented Triggers, Functions and Procedures",
    "key" : "7762",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6524390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 49,
      "successfulChecks" : 156,
      "ratio" : 0.7609756097560976
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.6524390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 49,
        "successfulChecks" : 156,
        "ratio" : 0.7609756097560976
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7766/snapshots/3722",
    "name" : "Avoid Artifacts with High Cyclomatic Complexity",
    "key" : "7766",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8886054421768708,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 425,
      "successfulChecks" : 2515,
      "ratio" : 0.8554421768707483
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.8632040965618142,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 423,
        "successfulChecks" : 2311,
        "ratio" : 0.8452816386247257
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 2,
        "successfulChecks" : 204,
        "ratio" : 0.9902912621359223
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7768/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Code",
    "key" : "7768",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4081632653061233,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 234,
      "successfulChecks" : 2706,
      "ratio" : 0.9204081632653062
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 2.2882223847842,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 234,
        "successfulChecks" : 2500,
        "ratio" : 0.9144111192392099
      },
      "evolutionSummary" : {
        "addedCritica

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Friday, November 3, 2017 4:41 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Could you please provide more details on your json data structure for 400 MB json file.


Structure 1:


'{ "key":[obj1, obj2, obj3..objn]}'


Structure 2:
[ {obj1},{obj2}..,{objn}]

Structure 3:
{obj1}
{obj1}
..
{objn}



Thanks,


Arjun


________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 1:49 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Paul,

Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.

Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5

Issue # 1:
When loading a json file with 400MB I keep getting a DATA_READ ERROR.
Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
With all the recommendations provided by various experts, nothing has worked.

Issue 2#:
While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
Json#1 is 11k KB, has 8 fields with 74091 rows
Json#2 is 752kb, has 8 fields with 4245 rows

Besides breaking them up to smaller files, not sure what else I could do.

Thanks for the help so far!

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com]
Sent: Thursday, November 2, 2017 11:06 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the "managed" version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:

* Take the "max query memory per node" (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the "planner width" which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>
>>    Hi Kunal and Andries,
>>
>>    Thanks for your reply. We need json in this case because Drill
>> only supports up to 65536 columns in a csv file.

Re: Drill Capacity

Posted by Arjun kr <ar...@outlook.com>.


I have seen a use-case where query fails for 12 GB single json file having structure '‘{ "key":[obj1, obj2, obj3..objn]}’'. Here json file has a key element and value is array of json object 'obj'. There were around 175K objects in this array and each obj is again complex json object with nested array elements. From what I understood, Drill reads entire file content as  single json record (which actually is) and fails with DATA_READ ERROR.


The solution was to re organize the data to either of following structure. Also to break single file into multiple smaller file for better parallelism.


Structure 2: File has array of json object like below
[ {obj1},{obj2}..,{objn}]

Structure 3:  File has  json objects as below
{obj1}
{obj1}
..
{objn}


I was checking if this is the case here..


Thanks,


Arjun


________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 2:27 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun,

Column 4 has the most data and a bit long here. The other 3 columns has maybe a word or 2. Thanks for your patience.

[ {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid unreferenced Tables",
    "key" : "1634",
    "critical" : false
  },
  "result" : {
    "grade" : 2,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 5,
      "successfulChecks" : 47,
      "ratio" : 0.9038461538461539
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.0769230769230775,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 5,
        "successfulChecks" : 47,
        "ratio" : 0.9038461538461539
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Namespace naming convention - case control",
    "key" : "3550",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 0,
      "successfulChecks" : 31,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 0,
        "successfulChecks" : 31,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "2",
    "name" : "Interface naming convention - case and character set control",
    "key" : "3554",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumerations naming convention - case and character set control",
    "key" : "3558",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 7,
      "failedChecks" : 0,
      "successfulChecks" : 7,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 7,
        "failedChecks" : 0,
        "successfulChecks" : 7,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumeration Items naming convention - case and character set control",
    "key" : "3560",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 65,
      "failedChecks" : 0,
      "successfulChecks" : 65,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 65,
        "failedChecks" : 0,
        "successfulChecks" : 65,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Private Fields naming convention - case and character set control",
    "key" : "3562",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 90,
      "successfulChecks" : 40,
      "ratio" : 0.3076923076923077
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 90,
        "successfulChecks" : 40,
        "ratio" : 0.3076923076923077
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Methods naming convention - case and character set control",
    "key" : "3566",
    "critical" : false
  },
  "result" : {
    "grade" : 3.759803921568629,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 11,
      "successfulChecks" : 550,
      "ratio" : 0.9803921568627451
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.759803921568629,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 11,
        "successfulChecks" : 550,
        "ratio" : 0.9803921568627451
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid using Keywords as names",
    "key" : "3570",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 1596,
      "failedChecks" : 4,
      "successfulChecks" : 1592,
      "ratio" : 0.9974937343358395
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 1596,
        "failedChecks" : 4,
        "successfulChecks" : 1592,
        "ratio" : 0.9974937343358395
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3574/snapshots/3722",
    "name" : "Properties naming convention - case and character set control",
    "key" : "3574",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 352,
      "failedChecks" : 0,
      "successfulChecks" : 352,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 352,
        "failedChecks" : 0,
        "successfulChecks" : 352,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3576/snapshots/3722",
    "name" : "Avoid declaring public Fields",
    "key" : "3576",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 0,
      "successfulChecks" : 130,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 0,
        "successfulChecks" : 130,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3578/snapshots/3722",
    "name" : "Avoid large Classes - too many Constructors",
    "key" : "3578",
    "critical" : false
  },
  "result" : {
    "grade" : 3.9691011235955074,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 2,
      "successfulChecks" : 176,
      "ratio" : 0.9887640449438202
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.9691011235955074,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 2,
        "successfulChecks" : 176,
        "ratio" : 0.9887640449438202
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3580/snapshots/3722",
    "name" : "Avoid large Classes - too many Methods",
    "key" : "3580",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 1,
      "successfulChecks" : 177,
      "ratio" : 0.9943820224719101
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 1,
        "successfulChecks" : 177,
        "ratio" : 0.9943820224719101
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3586/snapshots/3722",
    "name" : "Avoid large Methods - too many Lines of Code",
    "key" : "3586",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 4,
      "successfulChecks" : 492,
      "ratio" : 0.9919354838709677
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 4,
        "successfulChecks" : 492,
        "ratio" : 0.9919354838709677
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3612/snapshots/3722",
    "name" : "Close SQL connection ASAP",
    "key" : "3612",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3614/snapshots/3722",
    "name" : "Avoid using String.Empty for empty string tests",
    "key" : "3614",
    "critical" : false
  },
  "result" : {
    "grade" : 3.971603563474389,
    "violationRatio" : {
      "totalChecks" : 449,
      "failedChecks" : 5,
      "successfulChecks" : 444,
      "ratio" : 0.9888641425389755
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.971603563474389,
      "violationRatio" : {
        "totalChecks" : 449,
        "failedChecks" : 5,
        "successfulChecks" : 444,
        "ratio" : 0.9888641425389755
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3626/snapshots/3722",
    "name" : "Avoid Interfaces with a very low comment/code ratio",
    "key" : "3626",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 8,
      "successfulChecks" : 2,
      "ratio" : 0.2
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 8,
        "successfulChecks" : 2,
        "ratio" : 0.2
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3630/snapshots/3722",
    "name" : "Avoid having Classes implementing too many Interfaces",
    "key" : "3630",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 0,
      "successfulChecks" : 178,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 0,
        "successfulChecks" : 178,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4062/snapshots/3722",
    "name" : "Avoid Functions and Procedures doing an Insert, Update or Delete without managing a transaction",
    "key" : "4062",
    "critical" : false
  },
  "result" : {
    "grade" : 1.3841463414634148,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 71,
      "successfulChecks" : 134,
      "ratio" : 0.6536585365853659
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 12,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.3841463414634148,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 71,
        "successfulChecks" : 134,
        "ratio" : 0.6536585365853659
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 12,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4064/snapshots/3722",
    "name" : "Avoid Procedures using an Insert, Update, Delete, Create Table or Select without including error management",
    "key" : "4064",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 172,
      "successfulChecks" : 31,
      "ratio" : 0.15270935960591134
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 23,
      "removedCriticalViolations" : 2,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 172,
        "successfulChecks" : 31,
        "ratio" : 0.15270935960591134
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 23,
        "removedCriticalViolations" : 2,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4066/snapshots/3722",
    "name" : "Avoid Stored Procedures not returning a status value",
    "key" : "4066",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 184,
      "successfulChecks" : 19,
      "ratio" : 0.09359605911330049
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 184,
        "successfulChecks" : 19,
        "ratio" : 0.09359605911330049
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4070/snapshots/3722",
    "name" : "Avoid use of \"truncate table\"",
    "key" : "4070",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4076/snapshots/3722",
    "name" : "Avoid using temporary Objects",
    "key" : "4076",
    "critical" : false
  },
  "result" : {
    "grade" : 3.274390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 8,
      "successfulChecks" : 197,
      "ratio" : 0.9609756097560975
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.274390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 8,
        "successfulChecks" : 197,
        "ratio" : 0.9609756097560975
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4084/snapshots/3722",
    "name" : "Avoid nested Stored Procedures using temporary Tables",
    "key" : "4084",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 0,
      "successfulChecks" : 203,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 0,
        "successfulChecks" : 203,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7126/snapshots/3722",
    "name" : "Avoid Artifacts with high Commented-out Code Lines/Code Lines ratio",
    "key" : "7126",
    "critical" : false
  },
  "result" : {
    "grade" : 3.7591514143094855,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 59,
      "successfulChecks" : 2946,
      "ratio" : 0.9803660565723794
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.8659342622365145,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 43,
        "successfulChecks" : 2756,
        "ratio" : 0.9846373704894605
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4466019417475735,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 16,
        "successfulChecks" : 190,
        "ratio" : 0.9223300970873787
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7130/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Nested Subqueries",
    "key" : "7130",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7156/snapshots/3722",
    "name" : "Avoid Too Many Copy Pasted Artifacts",
    "key" : "7156",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6478102189781023,
    "violationRatio" : {
      "totalChecks" : 959,
      "failedChecks" : 231,
      "successfulChecks" : 728,
      "ratio" : 0.7591240875912408
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 7,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.671151439299124,
      "violationRatio" : {
        "totalChecks" : 799,
        "failedChecks" : 185,
        "successfulChecks" : 614,
        "ratio" : 0.7684605757196495
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.53125,
      "violationRatio" : {
        "totalChecks" : 160,
        "failedChecks" : 46,
        "successfulChecks" : 114,
        "ratio" : 0.7125
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 7,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7194/snapshots/3722",
    "name" : "Avoid large number of String concatenation",
    "key" : "7194",
    "critical" : false
  },
  "result" : {
    "grade" : 3.796370967741936,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 9,
      "successfulChecks" : 487,
      "ratio" : 0.9818548387096774
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.796370967741936,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 9,
        "successfulChecks" : 487,
        "ratio" : 0.9818548387096774
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7198/snapshots/3722",
    "name" : "Avoid String concatenation in loops",
    "key" : "7198",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 15,
      "successfulChecks" : 481,
      "ratio" : 0.969758064516129
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 15,
        "successfulChecks" : 481,
        "ratio" : 0.969758064516129
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7204/snapshots/3722",
    "name" : "Avoid method invocation in a loop termination expression",
    "key" : "7204",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 2,
      "successfulChecks" : 559,
      "ratio" : 0.9964349376114082
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 2,
        "successfulChecks" : 559,
        "ratio" : 0.9964349376114082
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7208/snapshots/3722",
    "name" : "Avoid the use of is inside loops",
    "key" : "7208",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 0,
      "successfulChecks" : 496,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 0,
        "successfulChecks" : 496,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7212/snapshots/3722",
    "name" : "Avoid instantiations inside loops",
    "key" : "7212",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 23,
      "successfulChecks" : 473,
      "ratio" : 0.9536290322580645
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 23,
        "successfulChecks" : 473,
        "ratio" : 0.9536290322580645
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7260/snapshots/3722",
    "name" : "User Interface elements must not use directly the database",
    "key" : "7260",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7262/snapshots/3722",
    "name" : "Avoid Namespaces with High Efferent Coupling (CE)",
    "key" : "7262",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7264/snapshots/3722",
    "name" : "Avoid namespaces with High Afferent Coupling (CA)",
    "key" : "7264",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7266/snapshots/3722",
    "name" : "Call 'base.Dispose()' or 'MyBase.Finalize()' in the \"finally\" block of 'Dispose(bool)' methods",
    "key" : "7266",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 3,
      "failedChecks" : 3,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 3,
        "failedChecks" : 3,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7268/snapshots/3722",
    "name" : "Dispose() methods should call GC.SuppressFinalize",
    "key" : "7268",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 5,
      "failedChecks" : 0,
      "successfulChecks" : 5,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 5,
        "failedChecks" : 0,
        "successfulChecks" : 5,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7270/snapshots/3722",
    "name" : "Declare as Static all methods not using instance members",
    "key" : "7270",
    "critical" : false
  },
  "result" : {
    "grade" : 3.5369875222816383,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 16,
      "successfulChecks" : 545,
      "ratio" : 0.9714795008912656
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.5369875222816383,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 16,
        "successfulChecks" : 545,
        "ratio" : 0.9714795008912656
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7272/snapshots/3722",
    "name" : "Provide a private default Constructor for utility Classes",
    "key" : "7272",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 6,
      "failedChecks" : 6,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 6,
        "failedChecks" : 6,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7294/snapshots/3722",
    "name" : "Avoid cyclical calls and inheritances between namespaces content",
    "key" : "7294",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 5,
      "successfulChecks" : 26,
      "ratio" : 0.8387096774193549
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 5,
        "successfulChecks" : 26,
        "ratio" : 0.8387096774193549
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7344/snapshots/3722",
    "name" : "Avoid \"SELECT *\" queries",
    "key" : "7344",
    "critical" : false
  },
  "result" : {
    "grade" : 2.932038834951456,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 11,
      "successfulChecks" : 195,
      "ratio" : 0.9466019417475728
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.932038834951456,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7346/snapshots/3722",
    "name" : "Avoid redundant indexes",
    "key" : "7346",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7348/snapshots/3722",
    "name" : "Avoid too many Indexes on one Table",
    "key" : "7348",
    "critical" : false
  },
  "result" : {
    "grade" : 3.76923076923077,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 1,
      "successfulChecks" : 51,
      "ratio" : 0.9807692307692307
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.76923076923077,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 1,
        "successfulChecks" : 51,
        "ratio" : 0.9807692307692307
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7352/snapshots/3722",
    "name" : "Avoid calling properties that clone values in loops",
    "key" : "7352",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7386/snapshots/3722",
    "name" : "Avoid Tables without a clustered Index",
    "key" : "7386",
    "critical" : false
  },
  "result" : {
    "grade" : 1.721153846153846,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 11,
      "successfulChecks" : 41,
      "ratio" : 0.7884615384615384
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.721153846153846,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 11,
        "successfulChecks" : 41,
        "ratio" : 0.7884615384615384
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7388/snapshots/3722",
    "name" : "Avoid artifacts having recursive calls",
    "key" : "7388",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 19,
      "successfulChecks" : 2986,
      "ratio" : 0.9936772046589019
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 19,
        "successfulChecks" : 2780,
        "ratio" : 0.9932118613790639
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7390/snapshots/3722",
    "name" : "Avoid having multiple Artifacts inserting data on the same SQL Table",
    "key" : "7390",
    "critical" : false
  },
  "result" : {
    "grade" : 1.673076923076923,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 12,
      "successfulChecks" : 40,
      "ratio" : 0.7692307692307693
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.673076923076923,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 12,
        "successfulChecks" : 40,
        "ratio" : 0.7692307692307693
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7392/snapshots/3722",
    "name" : "Avoid having multiple artifacts deleting data on the same SQL table",
    "key" : "7392",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8653846153846154,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 8,
      "successfulChecks" : 44,
      "ratio" : 0.8461538461538461
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.8653846153846154,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 8,
        "successfulChecks" : 44,
        "ratio" : 0.8461538461538461
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7394/snapshots/3722",
    "name" : "Avoid having multiple Artifacts updating data on the same SQL Table",
    "key" : "7394",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4615384615384612,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 4,
      "successfulChecks" : 48,
      "ratio" : 0.9230769230769231
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4615384615384612,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 4,
        "successfulChecks" : 48,
        "ratio" : 0.9230769230769231
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7418/snapshots/3722",
    "name" : "Avoid SQL queries using functions on indexed columns in the WHERE clause",
    "key" : "7418",
    "critical" : false
  },
  "result" : {
    "grade" : 3.28846153846154,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 7,
      "successfulChecks" : 175,
      "ratio" : 0.9615384615384616
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 5,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.28846153846154,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 7,
        "successfulChecks" : 175,
        "ratio" : 0.9615384615384616
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 5,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7420/snapshots/3722",
    "name" : "Avoid SQL queries with implicit conversions in the WHERE clause",
    "key" : "7420",
    "critical" : false
  },
  "result" : {
    "grade" : 3.425824175824175,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 6,
      "successfulChecks" : 176,
      "ratio" : 0.967032967032967
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.425824175824175,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 6,
        "successfulChecks" : 176,
        "ratio" : 0.967032967032967
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7424/snapshots/3722",
    "name" : "Avoid using SQL queries inside a loop",
    "key" : "7424",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 221,
      "failedChecks" : 11,
      "successfulChecks" : 210,
      "ratio" : 0.9502262443438914
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 15,
        "failedChecks" : 0,
        "successfulChecks" : 15,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7428/snapshots/3722",
    "name" : "Avoid SQL queries not using the first column of a composite index in the WHERE clause",
    "key" : "7428",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 24,
      "failedChecks" : 24,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 3,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 24,
        "failedChecks" : 24,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 3,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7458/snapshots/3722",
    "name" : "Avoid large Interfaces - too many Methods",
    "key" : "7458",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7460/snapshots/3722",
    "name" : "Avoid using untyped DataSet",
    "key" : "7460",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 17,
      "failedChecks" : 17,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 17,
        "failedChecks" : 17,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7470/snapshots/3722",
    "name" : "Avoid doing select on Datatable in loop",
    "key" : "7470",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 2,
      "failedChecks" : 0,
      "successfulChecks" : 2,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2,
        "failedChecks" : 0,
        "successfulChecks" : 2,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7760/snapshots/3722",
    "name" : "Avoid triggers, functions and procedures with a very low comment/code ratio",
    "key" : "7760",
    "critical" : false
  },
  "result" : {
    "grade" : 1.4695121951219512,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 64,
      "successfulChecks" : 141,
      "ratio" : 0.6878048780487804
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 2,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.4695121951219512,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 64,
        "successfulChecks" : 141,
        "ratio" : 0.6878048780487804
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 2,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7762/snapshots/3722",
    "name" : "Avoid undocumented Triggers, Functions and Procedures",
    "key" : "7762",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6524390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 49,
      "successfulChecks" : 156,
      "ratio" : 0.7609756097560976
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.6524390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 49,
        "successfulChecks" : 156,
        "ratio" : 0.7609756097560976
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7766/snapshots/3722",
    "name" : "Avoid Artifacts with High Cyclomatic Complexity",
    "key" : "7766",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8886054421768708,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 425,
      "successfulChecks" : 2515,
      "ratio" : 0.8554421768707483
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.8632040965618142,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 423,
        "successfulChecks" : 2311,
        "ratio" : 0.8452816386247257
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 2,
        "successfulChecks" : 204,
        "ratio" : 0.9902912621359223
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7768/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Code",
    "key" : "7768",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4081632653061233,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 234,
      "successfulChecks" : 2706,
      "ratio" : 0.9204081632653062
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 2.2882223847842,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 234,
        "successfulChecks" : 2500,
        "ratio" : 0.9144111192392099
      },
      "evolutionSummary" : {
        "addedCritica

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Friday, November 3, 2017 4:41 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Could you please provide more details on your json data structure for 400 MB json file.


Structure 1:


'{ "key":[obj1, obj2, obj3..objn]}'


Structure 2:
[ {obj1},{obj2}..,{objn}]

Structure 3:
{obj1}
{obj1}
..
{objn}



Thanks,


Arjun


________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 1:49 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Paul,

Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.

Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5

Issue # 1:
When loading a json file with 400MB I keep getting a DATA_READ ERROR.
Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
With all the recommendations provided by various experts, nothing has worked.

Issue 2#:
While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
Json#1 is 11k KB, has 8 fields with 74091 rows
Json#2 is 752kb, has 8 fields with 4245 rows

Besides breaking them up to smaller files, not sure what else I could do.

Thanks for the help so far!

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com]
Sent: Thursday, November 2, 2017 11:06 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the "managed" version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:

* Take the "max query memory per node" (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the "planner width" which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>
>>    Hi Kunal and Andries,
>>
>>    Thanks for your reply. We need json in this case because Drill
>> only supports up to 65536 columns in a csv file.

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Arjun,

Column 4 has the most data and a bit long here. The other 3 columns has maybe a word or 2. Thanks for your patience.

[ {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid unreferenced Tables",
    "key" : "1634",
    "critical" : false
  },
  "result" : {
    "grade" : 2,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 5,
      "successfulChecks" : 47,
      "ratio" : 0.9038461538461539
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.0769230769230775,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 5,
        "successfulChecks" : 47,
        "ratio" : 0.9038461538461539
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Namespace naming convention - case control",
    "key" : "3550",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 0,
      "successfulChecks" : 31,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 0,
        "successfulChecks" : 31,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "2",
    "name" : "Interface naming convention - case and character set control",
    "key" : "3554",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumerations naming convention - case and character set control",
    "key" : "3558",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 7,
      "failedChecks" : 0,
      "successfulChecks" : 7,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 7,
        "failedChecks" : 0,
        "successfulChecks" : 7,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumeration Items naming convention - case and character set control",
    "key" : "3560",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 65,
      "failedChecks" : 0,
      "successfulChecks" : 65,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 65,
        "failedChecks" : 0,
        "successfulChecks" : 65,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Private Fields naming convention - case and character set control",
    "key" : "3562",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 90,
      "successfulChecks" : 40,
      "ratio" : 0.3076923076923077
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 90,
        "successfulChecks" : 40,
        "ratio" : 0.3076923076923077
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Methods naming convention - case and character set control",
    "key" : "3566",
    "critical" : false
  },
  "result" : {
    "grade" : 3.759803921568629,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 11,
      "successfulChecks" : 550,
      "ratio" : 0.9803921568627451
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.759803921568629,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 11,
        "successfulChecks" : 550,
        "ratio" : 0.9803921568627451
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid using Keywords as names",
    "key" : "3570",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 1596,
      "failedChecks" : 4,
      "successfulChecks" : 1592,
      "ratio" : 0.9974937343358395
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 1596,
        "failedChecks" : 4,
        "successfulChecks" : 1592,
        "ratio" : 0.9974937343358395
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3574/snapshots/3722",
    "name" : "Properties naming convention - case and character set control",
    "key" : "3574",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 352,
      "failedChecks" : 0,
      "successfulChecks" : 352,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 352,
        "failedChecks" : 0,
        "successfulChecks" : 352,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3576/snapshots/3722",
    "name" : "Avoid declaring public Fields",
    "key" : "3576",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 0,
      "successfulChecks" : 130,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 0,
        "successfulChecks" : 130,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3578/snapshots/3722",
    "name" : "Avoid large Classes - too many Constructors",
    "key" : "3578",
    "critical" : false
  },
  "result" : {
    "grade" : 3.9691011235955074,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 2,
      "successfulChecks" : 176,
      "ratio" : 0.9887640449438202
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.9691011235955074,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 2,
        "successfulChecks" : 176,
        "ratio" : 0.9887640449438202
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3580/snapshots/3722",
    "name" : "Avoid large Classes - too many Methods",
    "key" : "3580",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 1,
      "successfulChecks" : 177,
      "ratio" : 0.9943820224719101
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 1,
        "successfulChecks" : 177,
        "ratio" : 0.9943820224719101
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3586/snapshots/3722",
    "name" : "Avoid large Methods - too many Lines of Code",
    "key" : "3586",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 4,
      "successfulChecks" : 492,
      "ratio" : 0.9919354838709677
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 4,
        "successfulChecks" : 492,
        "ratio" : 0.9919354838709677
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3612/snapshots/3722",
    "name" : "Close SQL connection ASAP",
    "key" : "3612",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3614/snapshots/3722",
    "name" : "Avoid using String.Empty for empty string tests",
    "key" : "3614",
    "critical" : false
  },
  "result" : {
    "grade" : 3.971603563474389,
    "violationRatio" : {
      "totalChecks" : 449,
      "failedChecks" : 5,
      "successfulChecks" : 444,
      "ratio" : 0.9888641425389755
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.971603563474389,
      "violationRatio" : {
        "totalChecks" : 449,
        "failedChecks" : 5,
        "successfulChecks" : 444,
        "ratio" : 0.9888641425389755
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3626/snapshots/3722",
    "name" : "Avoid Interfaces with a very low comment/code ratio",
    "key" : "3626",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 8,
      "successfulChecks" : 2,
      "ratio" : 0.2
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 8,
        "successfulChecks" : 2,
        "ratio" : 0.2
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3630/snapshots/3722",
    "name" : "Avoid having Classes implementing too many Interfaces",
    "key" : "3630",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 0,
      "successfulChecks" : 178,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 0,
        "successfulChecks" : 178,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4062/snapshots/3722",
    "name" : "Avoid Functions and Procedures doing an Insert, Update or Delete without managing a transaction",
    "key" : "4062",
    "critical" : false
  },
  "result" : {
    "grade" : 1.3841463414634148,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 71,
      "successfulChecks" : 134,
      "ratio" : 0.6536585365853659
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 12,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.3841463414634148,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 71,
        "successfulChecks" : 134,
        "ratio" : 0.6536585365853659
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 12,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4064/snapshots/3722",
    "name" : "Avoid Procedures using an Insert, Update, Delete, Create Table or Select without including error management",
    "key" : "4064",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 172,
      "successfulChecks" : 31,
      "ratio" : 0.15270935960591134
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 23,
      "removedCriticalViolations" : 2,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 172,
        "successfulChecks" : 31,
        "ratio" : 0.15270935960591134
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 23,
        "removedCriticalViolations" : 2,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4066/snapshots/3722",
    "name" : "Avoid Stored Procedures not returning a status value",
    "key" : "4066",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 184,
      "successfulChecks" : 19,
      "ratio" : 0.09359605911330049
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 184,
        "successfulChecks" : 19,
        "ratio" : 0.09359605911330049
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4070/snapshots/3722",
    "name" : "Avoid use of \"truncate table\"",
    "key" : "4070",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4076/snapshots/3722",
    "name" : "Avoid using temporary Objects",
    "key" : "4076",
    "critical" : false
  },
  "result" : {
    "grade" : 3.274390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 8,
      "successfulChecks" : 197,
      "ratio" : 0.9609756097560975
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.274390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 8,
        "successfulChecks" : 197,
        "ratio" : 0.9609756097560975
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4084/snapshots/3722",
    "name" : "Avoid nested Stored Procedures using temporary Tables",
    "key" : "4084",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 0,
      "successfulChecks" : 203,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 0,
        "successfulChecks" : 203,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7126/snapshots/3722",
    "name" : "Avoid Artifacts with high Commented-out Code Lines/Code Lines ratio",
    "key" : "7126",
    "critical" : false
  },
  "result" : {
    "grade" : 3.7591514143094855,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 59,
      "successfulChecks" : 2946,
      "ratio" : 0.9803660565723794
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.8659342622365145,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 43,
        "successfulChecks" : 2756,
        "ratio" : 0.9846373704894605
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4466019417475735,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 16,
        "successfulChecks" : 190,
        "ratio" : 0.9223300970873787
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7130/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Nested Subqueries",
    "key" : "7130",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7156/snapshots/3722",
    "name" : "Avoid Too Many Copy Pasted Artifacts",
    "key" : "7156",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6478102189781023,
    "violationRatio" : {
      "totalChecks" : 959,
      "failedChecks" : 231,
      "successfulChecks" : 728,
      "ratio" : 0.7591240875912408
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 7,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.671151439299124,
      "violationRatio" : {
        "totalChecks" : 799,
        "failedChecks" : 185,
        "successfulChecks" : 614,
        "ratio" : 0.7684605757196495
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.53125,
      "violationRatio" : {
        "totalChecks" : 160,
        "failedChecks" : 46,
        "successfulChecks" : 114,
        "ratio" : 0.7125
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 7,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7194/snapshots/3722",
    "name" : "Avoid large number of String concatenation",
    "key" : "7194",
    "critical" : false
  },
  "result" : {
    "grade" : 3.796370967741936,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 9,
      "successfulChecks" : 487,
      "ratio" : 0.9818548387096774
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.796370967741936,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 9,
        "successfulChecks" : 487,
        "ratio" : 0.9818548387096774
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7198/snapshots/3722",
    "name" : "Avoid String concatenation in loops",
    "key" : "7198",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 15,
      "successfulChecks" : 481,
      "ratio" : 0.969758064516129
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 15,
        "successfulChecks" : 481,
        "ratio" : 0.969758064516129
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7204/snapshots/3722",
    "name" : "Avoid method invocation in a loop termination expression",
    "key" : "7204",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 2,
      "successfulChecks" : 559,
      "ratio" : 0.9964349376114082
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 2,
        "successfulChecks" : 559,
        "ratio" : 0.9964349376114082
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7208/snapshots/3722",
    "name" : "Avoid the use of is inside loops",
    "key" : "7208",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 0,
      "successfulChecks" : 496,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 0,
        "successfulChecks" : 496,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7212/snapshots/3722",
    "name" : "Avoid instantiations inside loops",
    "key" : "7212",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 23,
      "successfulChecks" : 473,
      "ratio" : 0.9536290322580645
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 23,
        "successfulChecks" : 473,
        "ratio" : 0.9536290322580645
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7260/snapshots/3722",
    "name" : "User Interface elements must not use directly the database",
    "key" : "7260",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7262/snapshots/3722",
    "name" : "Avoid Namespaces with High Efferent Coupling (CE)",
    "key" : "7262",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7264/snapshots/3722",
    "name" : "Avoid namespaces with High Afferent Coupling (CA)",
    "key" : "7264",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7266/snapshots/3722",
    "name" : "Call 'base.Dispose()' or 'MyBase.Finalize()' in the \"finally\" block of 'Dispose(bool)' methods",
    "key" : "7266",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 3,
      "failedChecks" : 3,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 3,
        "failedChecks" : 3,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7268/snapshots/3722",
    "name" : "Dispose() methods should call GC.SuppressFinalize",
    "key" : "7268",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 5,
      "failedChecks" : 0,
      "successfulChecks" : 5,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 5,
        "failedChecks" : 0,
        "successfulChecks" : 5,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7270/snapshots/3722",
    "name" : "Declare as Static all methods not using instance members",
    "key" : "7270",
    "critical" : false
  },
  "result" : {
    "grade" : 3.5369875222816383,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 16,
      "successfulChecks" : 545,
      "ratio" : 0.9714795008912656
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.5369875222816383,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 16,
        "successfulChecks" : 545,
        "ratio" : 0.9714795008912656
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7272/snapshots/3722",
    "name" : "Provide a private default Constructor for utility Classes",
    "key" : "7272",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 6,
      "failedChecks" : 6,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 6,
        "failedChecks" : 6,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7294/snapshots/3722",
    "name" : "Avoid cyclical calls and inheritances between namespaces content",
    "key" : "7294",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 5,
      "successfulChecks" : 26,
      "ratio" : 0.8387096774193549
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 5,
        "successfulChecks" : 26,
        "ratio" : 0.8387096774193549
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7344/snapshots/3722",
    "name" : "Avoid \"SELECT *\" queries",
    "key" : "7344",
    "critical" : false
  },
  "result" : {
    "grade" : 2.932038834951456,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 11,
      "successfulChecks" : 195,
      "ratio" : 0.9466019417475728
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.932038834951456,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7346/snapshots/3722",
    "name" : "Avoid redundant indexes",
    "key" : "7346",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7348/snapshots/3722",
    "name" : "Avoid too many Indexes on one Table",
    "key" : "7348",
    "critical" : false
  },
  "result" : {
    "grade" : 3.76923076923077,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 1,
      "successfulChecks" : 51,
      "ratio" : 0.9807692307692307
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.76923076923077,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 1,
        "successfulChecks" : 51,
        "ratio" : 0.9807692307692307
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7352/snapshots/3722",
    "name" : "Avoid calling properties that clone values in loops",
    "key" : "7352",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7386/snapshots/3722",
    "name" : "Avoid Tables without a clustered Index",
    "key" : "7386",
    "critical" : false
  },
  "result" : {
    "grade" : 1.721153846153846,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 11,
      "successfulChecks" : 41,
      "ratio" : 0.7884615384615384
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.721153846153846,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 11,
        "successfulChecks" : 41,
        "ratio" : 0.7884615384615384
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7388/snapshots/3722",
    "name" : "Avoid artifacts having recursive calls",
    "key" : "7388",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 19,
      "successfulChecks" : 2986,
      "ratio" : 0.9936772046589019
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 19,
        "successfulChecks" : 2780,
        "ratio" : 0.9932118613790639
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7390/snapshots/3722",
    "name" : "Avoid having multiple Artifacts inserting data on the same SQL Table",
    "key" : "7390",
    "critical" : false
  },
  "result" : {
    "grade" : 1.673076923076923,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 12,
      "successfulChecks" : 40,
      "ratio" : 0.7692307692307693
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.673076923076923,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 12,
        "successfulChecks" : 40,
        "ratio" : 0.7692307692307693
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7392/snapshots/3722",
    "name" : "Avoid having multiple artifacts deleting data on the same SQL table",
    "key" : "7392",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8653846153846154,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 8,
      "successfulChecks" : 44,
      "ratio" : 0.8461538461538461
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.8653846153846154,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 8,
        "successfulChecks" : 44,
        "ratio" : 0.8461538461538461
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7394/snapshots/3722",
    "name" : "Avoid having multiple Artifacts updating data on the same SQL Table",
    "key" : "7394",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4615384615384612,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 4,
      "successfulChecks" : 48,
      "ratio" : 0.9230769230769231
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4615384615384612,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 4,
        "successfulChecks" : 48,
        "ratio" : 0.9230769230769231
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7418/snapshots/3722",
    "name" : "Avoid SQL queries using functions on indexed columns in the WHERE clause",
    "key" : "7418",
    "critical" : false
  },
  "result" : {
    "grade" : 3.28846153846154,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 7,
      "successfulChecks" : 175,
      "ratio" : 0.9615384615384616
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 5,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.28846153846154,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 7,
        "successfulChecks" : 175,
        "ratio" : 0.9615384615384616
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 5,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7420/snapshots/3722",
    "name" : "Avoid SQL queries with implicit conversions in the WHERE clause",
    "key" : "7420",
    "critical" : false
  },
  "result" : {
    "grade" : 3.425824175824175,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 6,
      "successfulChecks" : 176,
      "ratio" : 0.967032967032967
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.425824175824175,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 6,
        "successfulChecks" : 176,
        "ratio" : 0.967032967032967
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7424/snapshots/3722",
    "name" : "Avoid using SQL queries inside a loop",
    "key" : "7424",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 221,
      "failedChecks" : 11,
      "successfulChecks" : 210,
      "ratio" : 0.9502262443438914
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 15,
        "failedChecks" : 0,
        "successfulChecks" : 15,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7428/snapshots/3722",
    "name" : "Avoid SQL queries not using the first column of a composite index in the WHERE clause",
    "key" : "7428",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 24,
      "failedChecks" : 24,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 3,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 24,
        "failedChecks" : 24,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 3,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7458/snapshots/3722",
    "name" : "Avoid large Interfaces - too many Methods",
    "key" : "7458",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7460/snapshots/3722",
    "name" : "Avoid using untyped DataSet",
    "key" : "7460",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 17,
      "failedChecks" : 17,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 17,
        "failedChecks" : 17,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7470/snapshots/3722",
    "name" : "Avoid doing select on Datatable in loop",
    "key" : "7470",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 2,
      "failedChecks" : 0,
      "successfulChecks" : 2,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2,
        "failedChecks" : 0,
        "successfulChecks" : 2,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7760/snapshots/3722",
    "name" : "Avoid triggers, functions and procedures with a very low comment/code ratio",
    "key" : "7760",
    "critical" : false
  },
  "result" : {
    "grade" : 1.4695121951219512,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 64,
      "successfulChecks" : 141,
      "ratio" : 0.6878048780487804
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 2,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.4695121951219512,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 64,
        "successfulChecks" : 141,
        "ratio" : 0.6878048780487804
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 2,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7762/snapshots/3722",
    "name" : "Avoid undocumented Triggers, Functions and Procedures",
    "key" : "7762",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6524390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 49,
      "successfulChecks" : 156,
      "ratio" : 0.7609756097560976
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.6524390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 49,
        "successfulChecks" : 156,
        "ratio" : 0.7609756097560976
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7766/snapshots/3722",
    "name" : "Avoid Artifacts with High Cyclomatic Complexity",
    "key" : "7766",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8886054421768708,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 425,
      "successfulChecks" : 2515,
      "ratio" : 0.8554421768707483
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.8632040965618142,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 423,
        "successfulChecks" : 2311,
        "ratio" : 0.8452816386247257
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 2,
        "successfulChecks" : 204,
        "ratio" : 0.9902912621359223
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7768/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Code",
    "key" : "7768",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4081632653061233,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 234,
      "successfulChecks" : 2706,
      "ratio" : 0.9204081632653062
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 2.2882223847842,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 234,
        "successfulChecks" : 2500,
        "ratio" : 0.9144111192392099
      },
      "evolutionSummary" : {
        "addedCritica

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com] 
Sent: Friday, November 3, 2017 4:41 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Could you please provide more details on your json data structure for 400 MB json file.


Structure 1:


'{ "key":[obj1, obj2, obj3..objn]}'


Structure 2:
[ {obj1},{obj2}..,{objn}]

Structure 3:
{obj1}
{obj1}
..
{objn}



Thanks,


Arjun


________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 1:49 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Paul,

Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.

Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5

Issue # 1:
When loading a json file with 400MB I keep getting a DATA_READ ERROR.
Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
With all the recommendations provided by various experts, nothing has worked.

Issue 2#:
While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
Json#1 is 11k KB, has 8 fields with 74091 rows
Json#2 is 752kb, has 8 fields with 4245 rows

Besides breaking them up to smaller files, not sure what else I could do.

Thanks for the help so far!

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com]
Sent: Thursday, November 2, 2017 11:06 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the "managed" version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:

* Take the "max query memory per node" (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the "planner width" which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>
>>    Hi Kunal and Andries,
>>
>>    Thanks for your reply. We need json in this case because Drill 
>> only supports up to 65536 columns in a csv file.

Re: Drill Capacity

Posted by Arjun kr <ar...@outlook.com>.

Hi Yun,

Could you please provide more details on your json data structure for 400 MB json file.

Structure 1:

‘{ "key":[obj1, obj2, obj3..objn]}’

Structure 2:
[ {obj1},{obj2}..,{objn}]

Structure 3:
{obj1}
{obj1}
..
{objn}

Thanks,

Arjun

________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Saturday, November 4, 2017 1:49 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Paul,

Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.

Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5

Issue # 1:
When loading a json file with 400MB I keep getting a DATA_READ ERROR.
Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
With all the recommendations provided by various experts, nothing has worked.

Issue 2#:
While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
Json#1 is 11k KB, has 8 fields with 74091 rows
Json#2 is 752kb, has 8 fields with 4245 rows

Besides breaking them up to smaller files, not sure what else I could do.

Thanks for the help so far!

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com]
Sent: Thursday, November 2, 2017 11:06 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the “managed” version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:

* Take the “max query memory per node” (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the “planner width” which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>
>>    Hi Kunal and Andries,
>>
>>    Thanks for your reply. We need json in this case because Drill
>> only supports up to 65536 columns in a csv file.

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Paul,

I've already done this: alter session set `store.json.all_text_mode`=true;

I don't believe this is an accurate error message because when I reduce # of rows in the Compliance.json file by half (while all fields and queries stay the same), everything works with no issues. I've tried the same with another dataset (same format, same fields but smaller size pool), no issue there either. So I am still convinced it's a size issue.

Please let me know what else I could provide to troubleshoot this.

Thanks for all your help so far.

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com] 
Sent: Tuesday, November 7, 2017 7:55 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I looked at the sqlline.log file you posted. (Thanks much for doing so.) Here’s what I noted:

The log shows a failed query, but this one is different than the one we discussed earlier. Query:

SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100

Since this is a LIMIT query, with no ORDER BY, we got a different plan than the query we discussed earlier. The earlier one had a stack trace that suggested the query had an ORDER BY that used the legacy (non-managed) version of the sort.

Despite the fact that the query is different, the above query did, in fact, fail, but for a different reason.

JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

This is saying that there is a schema change: but one within a single batch. Variations of this bug can occur if you have several values of the form 10, 20, 30. But, then later, you have a value like “Hi” — we create a numeric vector then try to write a string.

Here, it appears you have values that are boolean, followed by a string:

… “a”: true …
… “a”: false …
… “a”: “a string!”

The JSON writer sees the boolean and locates a bit vector. Then, it sees the string, tries to write that into a bit vector, and gets the error displayed above.

You can work around this by using “all text mode” that reads all fields as text. Or, you can clean up your data.

Once this file works, perhaps you can try another run to recreate the original memory issue with the sort so we can track that one down.

Thanks,

- Paul

> On Nov 7, 2017, at 1:49 PM, Kunal Khatua <kk...@mapr.com> wrote:
> 
> Hi Yun
> 
> The new release might not address this issue as we don't have a repro for this. Any chance you can provide a sample anonymized data set. The JSON data doesn't have to be meaningful, but we need to be able to reproduce it to ensure that we are indeed addressing the issue you faced. 
> 
> Thanks
> ~K
> -----Original Message-----
> From: Yun Liu [mailto:y.liu@castsoftware.com]
> Sent: Tuesday, November 07, 2017 7:17 AM
> To: user@drill.apache.org
> Subject: RE: Drill Capacity
> 
> Hi Arjun,
> 
> That was already altered and schema was not changed. I've reduced the json size and everything works fine. I believe it was giving a false error. Seems that's the only way to bypass this error until your new release comes out?
> 
> Thanks,
> Yun
> 
> -----Original Message-----
> From: Arjun kr [mailto:arjun.kr@outlook.com]
> Sent: Monday, November 6, 2017 7:39 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> 
> Looking at the log shared, You seems to be running below query.
> 
> 
> 2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] 
> INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM 
> `dfs`.`Inputs`.`./Compliance.json` LIMIT 100
> 
> 
> Below is the exception with query failure.
> 
> 
> 2017-11-06 15:09:45,852 
> [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  
> o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried 
> to write a VarChar type when you are using a ValueWriter of type 
> NullableBitWriterImpl. (You tried to write a VarChar type when you are 
> using a ValueWriter of type NullableBitWriterImpl.)^M
> org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.
> 
> It could be related to schema change. Can you try setting below session parameter if not tried already?
> 
> 
> alter session set `store.json.all_text_mode`=true;
> 
> 
> 
> Thanks,
> 
> Arjun
> ________________________________
> From: Yun Liu <y....@castsoftware.com>
> Sent: Tuesday, November 7, 2017 1:46 AM
> To: user@drill.apache.org
> Subject: RE: Drill Capacity
> 
> Hi Arjun and Paul,
> 
> Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
> https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?d
> l=0 
> [https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-
> vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c
> 9jwijTjkkac_a?dl=0>
> 
> Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkk
> ac_a?dl=0>
> www.dropbox.com
> Shared with Dropbox
> 
> 
> 
> 
> Thank you!
> Yun
> 
> -----Original Message-----
> From: Arjun kr [mailto:arjun.kr@outlook.com]
> Sent: Monday, November 6, 2017 1:20 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> 
> Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.
> 
> 
> Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.
> 
> 
> <logger name="org.apache.drill" additivity="false">
>    <level value="debug" />
>    <appender-ref ref="FILE" />
>  </logger>
> 
> 
> Thanks,
> 
> 
> Arjun
> 
> ________________________________
> From: Paul Rogers <pr...@mapr.com>
> Sent: Monday, November 6, 2017 10:56 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.
> 
> If you can post your logs somewhere, I'll d/l them and take a look.
> 
> - Paul
> 
>> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>> 
>> Hi Paul,
>> 
>> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>> 
>> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>> 
>> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryT
>> e xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
>> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
>> username":"","remoteAddress":"localhost"}
>> 
>> Is this what you're looking for?
>> 
>> Thanks,
>> Yun
>> 
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Friday, November 3, 2017 6:45 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>> 
>> Thanks for the info. Clearly you are way ahead of me.
>> 
>> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>> 
>> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>> 
>> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>> 
>> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>> 
>> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>> 
>> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>> 
>> Since things don't work, I suspect that we're missing something.
>> (Note that the memory size we just calculated does not match the 
>> numbers shown in an earlier post in which the sort got just ~40 MB of
>> memory...)
>> 
>> Try this:
>> 
>> * With your current settings, enable debug-level logging. Run your query.
>> 
>> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>> 
>> maxAllocation: (some number)
>> 
>> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>> 
>> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>> 
>> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>> 
>> Thanks,
>> 
>> - Paul
>> 
>>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>> 
>>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>>> planner.width.max_per_node = 5
>>> 
>>> Issue # 1:
>>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>>> With all the recommendations provided by various experts, nothing has worked.
>>> 
>>> Issue 2#:
>>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>>> Json#1 is 11k KB, has 8 fields with 74091 rows
>>> Json#2 is 752kb, has 8 fields with 4245 rows
>>> 
>>> Besides breaking them up to smaller files, not sure what else I could do.
>>> 
>>> Thanks for the help so far!
>>> 
>>> Yun
>>> 
>>> -----Original Message-----
>>> From: Paul Rogers [mailto:progers@mapr.com]
>>> Sent: Thursday, November 2, 2017 11:06 PM
>>> To: user@drill.apache.org
>>> Subject: Re: Drill Capacity
>>> 
>>> Hi Yun,
>>> 
>>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>> 
>>> How large is each record in your file? How many fields? How many 
>>> bytes? (Alternatively, how big is a single input file and how many 
>>> records does it contain?)
>>> 
>>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>> 
>>> We can guess the size, however, from the info you provided:
>>> 
>>> batchGroups.size 1
>>> spilledBatchGroups.size 0
>>> allocated memory 42768000
>>> allocator limit 41943040
>>> 
>>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>> 
>>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>> 
>>> (If you do want to use the "managed" version of the sort, I suggest 
>>> you try Drill 1.12 when it is released as that version contains 
>>> additional fixes to handle constrained memory.)
>>> 
>>> Also, at present, The JSON record reader loads 4096 records into 
>>> each batch. If your file has at least that many records, then we can 
>>> guess each record is about 43 MB / 4096 =~ 10K in size. (You can 
>>> confirm, as noted above, by dividing total file size by record 
>>> count.)
>>> 
>>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>> 
>>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>> 
>>> * Take the "max query memory per node" (default of 2 GB regardless 
>>> of actual direct memory),
>>> * Divide by the number of sort operators in the plan (as shown in 
>>> the visualized query profile)
>>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>> 
>>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>> 
>>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>> 
>>> First, we can increase the memory per query:
>>> 
>>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>>> 4,294,967,296
>>> 
>>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>> 
>>> And/or, we can reduce the number of fragments:
>>> 
>>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>> 
>>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>> 
>>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>> 
>>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>> 
>>> - Paul
>>> 
>>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>> 
>>>>> Hi Kunal and Andries,
>>>>> 
>>>>> Thanks for your reply. We need json in this case because Drill 
>>>>> only supports up to 65536 columns in a csv file.
>> 
>

Re: Drill Capacity

Posted by Paul Rogers <pr...@mapr.com>.

Hi Yun,

I looked at the sqlline.log file you posted. (Thanks much for doing so.) Here’s what I noted:

The log shows a failed query, but this one is different than the one we discussed earlier. Query:

SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100

Since this is a LIMIT query, with no ORDER BY, we got a different plan than the query we discussed earlier. The earlier one had a stack trace that suggested the query had an ORDER BY that used the legacy (non-managed) version of the sort.

Despite the fact that the query is different, the above query did, in fact, fail, but for a different reason.

JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

This is saying that there is a schema change: but one within a single batch. Variations of this bug can occur if you have several values of the form 10, 20, 30. But, then later, you have a value like “Hi” — we create a numeric vector then try to write a string.

Here, it appears you have values that are boolean, followed by a string:

… “a”: true …
… “a”: false …
… “a”: “a string!”

The JSON writer sees the boolean and locates a bit vector. Then, it sees the string, tries to write that into a bit vector, and gets the error displayed above.

You can work around this by using “all text mode” that reads all fields as text. Or, you can clean up your data.

Once this file works, perhaps you can try another run to recreate the original memory issue with the sort so we can track that one down.

Thanks,

- Paul

> On Nov 7, 2017, at 1:49 PM, Kunal Khatua <kk...@mapr.com> wrote:
> 
> Hi Yun
> 
> The new release might not address this issue as we don't have a repro for this. Any chance you can provide a sample anonymized data set. The JSON data doesn't have to be meaningful, but we need to be able to reproduce it to ensure that we are indeed addressing the issue you faced. 
> 
> Thanks
> ~K
> -----Original Message-----
> From: Yun Liu [mailto:y.liu@castsoftware.com] 
> Sent: Tuesday, November 07, 2017 7:17 AM
> To: user@drill.apache.org
> Subject: RE: Drill Capacity
> 
> Hi Arjun,
> 
> That was already altered and schema was not changed. I've reduced the json size and everything works fine. I believe it was giving a false error. Seems that's the only way to bypass this error until your new release comes out?
> 
> Thanks,
> Yun
> 
> -----Original Message-----
> From: Arjun kr [mailto:arjun.kr@outlook.com]
> Sent: Monday, November 6, 2017 7:39 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> 
> Looking at the log shared, You seems to be running below query.
> 
> 
> 2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100
> 
> 
> Below is the exception with query failure.
> 
> 
> 2017-11-06 15:09:45,852 [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)^M
> org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.
> 
> It could be related to schema change. Can you try setting below session parameter if not tried already?
> 
> 
> alter session set `store.json.all_text_mode`=true;
> 
> 
> 
> Thanks,
> 
> Arjun
> ________________________________
> From: Yun Liu <y....@castsoftware.com>
> Sent: Tuesday, November 7, 2017 1:46 AM
> To: user@drill.apache.org
> Subject: RE: Drill Capacity
> 
> Hi Arjun and Paul,
> 
> Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
> https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0
> [https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
> 
> Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
> www.dropbox.com
> Shared with Dropbox
> 
> 
> 
> 
> Thank you!
> Yun
> 
> -----Original Message-----
> From: Arjun kr [mailto:arjun.kr@outlook.com]
> Sent: Monday, November 6, 2017 1:20 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> 
> Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.
> 
> 
> Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.
> 
> 
> <logger name="org.apache.drill" additivity="false">
>    <level value="debug" />
>    <appender-ref ref="FILE" />
>  </logger>
> 
> 
> Thanks,
> 
> 
> Arjun
> 
> ________________________________
> From: Paul Rogers <pr...@mapr.com>
> Sent: Monday, November 6, 2017 10:56 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.
> 
> If you can post your logs somewhere, I'll d/l them and take a look.
> 
> - Paul
> 
>> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>> 
>> Hi Paul,
>> 
>> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>> 
>> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>> 
>> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
>> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
>> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
>> username":"","remoteAddress":"localhost"}
>> 
>> Is this what you're looking for?
>> 
>> Thanks,
>> Yun
>> 
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Friday, November 3, 2017 6:45 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>> 
>> Thanks for the info. Clearly you are way ahead of me.
>> 
>> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>> 
>> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>> 
>> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>> 
>> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>> 
>> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>> 
>> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>> 
>> Since things don't work, I suspect that we're missing something.
>> (Note that the memory size we just calculated does not match the 
>> numbers shown in an earlier post in which the sort got just ~40 MB of
>> memory...)
>> 
>> Try this:
>> 
>> * With your current settings, enable debug-level logging. Run your query.
>> 
>> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>> 
>> maxAllocation: (some number)
>> 
>> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>> 
>> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>> 
>> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>> 
>> Thanks,
>> 
>> - Paul
>> 
>>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>> 
>>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>>> planner.width.max_per_node = 5
>>> 
>>> Issue # 1:
>>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>>> With all the recommendations provided by various experts, nothing has worked.
>>> 
>>> Issue 2#:
>>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>>> Json#1 is 11k KB, has 8 fields with 74091 rows
>>> Json#2 is 752kb, has 8 fields with 4245 rows
>>> 
>>> Besides breaking them up to smaller files, not sure what else I could do.
>>> 
>>> Thanks for the help so far!
>>> 
>>> Yun
>>> 
>>> -----Original Message-----
>>> From: Paul Rogers [mailto:progers@mapr.com]
>>> Sent: Thursday, November 2, 2017 11:06 PM
>>> To: user@drill.apache.org
>>> Subject: Re: Drill Capacity
>>> 
>>> Hi Yun,
>>> 
>>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>> 
>>> How large is each record in your file? How many fields? How many 
>>> bytes? (Alternatively, how big is a single input file and how many 
>>> records does it contain?)
>>> 
>>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>> 
>>> We can guess the size, however, from the info you provided:
>>> 
>>> batchGroups.size 1
>>> spilledBatchGroups.size 0
>>> allocated memory 42768000
>>> allocator limit 41943040
>>> 
>>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>> 
>>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>> 
>>> (If you do want to use the "managed" version of the sort, I suggest 
>>> you try Drill 1.12 when it is released as that version contains 
>>> additional fixes to handle constrained memory.)
>>> 
>>> Also, at present, The JSON record reader loads 4096 records into each 
>>> batch. If your file has at least that many records, then we can guess 
>>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, 
>>> as noted above, by dividing total file size by record count.)
>>> 
>>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>> 
>>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>> 
>>> * Take the "max query memory per node" (default of 2 GB regardless of 
>>> actual direct memory),
>>> * Divide by the number of sort operators in the plan (as shown in the 
>>> visualized query profile)
>>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>> 
>>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>> 
>>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>> 
>>> First, we can increase the memory per query:
>>> 
>>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>>> 4,294,967,296
>>> 
>>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>> 
>>> And/or, we can reduce the number of fragments:
>>> 
>>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>> 
>>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>> 
>>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>> 
>>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>> 
>>> - Paul
>>> 
>>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>> 
>>>>> Hi Kunal and Andries,
>>>>> 
>>>>> Thanks for your reply. We need json in this case because Drill 
>>>>> only supports up to 65536 columns in a csv file.
>> 
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Kunal,

Please see below dataset I've provided this week. Hope it helps:

[ {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid unreferenced Tables",
    "key" : "1634",
    "critical" : false
  },
  "result" : {
    "grade" : 2,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 5,
      "successfulChecks" : 47,
      "ratio" : 0.9038461538461539
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.0769230769230775,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 5,
        "successfulChecks" : 47,
        "ratio" : 0.9038461538461539
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Namespace naming convention - case control",
    "key" : "3550",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 0,
      "successfulChecks" : 31,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 0,
        "successfulChecks" : 31,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "2",
    "name" : "Interface naming convention - case and character set control",
    "key" : "3554",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumerations naming convention - case and character set control",
    "key" : "3558",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 7,
      "failedChecks" : 0,
      "successfulChecks" : 7,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 7,
        "failedChecks" : 0,
        "successfulChecks" : 7,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Enumeration Items naming convention - case and character set control",
    "key" : "3560",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 65,
      "failedChecks" : 0,
      "successfulChecks" : 65,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 65,
        "failedChecks" : 0,
        "successfulChecks" : 65,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Private Fields naming convention - case and character set control",
    "key" : "3562",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 90,
      "successfulChecks" : 40,
      "ratio" : 0.3076923076923077
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 90,
        "successfulChecks" : 40,
        "ratio" : 0.3076923076923077
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Methods naming convention - case and character set control",
    "key" : "3566",
    "critical" : false
  },
  "result" : {
    "grade" : 3.759803921568629,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 11,
      "successfulChecks" : 550,
      "ratio" : 0.9803921568627451
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.759803921568629,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 11,
        "successfulChecks" : 550,
        "ratio" : 0.9803921568627451
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "",
    "name" : "Avoid using Keywords as names",
    "key" : "3570",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 1596,
      "failedChecks" : 4,
      "successfulChecks" : 1592,
      "ratio" : 0.9974937343358395
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 1596,
        "failedChecks" : 4,
        "successfulChecks" : 1592,
        "ratio" : 0.9974937343358395
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3574/snapshots/3722",
    "name" : "Properties naming convention - case and character set control",
    "key" : "3574",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 352,
      "failedChecks" : 0,
      "successfulChecks" : 352,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 352,
        "failedChecks" : 0,
        "successfulChecks" : 352,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3576/snapshots/3722",
    "name" : "Avoid declaring public Fields",
    "key" : "3576",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 130,
      "failedChecks" : 0,
      "successfulChecks" : 130,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 130,
        "failedChecks" : 0,
        "successfulChecks" : 130,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3578/snapshots/3722",
    "name" : "Avoid large Classes - too many Constructors",
    "key" : "3578",
    "critical" : false
  },
  "result" : {
    "grade" : 3.9691011235955074,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 2,
      "successfulChecks" : 176,
      "ratio" : 0.9887640449438202
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.9691011235955074,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 2,
        "successfulChecks" : 176,
        "ratio" : 0.9887640449438202
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3580/snapshots/3722",
    "name" : "Avoid large Classes - too many Methods",
    "key" : "3580",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 1,
      "successfulChecks" : 177,
      "ratio" : 0.9943820224719101
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 1,
        "successfulChecks" : 177,
        "ratio" : 0.9943820224719101
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3586/snapshots/3722",
    "name" : "Avoid large Methods - too many Lines of Code",
    "key" : "3586",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 4,
      "successfulChecks" : 492,
      "ratio" : 0.9919354838709677
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 4,
        "successfulChecks" : 492,
        "ratio" : 0.9919354838709677
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3612/snapshots/3722",
    "name" : "Close SQL connection ASAP",
    "key" : "3612",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3614/snapshots/3722",
    "name" : "Avoid using String.Empty for empty string tests",
    "key" : "3614",
    "critical" : false
  },
  "result" : {
    "grade" : 3.971603563474389,
    "violationRatio" : {
      "totalChecks" : 449,
      "failedChecks" : 5,
      "successfulChecks" : 444,
      "ratio" : 0.9888641425389755
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.971603563474389,
      "violationRatio" : {
        "totalChecks" : 449,
        "failedChecks" : 5,
        "successfulChecks" : 444,
        "ratio" : 0.9888641425389755
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3626/snapshots/3722",
    "name" : "Avoid Interfaces with a very low comment/code ratio",
    "key" : "3626",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 8,
      "successfulChecks" : 2,
      "ratio" : 0.2
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 8,
        "successfulChecks" : 2,
        "ratio" : 0.2
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "3630/snapshots/3722",
    "name" : "Avoid having Classes implementing too many Interfaces",
    "key" : "3630",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 178,
      "failedChecks" : 0,
      "successfulChecks" : 178,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 178,
        "failedChecks" : 0,
        "successfulChecks" : 178,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4062/snapshots/3722",
    "name" : "Avoid Functions and Procedures doing an Insert, Update or Delete without managing a transaction",
    "key" : "4062",
    "critical" : false
  },
  "result" : {
    "grade" : 1.3841463414634148,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 71,
      "successfulChecks" : 134,
      "ratio" : 0.6536585365853659
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 12,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.3841463414634148,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 71,
        "successfulChecks" : 134,
        "ratio" : 0.6536585365853659
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 12,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4064/snapshots/3722",
    "name" : "Avoid Procedures using an Insert, Update, Delete, Create Table or Select without including error management",
    "key" : "4064",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 172,
      "successfulChecks" : 31,
      "ratio" : 0.15270935960591134
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 23,
      "removedCriticalViolations" : 2,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 172,
        "successfulChecks" : 31,
        "ratio" : 0.15270935960591134
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 23,
        "removedCriticalViolations" : 2,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4066/snapshots/3722",
    "name" : "Avoid Stored Procedures not returning a status value",
    "key" : "4066",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 184,
      "successfulChecks" : 19,
      "ratio" : 0.09359605911330049
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 23,
      "removedViolations" : 2
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 184,
        "successfulChecks" : 19,
        "ratio" : 0.09359605911330049
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 23,
        "removedViolations" : 2
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4070/snapshots/3722",
    "name" : "Avoid use of \"truncate table\"",
    "key" : "4070",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4076/snapshots/3722",
    "name" : "Avoid using temporary Objects",
    "key" : "4076",
    "critical" : false
  },
  "result" : {
    "grade" : 3.274390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 8,
      "successfulChecks" : 197,
      "ratio" : 0.9609756097560975
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.274390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 8,
        "successfulChecks" : 197,
        "ratio" : 0.9609756097560975
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "4084/snapshots/3722",
    "name" : "Avoid nested Stored Procedures using temporary Tables",
    "key" : "4084",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 203,
      "failedChecks" : 0,
      "successfulChecks" : 203,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 203,
        "failedChecks" : 0,
        "successfulChecks" : 203,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7126/snapshots/3722",
    "name" : "Avoid Artifacts with high Commented-out Code Lines/Code Lines ratio",
    "key" : "7126",
    "critical" : false
  },
  "result" : {
    "grade" : 3.7591514143094855,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 59,
      "successfulChecks" : 2946,
      "ratio" : 0.9803660565723794
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.8659342622365145,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 43,
        "successfulChecks" : 2756,
        "ratio" : 0.9846373704894605
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4466019417475735,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 16,
        "successfulChecks" : 190,
        "ratio" : 0.9223300970873787
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7130/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Nested Subqueries",
    "key" : "7130",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 0,
      "successfulChecks" : 206,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7156/snapshots/3722",
    "name" : "Avoid Too Many Copy Pasted Artifacts",
    "key" : "7156",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6478102189781023,
    "violationRatio" : {
      "totalChecks" : 959,
      "failedChecks" : 231,
      "successfulChecks" : 728,
      "ratio" : 0.7591240875912408
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 7,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.671151439299124,
      "violationRatio" : {
        "totalChecks" : 799,
        "failedChecks" : 185,
        "successfulChecks" : 614,
        "ratio" : 0.7684605757196495
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.53125,
      "violationRatio" : {
        "totalChecks" : 160,
        "failedChecks" : 46,
        "successfulChecks" : 114,
        "ratio" : 0.7125
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 7,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7194/snapshots/3722",
    "name" : "Avoid large number of String concatenation",
    "key" : "7194",
    "critical" : false
  },
  "result" : {
    "grade" : 3.796370967741936,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 9,
      "successfulChecks" : 487,
      "ratio" : 0.9818548387096774
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.796370967741936,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 9,
        "successfulChecks" : 487,
        "ratio" : 0.9818548387096774
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7198/snapshots/3722",
    "name" : "Avoid String concatenation in loops",
    "key" : "7198",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 15,
      "successfulChecks" : 481,
      "ratio" : 0.969758064516129
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 15,
        "successfulChecks" : 481,
        "ratio" : 0.969758064516129
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7204/snapshots/3722",
    "name" : "Avoid method invocation in a loop termination expression",
    "key" : "7204",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 2,
      "successfulChecks" : 559,
      "ratio" : 0.9964349376114082
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 2,
        "successfulChecks" : 559,
        "ratio" : 0.9964349376114082
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7208/snapshots/3722",
    "name" : "Avoid the use of is inside loops",
    "key" : "7208",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 0,
      "successfulChecks" : 496,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 0,
        "successfulChecks" : 496,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7212/snapshots/3722",
    "name" : "Avoid instantiations inside loops",
    "key" : "7212",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 496,
      "failedChecks" : 23,
      "successfulChecks" : 473,
      "ratio" : 0.9536290322580645
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 496,
        "failedChecks" : 23,
        "successfulChecks" : 473,
        "ratio" : 0.9536290322580645
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7260/snapshots/3722",
    "name" : "User Interface elements must not use directly the database",
    "key" : "7260",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7262/snapshots/3722",
    "name" : "Avoid Namespaces with High Efferent Coupling (CE)",
    "key" : "7262",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7264/snapshots/3722",
    "name" : "Avoid namespaces with High Afferent Coupling (CA)",
    "key" : "7264",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7266/snapshots/3722",
    "name" : "Call 'base.Dispose()' or 'MyBase.Finalize()' in the \"finally\" block of 'Dispose(bool)' methods",
    "key" : "7266",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 3,
      "failedChecks" : 3,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 3,
        "failedChecks" : 3,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7268/snapshots/3722",
    "name" : "Dispose() methods should call GC.SuppressFinalize",
    "key" : "7268",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 5,
      "failedChecks" : 0,
      "successfulChecks" : 5,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 5,
        "failedChecks" : 0,
        "successfulChecks" : 5,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7270/snapshots/3722",
    "name" : "Declare as Static all methods not using instance members",
    "key" : "7270",
    "critical" : false
  },
  "result" : {
    "grade" : 3.5369875222816383,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 16,
      "successfulChecks" : 545,
      "ratio" : 0.9714795008912656
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 3.5369875222816383,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 16,
        "successfulChecks" : 545,
        "ratio" : 0.9714795008912656
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7272/snapshots/3722",
    "name" : "Provide a private default Constructor for utility Classes",
    "key" : "7272",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 6,
      "failedChecks" : 6,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 6,
        "failedChecks" : 6,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7294/snapshots/3722",
    "name" : "Avoid cyclical calls and inheritances between namespaces content",
    "key" : "7294",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 31,
      "failedChecks" : 5,
      "successfulChecks" : 26,
      "ratio" : 0.8387096774193549
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 31,
        "failedChecks" : 5,
        "successfulChecks" : 26,
        "ratio" : 0.8387096774193549
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7344/snapshots/3722",
    "name" : "Avoid \"SELECT *\" queries",
    "key" : "7344",
    "critical" : false
  },
  "result" : {
    "grade" : 2.932038834951456,
    "violationRatio" : {
      "totalChecks" : 206,
      "failedChecks" : 11,
      "successfulChecks" : 195,
      "ratio" : 0.9466019417475728
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.932038834951456,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7346/snapshots/3722",
    "name" : "Avoid redundant indexes",
    "key" : "7346",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 0,
      "successfulChecks" : 52,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 0,
        "successfulChecks" : 52,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7348/snapshots/3722",
    "name" : "Avoid too many Indexes on one Table",
    "key" : "7348",
    "critical" : false
  },
  "result" : {
    "grade" : 3.76923076923077,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 1,
      "successfulChecks" : 51,
      "ratio" : 0.9807692307692307
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.76923076923077,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 1,
        "successfulChecks" : 51,
        "ratio" : 0.9807692307692307
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7352/snapshots/3722",
    "name" : "Avoid calling properties that clone values in loops",
    "key" : "7352",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 561,
      "failedChecks" : 0,
      "successfulChecks" : 561,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 561,
        "failedChecks" : 0,
        "successfulChecks" : 561,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7386/snapshots/3722",
    "name" : "Avoid Tables without a clustered Index",
    "key" : "7386",
    "critical" : false
  },
  "result" : {
    "grade" : 1.721153846153846,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 11,
      "successfulChecks" : 41,
      "ratio" : 0.7884615384615384
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 4,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.721153846153846,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 11,
        "successfulChecks" : 41,
        "ratio" : 0.7884615384615384
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 4,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7388/snapshots/3722",
    "name" : "Avoid artifacts having recursive calls",
    "key" : "7388",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 3005,
      "failedChecks" : 19,
      "successfulChecks" : 2986,
      "ratio" : 0.9936772046589019
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2799,
        "failedChecks" : 19,
        "successfulChecks" : 2780,
        "ratio" : 0.9932118613790639
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 0,
        "successfulChecks" : 206,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7390/snapshots/3722",
    "name" : "Avoid having multiple Artifacts inserting data on the same SQL Table",
    "key" : "7390",
    "critical" : false
  },
  "result" : {
    "grade" : 1.673076923076923,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 12,
      "successfulChecks" : 40,
      "ratio" : 0.7692307692307693
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.673076923076923,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 12,
        "successfulChecks" : 40,
        "ratio" : 0.7692307692307693
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7392/snapshots/3722",
    "name" : "Avoid having multiple artifacts deleting data on the same SQL table",
    "key" : "7392",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8653846153846154,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 8,
      "successfulChecks" : 44,
      "ratio" : 0.8461538461538461
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.8653846153846154,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 8,
        "successfulChecks" : 44,
        "ratio" : 0.8461538461538461
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7394/snapshots/3722",
    "name" : "Avoid having multiple Artifacts updating data on the same SQL Table",
    "key" : "7394",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4615384615384612,
    "violationRatio" : {
      "totalChecks" : 52,
      "failedChecks" : 4,
      "successfulChecks" : 48,
      "ratio" : 0.9230769230769231
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 2.4615384615384612,
      "violationRatio" : {
        "totalChecks" : 52,
        "failedChecks" : 4,
        "successfulChecks" : 48,
        "ratio" : 0.9230769230769231
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7418/snapshots/3722",
    "name" : "Avoid SQL queries using functions on indexed columns in the WHERE clause",
    "key" : "7418",
    "critical" : false
  },
  "result" : {
    "grade" : 3.28846153846154,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 7,
      "successfulChecks" : 175,
      "ratio" : 0.9615384615384616
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 5,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.28846153846154,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 7,
        "successfulChecks" : 175,
        "ratio" : 0.9615384615384616
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 5,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7420/snapshots/3722",
    "name" : "Avoid SQL queries with implicit conversions in the WHERE clause",
    "key" : "7420",
    "critical" : false
  },
  "result" : {
    "grade" : 3.425824175824175,
    "violationRatio" : {
      "totalChecks" : 182,
      "failedChecks" : 6,
      "successfulChecks" : 176,
      "ratio" : 0.967032967032967
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 3.425824175824175,
      "violationRatio" : {
        "totalChecks" : 182,
        "failedChecks" : 6,
        "successfulChecks" : 176,
        "ratio" : 0.967032967032967
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7424/snapshots/3722",
    "name" : "Avoid using SQL queries inside a loop",
    "key" : "7424",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 221,
      "failedChecks" : 11,
      "successfulChecks" : 210,
      "ratio" : 0.9502262443438914
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 15,
        "failedChecks" : 0,
        "successfulChecks" : 15,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 11,
        "successfulChecks" : 195,
        "ratio" : 0.9466019417475728
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7428/snapshots/3722",
    "name" : "Avoid SQL queries not using the first column of a composite index in the WHERE clause",
    "key" : "7428",
    "critical" : false
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 24,
      "failedChecks" : 24,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 3,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 24,
        "failedChecks" : 24,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 3,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7458/snapshots/3722",
    "name" : "Avoid large Interfaces - too many Methods",
    "key" : "7458",
    "critical" : false
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 10,
      "failedChecks" : 0,
      "successfulChecks" : 10,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 10,
        "failedChecks" : 0,
        "successfulChecks" : 10,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7460/snapshots/3722",
    "name" : "Avoid using untyped DataSet",
    "key" : "7460",
    "critical" : true
  },
  "result" : {
    "grade" : 1.0,
    "violationRatio" : {
      "totalChecks" : 17,
      "failedChecks" : 17,
      "successfulChecks" : 0,
      "ratio" : 0.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.0,
      "violationRatio" : {
        "totalChecks" : 17,
        "failedChecks" : 17,
        "successfulChecks" : 0,
        "ratio" : 0.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7470/snapshots/3722",
    "name" : "Avoid doing select on Datatable in loop",
    "key" : "7470",
    "critical" : true
  },
  "result" : {
    "grade" : 4.0,
    "violationRatio" : {
      "totalChecks" : 2,
      "failedChecks" : 0,
      "successfulChecks" : 2,
      "ratio" : 1.0
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 2,
        "failedChecks" : 0,
        "successfulChecks" : 2,
        "ratio" : 1.0
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7760/snapshots/3722",
    "name" : "Avoid triggers, functions and procedures with a very low comment/code ratio",
    "key" : "7760",
    "critical" : false
  },
  "result" : {
    "grade" : 1.4695121951219512,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 64,
      "successfulChecks" : 141,
      "ratio" : 0.6878048780487804
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 2,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.4695121951219512,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 64,
        "successfulChecks" : 141,
        "ratio" : 0.6878048780487804
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 2,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7762/snapshots/3722",
    "name" : "Avoid undocumented Triggers, Functions and Procedures",
    "key" : "7762",
    "critical" : false
  },
  "result" : {
    "grade" : 1.6524390243902438,
    "violationRatio" : {
      "totalChecks" : 205,
      "failedChecks" : 49,
      "successfulChecks" : 156,
      "ratio" : 0.7609756097560976
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 1
    }
  },
  "technologyResults" : [ {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 1.6524390243902438,
      "violationRatio" : {
        "totalChecks" : 205,
        "failedChecks" : 49,
        "successfulChecks" : 156,
        "ratio" : 0.7609756097560976
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 1
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7766/snapshots/3722",
    "name" : "Avoid Artifacts with High Cyclomatic Complexity",
    "key" : "7766",
    "critical" : false
  },
  "result" : {
    "grade" : 1.8886054421768708,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 425,
      "successfulChecks" : 2515,
      "ratio" : 0.8554421768707483
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 1,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 1.8632040965618142,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 423,
        "successfulChecks" : 2311,
        "ratio" : 0.8452816386247257
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 0,
        "removedViolations" : 0
      }
    }
  }, {
    "technology" : "Microsoft T-SQL",
    "result" : {
      "grade" : 4.0,
      "violationRatio" : {
        "totalChecks" : 206,
        "failedChecks" : 2,
        "successfulChecks" : 204,
        "ratio" : 0.9902912621359223
      },
      "evolutionSummary" : {
        "addedCriticalViolations" : 0,
        "removedCriticalViolations" : 0,
        "addedViolations" : 1,
        "removedViolations" : 0
      }
    }
  } ]
}, {
  "type" : "quality-rules",
  "reference" : {
    "href" : "7768/snapshots/3722",
    "name" : "Avoid Artifacts with High Depth of Code",
    "key" : "7768",
    "critical" : false
  },
  "result" : {
    "grade" : 2.4081632653061233,
    "violationRatio" : {
      "totalChecks" : 2940,
      "failedChecks" : 234,
      "successfulChecks" : 2706,
      "ratio" : 0.9204081632653062
    },
    "evolutionSummary" : {
      "addedCriticalViolations" : 0,
      "removedCriticalViolations" : 0,
      "addedViolations" : 0,
      "removedViolations" : 0
    }
  },
  "technologyResults" : [ {
    "technology" : ".NET",
    "result" : {
      "grade" : 2.2882223847842,
      "violationRatio" : {
        "totalChecks" : 2734,
        "failedChecks" : 234,
        "successfulChecks" : 2500,
        "ratio" : 0.9144111192392099
      },
      "evolutionSummary" : {
        "addedCritica

Thanks,
Yun

-----Original Message-----
From: Kunal Khatua [mailto:kkhatua@mapr.com] 
Sent: Tuesday, November 7, 2017 4:50 PM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Yun

The new release might not address this issue as we don't have a repro for this. Any chance you can provide a sample anonymized data set. The JSON data doesn't have to be meaningful, but we need to be able to reproduce it to ensure that we are indeed addressing the issue you faced. 

Thanks
~K
-----Original Message-----
From: Yun Liu [mailto:y.liu@castsoftware.com]
Sent: Tuesday, November 07, 2017 7:17 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun,

That was already altered and schema was not changed. I've reduced the json size and everything works fine. I believe it was giving a false error. Seems that's the only way to bypass this error until your new release comes out?

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 7:39 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Looking at the log shared, You seems to be running below query.


2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100


Below is the exception with query failure.


2017-11-06 15:09:45,852 [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)^M
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

It could be related to schema change. Can you try setting below session parameter if not tried already?


alter session set `store.json.all_text_mode`=true;



Thanks,

Arjun
________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Tuesday, November 7, 2017 1:46 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun and Paul,

Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0
[https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>

Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
www.dropbox.com
Shared with Dropbox




Thank you!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 1:20 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I'll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
> username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>
> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don't work, I suspect that we're missing something.
> (Note that the memory size we just calculated does not match the 
> numbers shown in an earlier post in which the sort got just ~40 MB of
> memory...)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>> planner.width.max_per_node = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>
>> How large is each record in your file? How many fields? How many 
>> bytes? (Alternatively, how big is a single input file and how many 
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the "managed" version of the sort, I suggest 
>> you try Drill 1.12 when it is released as that version contains 
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each 
>> batch. If your file has at least that many records, then we can guess 
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, 
>> as noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the "max query memory per node" (default of 2 GB regardless of 
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the 
>> visualized query profile)
>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill 
>>>> only supports up to 65536 columns in a csv file.
>

RE: Drill Capacity

Posted by Kunal Khatua <kk...@mapr.com>.

Hi Yun

The new release might not address this issue as we don't have a repro for this. Any chance you can provide a sample anonymized data set. The JSON data doesn't have to be meaningful, but we need to be able to reproduce it to ensure that we are indeed addressing the issue you faced. 

Thanks
~K
-----Original Message-----
From: Yun Liu [mailto:y.liu@castsoftware.com] 
Sent: Tuesday, November 07, 2017 7:17 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun,

That was already altered and schema was not changed. I've reduced the json size and everything works fine. I believe it was giving a false error. Seems that's the only way to bypass this error until your new release comes out?

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 7:39 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Looking at the log shared, You seems to be running below query.


2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100


Below is the exception with query failure.


2017-11-06 15:09:45,852 [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)^M
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

It could be related to schema change. Can you try setting below session parameter if not tried already?


alter session set `store.json.all_text_mode`=true;



Thanks,

Arjun
________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Tuesday, November 7, 2017 1:46 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun and Paul,

Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0
[https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>

Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
www.dropbox.com
Shared with Dropbox




Thank you!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 1:20 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I'll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
> username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>
> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don't work, I suspect that we're missing something.
> (Note that the memory size we just calculated does not match the 
> numbers shown in an earlier post in which the sort got just ~40 MB of
> memory...)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>> planner.width.max_per_node = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>
>> How large is each record in your file? How many fields? How many 
>> bytes? (Alternatively, how big is a single input file and how many 
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the "managed" version of the sort, I suggest 
>> you try Drill 1.12 when it is released as that version contains 
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each 
>> batch. If your file has at least that many records, then we can guess 
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, 
>> as noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the "max query memory per node" (default of 2 GB regardless of 
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the 
>> visualized query profile)
>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill 
>>>> only supports up to 65536 columns in a csv file.
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Arjun,

That was already altered and schema was not changed. I've reduced the json size and everything works fine. I believe it was giving a false error. Seems that's the only way to bypass this error until your new release comes out?

Thanks,
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com] 
Sent: Monday, November 6, 2017 7:39 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Looking at the log shared, You seems to be running below query.


2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100


Below is the exception with query failure.


2017-11-06 15:09:45,852 [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)^M
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

It could be related to schema change. Can you try setting below session parameter if not tried already?


alter session set `store.json.all_text_mode`=true;



Thanks,

Arjun
________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Tuesday, November 7, 2017 1:46 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun and Paul,

Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0
[https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>

Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
www.dropbox.com
Shared with Dropbox




Thank you!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 1:20 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I'll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
> username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>
> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don't work, I suspect that we're missing something.
> (Note that the memory size we just calculated does not match the 
> numbers shown in an earlier post in which the sort got just ~40 MB of
> memory...)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>> planner.width.max_per_node = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>
>> How large is each record in your file? How many fields? How many 
>> bytes? (Alternatively, how big is a single input file and how many 
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the "managed" version of the sort, I suggest 
>> you try Drill 1.12 when it is released as that version contains 
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each 
>> batch. If your file has at least that many records, then we can guess 
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, 
>> as noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the "max query memory per node" (default of 2 GB regardless of 
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the 
>> visualized query profile)
>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill 
>>>> only supports up to 65536 columns in a csv file.
>

Re: Drill Capacity

Posted by Arjun kr <ar...@outlook.com>.

Hi Yun,


Looking at the log shared, You seems to be running below query.


2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 25ff3e7e-39ef-a175-93e7-e4e62b284add: SELECT * FROM `dfs`.`Inputs`.`./Compliance.json` LIMIT 100


Below is the exception with query failure.


2017-11-06 15:09:45,852 [25ff3e7e-39ef-a175-93e7-e4e62b284add:frag:0:0] INFO  o.a.d.e.vector.complex.fn.JsonReader - User Error Occurred: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl. (You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.)^M
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR: You tried to write a VarChar type when you are using a ValueWriter of type NullableBitWriterImpl.

It could be related to schema change. Can you try setting below session parameter if not tried already?


alter session set `store.json.all_text_mode`=true;



Thanks,

Arjun
________________________________
From: Yun Liu <y....@castsoftware.com>
Sent: Tuesday, November 7, 2017 1:46 AM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Arjun and Paul,

Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0
[https://cfl.dropboxstatic.com/static/images/logo_catalog/glyph_m1@2x-vflA6lTFZ.png]<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>

Drill<https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0>
www.dropbox.com
Shared with Dropbox




Thank you!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com]
Sent: Monday, November 6, 2017 1:20 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I'll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT
> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
> username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>
> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don't work, I suspect that we're missing something.
> (Note that the memory size we just calculated does not match the
> numbers shown in an earlier post in which the sort got just ~40 MB of
> memory...)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB,
>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and
>> planner.width.max_per_node = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>
>> How large is each record in your file? How many fields? How many
>> bytes? (Alternatively, how big is a single input file and how many
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the "managed" version of the sort, I suggest
>> you try Drill 1.12 when it is released as that version contains
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each
>> batch. If your file has at least that many records, then we can guess
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm,
>> as noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the "max query memory per node" (default of 2 GB regardless of
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the
>> visualized query profile)
>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill
>>>> only supports up to 65536 columns in a csv file.
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Arjun and Paul,

Yep those are turned and I am reading it from sqlline.log. Only max allocation number I am reading is 10,000,000,000. Posted the logs in my Dropbox:
https://www.dropbox.com/sh/5akxrzm078jsabw/AADuD92swH6c9jwijTjkkac_a?dl=0

Thank you!
Yun

-----Original Message-----
From: Arjun kr [mailto:arjun.kr@outlook.com] 
Sent: Monday, November 6, 2017 1:20 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I'll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryTe
> xt":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 
> 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","
> username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the "batch size control" feature that we are working on, but have not yet shipped.
>
> Let's work out the math. How many sorts in your query? What other operators does the query include? Let's assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don't work, I suspect that we're missing something.  
> (Note that the memory size we just calculated does not match the 
> numbers shown in an earlier post in which the sort got just ~40 MB of 
> memory...)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, 
>> HEAP = 12G, Direct memory = 32G, Perm 1024M, and 
>> planner.width.max_per_node = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I'm going to give you multiple ways to understand the issue based on the information you've provided. I generally like to see the full logs to diagnose such problems, but we'll start with what you've provided thus far.
>>
>> How large is each record in your file? How many fields? How many 
>> bytes? (Alternatively, how big is a single input file and how many 
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a "jumbo" record. If each individual record is large, then there won't be enough space in the sort to take even a single batch of records, and you'll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the "sv2"). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the "managed" version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the "managed" version of the sort, I suggest 
>> you try Drill 1.12 when it is released as that version contains 
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each 
>> batch. If your file has at least that many records, then we can guess 
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, 
>> as noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don't let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the "allocator limit" was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the "max query memory per node" (default of 2 GB regardless of 
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the 
>> visualized query profile)
>> * Divide by the "planner width" which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can't reduce the number of sorts; that's determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let's say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the "jumbo" batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill 
>>>> only supports up to 65536 columns in a csv file.
>

Re: Drill Capacity

Posted by Arjun kr <ar...@outlook.com>.

Hi Yun,


Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested.


Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging.


 <logger name="org.apache.drill" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


Thanks,


Arjun

________________________________
From: Paul Rogers <pr...@mapr.com>
Sent: Monday, November 6, 2017 10:56 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I’ll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
>
> Hi Paul,
>
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
>
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
>
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryText":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","username":"","remoteAddress":"localhost"}
>
> Is this what you're looking for?
>
> Thanks,
> Yun
>
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> Thanks for the info. Clearly you are way ahead of me.
>
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
>
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
>
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
>
> So, we are dealing with jumbo records and you really want the “batch size control” feature that we are working on, but have not yet shipped.
>
> Let’s work out the math. How many sorts in your query? What other operators does the query include? Let’s assume a single sort.
>
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
>
> Since things don’t work, I suspect that we’re missing something.  (Note that the memory size we just calculated does not match the numbers shown in an earlier post in which the sort got just ~40 MB of memory…)
>
> Try this:
>
> * With your current settings, enable debug-level logging. Run your query.
>
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort:
>
> maxAllocation: (some number)
>
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
>
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
>
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
>
> Thanks,
>
> - Paul
>
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>>
>> Hi Paul,
>>
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>>
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP
>> = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node
>> = 5
>>
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>>
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>>
>> Besides breaking them up to smaller files, not sure what else I could do.
>>
>> Thanks for the help so far!
>>
>> Yun
>>
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>>
>> Hi Yun,
>>
>> I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.
>>
>> How large is each record in your file? How many fields? How many
>> bytes? (Alternatively, how big is a single input file and how many
>> records does it contain?)
>>
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.
>>
>> We can guess the size, however, from the info you provided:
>>
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>>
>> This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>>
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>>
>> (If you do want to use the “managed” version of the sort, I suggest
>> you try Drill 1.12 when it is released as that version contains
>> additional fixes to handle constrained memory.)
>>
>> Also, at present, The JSON record reader loads 4096 records into each
>> batch. If your file has at least that many records, then we can guess
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as
>> noted above, by dividing total file size by record count.)
>>
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.
>>
>> Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:
>>
>> * Take the “max query memory per node” (default of 2 GB regardless of
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the
>> visualized query profile)
>> * Divide by the “planner width” which is, by default, 70% of the number of cores on your system.
>>
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>>
>> We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.
>>
>> First, we can increase the memory per query:
>>
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` =
>> 4,294,967,296
>>
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>>
>> And/or, we can reduce the number of fragments:
>>
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>>
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.
>>
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>>
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>>
>> - Paul
>>
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>>
>>>>  Hi Kunal and Andries,
>>>>
>>>>  Thanks for your reply. We need json in this case because Drill
>>>> only supports up to 65536 columns in a csv file.
>

Re: Drill Capacity

Posted by Paul Rogers <pr...@mapr.com>.

Hi Yun,

Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation.

If you can post your logs somewhere, I’ll d/l them and take a look.

- Paul

> On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote:
> 
> Hi Paul,
> 
> I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.
> 
> I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:
> 
> {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryText":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","username":"","remoteAddress":"localhost"}
> 
> Is this what you're looking for?
> 
> Thanks,
> Yun
> 
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com] 
> Sent: Friday, November 3, 2017 6:45 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Thanks for the info. Clearly you are way ahead of me.
> 
> In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.
> 
> You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.
> 
> If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.
> 
> So, we are dealing with jumbo records and you really want the “batch size control” feature that we are working on, but have not yet shipped.
> 
> Let’s work out the math. How many sorts in your query? What other operators does the query include? Let’s assume a single sort.
> 
> Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.
> 
> Since things don’t work, I suspect that we’re missing something.  (Note that the memory size we just calculated does not match the numbers shown in an earlier post in which the sort got just ~40 MB of memory…)
> 
> Try this:
> 
> * With your current settings, enable debug-level logging. Run your query.
> 
> * Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort: 
> 
> maxAllocation: (some number)
> 
> * Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?
> 
> * Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?
> 
> With those, we can tell if the settings and sizes we think we are using are, in fact, correct.
> 
> Thanks,
> 
> - Paul
> 
>> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
>> 
>> Hi Paul,
>> 
>> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
>> 
>> Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP 
>> = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node 
>> = 5
>> 
>> Issue # 1:
>> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
>> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
>> With all the recommendations provided by various experts, nothing has worked.
>> 
>> Issue 2#:
>> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
>> Json#1 is 11k KB, has 8 fields with 74091 rows
>> Json#2 is 752kb, has 8 fields with 4245 rows
>> 
>> Besides breaking them up to smaller files, not sure what else I could do.
>> 
>> Thanks for the help so far!
>> 
>> Yun
>> 
>> -----Original Message-----
>> From: Paul Rogers [mailto:progers@mapr.com]
>> Sent: Thursday, November 2, 2017 11:06 PM
>> To: user@drill.apache.org
>> Subject: Re: Drill Capacity
>> 
>> Hi Yun,
>> 
>> I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.
>> 
>> How large is each record in your file? How many fields? How many 
>> bytes? (Alternatively, how big is a single input file and how many 
>> records does it contain?)
>> 
>> You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.
>> 
>> We can guess the size, however, from the info you provided:
>> 
>> batchGroups.size 1
>> spilledBatchGroups.size 0
>> allocated memory 42768000
>> allocator limit 41943040
>> 
>> This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
>> 
>> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
>> 
>> (If you do want to use the “managed” version of the sort, I suggest 
>> you try Drill 1.12 when it is released as that version contains 
>> additional fixes to handle constrained memory.)
>> 
>> Also, at present, The JSON record reader loads 4096 records into each 
>> batch. If your file has at least that many records, then we can guess 
>> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as 
>> noted above, by dividing total file size by record count.)
>> 
>> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.
>> 
>> Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:
>> 
>> * Take the “max query memory per node” (default of 2 GB regardless of 
>> actual direct memory),
>> * Divide by the number of sort operators in the plan (as shown in the 
>> visualized query profile)
>> * Divide by the “planner width” which is, by default, 70% of the number of cores on your system.
>> 
>> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
>> 
>> We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.
>> 
>> First, we can increase the memory per query:
>> 
>> ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 
>> 4,294,967,296
>> 
>> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
>> 
>> And/or, we can reduce the number of fragments:
>> 
>> ALTER SESSION SET `planner.width.max_per_node` = <a number>
>> 
>> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.
>> 
>> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
>> 
>> That is three separate possible solutions. Try them one by one or (carefully) together.
>> 
>> - Paul
>> 
>>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>>> 
>>>>  Hi Kunal and Andries,
>>>> 
>>>>  Thanks for your reply. We need json in this case because Drill 
>>>> only supports up to 65536 columns in a csv file.
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Paul,

I am using Drill v 1.11.0 so I am only seeing sqlline.log and sqlline_queries.log. hopefully the same.

I am following your instructions and I am not seeing any maxAllocation other than 10,000,000,000. No other number (or small number) than this. The query profile reads the following:

{"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryText":"SELECT * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","username":"","remoteAddress":"localhost"}

Is this what you're looking for?

Thanks,
Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com] 
Sent: Friday, November 3, 2017 6:45 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Thanks for the info. Clearly you are way ahead of me.

In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.

You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.

If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.

So, we are dealing with jumbo records and you really want the “batch size control” feature that we are working on, but have not yet shipped.

Let’s work out the math. How many sorts in your query? What other operators does the query include? Let’s assume a single sort.

Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.

Since things don’t work, I suspect that we’re missing something.  (Note that the memory size we just calculated does not match the numbers shown in an earlier post in which the sort got just ~40 MB of memory…)

Try this:

* With your current settings, enable debug-level logging. Run your query.

* Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort: 

maxAllocation: (some number)

* Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?

* Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?

With those, we can tell if the settings and sizes we think we are using are, in fact, correct.

Thanks,

- Paul

> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
> 
> Hi Paul,
> 
> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
> 
> Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP 
> = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node 
> = 5
> 
> Issue # 1:
> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
> With all the recommendations provided by various experts, nothing has worked.
> 
> Issue 2#:
> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
> Json#1 is 11k KB, has 8 fields with 74091 rows
> Json#2 is 752kb, has 8 fields with 4245 rows
> 
> Besides breaking them up to smaller files, not sure what else I could do.
> 
> Thanks for the help so far!
> 
> Yun
> 
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com]
> Sent: Thursday, November 2, 2017 11:06 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.
> 
> How large is each record in your file? How many fields? How many 
> bytes? (Alternatively, how big is a single input file and how many 
> records does it contain?)
> 
> You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.
> 
> We can guess the size, however, from the info you provided:
> 
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 42768000
> allocator limit 41943040
> 
> This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
> 
> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
> 
> (If you do want to use the “managed” version of the sort, I suggest 
> you try Drill 1.12 when it is released as that version contains 
> additional fixes to handle constrained memory.)
> 
> Also, at present, The JSON record reader loads 4096 records into each 
> batch. If your file has at least that many records, then we can guess 
> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as 
> noted above, by dividing total file size by record count.)
> 
> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.
> 
> Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:
> 
> * Take the “max query memory per node” (default of 2 GB regardless of 
> actual direct memory),
> * Divide by the number of sort operators in the plan (as shown in the 
> visualized query profile)
> * Divide by the “planner width” which is, by default, 70% of the number of cores on your system.
> 
> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
> 
> We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.
> 
> First, we can increase the memory per query:
> 
> ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 
> 4,294,967,296
> 
> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
> 
> And/or, we can reduce the number of fragments:
> 
> ALTER SESSION SET `planner.width.max_per_node` = <a number>
> 
> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.
> 
> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
> 
> That is three separate possible solutions. Try them one by one or (carefully) together.
> 
> - Paul
> 
>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>> 
>>>   Hi Kunal and Andries,
>>> 
>>>   Thanks for your reply. We need json in this case because Drill 
>>> only supports up to 65536 columns in a csv file.

Re: Drill Capacity

Posted by Paul Rogers <pr...@mapr.com>.

Thanks for the info. Clearly you are way ahead of me.

In issue 1, although you have only four (top level) fields, your example shows that you have many nested fields. It is the total field count (across all maps) that drives total width. And, it is the total amount of data that drives memory consumption.

You mentioned each record is 64KB and 3K rows. That suggests a total size of around 200MB. But, you mention the total file size is 400MB. So, either the rows are twice as large, or there are twice as many. If you have 3K rows of 128MB each, then each batch of data is 400MB, which is pretty large.

If your records are 64K in size, and we read 4K per batch, then the total size is 256MB, which is also large.

So, we are dealing with jumbo records and you really want the “batch size control” feature that we are working on, but have not yet shipped.

Let’s work out the math. How many sorts in your query? What other operators does the query include? Let’s assume a single sort.

Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. Since your batches are ~400 MB, things should work.

Since things don’t work, I suspect that we’re missing something.  (Note that the memory size we just calculated does not match the numbers shown in an earlier post in which the sort got just ~40 MB of memory…)

Try this:

* With your current settings, enable debug-level logging. Run your query.

* Open the Drillbit log. Look for the JSON version of the query plan (there will be two). One will tell you how much memory is given to the sort: 

maxAllocation: (some number)

* Ignore the one that says 10,000,000, find the one with a smaller number. What is that number?

* Then, look in the query profile for your query. Look at the peak memory for your JSON reader scan operator. The peak memory more-or-less reflects the batch size. What is that number?

With those, we can tell if the settings and sizes we think we are using are, in fact, correct.

Thanks,

- Paul

> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote:
> 
> Hi Paul,
> 
> Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.
> 
> Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5
> 
> Issue # 1:
> When loading a json file with 400MB I keep getting a DATA_READ ERROR.
> Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
> With all the recommendations provided by various experts, nothing has worked.
> 
> Issue 2#:
> While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
> Json#1 is 11k KB, has 8 fields with 74091 rows
> Json#2 is 752kb, has 8 fields with 4245 rows
> 
> Besides breaking them up to smaller files, not sure what else I could do.
> 
> Thanks for the help so far!
> 
> Yun
> 
> -----Original Message-----
> From: Paul Rogers [mailto:progers@mapr.com] 
> Sent: Thursday, November 2, 2017 11:06 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
> 
> Hi Yun,
> 
> I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.
> 
> How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)
> 
> You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.
> 
> We can guess the size, however, from the info you provided:
> 
> batchGroups.size 1
> spilledBatchGroups.size 0
> allocated memory 42768000
> allocator limit 41943040
> 
> This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.
> 
> It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.
> 
> (If you do want to use the “managed” version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)
> 
> Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)
> 
> We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.
> 
> Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:
> 
> * Take the “max query memory per node” (default of 2 GB regardless of actual direct memory),
> * Divide by the number of sort operators in the plan (as shown in the visualized query profile)
> * Divide by the “planner width” which is, by default, 70% of the number of cores on your system.
> 
> In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.
> 
> We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.
> 
> First, we can increase the memory per query:
> 
> ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296
> 
> That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.
> 
> And/or, we can reduce the number of fragments:
> 
> ALTER SESSION SET `planner.width.max_per_node` = <a number>
> 
> The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.
> 
> As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.
> 
> That is three separate possible solutions. Try them one by one or (carefully) together.
> 
> - Paul
> 
>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>>> 
>>>   Hi Kunal and Andries,
>>> 
>>>   Thanks for your reply. We need json in this case because Drill 
>>> only supports up to 65536 columns in a csv file.

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Paul,

Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing.

Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node = 5

Issue # 1:
When loading a json file with 400MB I keep getting a DATA_READ ERROR.
Each record in the file is about 64KB. Since it's a json file, there are only 4 fields per each record. Not sure how many records this file contains as it's too large to open with any tools, but I am guessing about 3k rows.
With all the recommendations provided by various experts, nothing has worked.

Issue 2#:
While processing a query with is a join of 2 functional .json files, I am getting a RESOURCE ERROR: One or more nodes ran out of memory while executing the query. These 2 json files alone process fine but when joined together, Drill throws me that error.
Json#1 is 11k KB, has 8 fields with 74091 rows
Json#2 is 752kb, has 8 fields with 4245 rows

Besides breaking them up to smaller files, not sure what else I could do.

Thanks for the help so far!

Yun

-----Original Message-----
From: Paul Rogers [mailto:progers@mapr.com] 
Sent: Thursday, November 2, 2017 11:06 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

Hi Yun,

I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the “managed” version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:

* Take the “max query memory per node” (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the “planner width” which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>> 
>>    Hi Kunal and Andries,
>> 
>>    Thanks for your reply. We need json in this case because Drill 
>> only supports up to 65536 columns in a csv file.

Re: Drill Capacity

Posted by Paul Rogers <pr...@mapr.com>.

Hi Yun,

I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far.

How large is each record in your file? How many fields? How many bytes? (Alternatively, how big is a single input file and how many records does it contain?)

You mention the limit of 64K columns in CSV. This makes me wonder if you have a “jumbo” record. If each individual record is large, then there won’t be enough space in the sort to take even a single batch of records, and you’ll get the sv2 error that you saw.

We can guess the size, however, from the info you provided:

batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

This says you have a batch in memory and are trying to allocate some memory (the “sv2”). The allocated memory number tells us that each batch size is probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs at least two batches in memory to make progress, hence the out-of-memory errors.

It would be nice to confirm this from the logs, but unfortunately, Drill does not normally log the size of each batch. As it turns out, however, the “managed” version that Boaz mentioned added more logging around this problem: it will tell you how large it thinks each batch is, and will warn if you have, say, a 43 MB batch but only 42 MB in which to sort.

(If you do want to use the “managed” version of the sort, I suggest you try Drill 1.12 when it is released as that version contains additional fixes to handle constrained memory.)

Also, at present, The JSON record reader loads 4096 records into each batch. If your file has at least that many records, then we can guess each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as noted above, by dividing total file size by record count.)

We are doing work to handle such large batches, but the work is not yet available in a release. Unfortunately, in the meanwhile, we also don’t let you control the batch size. But, we can provide another solution.

Let's explain why the message you provided said that the “allocator limit” was 42 MB. Drill does the following to allocate memory to the sort:

* Take the “max query memory per node” (default of 2 GB regardless of actual direct memory),
* Divide by the number of sort operators in the plan (as shown in the visualized query profile)
* Divide by the “planner width” which is, by default, 70% of the number of cores on your system.

In your case, if you are using the default 2 GB total, but getting 41 MB per sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% =~ 45.) Or some other combination.

We can’t reduce the number of sorts; that’s determined by your query. But, we can play with the other numbers.

First, we can increase the memory per query:

ALTER SESSION SET `planner.memory.max_query_memory_per_node` = 4,294,967,296

That is, 4 GB. This obviously means you must have at least 6 GB of direct memory; more is better.

And/or, we can reduce the number of fragments:

ALTER SESSION SET `planner.width.max_per_node` = <a number>

The value is a bit tricky. Drill normally creates a number of fragments equal to 70% of the number of CPUs on your system. Let’s say you have 32 cores. If so, change the max_per_node to, say, 10 or even 5. This will mean fewer sorts and so more memory per sort, helping compensate for the “jumbo” batches in your query. Pick a number based on your actual number of cores.

As an alternative, as Ted suggested, you could create a larger number of smaller files as this would solve the batch size problem while also getting the parallelization benefits that Kunal mentioned.

That is three separate possible solutions. Try them one by one or (carefully) together.

- Paul

>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>> 
>>    Hi Kunal and Andries,
>> 
>>    Thanks for your reply. We need json in this case because Drill only
>> supports up to 65536 columns in a csv file.

Re: Drill Capacity

Posted by Ted Dunning <te...@gmail.com>.

What happens if you split your large file into 5 smaller files?



On Thu, Nov 2, 2017 at 12:52 PM, Yun Liu <y....@castsoftware.com> wrote:

> Yes- I increased planner.memory.max_query_memory_per_node to 10GB
> HEAP to 12G
> Direct memory to 16G
> And Perm to 1024M
>
> It didn't have any schema changes. As with the same file format but less
> data- it works perfectly ok. I am unable to tell if there's corruption.
>
> Yun
>
> -----Original Message-----
> From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
> Sent: Thursday, November 2, 2017 3:35 PM
> To: user@drill.apache.org
> Subject: Re: Drill Capacity
>
> What memory setting did you increase? Have you tried 6 or 8GB?
>
> How much memory is allocated to Drill Heap and Direct memory for the
> embedded Drillbit?
>
> Also did you check the larger document doesn’t have any schema changes or
> corruption?
>
> --Andries
>
>
>
> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
>
>     Hi Kunal and Andries,
>
>     Thanks for your reply. We need json in this case because Drill only
> supports up to 65536 columns in a csv file. I also tried increasing the
> memory size to 4GB but I am still experiencing same issues. Drill is
> installed in Embedded Mode.
>
>     Thanks,
>     Yun
>
>     -----Original Message-----
>     From: Kunal Khatua [mailto:kkhatua@mapr.com]
>     Sent: Thursday, November 2, 2017 2:01 PM
>     To: user@drill.apache.org
>     Subject: RE: Drill Capacity
>
>     Hi Yun
>
>     Andries solution should address your problem. However, do understand
> that, unlike CSV files, a JSON file cannot be processed in parallel,
> because there is no clear record delimiter (CSV data usually has a new-line
> character to indicate the end of a record). So, the larger a file gets, the
> more work a single minor fragment has to do in processing it, including
> maintaining internal data-structures to represent the complex JSON document.
>
>     The preferable way would be to create more JSON files so that the
> files can be processed in parallel.
>
>     Hope that helps.
>
>     ~ Kunal
>
>     -----Original Message-----
>     From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
>     Sent: Thursday, November 02, 2017 10:26 AM
>     To: user@drill.apache.org
>     Subject: Re: Drill Capacity
>
>     How much memory is allocated to the Drill environment?
>     Embedded or in a cluster?
>
>     I don’t think there is a particular limit, but a single JSON file will
> be read by a single minor fragment, in general it is better to match the
> number/size of files to the Drill environment.
>
>     In the short term try to bump up planner.memory.max_query_memory_per_node
> in the options and see if that works for you.
>
>     --Andries
>
>
>
>     On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:
>
>         Hi,
>
>         I've been using Apache Drill actively and just wondering what is
> the capacity of Drill? I have a json file which is 390MB and it keeps
> throwing me an DATA_READ ERROR. I have another json file with exact same
> format but only 150MB and it's processing fine. When I did a *select* on
> the large json, it returns successfully for some of the fields. None of
> these errors really apply to me. So I am trying to understand the capacity
> of the json files Drill supports up to. Or if there's something else I
> missed.
>
>         Thanks,
>
>         Yun Liu
>         Solutions Delivery Consultant
>         321 West 44th St | Suite 501 | New York, NY 10036
>         +1 212.871.8355 office | +1 646.752.4933 mobile
>
>         CAST, Leader in Software Analysis and Measurement
>         Achieve Insight. Deliver Excellence.
>         Join the discussion http://blog.castsoftware.com/
>         LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<
> http://twitter.com/onquality> | Facebook<http://www.facebook.
> com/pages/CAST/105668942817177>
>
>
>
>
>
>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Boaz,



Seems I've already had those set to false. So it didn't help ☹



Thanks,

Yun



-----Original Message-----
From: Boaz Ben-Zvi [mailto:bben-zvi@mapr.com]
Sent: Thursday, November 2, 2017 6:14 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity



Hi Yun,



     Can you try using the “managed” version of the external sort – either change this option to false:



0: jdbc:drill:zk=local> select * from sys.options where name like '%man%';

+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+

|            name            |   kind   | accessibleScopes  | optionScope  |  status  | num_val  | string_val  | bool_val  | float_val  |

+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+

| exec.sort.disable_managed  | BOOLEAN  | ALL               | BOOT         | DEFAULT  | null     | null        | false     | null       |

+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+



Or override it into ‘false’ in the configuration:



0: jdbc:drill:zk=local> select * from sys.boot where name like '%managed%';

+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+

|                     name                      |   kind   | accessibleScopes  | optionScope  | status  | num_val  | string_val  | bool_val  | float_val  |

+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+

| drill.exec.options.exec.sort.disable_managed  | BOOLEAN  | BOOT              | BOOT         | BOOT    | null     | null        | false     | null       |

+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+



i.e., in the drill-override.conf file:



  sort: {

     external: {

         disable_managed: false

      }

  }



  Please let us know if this change helped,



         -- Boaz





On 11/2/17, 1:12 PM, "Yun Liu" <y....@castsoftware.com>> wrote:



    Please help me as to what further information I could provide to get this going. I am also experiencing a separate issue:



    RESOURCE ERROR: One or more nodes ran out of memory while executing the query.



    Unable to allocate sv2 for 8501 records, and not enough batchGroups to spill.

    batchGroups.size 1

    spilledBatchGroups.size 0

    allocated memory 42768000

    allocator limit 41943040



    Current setting is:

    planner.memory.max_query_memory_per_node= 10GB

    HEAP to 12G

    Direct memory to 32G

    Perm to 1024M



    What is the issue here?



    Thanks,

    Yun



    -----Original Message-----

    From: Yun Liu [mailto:y.liu@castsoftware.com]

    Sent: Thursday, November 2, 2017 3:52 PM

    To: user@drill.apache.org<ma...@drill.apache.org>

    Subject: RE: Drill Capacity



    Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M



    It didn't have any schema changes. As with the same file format but less data- it works perfectly ok. I am unable to tell if there's corruption.



    Yun



    -----Original Message-----

    From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]

    Sent: Thursday, November 2, 2017 3:35 PM

    To: user@drill.apache.org<ma...@drill.apache.org>

    Subject: Re: Drill Capacity



    What memory setting did you increase? Have you tried 6 or 8GB?



    How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?



    Also did you check the larger document doesn’t have any schema changes or corruption?



    --Andries







    On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com>> wrote:



        Hi Kunal and Andries,



        Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.



        Thanks,

        Yun



        -----Original Message-----

        From: Kunal Khatua [mailto:kkhatua@mapr.com]

        Sent: Thursday, November 2, 2017 2:01 PM

        To: user@drill.apache.org<ma...@drill.apache.org>

        Subject: RE: Drill Capacity



        Hi Yun



        Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document.



        The preferable way would be to create more JSON files so that the files can be processed in parallel.



        Hope that helps.



        ~ Kunal



        -----Original Message-----

        From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]

        Sent: Thursday, November 02, 2017 10:26 AM

        To: user@drill.apache.org<ma...@drill.apache.org>

        Subject: Re: Drill Capacity



        How much memory is allocated to the Drill environment?

        Embedded or in a cluster?



        I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.



        In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.



        --Andries







        On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com>> wrote:



            Hi,



            I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.



            Thanks,



            Yun Liu

            Solutions Delivery Consultant

            321 West 44th St | Suite 501 | New York, NY 10036

            +1 212.871.8355 office | +1 646.752.4933 mobile



            CAST, Leader in Software Analysis and Measurement

            Achieve Insight. Deliver Excellence.

            Join the discussion http://blog.castsoftware.com/

            LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Boaz,

Looks like I've already had those set to "false". So it didn't change much.

Thanks,
Yun

-----Original Message-----
From: Boaz Ben-Zvi [mailto:bben-zvi@mapr.com] 
Sent: Thursday, November 2, 2017 6:14 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

 Hi Yun,

     Can you try using the “managed” version of the external sort – either change this option to false:

0: jdbc:drill:zk=local> select * from sys.options where name like '%man%';
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+
|            name            |   kind   | accessibleScopes  | optionScope  |  status  | num_val  | string_val  | bool_val  | float_val  |
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+
| exec.sort.disable_managed  | BOOLEAN  | ALL               | BOOT         | DEFAULT  | null     | null        | false     | null       |
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+

Or override it into ‘false’ in the configuration:

0: jdbc:drill:zk=local> select * from sys.boot where name like '%managed%';
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+
|                     name                      |   kind   | accessibleScopes  | optionScope  | status  | num_val  | string_val  | bool_val  | float_val  |
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+
| drill.exec.options.exec.sort.disable_managed  | BOOLEAN  | BOOT              | BOOT         | BOOT    | null     | null        | false     | null       |
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+

i.e., in the drill-override.conf file:

  sort: {
     external: {
         disable_managed: false
      }
  }

  Please let us know if this change helped,

         -- Boaz 


On 11/2/17, 1:12 PM, "Yun Liu" <y....@castsoftware.com> wrote:

    Please help me as to what further information I could provide to get this going. I am also experiencing a separate issue:
    
    RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
    
    Unable to allocate sv2 for 8501 records, and not enough batchGroups to spill.
    batchGroups.size 1
    spilledBatchGroups.size 0
    allocated memory 42768000
    allocator limit 41943040
    
    Current setting is: 
    planner.memory.max_query_memory_per_node= 10GB 
    HEAP to 12G 
    Direct memory to 32G 
    Perm to 1024M
    
    What is the issue here?
    
    Thanks,
    Yun
    
    -----Original Message-----
    From: Yun Liu [mailto:y.liu@castsoftware.com] 
    Sent: Thursday, November 2, 2017 3:52 PM
    To: user@drill.apache.org
    Subject: RE: Drill Capacity
    
    Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M
    
    It didn't have any schema changes. As with the same file format but less data- it works perfectly ok. I am unable to tell if there's corruption.
    
    Yun
    
    -----Original Message-----
    From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
    Sent: Thursday, November 2, 2017 3:35 PM
    To: user@drill.apache.org
    Subject: Re: Drill Capacity
    
    What memory setting did you increase? Have you tried 6 or 8GB?
    
    How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?
    
    Also did you check the larger document doesn’t have any schema changes or corruption?
    
    --Andries
    
    
    
    On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
    
        Hi Kunal and Andries,
        
        Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.
        
        Thanks,
        Yun
        
        -----Original Message-----
        From: Kunal Khatua [mailto:kkhatua@mapr.com] 
        Sent: Thursday, November 2, 2017 2:01 PM
        To: user@drill.apache.org
        Subject: RE: Drill Capacity
        
        Hi Yun
        
        Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 
        
        The preferable way would be to create more JSON files so that the files can be processed in parallel. 
        
        Hope that helps.
        
        ~ Kunal
        
        -----Original Message-----
        From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
        Sent: Thursday, November 02, 2017 10:26 AM
        To: user@drill.apache.org
        Subject: Re: Drill Capacity
        
        How much memory is allocated to the Drill environment?
        Embedded or in a cluster?
        
        I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.
        
        In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.
        
        --Andries
        
        
        
        On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:
        
            Hi,
            
            I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
            
            Thanks,
            
            Yun Liu
            Solutions Delivery Consultant
            321 West 44th St | Suite 501 | New York, NY 10036
            +1 212.871.8355 office | +1 646.752.4933 mobile
            
            CAST, Leader in Software Analysis and Measurement
            Achieve Insight. Deliver Excellence.
            Join the discussion http://blog.castsoftware.com/
            LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

Re: Drill Capacity

Posted by Boaz Ben-Zvi <bb...@mapr.com>.

 Hi Yun,

     Can you try using the “managed” version of the external sort – either change this option to false:

0: jdbc:drill:zk=local> select * from sys.options where name like '%man%';
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+
|            name            |   kind   | accessibleScopes  | optionScope  |  status  | num_val  | string_val  | bool_val  | float_val  |
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+
| exec.sort.disable_managed  | BOOLEAN  | ALL               | BOOT         | DEFAULT  | null     | null        | false     | null       |
+----------------------------+----------+-------------------+--------------+----------+----------+-------------+-----------+------------+

Or override it into ‘false’ in the configuration:

0: jdbc:drill:zk=local> select * from sys.boot where name like '%managed%';
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+
|                     name                      |   kind   | accessibleScopes  | optionScope  | status  | num_val  | string_val  | bool_val  | float_val  |
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+
| drill.exec.options.exec.sort.disable_managed  | BOOLEAN  | BOOT              | BOOT         | BOOT    | null     | null        | false     | null       |
+-----------------------------------------------+----------+-------------------+--------------+---------+----------+-------------+-----------+------------+

i.e., in the drill-override.conf file:

  sort: {
     external: {
         disable_managed: false
      }
  }

  Please let us know if this change helped,

         -- Boaz 


On 11/2/17, 1:12 PM, "Yun Liu" <y....@castsoftware.com> wrote:

    Please help me as to what further information I could provide to get this going. I am also experiencing a separate issue:
    
    RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
    
    Unable to allocate sv2 for 8501 records, and not enough batchGroups to spill.
    batchGroups.size 1
    spilledBatchGroups.size 0
    allocated memory 42768000
    allocator limit 41943040
    
    Current setting is: 
    planner.memory.max_query_memory_per_node= 10GB 
    HEAP to 12G 
    Direct memory to 32G 
    Perm to 1024M
    
    What is the issue here?
    
    Thanks,
    Yun
    
    -----Original Message-----
    From: Yun Liu [mailto:y.liu@castsoftware.com] 
    Sent: Thursday, November 2, 2017 3:52 PM
    To: user@drill.apache.org
    Subject: RE: Drill Capacity
    
    Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M
    
    It didn't have any schema changes. As with the same file format but less data- it works perfectly ok. I am unable to tell if there's corruption.
    
    Yun
    
    -----Original Message-----
    From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
    Sent: Thursday, November 2, 2017 3:35 PM
    To: user@drill.apache.org
    Subject: Re: Drill Capacity
    
    What memory setting did you increase? Have you tried 6 or 8GB?
    
    How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?
    
    Also did you check the larger document doesn’t have any schema changes or corruption?
    
    --Andries
    
    
    
    On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:
    
        Hi Kunal and Andries,
        
        Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.
        
        Thanks,
        Yun
        
        -----Original Message-----
        From: Kunal Khatua [mailto:kkhatua@mapr.com] 
        Sent: Thursday, November 2, 2017 2:01 PM
        To: user@drill.apache.org
        Subject: RE: Drill Capacity
        
        Hi Yun
        
        Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 
        
        The preferable way would be to create more JSON files so that the files can be processed in parallel. 
        
        Hope that helps.
        
        ~ Kunal
        
        -----Original Message-----
        From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
        Sent: Thursday, November 02, 2017 10:26 AM
        To: user@drill.apache.org
        Subject: Re: Drill Capacity
        
        How much memory is allocated to the Drill environment?
        Embedded or in a cluster?
        
        I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.
        
        In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.
        
        --Andries
        
        
        
        On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:
        
            Hi,
            
            I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
            
            Thanks,
            
            Yun Liu
            Solutions Delivery Consultant
            321 West 44th St | Suite 501 | New York, NY 10036
            +1 212.871.8355 office | +1 646.752.4933 mobile
            
            CAST, Leader in Software Analysis and Measurement
            Achieve Insight. Deliver Excellence.
            Join the discussion http://blog.castsoftware.com/
            LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Please help me as to what further information I could provide to get this going. I am also experiencing a separate issue:

RESOURCE ERROR: One or more nodes ran out of memory while executing the query.

Unable to allocate sv2 for 8501 records, and not enough batchGroups to spill.
batchGroups.size 1
spilledBatchGroups.size 0
allocated memory 42768000
allocator limit 41943040

Current setting is: 
planner.memory.max_query_memory_per_node= 10GB 
HEAP to 12G 
Direct memory to 32G 
Perm to 1024M

What is the issue here?

Thanks,
Yun

-----Original Message-----
From: Yun Liu [mailto:y.liu@castsoftware.com] 
Sent: Thursday, November 2, 2017 3:52 PM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M

It didn't have any schema changes. As with the same file format but less data- it works perfectly ok. I am unable to tell if there's corruption.

Yun

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
Sent: Thursday, November 2, 2017 3:35 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

What memory setting did you increase? Have you tried 6 or 8GB?

How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?

Also did you check the larger document doesn’t have any schema changes or corruption?

--Andries



On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:

    Hi Kunal and Andries,
    
    Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.
    
    Thanks,
    Yun
    
    -----Original Message-----
    From: Kunal Khatua [mailto:kkhatua@mapr.com] 
    Sent: Thursday, November 2, 2017 2:01 PM
    To: user@drill.apache.org
    Subject: RE: Drill Capacity
    
    Hi Yun
    
    Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 
    
    The preferable way would be to create more JSON files so that the files can be processed in parallel. 
    
    Hope that helps.
    
    ~ Kunal
    
    -----Original Message-----
    From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
    Sent: Thursday, November 02, 2017 10:26 AM
    To: user@drill.apache.org
    Subject: Re: Drill Capacity
    
    How much memory is allocated to the Drill environment?
    Embedded or in a cluster?
    
    I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.
    
    In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.
    
    --Andries
    
    
    
    On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:
    
        Hi,
        
        I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
        
        Thanks,
        
        Yun Liu
        Solutions Delivery Consultant
        321 West 44th St | Suite 501 | New York, NY 10036
        +1 212.871.8355 office | +1 646.752.4933 mobile
        
        CAST, Leader in Software Analysis and Measurement
        Achieve Insight. Deliver Excellence.
        Join the discussion http://blog.castsoftware.com/
        LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Yes- I increased planner.memory.max_query_memory_per_node to 10GB
HEAP to 12G
Direct memory to 16G
And Perm to 1024M

It didn't have any schema changes. As with the same file format but less data- it works perfectly ok. I am unable to tell if there's corruption.

Yun

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
Sent: Thursday, November 2, 2017 3:35 PM
To: user@drill.apache.org
Subject: Re: Drill Capacity

What memory setting did you increase? Have you tried 6 or 8GB?

How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?

Also did you check the larger document doesn’t have any schema changes or corruption?

--Andries



On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:

    Hi Kunal and Andries,
    
    Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.
    
    Thanks,
    Yun
    
    -----Original Message-----
    From: Kunal Khatua [mailto:kkhatua@mapr.com] 
    Sent: Thursday, November 2, 2017 2:01 PM
    To: user@drill.apache.org
    Subject: RE: Drill Capacity
    
    Hi Yun
    
    Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 
    
    The preferable way would be to create more JSON files so that the files can be processed in parallel. 
    
    Hope that helps.
    
    ~ Kunal
    
    -----Original Message-----
    From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
    Sent: Thursday, November 02, 2017 10:26 AM
    To: user@drill.apache.org
    Subject: Re: Drill Capacity
    
    How much memory is allocated to the Drill environment?
    Embedded or in a cluster?
    
    I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.
    
    In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.
    
    --Andries
    
    
    
    On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:
    
        Hi,
        
        I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
        
        Thanks,
        
        Yun Liu
        Solutions Delivery Consultant
        321 West 44th St | Suite 501 | New York, NY 10036
        +1 212.871.8355 office | +1 646.752.4933 mobile
        
        CAST, Leader in Software Analysis and Measurement
        Achieve Insight. Deliver Excellence.
        Join the discussion http://blog.castsoftware.com/
        LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

Re: Drill Capacity

Posted by Andries Engelbrecht <ae...@mapr.com>.

What memory setting did you increase? Have you tried 6 or 8GB?

How much memory is allocated to Drill Heap and Direct memory for the embedded Drillbit?

Also did you check the larger document doesn’t have any schema changes or corruption?

--Andries

On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote:

Hi Kunal and Andries,

Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.

Thanks,
Yun

-----Original Message-----
From: Kunal Khatua [mailto:kkhatua@mapr.com]
Sent: Thursday, November 2, 2017 2:01 PM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Yun

Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document.

The preferable way would be to create more JSON files so that the files can be processed in parallel.

Hope that helps.

~ Kunal

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com]
Sent: Thursday, November 02, 2017 10:26 AM
To: user@drill.apache.org
Subject: Re: Drill Capacity

How much memory is allocated to the Drill environment?
Embedded or in a cluster?

I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.

In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.

--Andries

On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:

Hi,

I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.

Thanks,

Yun Liu
Solutions Delivery Consultant
321 West 44th St | Suite 501 | New York, NY 10036
+1 212.871.8355 office | +1 646.752.4933 mobile

CAST, Leader in Software Analysis and Measurement
Achieve Insight. Deliver Excellence.
Join the discussion http://blog.castsoftware.com/
LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

RE: Drill Capacity

Posted by Yun Liu <y....@castsoftware.com>.

Hi Kunal and Andries,

Thanks for your reply. We need json in this case because Drill only supports up to 65536 columns in a csv file. I also tried increasing the memory size to 4GB but I am still experiencing same issues. Drill is installed in Embedded Mode.

Thanks,
Yun

-----Original Message-----
From: Kunal Khatua [mailto:kkhatua@mapr.com] 
Sent: Thursday, November 2, 2017 2:01 PM
To: user@drill.apache.org
Subject: RE: Drill Capacity

Hi Yun

Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 

The preferable way would be to create more JSON files so that the files can be processed in parallel. 

Hope that helps.

~ Kunal

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
Sent: Thursday, November 02, 2017 10:26 AM
To: user@drill.apache.org
Subject: Re: Drill Capacity

How much memory is allocated to the Drill environment?
Embedded or in a cluster?

I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.

In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.

--Andries



On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:

    Hi,
    
    I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
    
    Thanks,
    
    Yun Liu
    Solutions Delivery Consultant
    321 West 44th St | Suite 501 | New York, NY 10036
    +1 212.871.8355 office | +1 646.752.4933 mobile
    
    CAST, Leader in Software Analysis and Measurement
    Achieve Insight. Deliver Excellence.
    Join the discussion http://blog.castsoftware.com/
    LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

RE: Drill Capacity

Posted by Kunal Khatua <kk...@mapr.com>.

Hi Yun

Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel, because there is no clear record delimiter (CSV data usually has a new-line character to indicate the end of a record). So, the larger a file gets, the more work a single minor fragment has to do in processing it, including maintaining internal data-structures to represent the complex JSON document. 

The preferable way would be to create more JSON files so that the files can be processed in parallel. 

Hope that helps.

~ Kunal

-----Original Message-----
From: Andries Engelbrecht [mailto:aengelbrecht@mapr.com] 
Sent: Thursday, November 02, 2017 10:26 AM
To: user@drill.apache.org
Subject: Re: Drill Capacity

How much memory is allocated to the Drill environment?
Embedded or in a cluster?

I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.

In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.

--Andries



On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:

    Hi,
    
    I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
    
    Thanks,
    
    Yun Liu
    Solutions Delivery Consultant
    321 West 44th St | Suite 501 | New York, NY 10036
    +1 212.871.8355 office | +1 646.752.4933 mobile
    
    CAST, Leader in Software Analysis and Measurement
    Achieve Insight. Deliver Excellence.
    Join the discussion http://blog.castsoftware.com/
    LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>

Re: Drill Capacity

Posted by Andries Engelbrecht <ae...@mapr.com>.

How much memory is allocated to the Drill environment?
Embedded or in a cluster?

I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment.

In the short term try to bump up planner.memory.max_query_memory_per_node in the options and see if that works for you.

--Andries



On 11/2/17, 7:46 AM, "Yun Liu" <y....@castsoftware.com> wrote:

    Hi,
    
    I've been using Apache Drill actively and just wondering what is the capacity of Drill? I have a json file which is 390MB and it keeps throwing me an DATA_READ ERROR. I have another json file with exact same format but only 150MB and it's processing fine. When I did a *select* on the large json, it returns successfully for some of the fields. None of these errors really apply to me. So I am trying to understand the capacity of the json files Drill supports up to. Or if there's something else I missed.
    
    Thanks,
    
    Yun Liu
    Solutions Delivery Consultant
    321 West 44th St | Suite 501 | New York, NY 10036
    +1 212.871.8355 office | +1 646.752.4933 mobile
    
    CAST, Leader in Software Analysis and Measurement
    Achieve Insight. Deliver Excellence.
    Join the discussion http://blog.castsoftware.com/
    LinkedIn<http://www.linkedin.com/companies/162909> | Twitter<http://twitter.com/onquality> | Facebook<http://www.facebook.com/pages/CAST/105668942817177>