You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2016/06/06 19:33:20 UTC

[jira] [Commented] (DRILL-4709) Document the included Foodmart sample data

    [ https://issues.apache.org/jira/browse/DRILL-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317059#comment-15317059 ] 

Julian Hyde commented on DRILL-4709:
------------------------------------

Microsoft have never objected to this use of the data, nor have they shown much interest in curating it. The de facto home of the data is my foodmart-data-hsqldb project:

https://github.com/julianhyde/foodmart-data-hsqldb

You will see that on that page I make some effort to describe the schema, etc. Maybe you could help improve that site, and include a reference to that site in Drill documentation.

Under ASL you could of course copy that site into Drill's documentation but please, for heaven's sake, don't fork.

> Document the included Foodmart sample data
> ------------------------------------------
>
>                 Key: DRILL-4709
>                 URL: https://issues.apache.org/jira/browse/DRILL-4709
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.6.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> Drill includes a JSON version of the Mondrian FoodMart sample data. This data appears in the $DRILL_HOME/jars/3rdparty/foodmart-data-json-0.4.jar jar file, accessible using the class path storage plugin.
> The documentation mentions using the cp plugin to access customers.json. However, the FoodMart data set is quite rich, with many example files.
> As it is, unless someone is a curious developer, and good with Google, they won't be able to find the other data sets or the source of the FoodMart data.
> The data appears to be a JSON version of the SQL sample data for the Mondrian project. A schema description is here: https://github.com/pentaho/mondrian/blob/master/demo/FoodMart.xml
> The Mondrian data appears to have originated at Microsoft to highlight their circa 2000 OLAP projects, but has since been discontinued. See
> * http://sqlmag.com/development/dts-2000-action
> * https://technet.microsoft.com/en-us/library/aa217032(v=sql.80).aspx
> * http://sqlmag.com/sql-server/desperately-seeking-samples
> Or do a Google search for "microsoft foodmart database".
> The request is to:
> 1. Credit MS and Mondrian for the data.
> 2. Either explain the data (which is quite a bit of work), or
> 3. Explain how to extract the files from the jar file to explore manually.
> 4. Provide a pointer to a description of the schema (if such can be found.)
> For option 3:
> cd $DRILL_HOME/jars/3rdparty
> unzip foodmart-data-json-0.4.jar -d ~/foodmart
> cd ~/foodmart
> ls
> Looking at the data, it is clear that SOME description is needed to understand the many tables and how they might work with Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)