You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/07/09 19:24:22 UTC

[GitHub] [incubator-superset] villebro commented on a change in pull request #7773: Improve examples & related tests

villebro commented on a change in pull request #7773: Improve examples & related tests
URL: https://github.com/apache/incubator-superset/pull/7773#discussion_r301754578
 
 

 ##########
 File path: superset/data/bart_lines.py
 ##########
 @@ -21,37 +21,41 @@
 from sqlalchemy import String, Text
 
 from superset import db
-from superset.utils.core import get_or_create_main_db
-from .helpers import TBL, get_example_data
+from superset.utils.core import get_example_database
+from .helpers import get_example_data, TBL
 
 
-def load_bart_lines():
+def load_bart_lines(only_metadata=False):
     tbl_name = "bart_lines"
-    content = get_example_data("bart-lines.json.gz")
-    df = pd.read_json(content, encoding="latin-1")
-    df["path_json"] = df.path.map(json.dumps)
-    df["polyline"] = df.path.map(polyline.encode)
-    del df["path"]
+    database = get_example_database()
+
+    if not only_metadata:
+        content = get_example_data("bart-lines.json.gz")
+        df = pd.read_json(content, encoding="latin-1")
+        df["path_json"] = df.path.map(json.dumps)
+        df["polyline"] = df.path.map(polyline.encode)
+        del df["path"]
+
+        df.to_sql(
+            tbl_name,
+            database.get_sqla_engine(),
+            if_exists="replace",
+            chunksize=500,
+            dtype={
+                "color": String(255),
+                "name": String(255),
+                "polyline": Text,
+                "path_json": Text,
+            },
+            index=False,
+        )
 
 Review comment:
   Oh, one more thing; when I added csv import functionality for BigQuery, I refactored `db_engine_specs` so that one can call `db_engine_spec.df_to_sql(df, *kwargs)` in place of `df.to_sql(*kwargs)` for engines that don't support `df.to_sql()`. So to make this work universally here, one would write
   ```python
           database.db_engine_spec.df_to_sql(
               tbl_name,
               database.get_sqla_engine(),
               if_exists="replace",
               chunksize=500,
               dtype={
                   "color": String(255),
                   "name": String(255),
                   "polyline": Text,
                   "path_json": Text,
               },
               index=False,
           )
   ```
   This doesn't necessarily have to be addressed in this PR; I can do that later, too, as I have a good test rig for that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org