You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by "miomiocat (via GitHub)" <gi...@apache.org> on 2023/04/27 03:04:14 UTC

[GitHub] [superset] miomiocat opened a new pull request, #23209: feat: Add StarRocks support

miomiocat opened a new pull request, #23209:
URL: https://github.com/apache/superset/pull/23209

   <!---
   Please write the PR title following the conventions at https://www.conventionalcommits.org/en/v1.0.0/
   Example:
   fix(dashboard): load charts correctly
   -->
   
   ### SUMMARY
   
   Add StarRocks support
   
   Starrocks database is based on mysql protocal but introduce the concept of catalog and other differences, so there will be incompatibility when directly using mysqldb engine spec and it is necessary to introduce a separate Starrocks engine spec to fix various issues and use starrocks python client directly
   
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   <!--- Skip this if not applicable -->
   
   ### TESTING INSTRUCTIONS
   <!--- Required! What steps can be taken to manually verify the changes? -->
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [x] Changes UI
   - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [x] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198240502


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,235 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import Integer, Numeric, types
+from sqlalchemy.engine import Inspector
+from sqlalchemy.engine.result import Row as ResultRow
+from sqlalchemy.engine.url import URL
+from sqlalchemy.sql.type_api import TypeEngine
+
+from superset.db_engine_specs.mysql import MySQLEngineSpec
+from superset.errors import SupersetErrorType
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+logger = logging.getLogger(__name__)
+
+
+class TINYINT(Integer):
+    __visit_name__ = "TINYINT"
+
+
+class DOUBLE(Numeric):
+    __visit_name__ = "DOUBLE"
+
+
+class ARRAY(TypeEngine):
+    __visit_name__ = "ARRAY"
+
+
+class MAP(TypeEngine):
+    __visit_name__ = "MAP"
+
+
+class STRUCT(TypeEngine):
+    __visit_name__ = "STRUCT"
+
+
+class StarRocksEngineSpec(MySQLEngineSpec):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/catalog.db[?key=value&key=value...]"
+    )
+
+    column_type_mappings = (
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^smallint", re.IGNORECASE),
+            types.SMALLINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            types.INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bigint.*", re.IGNORECASE),
+            types.BIGINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal.*", re.IGNORECASE),
+            types.DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            types.FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^varchar(\((\d+)\))*$", re.IGNORECASE),
+            types.VARCHAR(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^char(\((\d+)\))*$", re.IGNORECASE),
+            types.CHAR(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^string", re.IGNORECASE),
+            types.String(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^datetime", re.IGNORECASE),
+            types.TIMESTAMP(),
+            GenericDataType.TEMPORAL,
+        ),
+        (
+            re.compile(r"^date", re.IGNORECASE),
+            types.Date(),
+            GenericDataType.TEMPORAL,
+        ),
+        (
+            re.compile(r"^binary.*", re.IGNORECASE),
+            types.String(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^array.*", re.IGNORECASE),
+            ARRAY(),
+            GenericDataType.STRING
+        ),
+        (
+            re.compile(r"^map.*", re.IGNORECASE),
+            MAP(),
+            GenericDataType.STRING
+        ),
+        (
+            re.compile(r"^struct.*", re.IGNORECASE),
+            STRUCT(),
+            GenericDataType.STRING
+        ),

Review Comment:
   ```suggestion
           (re.compile(r"^array.*", re.IGNORECASE), ARRAY(), GenericDataType.STRING),
           (re.compile(r"^map.*", re.IGNORECASE), MAP(), GenericDataType.STRING),
           (re.compile(r"^struct.*", re.IGNORECASE), STRUCT(), GenericDataType.STRING),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1553612491

   Now getting an error from mypy:
   
   ```
   superset/db_engine_specs/starrocks.py:71: error: Incompatible types in assignment (expression has type Tuple[Tuple[Pattern[str], TINYINT, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], ... <14 more items>], base class "MySQLEngineSpec" defined the type as Tuple[Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType]])  [assignment]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "villebro (via GitHub)" <gi...@apache.org>.
villebro commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198769805


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,239 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: no-init
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import Integer, Numeric, types
+from sqlalchemy.engine import Inspector
+from sqlalchemy.engine.result import Row as ResultRow
+from sqlalchemy.engine.url import URL
+from sqlalchemy.sql.type_api import TypeEngine
+
+from superset.db_engine_specs.mysql import MySQLEngineSpec
+from superset.errors import SupersetErrorType
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+logger = logging.getLogger(__name__)
+
+
+class TINYINT(Integer):
+    __visit_name__ = "TINYINT"
+
+
+class DOUBLE(Numeric):
+    __visit_name__ = "DOUBLE"
+
+
+class ARRAY(TypeEngine):
+    __visit_name__ = "ARRAY"
+
+    @property
+    def python_type(self) -> Optional[Type[List[Any]]]:
+        return list
+
+
+class MAP(TypeEngine):
+    __visit_name__ = "MAP"
+
+    @property
+    def python_type(self) -> Optional[Type[Dict[Any, Any]]]:
+        return dict
+
+
+class STRUCT(TypeEngine):
+    __visit_name__ = "STRUCT"
+
+    @property
+    def python_type(self) -> Optional[Type[Any]]:
+        return None
+
+
+class StarRocksEngineSpec(MySQLEngineSpec):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/catalog.db[?key=value&key=value...]"
+    )
+
+    _default_column_type_mappings = (

Review Comment:
   this should be 
   ```suggestion
       column_type_mappings = (
   ```
   
   Also note, that you don't need to reimplement all types - Any types that are already covered by `_default_column_type_mappings` can be omitted. In addition, I would suggest adding unit tests to ensure that the main types are identified correctly (see `tests/unit_tests/db_engine_specs` for other examples of how this is done).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1143043758


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   Thanks for your advice
   
   I refactored these codes, inherit from `MySQLEngineSpec`. Only the necessary code is kept now.
   
   PTAL
   
   @john-bodley 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1449257641

   > Thanks @miomiocat for the contribution. We (Airbnb) are using StarRocks within Superset, but given StarRocks is [compatible with the MySQL protocol](https://docs.starrocks.io/en-us/latest/introduction/StarRocks_intro) we've had good mileage by simply extending the `MySQLEngineSpec`, i.e.,
   > 
   > ```python
   > class StarRocksEngineSpec(MySQLEngineSpec):
   >     """
   >     The StarRocks engine specification.
   >     """
   > 
   >     engine = "starrocks"
   >     engine_name = "StarRocks"
   > ```
   > 
   > I was wondering whether you considered this option as this adheres to the DRY principle.
   
   Thanks for your reply.
   
   Actually, StarRocks is mostly compatible with mysql but not all.
   If you are using internal table in StarRocks only, this implementation can work.
   But for [external catalog](https://docs.starrocks.io/zh-cn/latest/data_source/catalog/hive_catalog), it will throw `Database not found` exception in Superset.
   
   for example, you can reproduce this issue with following steps:
   1. create a [external catalog](https://docs.starrocks.io/zh-cn/latest/data_source/catalog/hive_catalog)
   2. connect this catalog in superset using uri : `starrocks://user:pass@host:port/<catalog name>.<database name> `
   3. select a schema from superset UI and execute any SQL in SQL editor like `show tables`
   
   The reason for this exception is that mysql does not handle the catalog in StarRocks so I handle this issue from `adjust_database_uri`.
   
   
   We are considering changing the format of the current URI and other things. For future expansion, we need a separate StarRocks engine spec.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1559541454

   Looks like we're good! Thanks so much, @miomiocat!!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas merged pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas merged PR #23209:
URL: https://github.com/apache/superset/pull/23209


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov[bot] commented on pull request #23209: feat: Add StarRocks support

Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1445960579

   # [Codecov](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#23209](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (951d8ee) into [master](https://codecov.io/gh/apache/superset/commit/64ad70cc8d3b0cbf659ee9dfdf8e7db4ae83fa7a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (64ad70c) will **decrease** coverage by `10.64%`.
   > The diff coverage is `50.60%`.
   
   > :exclamation: Current head 951d8ee differs from pull request most recent head 4a03a1b. Consider uploading reports for the commit 4a03a1b to get more accurate results
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #23209       +/-   ##
   ===========================================
   - Coverage   66.81%   56.17%   -10.64%     
   ===========================================
     Files        1900     1901        +1     
     Lines       73318    73401       +83     
     Branches     7935     7935               
   ===========================================
   - Hits        48986    41236     -7750     
   - Misses      22299    30132     +7833     
     Partials     2033     2033               
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hive | `?` | |
   | presto | `52.65% <50.60%> (-0.01%)` | :arrow_down: |
   | python | `58.82% <50.60%> (-22.00%)` | :arrow_down: |
   | sqlite | `?` | |
   | unit | `52.52% <50.60%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [superset/db\_engine\_specs/starrocks.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGJfZW5naW5lX3NwZWNzL3N0YXJyb2Nrcy5weQ==) | `50.60% <50.60%> (ø)` | |
   | [superset/utils/dashboard\_import\_export.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvdXRpbHMvZGFzaGJvYXJkX2ltcG9ydF9leHBvcnQucHk=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [superset/tags/core.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvdGFncy9jb3JlLnB5) | `4.54% <0.00%> (-95.46%)` | :arrow_down: |
   | [superset/key\_value/commands/update.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL3VwZGF0ZS5weQ==) | `0.00% <0.00%> (-90.91%)` | :arrow_down: |
   | [superset/key\_value/commands/delete.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL2RlbGV0ZS5weQ==) | `0.00% <0.00%> (-87.88%)` | :arrow_down: |
   | [superset/key\_value/commands/delete\_expired.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL2RlbGV0ZV9leHBpcmVkLnB5) | `0.00% <0.00%> (-84.00%)` | :arrow_down: |
   | [superset/dashboards/commands/importers/v0.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGFzaGJvYXJkcy9jb21tYW5kcy9pbXBvcnRlcnMvdjAucHk=) | `15.62% <0.00%> (-76.25%)` | :arrow_down: |
   | [superset/reports/commands/execute.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvcmVwb3J0cy9jb21tYW5kcy9leGVjdXRlLnB5) | `23.07% <0.00%> (-68.57%)` | :arrow_down: |
   | [superset/views/datasource/utils.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvdmlld3MvZGF0YXNvdXJjZS91dGlscy5weQ==) | `26.66% <0.00%> (-66.67%)` | :arrow_down: |
   | [superset/datasets/commands/importers/v0.py](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGF0YXNldHMvY29tbWFuZHMvaW1wb3J0ZXJzL3YwLnB5) | `24.03% <0.00%> (-66.67%)` | :arrow_down: |
   | ... and [280 more](https://codecov.io/gh/apache/superset/pull/23209?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas closed pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas closed pull request #23209: feat: Add StarRocks support
URL: https://github.com/apache/superset/pull/23209


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1467753971

   > Thanks @miomiocat for clarifying. I've left some comments.
   
   Thanks @john-bodley 
   I have fixed your comments, PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1134791462


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian

Review Comment:
   What's the context of this TODO?



##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   Setting the `type_code_map` in a get method seems somewhat atypical. Could this logic reside elsewhere?



##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:
+            # only import and store if needed at least once
+            # pylint: disable=import-outside-toplevel
+            import MySQLdb
+
+            ft = MySQLdb.constants.FIELD_TYPE
+            cls.type_code_map = {
+                getattr(ft, k): k for k in dir(ft) if not k.startswith("_")
+            }
+        datatype = type_code
+        if isinstance(type_code, int):
+            datatype = cls.type_code_map.get(type_code)
+        if datatype and isinstance(datatype, str) and datatype:

Review Comment:
   You have `datatype` twice here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1552960864

   Fixed field type issues because mysql does not have STRUCT/MAP/ARRAY types.
   Please help me review again.
   
   @john-bodley @rusackas 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1140527822


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   I understand the concept of lazy loading, but wondered if this could be defined elsewhere. That said it does seem like this pattern is used for other connectors. This code block (and others) does feel like a copy-and-paste from the MySQL connector and thus (per my previous comment) I wonder if it makes more sense to inherit from `MySQLEngineSpec` and just override what is necessary.
   
   @villebro this is somewhat akin to the Presto/Trino refactor where one could debate they're either one in the same or completely different.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1542296165

   I've kicked CI a couple of times, but the DB tests still seem to be failing, so those need some 👀. @villebro do you know if this is victim to a flaky test, or if this looks legit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1558750734

   > Now getting an error from mypy:
   > 
   > ```
   > superset/db_engine_specs/starrocks.py:71: error: Incompatible types in assignment (expression has type Tuple[Tuple[Pattern[str], TINYINT, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], ... <14 more items>], base class "MySQLEngineSpec" defined the type as Tuple[Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType]])  [assignment]
   > ```
   
   @rusackas 
   Thanks for your reply
   
   I've fixed this issue and passed all check except one error in `pre-commit` stage with following message.
   can you give me some help?
   
   ```
   [INFO] This may take a few minutes...
   isort....................................................................Failed
   - hook id: isort
   - files were modified by this hook
   
   Fixing /home/runner/work/superset/superset/tests/unit_tests/db_engine_specs/test_starrocks.py
   Skipped [49](https://github.com/miomiocat/superset/actions/runs/5054598952/jobs/9069709016#step:6:50) files
   
   mypy.....................................................................Passed
   pip-compile-multi verify.................................................Passed
   check docstring is first.................................................Passed
   check for added large files..............................................Passed
   check yaml...............................................................Passed
   debug statements (python)................................................Passed
   fix end of files.........................................................Passed
   trim trailing whitespace.................................................Passed
   black....................................................................Failed
   - hook id: black
   - files were modified by this hook
   
   reformatted tests/unit_tests/db_engine_specs/test_starrocks.py
   
   All done! ✨ 🍰 ✨
   1 file reformatted, 1271 files left unchanged.
   
   prettier.................................................................Passed
   Blacklist................................................................Passed
   Helm Docs................................................................Passed
   Error: Process completed with exit code 1.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1201695541


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,239 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: no-init
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import Integer, Numeric, types
+from sqlalchemy.engine import Inspector
+from sqlalchemy.engine.result import Row as ResultRow
+from sqlalchemy.engine.url import URL
+from sqlalchemy.sql.type_api import TypeEngine
+
+from superset.db_engine_specs.mysql import MySQLEngineSpec
+from superset.errors import SupersetErrorType
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+logger = logging.getLogger(__name__)
+
+
+class TINYINT(Integer):
+    __visit_name__ = "TINYINT"
+
+
+class DOUBLE(Numeric):
+    __visit_name__ = "DOUBLE"
+
+
+class ARRAY(TypeEngine):
+    __visit_name__ = "ARRAY"
+
+    @property
+    def python_type(self) -> Optional[Type[List[Any]]]:
+        return list
+
+
+class MAP(TypeEngine):
+    __visit_name__ = "MAP"
+
+    @property
+    def python_type(self) -> Optional[Type[Dict[Any, Any]]]:
+        return dict
+
+
+class STRUCT(TypeEngine):
+    __visit_name__ = "STRUCT"
+
+    @property
+    def python_type(self) -> Optional[Type[Any]]:
+        return None
+
+
+class StarRocksEngineSpec(MySQLEngineSpec):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/catalog.db[?key=value&key=value...]"
+    )
+
+    _default_column_type_mappings = (

Review Comment:
   thanks:)
   added unit tests and updated codes, please review it again. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1135268095


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian

Review Comment:
   removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1143043758


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   Thanks for your suggestion.
   
   I refactored these codes, inherit from `MySQLEngineSpec`. Only the necessary code is kept now.
   
   PTAL
   
   @john-bodley 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on pull request #23209: feat: Add StarRocks support

Posted by "villebro (via GitHub)" <gi...@apache.org>.
villebro commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1559083989

   @miomiocat if you check the `CONTRIBUTING.md` guide, it shows how to enable pre-commit hooks that automatically check and fix linting issues like this. I pushed a fix to resolve the linting issue, should be good to go after that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1134793887


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:
+            # only import and store if needed at least once
+            # pylint: disable=import-outside-toplevel
+            import MySQLdb
+
+            ft = MySQLdb.constants.FIELD_TYPE
+            cls.type_code_map = {
+                getattr(ft, k): k for k in dir(ft) if not k.startswith("_")
+            }
+        datatype = type_code
+        if isinstance(type_code, int):
+            datatype = cls.type_code_map.get(type_code)
+        if datatype and isinstance(datatype, str) and datatype:

Review Comment:
   `datatype` is specified twice here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1134793362


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   Setting the `type_code_map` in a get method seems somewhat atypical. Could this logic reside elsewhere, i.e., upon object instantiation or within the class definition?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1524557894

   Closing and re-opening to kick-start CI (not sure why it's stuck in some eternal waiting state).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1447713516

   Thanks @miomiocat for the contribution. We (Airbnb) are using StarRocks within Superset, but given StarRocks is [compatible with the MySQL protocol](https://docs.starrocks.io/en-us/latest/introduction/StarRocks_intro) we've had good mileage by simply extending the `MySQLEngineSpec`, i.e., 
   
   ```python
   class StarRocksEngineSpec(MySQLEngineSpec):
       """
       The StarRocks engine specification.
       """
   
       engine = "starrocks"
       engine_name = "StarRocks"
   ```
   
   I was wondering whether you considered this option as this adheres to the DRY principle.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "villebro (via GitHub)" <gi...@apache.org>.
villebro commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198607949


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,236 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import Integer, Numeric, types
+from sqlalchemy.engine import Inspector
+from sqlalchemy.engine.result import Row as ResultRow
+from sqlalchemy.engine.url import URL
+from sqlalchemy.sql.type_api import TypeEngine
+
+from superset.db_engine_specs import BaseEngineSpec
+from superset.db_engine_specs.mysql import MySQLEngineSpec
+from superset.errors import SupersetErrorType
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+logger = logging.getLogger(__name__)
+
+
+class TINYINT(Integer):
+    __visit_name__ = "TINYINT"
+
+
+class DOUBLE(Numeric):
+    __visit_name__ = "DOUBLE"
+
+
+class ARRAY(TypeEngine):
+    __visit_name__ = "ARRAY"
+
+    @property
+    def python_type(self) -> Optional[Type[List[Any]]]:
+        return list
+
+
+class MAP(TypeEngine):
+    __visit_name__ = "MAP"
+
+    @property
+    def python_type(self) -> Optional[Type[Dict[Any, Any]]]:
+        return dict
+
+
+class STRUCT(TypeEngine):
+    __visit_name__ = "STRUCT"
+
+    @property
+    def python_type(self) -> Optional[Type[Any]]:
+        return None
+
+
+class StarRocksEngineSpec(MySQLEngineSpec):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/catalog.db[?key=value&key=value...]"
+    )
+
+    super().column_type_mappings = (

Review Comment:
   ```suggestion
       column_type_mappings = (
   ```
   I don't think you can have `super()` here, check how the other db engine specs do it. The way this works, is `column_type_mappings` is checked first, and if there are no hits, it proceeds to `_default_column_type_mappings`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1553579269

   PR was failing due to some pre-commit hook linting errors. Apologies for the churn as I try to get those ironed out ;) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198224144


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,235 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, Optional, Pattern, Tuple, List, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types, Integer, Numeric

Review Comment:
   ```suggestion
   from sqlalchemy import, Integer, Numeric, types
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1457424782

   @etr2460 @john-bodley Could you help review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198223572


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,235 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, Optional, Pattern, Tuple, List, Type

Review Comment:
   ```suggestion
   from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198658284


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,236 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import Integer, Numeric, types
+from sqlalchemy.engine import Inspector
+from sqlalchemy.engine.result import Row as ResultRow
+from sqlalchemy.engine.url import URL
+from sqlalchemy.sql.type_api import TypeEngine
+
+from superset.db_engine_specs import BaseEngineSpec
+from superset.db_engine_specs.mysql import MySQLEngineSpec
+from superset.errors import SupersetErrorType
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+logger = logging.getLogger(__name__)
+
+
+class TINYINT(Integer):
+    __visit_name__ = "TINYINT"
+
+
+class DOUBLE(Numeric):
+    __visit_name__ = "DOUBLE"
+
+
+class ARRAY(TypeEngine):
+    __visit_name__ = "ARRAY"
+
+    @property
+    def python_type(self) -> Optional[Type[List[Any]]]:
+        return list
+
+
+class MAP(TypeEngine):
+    __visit_name__ = "MAP"
+
+    @property
+    def python_type(self) -> Optional[Type[Dict[Any, Any]]]:
+        return dict
+
+
+class STRUCT(TypeEngine):
+    __visit_name__ = "STRUCT"
+
+    @property
+    def python_type(self) -> Optional[Type[Any]]:
+        return None
+
+
+class StarRocksEngineSpec(MySQLEngineSpec):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/catalog.db[?key=value&key=value...]"
+    )
+
+    super().column_type_mappings = (

Review Comment:
   Yes, actually I didn't write `super()` originally, but mypy reports `superset/db_engine_specs/starrocks.py:71: error: Incompatible types in assignment (expression has type Tuple[Tuple[Pattern[str], TINYINT, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], ... <14 more items>], base class "MySQLEngineSpec" defined the type as Tuple[Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType], Tuple[Pattern[str], Any, GenericDataType]])  [assignment]` 
   
   I don't know how to fix this check, can you give some suggestion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rusackas commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1198232045


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,235 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+import re
+from typing import Any, Dict, List, Optional, Pattern, Tuple, Type
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import, Integer, Numeric, types

Review Comment:
   ```suggestion
   from sqlalchemy import Integer, Numeric, types
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] miomiocat commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "miomiocat (via GitHub)" <gi...@apache.org>.
miomiocat commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1135268790


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   for lazy load here



##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:
+            # only import and store if needed at least once
+            # pylint: disable=import-outside-toplevel
+            import MySQLdb
+
+            ft = MySQLdb.constants.FIELD_TYPE
+            cls.type_code_map = {
+                getattr(ft, k): k for k in dir(ft) if not k.startswith("_")
+            }
+        datatype = type_code
+        if isinstance(type_code, int):
+            datatype = cls.type_code_map.get(type_code)
+        if datatype and isinstance(datatype, str) and datatype:

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on a diff in pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on code in PR #23209:
URL: https://github.com/apache/superset/pull/23209#discussion_r1134793362


##########
superset/db_engine_specs/starrocks.py:
##########
@@ -0,0 +1,269 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import re
+from datetime import datetime
+from typing import Any, Dict, Optional, Pattern, Tuple
+from urllib import parse
+
+from flask_babel import gettext as __
+from sqlalchemy import types
+from sqlalchemy.dialects.mysql import (
+    BIT,
+    DECIMAL,
+    DOUBLE,
+    FLOAT,
+    INTEGER,
+    LONGTEXT,
+    MEDIUMINT,
+    MEDIUMTEXT,
+    TINYINT,
+    TINYTEXT,
+)
+from sqlalchemy.engine.url import URL
+
+from superset.db_engine_specs.base import BaseEngineSpec, BasicParametersMixin
+from superset.errors import SupersetErrorType
+from superset.models.sql_lab import Query
+from superset.utils.core import GenericDataType
+
+# Regular expressions to catch custom errors
+CONNECTION_ACCESS_DENIED_REGEX = re.compile(
+    "Access denied for user '(?P<username>.*?)'@'(?P<hostname>.*?)'"
+)
+CONNECTION_INVALID_HOSTNAME_REGEX = re.compile(
+    "Unknown MySQL server host '(?P<hostname>.*?)'"
+)
+CONNECTION_HOST_DOWN_REGEX = re.compile(
+    "Can't connect to MySQL server on '(?P<hostname>.*?)'"
+)
+CONNECTION_UNKNOWN_DATABASE_REGEX = re.compile("Unknown database '(?P<database>.*?)'")
+
+SYNTAX_ERROR_REGEX = re.compile(
+    "check the manual that corresponds to your MySQL server "
+    "version for the right syntax to use near '(?P<server_error>.*)"
+)
+
+
+class StarRocksEngineSpec(BaseEngineSpec, BasicParametersMixin):
+    engine = "starrocks"
+    engine_name = "StarRocks"
+    max_column_name_length = 64
+
+    default_driver = "starrocks"
+    sqlalchemy_uri_placeholder = (
+        "starrocks://user:password@host:port/dbname[?key=value&key=value...]"
+    )
+    encryption_parameters = {"ssl": "1"}
+
+    column_type_mappings = (
+        (
+            re.compile(r"^int.*", re.IGNORECASE),
+            INTEGER(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinyint", re.IGNORECASE),
+            TINYINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^mediumint", re.IGNORECASE),
+            MEDIUMINT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^decimal", re.IGNORECASE),
+            DECIMAL(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^float", re.IGNORECASE),
+            FLOAT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^double", re.IGNORECASE),
+            DOUBLE(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^bit", re.IGNORECASE),
+            BIT(),
+            GenericDataType.NUMERIC,
+        ),
+        (
+            re.compile(r"^tinytext", re.IGNORECASE),
+            TINYTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^mediumtext", re.IGNORECASE),
+            MEDIUMTEXT(),
+            GenericDataType.STRING,
+        ),
+        (
+            re.compile(r"^longtext", re.IGNORECASE),
+            LONGTEXT(),
+            GenericDataType.STRING,
+        ),
+    )
+
+    _time_grain_expressions = {
+        None: "{col}",
+        "PT1S": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60*60 + MINUTE({col})*60"
+        " + SECOND({col})) SECOND)",
+        "PT1M": "DATE_ADD(DATE({col}), "
+        "INTERVAL (HOUR({col})*60 + MINUTE({col})) MINUTE)",
+        "PT1H": "DATE_ADD(DATE({col}), " "INTERVAL HOUR({col}) HOUR)",
+        "P1D": "DATE({col})",
+        "P1W": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFWEEK({col}) - 1 DAY))",
+        "P1M": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFMONTH({col}) - 1 DAY))",
+        "P3M": "MAKEDATE(YEAR({col}), 1) "
+        "+ INTERVAL QUARTER({col}) QUARTER - INTERVAL 1 QUARTER",
+        "P1Y": "DATE(DATE_SUB({col}, " "INTERVAL DAYOFYEAR({col}) - 1 DAY))",
+        "1969-12-29T00:00:00Z/P1W": "DATE(DATE_SUB({col}, "
+        "INTERVAL DAYOFWEEK(DATE_SUB({col}, "
+        "INTERVAL 1 DAY)) - 1 DAY))",
+    }
+
+    type_code_map: Dict[int, str] = {}  # loaded from get_datatype only if needed
+
+    custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = {
+        CONNECTION_ACCESS_DENIED_REGEX: (
+            __('Either the username "%(username)s" or the password is incorrect.'),
+            SupersetErrorType.CONNECTION_ACCESS_DENIED_ERROR,
+            {"invalid": ["username", "password"]},
+        ),
+        CONNECTION_INVALID_HOSTNAME_REGEX: (
+            __('Unknown MySQL server host "%(hostname)s".'),
+            SupersetErrorType.CONNECTION_INVALID_HOSTNAME_ERROR,
+            {"invalid": ["host"]},
+        ),
+        CONNECTION_HOST_DOWN_REGEX: (
+            __('The host "%(hostname)s" might be down and can\'t be reached.'),
+            SupersetErrorType.CONNECTION_HOST_DOWN_ERROR,
+            {"invalid": ["host", "port"]},
+        ),
+        CONNECTION_UNKNOWN_DATABASE_REGEX: (
+            __('Unable to connect to database "%(database)s".'),
+            SupersetErrorType.CONNECTION_UNKNOWN_DATABASE_ERROR,
+            {"invalid": ["database"]},
+        ),
+        SYNTAX_ERROR_REGEX: (
+            __(
+                'Please check your query for syntax errors near "%(server_error)s". '
+                "Then, try running your query again."
+            ),
+            SupersetErrorType.SYNTAX_ERROR,
+            {},
+        ),
+    }
+
+    @classmethod
+    def convert_dttm(
+        cls, target_type: str, dttm: datetime, db_extra: Optional[Dict[str, Any]] = None
+    ) -> Optional[str]:
+        sqla_type = cls.get_sqla_column_type(target_type)
+
+        if isinstance(sqla_type, types.Date):
+            return f"STR_TO_DATE('{dttm.date().isoformat()}', '%Y-%m-%d')"
+        if isinstance(sqla_type, types.DateTime):
+            datetime_formatted = dttm.isoformat(sep=" ", timespec="microseconds")
+            return f"""STR_TO_DATE('{datetime_formatted}', '%Y-%m-%d %H:%i:%s.%f')"""
+        return None
+
+    @classmethod
+    def adjust_database_uri(
+        cls, uri: URL, selected_schema: Optional[str] = None
+    ) -> URL:
+        # # TODO: jiqian
+        # uri = uri.set(drivername="mysql")
+        database = uri.database
+        if selected_schema and database:
+            selected_schema = parse.quote(selected_schema, safe="")
+            if "." in database:
+                database = database.split(".")[0] + "." + selected_schema
+            else:
+                database += "." + selected_schema
+            uri = uri.set(database=database)
+
+        return uri
+
+    @classmethod
+    def get_datatype(cls, type_code: Any) -> Optional[str]:
+        if not cls.type_code_map:

Review Comment:
   Setting the `type_code_map` in a get method seems somewhat atypical. Could this logic reside elsewhere, i.e., upon object instantiation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] john-bodley commented on pull request #23209: feat: Add StarRocks support

Posted by "john-bodley (via GitHub)" <gi...@apache.org>.
john-bodley commented on PR #23209:
URL: https://github.com/apache/superset/pull/23209#issuecomment-1467214978

   Thanks @miomiocat for clarifying. I've left some comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org