You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/11/12 09:28:45 UTC

[GitHub] [incubator-superset] Sascha-Gschwind opened a new issue #8544: [SIP] Proposal for Geocoding

Sascha-Gschwind opened a new issue #8544: [SIP] Proposal for Geocoding
URL: https://github.com/apache/incubator-superset/issues/8544
 
 
   ## [SIP] Proposal for Geocoding
   
   ### Motivation
   
   A Superset user wants to use address-based data to generate charts like `deck.gl Scatterplot`. They then first need to convert their address based data using an external source so it has the required `latitude/longitude` columns.
   
   Many other BI tools can convert addresses automatically.
   
   ### Proposed Change
   
   We want to implement a feature using the `GeoPy` package with the `Mapbox Geocoding API` that can convert addresses to `latitude/longitude` and save those values as additional columns or overwrite certain columns in the same table.
   
   To make the API calls we plan to use the same API-Key that is already used for the background maps (`Mapbox API Key`).
   
   The feature will be available under the menu "Sources" > "Geocode Addresses" and will be implemented asynchronously. Only one geocoding can be in progress at once though. There are multiple reasons for this decision:
   * Most geocoding API's limit the amount of requests per second
   * Most geocoding API's limit the amount of requests that can be made over a certain time period
   * Depending on the amount of data in the table the process can take a very long time
   
   If the geocoding is in process and the user navigates to the "Geocode Addresses" URL he will see a progressbar and will have the ability to cancel the process. If no process is ongoing the geocoding form will be shown.
   
   The user can decide what happens if anything goes wrong (for ex. call limit reached, connection issues, etc.) or the process is interrupted. He can choose to save the already converted data or discard it.
   
   ### New or Changed Public Interfaces
   
   * There will be a new form for the Geocoding in React
   * There will be a new REST API that geocodes the address based data on a specific table and adds or overwrites columns
   * There will be a new REST API that informs the caller if a geocoding is already in progress
   * There will be a new REST API that informs the caller about the progress (in %)
   * There will be a new REST API with wich a user can interrupt a geocoding progress
   * There will be a new REST API where you can get a list of columns for a selected table
   
   ### New dependencies
   
   * We **do not need** a new dependency, because Superset is already using GeoPy
   
   ### Migration Plan and Compatibility
   
   The documentation will likely need to be added which describes the usage of this new feature once this is merged into master
   
   ### Rejected Alternatives
   
   * We thought about on-the-fly geocoding and accepting address-data in certain charts like the `deck.gl Scatterplot` but rejected the idea since geocoding itself is an expensive operation and should only be done once on a specific dataset.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org