You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@echarts.apache.org by GitBox <gi...@apache.org> on 2020/10/06 17:06:30 UTC
[GitHub] [incubator-echarts] 100pah commented on pull request #13358: Custom morph2

100pah commented on pull request #13358:
URL: https://github.com/apache/incubator-echarts/pull/13358#issuecomment-704408035


   # How a user expresses data mapping for transition?
   
   
   ## Issues
   
   First and foremost, we need to consider those issues below:
   
   ### ISSUE_I: If we need to "auto detect the change of dimensions" between old data and new data, how to implement it?
   We should consider:
   + We have never been forcing users to specify dimension names. User can only specify certain dimensions by dimension index, which is probably convenient in some scenario in practice.
   + If we implement "data mapping for transition animation" via "auto detection of the change of dimensions", probably we can force the users to specify dimension names if they want to have a "correct transition animation", and perform mapping by the rule of `MAPPING_ON_THE_SAME_DIMENSION_NAME`, which means that if there is any equality on `oldData.dimensions[i].name` and `newData.dimensions[j].name`, we can perform mapping of data items by the values on `oldData.dimensions[i]` and `newData.dimensions[j]`. **Is there any flaw if applying that rule**?
   
   
   ### ISSUE_II: The issues of "mapping by index":
   The default data mapping implementation is provided by `List['diff']`, where if the names of data items are not specified, they will be mapped by data index. "mapping by index" is not a big deal in scenarios that the meaning of transition are not noticed. But in some scenario that the meaning of transition need to be noticed, like storytelling, any incorrectly data mapping is probably inappropriate. For example:
   
   `dataA` is the raw data, where the dimensions are `['Year', 'Income', 'Population', 'Sex', 'Country']`.
   `dataB` is calculated by:
   ```sql
   select avg(`Population`), avg(`Income`) from `dataA` group by `Sex`;
   ```
   `dataC` is calculated by:
   ```sql
   select avg(`Population`), avg(`Income`) from `dataA` group by `Country`;
   ```
   Suppose there are only two values in dimension `Country` (`'France'`, `'Germany'`), which are just the same as the value count of dimension `Sex` (`'Woman'`, `'Man'`).
   Consequently the count of `dataB` and `dataC` are exactly the same.
   Having these data above, when `dataB` is switched to `dataC` via `setOption`, the data mapping should not be performed by index. Otherwise there will be misleading mappings from `'Man'` to `'France'` or from `'Women'` to `'Germany'`. In this case, no transition animation is probably better than misleading transition animation.
   
   
   ### ISSUE_III: The issues of "when dimensions not changed":
   Suppose there is no changes before and after `setOption` called:
   Dimensions of `dataA` is `['Income', 'Population', 'Country']`,
   Dimensions of `dataB` is `['Income', 'Population', 'Country']`, exactly the same.
   But `dataB` is calculated by:
   ```sql
   select sum(`Income`), avg(`Population`) from `dataA` group by `Country`;
   ```
   Have these data above, the dimensions are not change, but obviously it should be mapped neither by index, nor by the first same dimension (`Income`). The appropriate mapping should be performed on dimension `Country`, which, nevertheless, can not be auto-detected.
   
   That is, even though the dimensions are not changed, it hardly auto-detect how to make a totally correct data mapping. User input about transition is still needed in this case.
   
   
   ### ISSUE_IV: Issues about "user specifies a dimension (also say, `key` below) to perform mapping":
   Suppose there are requirements:
   1. `dataB`(`seriesB`)  ---transition1(on `'Country'`)--->  `dataA`(`seriesA`)
   2. `dataC`(`seriesC`)  ---transition2(on `'Income'`)--->  `dataA`(`seriesA`)
   We say the data before the "transition arrow" as `from`, and the data after the arrow as `to`.
   `transition1` needs user to input a key `'Country'`, and `transition2` needs user to input a key `'Income'`.
   That is, the "user specified key" is not only related to `to` but also related to `from`.
   That is, the "user specified key" only work for this calling of `setOption`, and should be discarded after setOption called.
   That is, the "user specified key" should better be set on the params of `setOption` rather than series option.
   
   If we intend to make the "user specified key" on series option, probably we need to lift the concept of that "key", making it not describe this transition but describe the feature of the data itself. For example, describe that it is the unique key of the data, and make a auto-mapping rule based on unique key). We will discuss it below in detail.
   
   
   ### ISSUE_V: Issues about "data totally not changed but need transition animation".
   Like transition from bar to pie chart with the same data.
   For example, there is `dataA`, which dimensions are `['Income', 'Population', 'Country', 'Sex']` and no dimension is suitable for `itemName`.
   The current default rule that mapping by index can handle that.
   But if we disable the rule that "mapping by index" for transition animation scenario, how to handle it?
   A possible solution can be:
   ```js
   option = {
       dataset: [{
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           source: dataA
       }, {
           // Generate an extra dimension as id.
           transform: {
               type: 'id',
               dimensionIndex: 4,
               dimensionName: 'Id'
           }
       }],
       series: {
           type: 'custom',
           encode: { itemName: 'Id' },
           datasetIndex: 1
       }
   };
   ```
   
   <br>
   
   ## Solutions
   
   Based on the scenarios listed above, I summarized to two designs about **how a user expresses data mapping for transition**.
   
   
   ### SOLUTION_A: Dimension key about data mapping is set in the parameter of `setOption`.
   That is, user is responsible for the setting of "from dimension" and "to dimension" of data mapping when intending to have transition animation.
   
   The advantages:
   + The API is more "atomic" relatively. Users can control everything about transition, which might avoid some bad cases that haven't thought of.
   + It's not hard for users to configure it in the "linear scene changing" (that is, optionA -> optionB -> optionC, be a linked list rather than a directed graph).
   
   The disadvantages:
   + It's not easy for users to configure it in the "directed-graph scene changing", where users might need upper layer to manage transition settings.
   
   
   
   ### SOLUTION_B The "key" about transition is set in series option.
   
   The key points of this strategy:
   + Apply `MAPPING_ON_THE_SAME_DIMENSION_NAME`.
   + User is responsible for specifying the "unique key" of data, which is used subsequently to select the transition key.
       + The term "unique key" follows the same concept of unique key in database.
       + `PENDING_I`: how to specify unique key?
           + We can use the existing setting `series.encode.itemName` to specify the unique key. At present, `series.encode.itemName` is used to specify the value of which dimension should be used as the name of each data item, and `List['diff']` will use it to perform data mapping. But `series.encode.itemName` is also responsible for displaying in the default tooltip. If we generate a extra dimension containing unintelligible ids (see ISSUE_V), it should better not to be displayed in the default tooltip. Consider this case, should we do not use `series.encode.itemName` to express unique key? ![image](https://user-images.githubusercontent.com/1956569/95232028-8c4c5100-0836-11eb-9a82-8a560ff6e6a7.png)
           + Or we can add a new setting `series.encode.itemId` (or `series.encode.uniqueKey`?) to take charge of this job, whose only different from `series.encode.itemName` is that it will not be displayed in the default tooltip. And it might be more semantically correct.
   
   Considering the compatibility with the current mapping strategy, when `setOption` happen, we have the rule as follows:
   + Get `UNIQUE_KEY_DIMENSION_NAME`: if `series.encode.itemId` (or `series.encode.itemName`, see `PENDING_I`) is specified and has its dimension name specified, we have `UNIQUE_KEY_DIMENSION_NAME`.
   + If there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in `oldData`. If there is any dimension having the same name, we got the transition mapping dimension `from` and `to`.
   + Else if there is `oldData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in `newData`. If there is any dimension having the same name, we got the transition mapping dimension `from` and `to`.
   + Else if there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, do not apply transition animation.
       + This is to provide a way to disable unexpected transition.
   + Else apply the existing mapping rule.
   
   
   **User usage hints:**
   
   Scenario in (ISSUE_II):
   Expect no transition animation.
   ```js
   chart.setOption({
       series: {
           encode: { itemId: 'Sex' },
           dimensions: ['Population', 'Income'],
           data: dataB_aggregate_by_Sex_from_dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Population', 'Income'],
           data: dataC_aggregate_by_Country_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_III):
   Expect map by dimension country.
   ```js
   chart.setOption({
       series: {
           encode: { itemId: -1 }, // Means no item name.
           dimensions: ['Income', 'Population', 'Country'],
           data: dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Income', 'Population', 'Country'],
           data: dataB_aggregate_by_Country_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_IV):
   ```js
   chart.setOption({
       series: {
           encode: { itemId: -1 }, // Means no item name.
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           data: dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Income', 'Population', 'Country'],
           data: dataB_aggregate_by_Country_from_dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Sex' },
           dimensions: ['Income', 'Population', 'Sex'],
           data: dataC_aggregate_by_Sex_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_V):
   Expect transition between bar and pie with the same data.
   ```js
   chart.setOption({
       dataset: [{
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           source: dataA
       }, {
           // Generate an extra dimension as id.
           transform: {
               type: 'id',
               dimensionIndex: 4,
               dimensionName: 'Id'
           }
       },
   }, { lazyUpdate: true });
   
   chart.setOption({
       series: {
           // render pie
           type: 'custom',
           renderItem: renderBar,
           encode: { itemId: 'Id' },
           datasetIndex: 1
       }
   });
   chart.setOption({
       series: {
           // render bar
           type: 'custom',
           renderItem: renderPie,
           encode: { itemId: 'Id' },
           datasetIndex: 1
       }
   });
   ```
   
   <br>
   
   
   ## Summary
   
   At present I think `SOLUTION_B` probably better.
   But it might reduce the capability then `SOLUTION_A`. I am not sure is there any meaningful scenario that `SOLUTION_B` do not cover?
   And I am not sure about the `PENDING_I`.
   
   
   What's your opinions @pissang ?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@echarts.apache.org
For additional commands, e-mail: commits-help@echarts.apache.org