You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@streampipes.apache.org by Philipp Zehnder <ze...@apache.org> on 2020/11/09 17:10:31 UTC

Re: Time Series Data Labeling Tool

Hi all,

we have a first running version of the new labeling tools, for images and time-series data.
A special thanks goes to Marco who helped a lot with the backend implementation to persist the labels. 

Users can now configure custom labels that can be used for labeling data.
Currently the functionality to add new labels is within the overview of the data explorer.
Any ideas where we should put this functionality? 
At the moment it is just a button that we should replace soon.

I merged the state of the branch STREAMPIPES-234 in dev, that everyone can try the new features. 
If you have any feedback please feel free to contact us.

Philipp


> On 3. Oct 2020, at 09:04, Philipp Zehnder <ze...@apache.org> wrote:
> 
> Hi Marco,
> 
> I implemented the UI for the label configuration functionality in branch STREAMPIPES-234.
> 
> Currently there is a button “Edit Labels” in the data explorer to open the configuration menu. 
> The button is not final, it is just to open the configuration menu. We will change the appearance in the future.
> 
> The code for the UI is in src/app/core-ui/labels. In this directory is a service storing the categories locally.
> In this service we will integrate the REST API that you are developing.
> 
> So far I implemented the functionality to add, edit, and remove categories.
> One category can have multiple labels, and each label has a name and a color.
> I already used the data model you provided. The next step would be to persist the changes the user does in the DB.
> 
> Please tell me if you have any suggestions for improvement.
> 
> Philipp
> 
> 
> 
>> On 2. Oct 2020, at 09:11, Philipp Zehnder <ze...@apache.org> wrote:
>> 
>> Sorry I just saw your PR. Thank you.
>> 
>>> On 2. Oct 2020, at 09:09, Philipp Zehnder <ze...@apache.org> wrote:
>>> 
>>> Hi Marco,
>>> 
>>> very cool. I started working on the front-end, which allows users to manage their labels via the GUI in branch STREAMPIPES-234.
>>> 
>>> When your model is ready, it would be good to make a pull request. Currently I only use a Json object for the labels, but it would be good to have the same model as in the backend.
>>> 
>>> My suggestion would be that a label has a uniqueID, a name, a color, and a category. And categories have an id and a name.
>>> 
>>> Is there any other information that we need?
>>> 
>>> Philipp
>>> 
>>>> On 1. Oct 2020, at 09:41, Marco Heyden <he...@gmail.com> wrote:
>>>> 
>>>> Hi Philipp,
>>>> 
>>>> I started working on the backend of the labeling tool and implemented two classes Label and LabelCategory. I also started implementing the rest-api.
>>>> 
>>>> Currently, I am also thinking about how to persist the labels and categories in the database: Should a label belong to a certain category (or multiple categories) or should a category have certain labels? 
>>>> 
>>>> What are your thoughts on that?
>>>> 
>>>> Best
>>>> Marco
>>>> 
>>>> Am 13.09.20, 00:01 schrieb "Philipp Zehnder" <ze...@apache.org>:
>>>> 
>>>> Hi Marco,
>>>> 
>>>> I am happy to hear that. I created a branch [1] that contains the current “prototypical” implementation.
>>>> I tried to increase the usability. Users can now use key shortcuts to select the label, like in the image label editor.
>>>> 
>>>> Everything should work, but is still a bit buggy. If you want you can try it out, I would be happy to get some feedback.
>>>> 
>>>> Philipp
>>>> 
>>>> 
>>>> [1] https://github.com/apache/incubator-streampipes/tree/STREAMPIPES-234
>>>> 
>>>>> On 11. Sep 2020, at 10:17, Marco Heyden <he...@gmail.com> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> this is a very cool feature which would help me a lot in my work.
>>>>> Let me know where I can support you!
>>>>> 
>>>>> Best 
>>>>> Marco
>>>>> 
>>>>> Am 11.09.20, 09:16 schrieb "Philipp Zehnder" <ze...@apache.org>:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I needed the label editor to label some time-series data.  
>>>>> Therefore I have used the current version and it works, but so far it is only a prototype and you have to press many buttons to label data.
>>>>> I started to refactor the component in [1], with the goal to harmonize the label function with the components of the image labeling.
>>>>> 
>>>>> Here is the link to the issue [2].
>>>>> 
>>>>> Philipp
>>>>> 
>>>>> [1] https://github.com/apache/incubator-streampipes/tree/STREAMPIPES-234
>>>>> [2] https://issues.apache.org/jira/projects/STREAMPIPES/issues/STREAMPIPES-234?filter=allopenissues
>>>>> 
>>>>> On 2020/04/02 06:11:44, Philipp Zehnder <z....@apache.org> wrote: 
>>>>>> Hi Daniel,> 
>>>>>> 
>>>>>> thanks, your pull request looks very good and I directly merged it.> 
>>>>>> When I saw it correctly the labels are currently set in the UI and not persisted to the database, right?> 
>>>>>> 
>>>>>> So I think the next step would be to implement the backend to update the labels in the database.> 
>>>>>> 
>>>>>> The endpoint for the data lake is defined in the class DataLakeResourceV3 [1]. Here we would need a new POST endpoint with the parameters: startDate, endDate, and label to update the database values.> 
>>>>>> My suggestion would be not to reload the data directly after a new label is added, instead update the UI and the backend to ensure a better user experience.> 
>>>>>> 
>>>>>> Regarding the naming for the label column in the database. > 
>>>>>> We could add a new column to each data stream called ’sp_internal_label’ containing a String value for each datapoint with the label.> 
>>>>>> Alternatively we could add a new table to a database (e.g. couchdb) with just the label information and then merge the information before it is sent to the browser.> 
>>>>>> This would ease the insertion of new labels, but make requesting the data more complex.> 
>>>>>> 
>>>>>> I would prefer to add a label to each data point. This solution makes the creation of labels a bit more challenging, but should have a better performance on querying the data.> 
>>>>>> 
>>>>>> How do you think about that?> 
>>>>>> 
>>>>>> Kind regards,> 
>>>>>> Philipp> 
>>>>>> 
>>>>>> 
>>>>>> [1] https://github.com/apache/incubator-streampipes/blob/9e5cafd11f530d5212f973da7b532b0dd102c22d/streampipes-rest/src/main/java/org/apache/streampipes/rest/impl/datalake/DataLakeResourceV3.java <https://github.com/apache/incubator-streampipes/blob/9e5cafd11f530d5212f973da7b532b0dd102c22d/streampipes-rest/src/main/java/org/apache/streampipes/rest/impl/datalake/DataLakeResourceV3.java>> 
>>>>>> 
>>>>>> On 2020/04/01 21:33:48, Daniel Ebi <E....@fzi.de> wrote: > 
>>>>>>> Hi there,> > 
>>>>>>>> 
>>>>>>> I have extended the time series data labeling tool in such a way that the labeling mode now remains activated after clicking the 'Labeling'-button - until it gets deactivated by clicking the button again or by switching to the plotly-specific 'Pan' or 'Zoom' mode. The activation of the labeling mode is highlighted by color adjustment of the 'Labeling'-button icon.> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> Just now I've already created a pull request on Github (pull request #13).> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> Daniel> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> -----Ursprüngliche Nachricht-----> > 
>>>>>>>> 
>>>>>>> Von: Philipp Zehnder <ze...@apache.org> > > 
>>>>>>>> 
>>>>>>> Gesendet: Dienstag, 31. März 2020 17:54> > 
>>>>>>>> 
>>>>>>> An: dev@streampipes.apache.org> > 
>>>>>>>> 
>>>>>>> Betreff: Re: AW: Time Series Data Labeling Tool> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> Hi Daniel,> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> the time labeling tool looks very cool. I checked it out and it directly worked.> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> One question. Currently I have to click on the button (‘label’, on the top right corner) for each label I want to select. > > 
>>>>>>>> 
>>>>>>> Can we change that behavior. I would expect that a user is entering the labeling mode, once the button is clicked and exits it again when the button is clicked again.> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> Philipp> > 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> On 2020/03/27 13:31:20, Daniel Ebi <E....@fzi.de> wrote: > > 
>>>>>>>> 
>>>>>>>> Hi there,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> I just wanted to let you know that I created the pull request for the time series labeling tool today (pull request #12).> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> During the last days, I have been working on the main components for the mentioned time series labeling tool and I already implemented a working version.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Cheers,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Daniel> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Ursprüngliche Nachricht-----> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Von: Daniel Ebi > > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Gesendet: Dienstag, 24. März 2020 17:35> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> An: 'dev@streampipes.apache.org' <de...@streampipes.apache.org>> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Betreff: AW: New Contributer | Time Series Data Labeling Tool> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Philipp, hi Johannes,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> thanks for your short guidance and the warm welcome. > > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> @Philipp: It all worked out well.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> @Johannes: I will look into it and check which components I can reused.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Cheers,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Daniel> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> ________________> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2020/03/24 16:43:33, Johannes Tex wrote:> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Daniel,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> welcome :)> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> As you mentioned, I have started to develop an image labeling tool. > > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think some of the components can be resused, for example the "image-labels" component [1].> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> It is a component for selecting labels received from the backend (at the moment it is only mocked). You also can use the num-pad change labels, it's for the power users ;) You can have a view and we can discuss it if you have comments or questions.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Johannes> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1] https://github.com/apache/incubator-streampipes/tree/b2e92c667d77e7b7e724112463bb25141881ffd6/ui/src/app/core-ui/image/components/image-labels> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> ________________> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2020/03/24 16:17:16, Philipp Zehnder wrote:> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Daniel,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> a warm welcome to the mailing list!> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> You can start developing on the UI using the cli tool in our installer [1]. (Currently in the dev branch) I’d suggest you use the lite version (‘sp set-template lite') and run a local StreamPipes instance on your computer (‘sp set-template lite').> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> The code for the UI is in [2] in the ui folder. Check out the repository and navigate to the folder.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Then run ’npm install’ and ’npm start’. (This might take a while) Afterwards you can navigate to ‘localhost:8082’ in your browser and you can start developing.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Cheers,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Philipp> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1] https://github.com/apache/incubator-streampipes-installer <https://github.com/apache/incubator-streampipes-installer>> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> [2] https://github.com/apache/incubator-streampipes <https://github.com/apache/incubator-streampipes>> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2020/03/24 10:54:12, Daniel Ebi <Eb...@fzi.de> wrote: > > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Hey there,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> my name is Daniel and I would like to contribute to Apache StreamPipes project. Therefore I recently joined this dev mailing list.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> About me> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> I study Information Engineering and Management in the Master's program - with focus on Machine Learning and Artificial Intelligence - at Karlsruhe Institute of Technologie (KIT). Additionally, I work as a student assistant at the FZI in Karlsruhe in order to gain some practical experience in research. There I got in contact with StreamPipes and became interested in the project. In the course of my studies I already gathered much experience in the Machine Learning and Data Science domain.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> What are my current ideas for Apache StreamPipes?> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Specifically, I would like to develop a labeling feature for Apache StreamPipes that allows users to label time series data easily and intuitively (like the image labeling tool of Johannes [1]). My idea is to extend the line chart of the data explorer in such a way that one or more labels can be added by simply marking the data points of interest. The visualization of the data is thereby used as a basis for the labeling, and the labelled data can be afterwards illustrated with colored overlays, too. The long-term goal is to ensure that the data labelled in this way can be used as a starting point for machine learning procedures.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> I will be very happy to hear about your suggestions about my plans. Thanks in advance.> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Best regards,> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Daniel Ebi> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> ________________________________> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> [1] > > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> https://issues.apache.org/jira/projects/STREAMPIPES/issues/STREAMPIPES> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -78?filter=allopenissues> > > 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 
>