You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mohammed Ait Haddou <mo...@gmail.com> on 2020/06/05 00:48:43 UTC
count total & percentage in ksqldb
I have a stream with an event_type field, possible values are (view, cart,
purchase).
CREATE STREAM events(event_type STRING)
WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
I want to count the total number of all events and the number of events for
each event_type into a single table.
Re: count total & percentage in ksqldb
Posted by Ricardo Ferreira <ri...@riferrei.com>.
Mohammed,
The `events_per_type` table example that I provided before produces the
same output shown below; except of course for the percentage which can
be easily computed as well.
Thanks,
-- Ricardo
On 6/5/20 10:46 AM, Mohammed Ait Haddou wrote:
> Thanks a lot for the reply.
> But, I want total number of all events and the count for each event
> type into a single table as I mentioned.
> Similar results :
> +------------+-------+------------+
> | Event_type | Count | Percentage |
> +------------+-------+------------+
> | view | 6 | 0.5 |
> +------------+-------+------------+
> | cart | 3 | 0.25 |
> +------------+-------+------------+
> | purchase | 3 | 0.25 |
> +------------+-------+------------+
>
> Percentage or total is the same thing for me :)
> Thank you
>
> On Fri, Jun 5, 2020 at 3:13 PM Ricardo Ferreira <riferrei@riferrei.com
> <ma...@riferrei.com>> wrote:
>
> Mohammed,
>
> The first thing you need to do is making sure to set a key for
> this stream. This can be accomplished either in the creation
> statement or creating a new stream and using the *PARTITION BY*
> clause. For the sake of simplicity; the example below uses the
> creation statement strategy:
>
> ```
>
> CREATE STREAM events(event_type STRING)
>
> WITH (KAFKA_TOPIC='events', *KEY='event_type'*,
> VALUE_FORMAT='DELIMITED');
>
> ```
>
> This will make sure that each record in the topic will have a key
> associated. Then, you will need to create two tables:
>
> ### One to aggregate the sum of all event types
>
> ```
>
> CREATE TABLE number_of_events AS
>
> SELECT COUNT(event_type) AS number_of_events
>
> FROM EVENTS GROUP BY 'number_of_events';
>
> ```
>
> That you can easily query the result using a pull query:
>
> ```
>
> SELECT number_of_events
>
> FROM NUMBER_OF_EVENTS
>
> WHERE ROWKEY = 'number_of_events';
>
> ```
>
> ### One to aggregate the sum of all event types per event
>
> ```
>
> CREATE TABLE events_per_type AS
>
> SELECT event_type as event_type, COUNT(event_type) AS total
>
> FROM EVENTS GROUP BY event_type;
>
> ```
>
> That you can query using a push query:
>
> ```
>
> SELECT * FROM events_per_type EMIT CHANGES;
>
> ```
>
> Thanks,
>
> -- Ricardo
>
> On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
>> I have a stream with an event_type field, possible values are (view, cart,
>> purchase).
>>
>> CREATE STREAM events(event_type STRING)
>> WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>>
>> I want to count the total number of all events and the number of events for
>> each event_type into a single table.
>>
>
>
> --
> Mohammed Ait Haddou
> Linkedin.com/in/medait <http://linkedin.com/in/medait>
> +212.697.93.71.89 <tel:+212697937189>
Re: count total & percentage in ksqldb
Posted by Mohammed Ait Haddou <mo...@gmail.com>.
Thanks a lot for the reply.
But, I want total number of all events and the count for each event type
into a single table as I mentioned.
Similar results :
+------------+-------+------------+
| Event_type | Count | Percentage |
+------------+-------+------------+
| view | 6 | 0.5 |
+------------+-------+------------+
| cart | 3 | 0.25 |
+------------+-------+------------+
| purchase | 3 | 0.25 |
+------------+-------+------------+
Percentage or total is the same thing for me :)
Thank you
On Fri, Jun 5, 2020 at 3:13 PM Ricardo Ferreira <ri...@riferrei.com>
wrote:
> Mohammed,
>
> The first thing you need to do is making sure to set a key for this
> stream. This can be accomplished either in the creation statement or
> creating a new stream and using the *PARTITION BY* clause. For the sake
> of simplicity; the example below uses the creation statement strategy:
>
> ```
>
> CREATE STREAM events(event_type STRING)
>
> WITH (KAFKA_TOPIC='events', *KEY='event_type'*,
> VALUE_FORMAT='DELIMITED');
>
> ```
>
> This will make sure that each record in the topic will have a key
> associated. Then, you will need to create two tables:
>
> ### One to aggregate the sum of all event types
>
> ```
>
> CREATE TABLE number_of_events AS
>
> SELECT COUNT(event_type) AS number_of_events
>
> FROM EVENTS GROUP BY 'number_of_events';
>
> ```
>
> That you can easily query the result using a pull query:
>
> ```
>
> SELECT number_of_events
>
> FROM NUMBER_OF_EVENTS
>
> WHERE ROWKEY = 'number_of_events';
>
> ```
>
> ### One to aggregate the sum of all event types per event
>
> ```
>
> CREATE TABLE events_per_type AS
>
> SELECT event_type as event_type, COUNT(event_type) AS total
>
> FROM EVENTS GROUP BY event_type;
>
> ```
>
> That you can query using a push query:
>
> ```
>
> SELECT * FROM events_per_type EMIT CHANGES;
>
> ```
>
> Thanks,
>
> -- Ricardo
> On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
>
> I have a stream with an event_type field, possible values are (view, cart,
> purchase).
>
> CREATE STREAM events(event_type STRING)
> WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>
> I want to count the total number of all events and the number of events for
> each event_type into a single table.
>
>
>
--
Mohammed Ait Haddou
Linkedin.com/in/medait <http://linkedin.com/in/medait>
+212.697.93.71.89 <+212697937189>
Re: count total & percentage in ksqldb
Posted by Ricardo Ferreira <ri...@riferrei.com>.
Mohammed,
The first thing you need to do is making sure to set a key for this
stream. This can be accomplished either in the creation statement or
creating a new stream and using the *PARTITION BY* clause. For the sake
of simplicity; the example below uses the creation statement strategy:
```
CREATE STREAM events(event_type STRING)
WITH (KAFKA_TOPIC='events', *KEY='event_type'*,
VALUE_FORMAT='DELIMITED');
```
This will make sure that each record in the topic will have a key
associated. Then, you will need to create two tables:
### One to aggregate the sum of all event types
```
CREATE TABLE number_of_events AS
SELECT COUNT(event_type) AS number_of_events
FROM EVENTS GROUP BY 'number_of_events';
```
That you can easily query the result using a pull query:
```
SELECT number_of_events
FROM NUMBER_OF_EVENTS
WHERE ROWKEY = 'number_of_events';
```
### One to aggregate the sum of all event types per event
```
CREATE TABLE events_per_type AS
SELECT event_type as event_type, COUNT(event_type) AS total
FROM EVENTS GROUP BY event_type;
```
That you can query using a push query:
```
SELECT * FROM events_per_type EMIT CHANGES;
```
Thanks,
-- Ricardo
On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
> I have a stream with an event_type field, possible values are (view, cart,
> purchase).
>
> CREATE STREAM events(event_type STRING)
> WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>
> I want to count the total number of all events and the number of events for
> each event_type into a single table.
>