You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mohammed Ait Haddou <mo...@gmail.com> on 2020/06/05 00:48:43 UTC

count total & percentage in ksqldb

I have a stream with an event_type field, possible values are (view, cart,
purchase).

CREATE STREAM events(event_type STRING)
WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');

I want to count the total number of all events and the number of events for
each event_type into a single table.

Re: count total & percentage in ksqldb

Posted by Ricardo Ferreira <ri...@riferrei.com>.
Mohammed,

The `events_per_type` table example that I provided before produces the 
same output shown below; except of course for the percentage which can 
be easily computed as well.

Thanks,

-- Ricardo

On 6/5/20 10:46 AM, Mohammed Ait Haddou wrote:
> Thanks a lot for the reply.
> But, I want total number of all events and the count for each event 
> type into a single table as I mentioned.
> Similar results :
> +------------+-------+------------+
> | Event_type | Count | Percentage |
> +------------+-------+------------+
> |    view    | 6     | 0.5        |
> +------------+-------+------------+
> |    cart    | 3     | 0.25       |
> +------------+-------+------------+
> | purchase   | 3     | 0.25       |
> +------------+-------+------------+
>
> Percentage or total is the same thing for me :)
> Thank you
>
> On Fri, Jun 5, 2020 at 3:13 PM Ricardo Ferreira <riferrei@riferrei.com 
> <ma...@riferrei.com>> wrote:
>
>     Mohammed,
>
>     The first thing you need to do is making sure to set a key for
>     this stream. This can be accomplished either in the creation
>     statement or creating a new stream and using the *PARTITION BY*
>     clause. For the sake of simplicity; the example below uses the
>     creation statement strategy:
>
>     ```
>
>     CREATE STREAM events(event_type STRING)
>
>        WITH (KAFKA_TOPIC='events', *KEY='event_type'*,
>     VALUE_FORMAT='DELIMITED');
>
>     ```
>
>     This will make sure that each record in the topic will have a key
>     associated. Then, you will need to create two tables:
>
>     ### One to aggregate the sum of all event types
>
>     ```
>
>     CREATE TABLE number_of_events AS
>
>        SELECT COUNT(event_type) AS number_of_events
>
>        FROM EVENTS GROUP BY 'number_of_events';
>
>     ```
>
>     That you can easily query the result using a pull query:
>
>     ```
>
>     SELECT number_of_events
>
>     FROM NUMBER_OF_EVENTS
>
>     WHERE ROWKEY = 'number_of_events';
>
>     ```
>
>     ### One to aggregate the sum of all event types per event
>
>     ```
>
>     CREATE TABLE events_per_type AS
>
>        SELECT event_type as event_type, COUNT(event_type) AS total
>
>        FROM EVENTS GROUP BY event_type;
>
>     ```
>
>     That you can query using a push query:
>
>     ```
>
>     SELECT * FROM events_per_type EMIT CHANGES;
>
>     ```
>
>     Thanks,
>
>     -- Ricardo
>
>     On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
>>     I have a stream with an event_type field, possible values are (view, cart,
>>     purchase).
>>
>>     CREATE STREAM events(event_type STRING)
>>     WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>>
>>     I want to count the total number of all events and the number of events for
>>     each event_type into a single table.
>>
>
>
> -- 
> Mohammed Ait Haddou
> Linkedin.com/in/medait <http://linkedin.com/in/medait>
> +212.697.93.71.89 <tel:+212697937189>

Re: count total & percentage in ksqldb

Posted by Mohammed Ait Haddou <mo...@gmail.com>.
Thanks a lot for the reply.
But, I want total number of all events and the count for each event type
into a single table as I mentioned.
Similar results :

+------------+-------+------------+
| Event_type | Count | Percentage |
+------------+-------+------------+
|    view    | 6     | 0.5        |
+------------+-------+------------+
|    cart    | 3     | 0.25       |
+------------+-------+------------+
| purchase   | 3     | 0.25       |
+------------+-------+------------+

Percentage or total is the same thing for me :)

Thank you


On Fri, Jun 5, 2020 at 3:13 PM Ricardo Ferreira <ri...@riferrei.com>
wrote:

> Mohammed,
>
> The first thing you need to do is making sure to set a key for this
> stream. This can be accomplished either in the creation statement or
> creating a new stream and using the *PARTITION BY* clause. For the sake
> of simplicity; the example below uses the creation statement strategy:
>
> ```
>
> CREATE STREAM events(event_type STRING)
>
>    WITH (KAFKA_TOPIC='events', *KEY='event_type'*,
> VALUE_FORMAT='DELIMITED');
>
> ```
>
> This will make sure that each record in the topic will have a key
> associated. Then, you will need to create two tables:
>
> ### One to aggregate the sum of all event types
>
> ```
>
> CREATE TABLE number_of_events AS
>
>    SELECT COUNT(event_type) AS number_of_events
>
>    FROM EVENTS GROUP BY 'number_of_events';
>
> ```
>
> That you can easily query the result using a pull query:
>
> ```
>
> SELECT number_of_events
>
> FROM NUMBER_OF_EVENTS
>
> WHERE ROWKEY = 'number_of_events';
>
> ```
>
> ### One to aggregate the sum of all event types per event
>
> ```
>
> CREATE TABLE events_per_type AS
>
>    SELECT event_type as event_type, COUNT(event_type) AS total
>
>    FROM EVENTS GROUP BY event_type;
>
> ```
>
> That you can query using a push query:
>
> ```
>
> SELECT * FROM events_per_type EMIT CHANGES;
>
> ```
>
> Thanks,
>
> -- Ricardo
> On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
>
> I have a stream with an event_type field, possible values are (view, cart,
> purchase).
>
> CREATE STREAM events(event_type STRING)
> WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>
> I want to count the total number of all events and the number of events for
> each event_type into a single table.
>
>
>

-- 
Mohammed Ait Haddou
Linkedin.com/in/medait <http://linkedin.com/in/medait>
+212.697.93.71.89 <+212697937189>

Re: count total & percentage in ksqldb

Posted by Ricardo Ferreira <ri...@riferrei.com>.
Mohammed,

The first thing you need to do is making sure to set a key for this 
stream. This can be accomplished either in the creation statement or 
creating a new stream and using the *PARTITION BY* clause. For the sake 
of simplicity; the example below uses the creation statement strategy:

```

CREATE STREAM events(event_type STRING)

    WITH (KAFKA_TOPIC='events', *KEY='event_type'*, 
VALUE_FORMAT='DELIMITED');

```

This will make sure that each record in the topic will have a key 
associated. Then, you will need to create two tables:

### One to aggregate the sum of all event types

```

CREATE TABLE number_of_events AS

    SELECT COUNT(event_type) AS number_of_events

    FROM EVENTS GROUP BY 'number_of_events';

```

That you can easily query the result using a pull query:

```

SELECT number_of_events

FROM NUMBER_OF_EVENTS

WHERE ROWKEY = 'number_of_events';

```

### One to aggregate the sum of all event types per event

```

CREATE TABLE events_per_type AS

    SELECT event_type as event_type, COUNT(event_type) AS total

    FROM EVENTS GROUP BY event_type;

```

That you can query using a push query:

```

SELECT * FROM events_per_type EMIT CHANGES;

```

Thanks,

-- Ricardo

On 6/4/20 8:48 PM, Mohammed Ait Haddou wrote:
> I have a stream with an event_type field, possible values are (view, cart,
> purchase).
>
> CREATE STREAM events(event_type STRING)
> WITH (KAFKA_TOPIC='events', VALUE_FORMAT='DELIMITED');
>
> I want to count the total number of all events and the number of events for
> each event_type into a single table.
>