You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Niels Basjes <Ni...@basjes.nl> on 2020/02/29 15:11:32 UTC

Giving useful names to the SQL steps/operators.

Hi,

I'm playing around with the streaming SQL engine in combination with the
UDF I wrote ( https://yauaa.basjes.nl/UDF-ApacheFlinkTable.html ) .
I generated an SQL statement to extract all possible fields of my UDF (i.e.
many fields) and what I found is that the names of the steps in the logging
and the UI become ... very very large.

In fact they become so large that it is hard to read what the step is
actually doing.

As an example I get log messages like this (This is 1 logline, I added
newlines for readability in this email).

2020-02-29 14:48:13,148 WARN org.apache.flink.metrics.MetricGroup - The
operator name
select: (EventTime, useragent,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceClass') AS DeviceClass,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceName') AS DeviceName,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceBrand') AS DeviceBrand,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpu') AS DeviceCpu,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpuBits') AS DeviceCpuBits,
ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceVersion') AS DeviceVersion,
ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemClass') AS
OperatingSystemClass,
ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemName') AS
OperatingSystemName,
ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemNameVersion') AS
OperatingSystemNameVersion,
ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineClass') AS
LayoutEngineClass,
ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineName') AS
LayoutEngineName,
ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineVersionMajor') AS
LayoutEngineVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineNameVersionMajor') AS
LayoutEngineNameVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentClass') AS AgentClass,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentName') AS AgentName,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentVersionMajor') AS
AgentVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentNameVersionMajor') AS
AgentNameVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguage') AS AgentLanguage,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguageCode') AS
AgentLanguageCode,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationEmail') AS
AgentInformationEmail,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationUrl') AS
AgentInformationUrl,
ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentSecurity') AS AgentSecurity,
ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppName') AS WebviewAppName,

ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppNameVersionMajor') AS
WebviewAppNameVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'Anonymized') AS Anonymized,
ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerAttackVector') AS
HackerAttackVector,
ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerToolkit') AS HackerToolkit,
ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboAffiliate') AS KoboAffiliate,
ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboPlatformId') AS KoboPlatformId,

ITEM(ParseUserAgent(useragent), _UTF-16LE'IECompatibilityNameVersionMajor')
AS IECompatibilityNameVersionMajor,
ITEM(ParseUserAgent(useragent), _UTF-16LE'Carrier') AS Carrier,
ITEM(ParseUserAgent(useragent), _UTF-16LE'NetworkType') AS NetworkType,
clicks, visitors)
exceeded the 80 characters length limit and was truncated.


As you can see this impacts not only the names of the steps but also the
metrics.

My question if it is possible to specify a name for the step, similar to
what I can do in the Java code?

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Giving useful names to the SQL steps/operators.

Posted by Niels Basjes <Ni...@basjes.nl>.
Thanks.

On Sat, Feb 29, 2020 at 4:20 PM Yuval Itzchakov <yu...@gmail.com> wrote:

>
> Unfortunately, it isn't possible. You can't set names to steps like
> ordinary Java/Scala functions.
>
> On Sat, 29 Feb 2020, 17:11 Niels Basjes, <Ni...@basjes.nl> wrote:
>
>> Hi,
>>
>> I'm playing around with the streaming SQL engine in combination with the
>> UDF I wrote ( https://yauaa.basjes.nl/UDF-ApacheFlinkTable.html ) .
>> I generated an SQL statement to extract all possible fields of my UDF
>> (i.e. many fields) and what I found is that the names of the steps in the
>> logging and the UI become ... very very large.
>>
>> In fact they become so large that it is hard to read what the step is
>> actually doing.
>>
>> As an example I get log messages like this (This is 1 logline, I added
>> newlines for readability in this email).
>>
>> 2020-02-29 14:48:13,148 WARN org.apache.flink.metrics.MetricGroup - The
>> operator name
>> select: (EventTime, useragent,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceClass') AS DeviceClass,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceName') AS DeviceName,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceBrand') AS DeviceBrand,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpu') AS DeviceCpu,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpuBits') AS
>> DeviceCpuBits,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceVersion') AS
>> DeviceVersion,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemClass') AS
>> OperatingSystemClass,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemName') AS
>> OperatingSystemName,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemNameVersion') AS
>> OperatingSystemNameVersion,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineClass') AS
>> LayoutEngineClass,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineName') AS
>> LayoutEngineName,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineVersionMajor') AS
>> LayoutEngineVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineNameVersionMajor')
>> AS LayoutEngineNameVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentClass') AS AgentClass,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentName') AS AgentName,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentVersionMajor') AS
>> AgentVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentNameVersionMajor') AS
>> AgentNameVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguage') AS
>> AgentLanguage,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguageCode') AS
>> AgentLanguageCode,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationEmail') AS
>> AgentInformationEmail,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationUrl') AS
>> AgentInformationUrl,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentSecurity') AS
>> AgentSecurity,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppName') AS
>> WebviewAppName,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppNameVersionMajor') AS
>> WebviewAppNameVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'Anonymized') AS Anonymized,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerAttackVector') AS
>> HackerAttackVector,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerToolkit') AS
>> HackerToolkit,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboAffiliate') AS
>> KoboAffiliate,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboPlatformId') AS
>> KoboPlatformId,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE
>> 'IECompatibilityNameVersionMajor') AS IECompatibilityNameVersionMajor,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'Carrier') AS Carrier,
>> ITEM(ParseUserAgent(useragent), _UTF-16LE'NetworkType') AS NetworkType,
>> clicks, visitors)
>> exceeded the 80 characters length limit and was truncated.
>>
>>
>> As you can see this impacts not only the names of the steps but also the
>> metrics.
>>
>> My question if it is possible to specify a name for the step, similar to
>> what I can do in the Java code?
>>
>> --
>> Best regards / Met vriendelijke groeten,
>>
>> Niels Basjes
>>
>>

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Giving useful names to the SQL steps/operators.

Posted by Yuval Itzchakov <yu...@gmail.com>.
Unfortunately, it isn't possible. You can't set names to steps like
ordinary Java/Scala functions.

On Sat, 29 Feb 2020, 17:11 Niels Basjes, <Ni...@basjes.nl> wrote:

> Hi,
>
> I'm playing around with the streaming SQL engine in combination with the
> UDF I wrote ( https://yauaa.basjes.nl/UDF-ApacheFlinkTable.html ) .
> I generated an SQL statement to extract all possible fields of my UDF
> (i.e. many fields) and what I found is that the names of the steps in the
> logging and the UI become ... very very large.
>
> In fact they become so large that it is hard to read what the step is
> actually doing.
>
> As an example I get log messages like this (This is 1 logline, I added
> newlines for readability in this email).
>
> 2020-02-29 14:48:13,148 WARN org.apache.flink.metrics.MetricGroup - The
> operator name
> select: (EventTime, useragent,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceClass') AS DeviceClass,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceName') AS DeviceName,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceBrand') AS DeviceBrand,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpu') AS DeviceCpu,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceCpuBits') AS DeviceCpuBits,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'DeviceVersion') AS DeviceVersion,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemClass') AS
> OperatingSystemClass,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemName') AS
> OperatingSystemName,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'OperatingSystemNameVersion') AS
> OperatingSystemNameVersion,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineClass') AS
> LayoutEngineClass,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineName') AS
> LayoutEngineName,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineVersionMajor') AS
> LayoutEngineVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'LayoutEngineNameVersionMajor')
> AS LayoutEngineNameVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentClass') AS AgentClass,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentName') AS AgentName,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentVersionMajor') AS
> AgentVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentNameVersionMajor') AS
> AgentNameVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguage') AS AgentLanguage,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentLanguageCode') AS
> AgentLanguageCode,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationEmail') AS
> AgentInformationEmail,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentInformationUrl') AS
> AgentInformationUrl,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'AgentSecurity') AS AgentSecurity,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppName') AS
> WebviewAppName,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'WebviewAppNameVersionMajor') AS
> WebviewAppNameVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'Anonymized') AS Anonymized,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerAttackVector') AS
> HackerAttackVector,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'HackerToolkit') AS HackerToolkit,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboAffiliate') AS KoboAffiliate,
>
> ITEM(ParseUserAgent(useragent), _UTF-16LE'KoboPlatformId') AS
> KoboPlatformId,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'IECompatibilityNameVersionMajor')
> AS IECompatibilityNameVersionMajor,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'Carrier') AS Carrier,
> ITEM(ParseUserAgent(useragent), _UTF-16LE'NetworkType') AS NetworkType,
> clicks, visitors)
> exceeded the 80 characters length limit and was truncated.
>
>
> As you can see this impacts not only the names of the steps but also the
> metrics.
>
> My question if it is possible to specify a name for the step, similar to
> what I can do in the Java code?
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>
>