You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2020/02/05 00:46:23 UTC

Re: More publicly documenting the options under spark.sql.*

FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks
Nicholas.
Hope guys find some time to take a look.

2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas <ni...@gmail.com>님이
작성:

> I am! Thanks for the reference.
>
> On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Nicholas, are you interested in taking a stab at this? You could refer
>> https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3
>>
>> 2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <li...@gmail.com>님이 작성:
>>
>>> The idea looks nice. I think web documents always help end users.
>>>
>>> Bests,
>>> Takeshi
>>>
>>> On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
>>> shixiong@databricks.com> wrote:
>>>
>>>> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
>>>> configurations. Should be pretty easy to automatically generate a SQL
>>>> configuration page.
>>>>
>>>> Best Regards,
>>>> Ryan
>>>>
>>>>
>>>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think automatically creating a configuration page isn't a bad idea
>>>>> because I think we deprecate and remove configurations which are not
>>>>> created via .internal() in SQLConf anyway.
>>>>>
>>>>> I already tried this automatic generation from the codes at SQL
>>>>> built-in functions and I'm pretty sure we can do the similar thing for
>>>>> configurations as well.
>>>>>
>>>>> We could perhaps mimic what hadoop does
>>>>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>>>>
>>>>> On Wed, 15 Jan 2020, 10:46 Sean Owen, <sr...@gmail.com> wrote:
>>>>>
>>>>>> Some of it is intentionally undocumented, as far as I know, as an
>>>>>> experimental option that may change, or legacy, or safety valve flag.
>>>>>> Certainly anything that's marked an internal conf. (That does raise
>>>>>> the question of who it's for, if you have to read source to find it.)
>>>>>>
>>>>>> I don't know if we need to overhaul the conf system, but there may
>>>>>> indeed be some confs that could legitimately be documented. I don't
>>>>>> know which.
>>>>>>
>>>>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>>>> <ni...@gmail.com> wrote:
>>>>>> >
>>>>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>>>>> option, but it turns out that there's a whole bunch of stuff under
>>>>>> SQLConf.scala that has no public documentation under
>>>>>> http://spark.apache.org/docs.
>>>>>> >
>>>>>> > Would it be appropriate to somehow automatically generate a
>>>>>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>>>>> >
>>>>>> > Another thought that comes to mind is moving the config definitions
>>>>>> out of Scala and into a data format like YAML or JSON, and then sourcing
>>>>>> that both for SQLConf as well as for whatever documentation page we want to
>>>>>> generate. What do you think of that idea?
>>>>>> >
>>>>>> > Nick
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>
>>>>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>

Re: More publicly documenting the options under spark.sql.*

Posted by Hyukjin Kwon <gu...@gmail.com>.
The PR was merged. Now all external SQL configurations will be
automatically documented.

2020년 2월 5일 (수) 오전 9:46, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks
> Nicholas.
> Hope guys find some time to take a look.
>
> 2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas <ni...@gmail.com>님이
> 작성:
>
>> I am! Thanks for the reference.
>>
>> On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Nicholas, are you interested in taking a stab at this? You could refer
>>> https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3
>>>
>>> 2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro <li...@gmail.com>님이 작성:
>>>
>>>> The idea looks nice. I think web documents always help end users.
>>>>
>>>> Bests,
>>>> Takeshi
>>>>
>>>> On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
>>>> shixiong@databricks.com> wrote:
>>>>
>>>>> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
>>>>> configurations. Should be pretty easy to automatically generate a SQL
>>>>> configuration page.
>>>>>
>>>>> Best Regards,
>>>>> Ryan
>>>>>
>>>>>
>>>>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think automatically creating a configuration page isn't a bad idea
>>>>>> because I think we deprecate and remove configurations which are not
>>>>>> created via .internal() in SQLConf anyway.
>>>>>>
>>>>>> I already tried this automatic generation from the codes at SQL
>>>>>> built-in functions and I'm pretty sure we can do the similar thing for
>>>>>> configurations as well.
>>>>>>
>>>>>> We could perhaps mimic what hadoop does
>>>>>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>>>>>
>>>>>> On Wed, 15 Jan 2020, 10:46 Sean Owen, <sr...@gmail.com> wrote:
>>>>>>
>>>>>>> Some of it is intentionally undocumented, as far as I know, as an
>>>>>>> experimental option that may change, or legacy, or safety valve flag.
>>>>>>> Certainly anything that's marked an internal conf. (That does raise
>>>>>>> the question of who it's for, if you have to read source to find it.)
>>>>>>>
>>>>>>> I don't know if we need to overhaul the conf system, but there may
>>>>>>> indeed be some confs that could legitimately be documented. I don't
>>>>>>> know which.
>>>>>>>
>>>>>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>>>>> <ni...@gmail.com> wrote:
>>>>>>> >
>>>>>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>>>>>> option, but it turns out that there's a whole bunch of stuff under
>>>>>>> SQLConf.scala that has no public documentation under
>>>>>>> http://spark.apache.org/docs.
>>>>>>> >
>>>>>>> > Would it be appropriate to somehow automatically generate a
>>>>>>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>>>>>> >
>>>>>>> > Another thought that comes to mind is moving the config
>>>>>>> definitions out of Scala and into a data format like YAML or JSON, and then
>>>>>>> sourcing that both for SQLConf as well as for whatever documentation page
>>>>>>> we want to generate. What do you think of that idea?
>>>>>>> >
>>>>>>> > Nick
>>>>>>> >
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> ---
>>>> Takeshi Yamamuro
>>>>
>>>