You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Pedro Larroy <pe...@gmail.com> on 2019/06/11 17:18:22 UTC

Re: Does internal quality matters to users?

Thanks for the good discussion.

I actually wasn't referring particularly to our conversations in
github with respect of the refactors, but it's nice from you to bring
them up. And it's ok to disagree in small things, hopefully we can
align in the big ones.

I understand that for TVM you might have different constraints with
how dynamic you want to be for mutating the graph and doing quick
experimentation and research but please try to understand my
perspectives coming from a software engineering background and helping
maintain MXNet for thousands of users and teams using it in
production. Let's also consider how many issues we have open and our
bandwidth to deal with additional complexity.

To your pros and cons I would like to add and emphasize that currently
the heavy use of dynamic attributes in the graph using dmlc::any has
two very negative consequences, at least for MXNet:

1 - Makes the data structures using dmlc::any almost impossible to
debug, as they are just binary.
2 - Makes the code more difficult to understand because there's no
declaration in a data structure of the data fields it uses and its
responsibilities. We are basically shoving all kinds of stuff using
dmlc::any.
3 - You get no help from the IDE to navigate and refactor as another
consequence.

I would really like you to give me solutions to these problems or at
least acknowledge them and tell me why do we have to pay those
tradeoffs instead of just dismissing them as engineering taste.

The more I work with MXNet the more I wish debugging was easier, and
reading and refactoring the code, and those fields would be declared
and typed in their corresponding data structures, for MXNet I don't
think this would affect anything in regards the python bindings since
they go through the typed C API anyway.

Maybe we can get some inspiration from LLVM as they have bindings for
many languages to work with the AST and have very clean APIs for the
compilation steps. It's also OK to have an initial dynamic codebase
for research and experimentation and then "cure" them into a solid
maintainable one with more types and more robust...

Pedro.





On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tq...@cs.washington.edu> wrote:
>
> A good infrastructure design has a long way to go and has a profound impact on the project itself. That is why we always want to rethink if the interface can be better done, and think about the next possible infrastructure to make things better, Refactoring is certainly part of it.
>
> There are usually two types of refactoring we refers to :
> 1) The major design change, in terms of class relations, data structures (e.g. numpy support, adding compilation to new hardware)
> 2) The specific choice of API, programming style(more types or type-erased program)
>
> (1) affects the long term support of the project, introduces new features if necessary and need a lot of thoughts into that. I believe the general IR, compilation and numpy support belongs to that category.
>
> I would particularly like to talk about (2).
> Because there is no unified correct answer in software engineering, different developers may prefer different views on a certain problem.
> Some of them have things to do with the taste developers. The change could favor certain aspect of the project, but not necessarily another part.
> Refactoring wrt these sometimes does require a more thoughtful conversation and make a reasonable compromise.
>
> For example, we have a recent discussion about whether to introduce more typing into the code base, to the extent that the base data structure could be templatized.
> - The Pros of this approach
>     - It introduces more typing and compile-time error message(instead of runtime checking), could help developers to find problem earlier.
> - The Cons of the approach:
>    - Having a template in the base data structure causes ABI problem(which code generated by DLL A vs DLL B) and will have potential future issues.
>    - Template sometimes confuses some developers.
>    - For serialization, it is hard to anticipate all kinds of classes and it is easier to have one class(any) that handles polymorphism.
>    - Because of most frontends(python) are dynamic, it is easier to interface them with a type-erased API.
>
> As we can see there are pros and cons of bringing in more typing to the change, and there is no unified answer.
> One good example of a nice infrastructure design trade-off is DLPack https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> This is a base data structure adopted by MXNet, Pytorch, Chainer, and many other frameworks unanimously.
> It is a type-erased data structure that erases the data type, and memory allocator from the data structure and is designed to exchange tensor(coming from different memory allocators) across DLL boundaries.
> As you can see this is a good example of type-erased data structures.
>
> When we are having this kind of questions. It is important to have a good conversation. Sometimes we have to make tradeoffs rather than bend everyone-else to our will. This is what open source is about.
> I would also like to give some examples of conversations and how design decisions are resolved. It comes from the TVM community's recent discussion about VM design.
> I directly paste the github issue conversation here for the sake of clarity(note that all the conversations are also mirrored to dev@tvm).
> The background is that the community want to bring a virtual machine that can execute dynamic operations more effectively.
>
> - The initial proposal, made by one of the committers gave a detailed design based on Stack VM https://github.com/dmlc/tvm/issues/2810
>    - As you can see that there are quite some discussions about whether we want to use a different set of design, in this case, a register-based version.
>    - The conversation evolves, and while the community members disagree on some cases, also agrees with each other on the particular tradeoffs.
> - After some discussions, the committers bring a tradeoff design that tries to consolidate the needs of both sides and this is the final solution being adopted  https://github.com/dmlc/tvm/issues/2915
> I would like to particularly highlight the fact that: 1) there are disagreements in the development process. 2) developers work together to understand each others' needs and then make consensus on a perhaps better design.
>
> There are two other particular conversations between Pedro and myself, which are during his contributions.
> - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern about API consistency, and Pedro brings up a reason why he thinks it is a better idea, I agreed and we merged the PR
> - https://github.com/dmlc/tvm/pull/3108 In this other case, there are technical reasons for going both sides for the case of MXNet, we have listed pros/cons about both sides and have a constructive conversation. Eventually, I decided to not merge the PR after weighing in all the cases.
>
> I believe both are useful conversations, and while Pedro and I disagree sometimes, we do agree on many other cases. The most crucial part is about having a constructive conversation.
> To summarize, I do refactoring and making things better is certainly important to make the project better. And I do believe it is crucial to think about all the technical consequences and make good tradeoff decisions.
> Sometimes the decision may not make every developer mostly happy, but a good technical compromise could move the project forward and help the community in general.
>
> Tianqi
>
> On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <is...@apache.org> wrote:
>>
>>
>>
>> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <pe...@gmail.com>:
>> > I think Martin does a very good job explaining why
>> >refactoring,
>> >reducing developer frustration and internal improvement is a crucial
>> >productivity multiplier which includes lower cost to ship features,
>> >less
>> >bugs and time spent debugging.
>>
>> There's one aspect that's special for open source projects: if a project wants to survive long term, it should make it easy for people to get started working on the project. In my experience, refactoring and cleanup play an important role in that. So thanks also for making recruiting of new contributers better.
>>
>> Isabel
>> --
>> This message was sent with K-9 from a mobile device with swipe to type enabled. I'm sorry for any embarrassing typos that slipped through.

Re: Does internal quality matters to users?

Posted by Tianqi Chen <tq...@cs.washington.edu>.
We have thought very carefully when introducing type-erasures, including
considering the concerns you raised, and never-the-less have made the
decision
that resulted in the current design, which strikes the balance of
type-erasure and typing.
The original intention of the current design is to strike a balance between
the need for typing and the dynamism.
Type erasure brings the benefit of pluggable attributes(you don't have to
enumerate them beforehand), and as a result the pluggable operator and pass
optimization system.
The fact that it is map<str, any> where any can be vector<T> represents the
need to use typed data structures vector<T> when necessary, but only have
to fetch once.
Note that we did not go as far to vector<any>, so that most part of
processing can be typed.

The advantage of map<str, any> enables the unified processing of the
designs. You can find the same design in cases like DataFrame.

To inspect a dynamic type, likely you can create an auxiliary function to
do so, and many of them are still one-liner. or just use ```DLOG(INFO) <<
PrintInfo(graph, fields)```

On the other hand, having a separate attribute field means we need separate
logic for serialization and manipulation of these data structures.
The general benefit of type-erasure is to gain the dynamism and possible
ability of backend registration.
Why I totally understand where you are coming from. We do need to again
look at both sides of the spectrum. The data structure design like any
infra design is about trade-offs.

To sum up, I certainly agree some level of strong typing can be helpful and
we did do that in our codebase.
I do think it is wrong to dismiss dynamic types as simply bad and "not for
production".

Dynamic types, when used properly, can simplify the code assumptions, make
the system more extensible/pluggable, and interpolable with frontends.
All these features are important and necessary for a successful deep
learning system.
Admittedly they might make life a bit harder(e.g. not being able to
auto-complete using your IDE), but again this is a tradeoff we need to make.

I would recommend taking a look at the example of DLPack again
https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h as a
good example
of how such balance can be achieved.

Tianqi


On Tue, Jun 11, 2019 at 1:26 PM Pedro Larroy <pe...@gmail.com>
wrote:

> Another data point. While working with a contributor, he/she is asking
> to get access to the graph and values of the NDArray (me too) to be
> able to reason more effectively about an enhacements to the operators:
>
> https://github.com/apache/incubator-mxnet/pull/15120
>
> I think gathering in the wiki design constraints with respect to
> different activities and design proposals using the graph and pain
> points as the one we are discussing would be a constructive way to
> move forward, unless you think everything is as good as it gets right
> now which is what I understand from your responses.
>
> Re the shape of vector I know is one line of code to get it, but you
> can't get the values with a debugger from other points of the code,
> and that compounds to many other dynamic attributes that you can also
> fetch, it's a bit like dying from a thousand cuts. In the MXNet
> codebase we are paying a price for no benefit on flexibility as we
> always use those attributes, hence my point they should be typed in
> the graph. Please try to understand my point and help provide
> solutions or at least a reason why it can't be improved instead of
> saying there's no problem, there's still three problems in order of
> importance: debuggability, clarity of definition of data structures
> and navigability with an IDE which breaks with an untyped field keyed
> with string.
>
> Pedro.
>
>
>
> Pedro.
>
> On Tue, Jun 11, 2019 at 11:07 AM Pedro Larroy
> <pe...@gmail.com> wrote:
> >
> > To put a recent specific example and focus the discussion (there are
> > many as there are attributes), the shapes in the graph are a vector of
> > Shape set as an attribute using dmlc::any so this makes it very
> > difficult to debug the shapes when you have a graph object. I would
> > have it as a typed attribute to the graph, as we always need the
> > vector of shapes and operate on it while doing shape inference.
> >
> > On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy
> > <pe...@gmail.com> wrote:
> > >
> > > Thanks for the good discussion.
> > >
> > > I actually wasn't referring particularly to our conversations in
> > > github with respect of the refactors, but it's nice from you to bring
> > > them up. And it's ok to disagree in small things, hopefully we can
> > > align in the big ones.
> > >
> > > I understand that for TVM you might have different constraints with
> > > how dynamic you want to be for mutating the graph and doing quick
> > > experimentation and research but please try to understand my
> > > perspectives coming from a software engineering background and helping
> > > maintain MXNet for thousands of users and teams using it in
> > > production. Let's also consider how many issues we have open and our
> > > bandwidth to deal with additional complexity.
> > >
> > > To your pros and cons I would like to add and emphasize that currently
> > > the heavy use of dynamic attributes in the graph using dmlc::any has
> > > two very negative consequences, at least for MXNet:
> > >
> > > 1 - Makes the data structures using dmlc::any almost impossible to
> > > debug, as they are just binary.
> > > 2 - Makes the code more difficult to understand because there's no
> > > declaration in a data structure of the data fields it uses and its
> > > responsibilities. We are basically shoving all kinds of stuff using
> > > dmlc::any.
> > > 3 - You get no help from the IDE to navigate and refactor as another
> > > consequence.
> > >
> > > I would really like you to give me solutions to these problems or at
> > > least acknowledge them and tell me why do we have to pay those
> > > tradeoffs instead of just dismissing them as engineering taste.
> > >
> > > The more I work with MXNet the more I wish debugging was easier, and
> > > reading and refactoring the code, and those fields would be declared
> > > and typed in their corresponding data structures, for MXNet I don't
> > > think this would affect anything in regards the python bindings since
> > > they go through the typed C API anyway.
> > >
> > > Maybe we can get some inspiration from LLVM as they have bindings for
> > > many languages to work with the AST and have very clean APIs for the
> > > compilation steps. It's also OK to have an initial dynamic codebase
> > > for research and experimentation and then "cure" them into a solid
> > > maintainable one with more types and more robust...
> > >
> > > Pedro.
> > >
> > >
> > >
> > >
> > >
> > > On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tq...@cs.washington.edu>
> wrote:
> > > >
> > > > A good infrastructure design has a long way to go and has a profound
> impact on the project itself. That is why we always want to rethink if the
> interface can be better done, and think about the next possible
> infrastructure to make things better, Refactoring is certainly part of it.
> > > >
> > > > There are usually two types of refactoring we refers to :
> > > > 1) The major design change, in terms of class relations, data
> structures (e.g. numpy support, adding compilation to new hardware)
> > > > 2) The specific choice of API, programming style(more types or
> type-erased program)
> > > >
> > > > (1) affects the long term support of the project, introduces new
> features if necessary and need a lot of thoughts into that. I believe the
> general IR, compilation and numpy support belongs to that category.
> > > >
> > > > I would particularly like to talk about (2).
> > > > Because there is no unified correct answer in software engineering,
> different developers may prefer different views on a certain problem.
> > > > Some of them have things to do with the taste developers. The change
> could favor certain aspect of the project, but not necessarily another part.
> > > > Refactoring wrt these sometimes does require a more thoughtful
> conversation and make a reasonable compromise.
> > > >
> > > > For example, we have a recent discussion about whether to introduce
> more typing into the code base, to the extent that the base data structure
> could be templatized.
> > > > - The Pros of this approach
> > > >     - It introduces more typing and compile-time error
> message(instead of runtime checking), could help developers to find problem
> earlier.
> > > > - The Cons of the approach:
> > > >    - Having a template in the base data structure causes ABI
> problem(which code generated by DLL A vs DLL B) and will have potential
> future issues.
> > > >    - Template sometimes confuses some developers.
> > > >    - For serialization, it is hard to anticipate all kinds of
> classes and it is easier to have one class(any) that handles polymorphism.
> > > >    - Because of most frontends(python) are dynamic, it is easier to
> interface them with a type-erased API.
> > > >
> > > > As we can see there are pros and cons of bringing in more typing to
> the change, and there is no unified answer.
> > > > One good example of a nice infrastructure design trade-off is DLPack
> https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> > > > This is a base data structure adopted by MXNet, Pytorch, Chainer,
> and many other frameworks unanimously.
> > > > It is a type-erased data structure that erases the data type, and
> memory allocator from the data structure and is designed to exchange
> tensor(coming from different memory allocators) across DLL boundaries.
> > > > As you can see this is a good example of type-erased data structures.
> > > >
> > > > When we are having this kind of questions. It is important to have a
> good conversation. Sometimes we have to make tradeoffs rather than bend
> everyone-else to our will. This is what open source is about.
> > > > I would also like to give some examples of conversations and how
> design decisions are resolved. It comes from the TVM community's recent
> discussion about VM design.
> > > > I directly paste the github issue conversation here for the sake of
> clarity(note that all the conversations are also mirrored to dev@tvm).
> > > > The background is that the community want to bring a virtual machine
> that can execute dynamic operations more effectively.
> > > >
> > > > - The initial proposal, made by one of the committers gave a
> detailed design based on Stack VM https://github.com/dmlc/tvm/issues/2810
> > > >    - As you can see that there are quite some discussions about
> whether we want to use a different set of design, in this case, a
> register-based version.
> > > >    - The conversation evolves, and while the community members
> disagree on some cases, also agrees with each other on the particular
> tradeoffs.
> > > > - After some discussions, the committers bring a tradeoff design
> that tries to consolidate the needs of both sides and this is the final
> solution being adopted  https://github.com/dmlc/tvm/issues/2915
> > > > I would like to particularly highlight the fact that: 1) there are
> disagreements in the development process. 2) developers work together to
> understand each others' needs and then make consensus on a perhaps better
> design.
> > > >
> > > > There are two other particular conversations between Pedro and
> myself, which are during his contributions.
> > > > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the
> concern about API consistency, and Pedro brings up a reason why he thinks
> it is a better idea, I agreed and we merged the PR
> > > > - https://github.com/dmlc/tvm/pull/3108 In this other case, there
> are technical reasons for going both sides for the case of MXNet, we have
> listed pros/cons about both sides and have a constructive conversation.
> Eventually, I decided to not merge the PR after weighing in all the cases.
> > > >
> > > > I believe both are useful conversations, and while Pedro and I
> disagree sometimes, we do agree on many other cases. The most crucial part
> is about having a constructive conversation.
> > > > To summarize, I do refactoring and making things better is certainly
> important to make the project better. And I do believe it is crucial to
> think about all the technical consequences and make good tradeoff decisions.
> > > > Sometimes the decision may not make every developer mostly happy,
> but a good technical compromise could move the project forward and help the
> community in general.
> > > >
> > > > Tianqi
> > > >
> > > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <
> isabel@apache.org> wrote:
> > > >>
> > > >>
> > > >>
> > > >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <
> pedro.larroy.lists@gmail.com>:
> > > >> > I think Martin does a very good job explaining why
> > > >> >refactoring,
> > > >> >reducing developer frustration and internal improvement is a
> crucial
> > > >> >productivity multiplier which includes lower cost to ship features,
> > > >> >less
> > > >> >bugs and time spent debugging.
> > > >>
> > > >> There's one aspect that's special for open source projects: if a
> project wants to survive long term, it should make it easy for people to
> get started working on the project. In my experience, refactoring and
> cleanup play an important role in that. So thanks also for making
> recruiting of new contributers better.
> > > >>
> > > >> Isabel
> > > >> --
> > > >> This message was sent with K-9 from a mobile device with swipe to
> type enabled. I'm sorry for any embarrassing typos that slipped through.
>

Re: Does internal quality matters to users?

Posted by Pedro Larroy <pe...@gmail.com>.
Another data point. While working with a contributor, he/she is asking
to get access to the graph and values of the NDArray (me too) to be
able to reason more effectively about an enhacements to the operators:

https://github.com/apache/incubator-mxnet/pull/15120

I think gathering in the wiki design constraints with respect to
different activities and design proposals using the graph and pain
points as the one we are discussing would be a constructive way to
move forward, unless you think everything is as good as it gets right
now which is what I understand from your responses.

Re the shape of vector I know is one line of code to get it, but you
can't get the values with a debugger from other points of the code,
and that compounds to many other dynamic attributes that you can also
fetch, it's a bit like dying from a thousand cuts. In the MXNet
codebase we are paying a price for no benefit on flexibility as we
always use those attributes, hence my point they should be typed in
the graph. Please try to understand my point and help provide
solutions or at least a reason why it can't be improved instead of
saying there's no problem, there's still three problems in order of
importance: debuggability, clarity of definition of data structures
and navigability with an IDE which breaks with an untyped field keyed
with string.

Pedro.



Pedro.

On Tue, Jun 11, 2019 at 11:07 AM Pedro Larroy
<pe...@gmail.com> wrote:
>
> To put a recent specific example and focus the discussion (there are
> many as there are attributes), the shapes in the graph are a vector of
> Shape set as an attribute using dmlc::any so this makes it very
> difficult to debug the shapes when you have a graph object. I would
> have it as a typed attribute to the graph, as we always need the
> vector of shapes and operate on it while doing shape inference.
>
> On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy
> <pe...@gmail.com> wrote:
> >
> > Thanks for the good discussion.
> >
> > I actually wasn't referring particularly to our conversations in
> > github with respect of the refactors, but it's nice from you to bring
> > them up. And it's ok to disagree in small things, hopefully we can
> > align in the big ones.
> >
> > I understand that for TVM you might have different constraints with
> > how dynamic you want to be for mutating the graph and doing quick
> > experimentation and research but please try to understand my
> > perspectives coming from a software engineering background and helping
> > maintain MXNet for thousands of users and teams using it in
> > production. Let's also consider how many issues we have open and our
> > bandwidth to deal with additional complexity.
> >
> > To your pros and cons I would like to add and emphasize that currently
> > the heavy use of dynamic attributes in the graph using dmlc::any has
> > two very negative consequences, at least for MXNet:
> >
> > 1 - Makes the data structures using dmlc::any almost impossible to
> > debug, as they are just binary.
> > 2 - Makes the code more difficult to understand because there's no
> > declaration in a data structure of the data fields it uses and its
> > responsibilities. We are basically shoving all kinds of stuff using
> > dmlc::any.
> > 3 - You get no help from the IDE to navigate and refactor as another
> > consequence.
> >
> > I would really like you to give me solutions to these problems or at
> > least acknowledge them and tell me why do we have to pay those
> > tradeoffs instead of just dismissing them as engineering taste.
> >
> > The more I work with MXNet the more I wish debugging was easier, and
> > reading and refactoring the code, and those fields would be declared
> > and typed in their corresponding data structures, for MXNet I don't
> > think this would affect anything in regards the python bindings since
> > they go through the typed C API anyway.
> >
> > Maybe we can get some inspiration from LLVM as they have bindings for
> > many languages to work with the AST and have very clean APIs for the
> > compilation steps. It's also OK to have an initial dynamic codebase
> > for research and experimentation and then "cure" them into a solid
> > maintainable one with more types and more robust...
> >
> > Pedro.
> >
> >
> >
> >
> >
> > On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tq...@cs.washington.edu> wrote:
> > >
> > > A good infrastructure design has a long way to go and has a profound impact on the project itself. That is why we always want to rethink if the interface can be better done, and think about the next possible infrastructure to make things better, Refactoring is certainly part of it.
> > >
> > > There are usually two types of refactoring we refers to :
> > > 1) The major design change, in terms of class relations, data structures (e.g. numpy support, adding compilation to new hardware)
> > > 2) The specific choice of API, programming style(more types or type-erased program)
> > >
> > > (1) affects the long term support of the project, introduces new features if necessary and need a lot of thoughts into that. I believe the general IR, compilation and numpy support belongs to that category.
> > >
> > > I would particularly like to talk about (2).
> > > Because there is no unified correct answer in software engineering, different developers may prefer different views on a certain problem.
> > > Some of them have things to do with the taste developers. The change could favor certain aspect of the project, but not necessarily another part.
> > > Refactoring wrt these sometimes does require a more thoughtful conversation and make a reasonable compromise.
> > >
> > > For example, we have a recent discussion about whether to introduce more typing into the code base, to the extent that the base data structure could be templatized.
> > > - The Pros of this approach
> > >     - It introduces more typing and compile-time error message(instead of runtime checking), could help developers to find problem earlier.
> > > - The Cons of the approach:
> > >    - Having a template in the base data structure causes ABI problem(which code generated by DLL A vs DLL B) and will have potential future issues.
> > >    - Template sometimes confuses some developers.
> > >    - For serialization, it is hard to anticipate all kinds of classes and it is easier to have one class(any) that handles polymorphism.
> > >    - Because of most frontends(python) are dynamic, it is easier to interface them with a type-erased API.
> > >
> > > As we can see there are pros and cons of bringing in more typing to the change, and there is no unified answer.
> > > One good example of a nice infrastructure design trade-off is DLPack https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> > > This is a base data structure adopted by MXNet, Pytorch, Chainer, and many other frameworks unanimously.
> > > It is a type-erased data structure that erases the data type, and memory allocator from the data structure and is designed to exchange tensor(coming from different memory allocators) across DLL boundaries.
> > > As you can see this is a good example of type-erased data structures.
> > >
> > > When we are having this kind of questions. It is important to have a good conversation. Sometimes we have to make tradeoffs rather than bend everyone-else to our will. This is what open source is about.
> > > I would also like to give some examples of conversations and how design decisions are resolved. It comes from the TVM community's recent discussion about VM design.
> > > I directly paste the github issue conversation here for the sake of clarity(note that all the conversations are also mirrored to dev@tvm).
> > > The background is that the community want to bring a virtual machine that can execute dynamic operations more effectively.
> > >
> > > - The initial proposal, made by one of the committers gave a detailed design based on Stack VM https://github.com/dmlc/tvm/issues/2810
> > >    - As you can see that there are quite some discussions about whether we want to use a different set of design, in this case, a register-based version.
> > >    - The conversation evolves, and while the community members disagree on some cases, also agrees with each other on the particular tradeoffs.
> > > - After some discussions, the committers bring a tradeoff design that tries to consolidate the needs of both sides and this is the final solution being adopted  https://github.com/dmlc/tvm/issues/2915
> > > I would like to particularly highlight the fact that: 1) there are disagreements in the development process. 2) developers work together to understand each others' needs and then make consensus on a perhaps better design.
> > >
> > > There are two other particular conversations between Pedro and myself, which are during his contributions.
> > > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern about API consistency, and Pedro brings up a reason why he thinks it is a better idea, I agreed and we merged the PR
> > > - https://github.com/dmlc/tvm/pull/3108 In this other case, there are technical reasons for going both sides for the case of MXNet, we have listed pros/cons about both sides and have a constructive conversation. Eventually, I decided to not merge the PR after weighing in all the cases.
> > >
> > > I believe both are useful conversations, and while Pedro and I disagree sometimes, we do agree on many other cases. The most crucial part is about having a constructive conversation.
> > > To summarize, I do refactoring and making things better is certainly important to make the project better. And I do believe it is crucial to think about all the technical consequences and make good tradeoff decisions.
> > > Sometimes the decision may not make every developer mostly happy, but a good technical compromise could move the project forward and help the community in general.
> > >
> > > Tianqi
> > >
> > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <is...@apache.org> wrote:
> > >>
> > >>
> > >>
> > >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <pe...@gmail.com>:
> > >> > I think Martin does a very good job explaining why
> > >> >refactoring,
> > >> >reducing developer frustration and internal improvement is a crucial
> > >> >productivity multiplier which includes lower cost to ship features,
> > >> >less
> > >> >bugs and time spent debugging.
> > >>
> > >> There's one aspect that's special for open source projects: if a project wants to survive long term, it should make it easy for people to get started working on the project. In my experience, refactoring and cleanup play an important role in that. So thanks also for making recruiting of new contributers better.
> > >>
> > >> Isabel
> > >> --
> > >> This message was sent with K-9 from a mobile device with swipe to type enabled. I'm sorry for any embarrassing typos that slipped through.

Re: Does internal quality matters to users?

Posted by Tianqi Chen <tq...@cs.washington.edu>.
> Re that particular case.
>
> The shape of vector will be typed after being fetched and won’t affect the
> general effort for programming. Getting the shape vector out contains
> around one line of code.
>
> The str to any map is defined to enable future compatibility of the
> general set of attributes. While it is possible to specialize such kind of
> attributes, this will likely make the set of code processing one kind of
> attributes differ from another.
>
> In summary, there won’t really be any problem keeping the any storage of
> the shape vector, as long as it is properly documented.
>
> Tianqi
>
> On Tue, Jun 11, 2019 at 11:07 AM Pedro Larroy <
> pedro.larroy.lists@gmail.com> wrote:
>
>> To put a recent specific example and focus the discussion (there are
>> many as there are attributes), the shapes in the graph are a vector of
>> Shape set as an attribute using dmlc::any so this makes it very
>> difficult to debug the shapes when you have a graph object. I would
>> have it as a typed attribute to the graph, as we always need the
>> vector of shapes and operate on it while doing shape inference.
>>
>> On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy
>> <pe...@gmail.com> wrote:
>> >
>> > Thanks for the good discussion.
>> >
>> > I actually wasn't referring particularly to our conversations in
>> > github with respect of the refactors, but it's nice from you to bring
>> > them up. And it's ok to disagree in small things, hopefully we can
>> > align in the big ones.
>> >
>> > I understand that for TVM you might have different constraints with
>> > how dynamic you want to be for mutating the graph and doing quick
>> > experimentation and research but please try to understand my
>> > perspectives coming from a software engineering background and helping
>> > maintain MXNet for thousands of users and teams using it in
>> > production. Let's also consider how many issues we have open and our
>> > bandwidth to deal with additional complexity.
>> >
>> > To your pros and cons I would like to add and emphasize that currently
>> > the heavy use of dynamic attributes in the graph using dmlc::any has
>> > two very negative consequences, at least for MXNet:
>> >
>> > 1 - Makes the data structures using dmlc::any almost impossible to
>> > debug, as they are just binary.
>> > 2 - Makes the code more difficult to understand because there's no
>> > declaration in a data structure of the data fields it uses and its
>> > responsibilities. We are basically shoving all kinds of stuff using
>> > dmlc::any.
>> > 3 - You get no help from the IDE to navigate and refactor as another
>> > consequence.
>> >
>> > I would really like you to give me solutions to these problems or at
>> > least acknowledge them and tell me why do we have to pay those
>> > tradeoffs instead of just dismissing them as engineering taste.
>> >
>> > The more I work with MXNet the more I wish debugging was easier, and
>> > reading and refactoring the code, and those fields would be declared
>> > and typed in their corresponding data structures, for MXNet I don't
>> > think this would affect anything in regards the python bindings since
>> > they go through the typed C API anyway.
>> >
>> > Maybe we can get some inspiration from LLVM as they have bindings for
>> > many languages to work with the AST and have very clean APIs for the
>> > compilation steps. It's also OK to have an initial dynamic codebase
>> > for research and experimentation and then "cure" them into a solid
>> > maintainable one with more types and more robust...
>> >
>> > Pedro.
>> >
>> >
>> >
>> >
>> >
>> > On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tq...@cs.washington.edu>
>> wrote:
>> > >
>> > > A good infrastructure design has a long way to go and has a profound
>> impact on the project itself. That is why we always want to rethink if the
>> interface can be better done, and think about the next possible
>> infrastructure to make things better, Refactoring is certainly part of it.
>> > >
>> > > There are usually two types of refactoring we refers to :
>> > > 1) The major design change, in terms of class relations, data
>> structures (e.g. numpy support, adding compilation to new hardware)
>> > > 2) The specific choice of API, programming style(more types or
>> type-erased program)
>> > >
>> > > (1) affects the long term support of the project, introduces new
>> features if necessary and need a lot of thoughts into that. I believe the
>> general IR, compilation and numpy support belongs to that category.
>> > >
>> > > I would particularly like to talk about (2).
>> > > Because there is no unified correct answer in software engineering,
>> different developers may prefer different views on a certain problem.
>> > > Some of them have things to do with the taste developers. The change
>> could favor certain aspect of the project, but not necessarily another part.
>> > > Refactoring wrt these sometimes does require a more thoughtful
>> conversation and make a reasonable compromise.
>> > >
>> > > For example, we have a recent discussion about whether to introduce
>> more typing into the code base, to the extent that the base data structure
>> could be templatized.
>> > > - The Pros of this approach
>> > >     - It introduces more typing and compile-time error
>> message(instead of runtime checking), could help developers to find problem
>> earlier.
>> > > - The Cons of the approach:
>> > >    - Having a template in the base data structure causes ABI
>> problem(which code generated by DLL A vs DLL B) and will have potential
>> future issues.
>> > >    - Template sometimes confuses some developers.
>> > >    - For serialization, it is hard to anticipate all kinds of classes
>> and it is easier to have one class(any) that handles polymorphism.
>> > >    - Because of most frontends(python) are dynamic, it is easier to
>> interface them with a type-erased API.
>> > >
>> > > As we can see there are pros and cons of bringing in more typing to
>> the change, and there is no unified answer.
>> > > One good example of a nice infrastructure design trade-off is DLPack
>> https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
>> > > This is a base data structure adopted by MXNet, Pytorch, Chainer, and
>> many other frameworks unanimously.
>> > > It is a type-erased data structure that erases the data type, and
>> memory allocator from the data structure and is designed to exchange
>> tensor(coming from different memory allocators) across DLL boundaries.
>> > > As you can see this is a good example of type-erased data structures.
>> > >
>> > > When we are having this kind of questions. It is important to have a
>> good conversation. Sometimes we have to make tradeoffs rather than bend
>> everyone-else to our will. This is what open source is about.
>> > > I would also like to give some examples of conversations and how
>> design decisions are resolved. It comes from the TVM community's recent
>> discussion about VM design.
>> > > I directly paste the github issue conversation here for the sake of
>> clarity(note that all the conversations are also mirrored to dev@tvm).
>> > > The background is that the community want to bring a virtual machine
>> that can execute dynamic operations more effectively.
>> > >
>> > > - The initial proposal, made by one of the committers gave a detailed
>> design based on Stack VM https://github.com/dmlc/tvm/issues/2810
>> > >    - As you can see that there are quite some discussions about
>> whether we want to use a different set of design, in this case, a
>> register-based version.
>> > >    - The conversation evolves, and while the community members
>> disagree on some cases, also agrees with each other on the particular
>> tradeoffs.
>> > > - After some discussions, the committers bring a tradeoff design that
>> tries to consolidate the needs of both sides and this is the final solution
>> being adopted  https://github.com/dmlc/tvm/issues/2915
>> > > I would like to particularly highlight the fact that: 1) there are
>> disagreements in the development process. 2) developers work together to
>> understand each others' needs and then make consensus on a perhaps better
>> design.
>> > >
>> > > There are two other particular conversations between Pedro and
>> myself, which are during his contributions.
>> > > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the
>> concern about API consistency, and Pedro brings up a reason why he thinks
>> it is a better idea, I agreed and we merged the PR
>> > > - https://github.com/dmlc/tvm/pull/3108 In this other case, there
>> are technical reasons for going both sides for the case of MXNet, we have
>> listed pros/cons about both sides and have a constructive conversation.
>> Eventually, I decided to not merge the PR after weighing in all the cases.
>> > >
>> > > I believe both are useful conversations, and while Pedro and I
>> disagree sometimes, we do agree on many other cases. The most crucial part
>> is about having a constructive conversation.
>> > > To summarize, I do refactoring and making things better is certainly
>> important to make the project better. And I do believe it is crucial to
>> think about all the technical consequences and make good tradeoff decisions.
>> > > Sometimes the decision may not make every developer mostly happy, but
>> a good technical compromise could move the project forward and help the
>> community in general.
>> > >
>> > > Tianqi
>> > >
>> > > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <is...@apache.org>
>> wrote:
>> > >>
>> > >>
>> > >>
>> > >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <
>> pedro.larroy.lists@gmail.com>:
>> > >> > I think Martin does a very good job explaining why
>> > >> >refactoring,
>> > >> >reducing developer frustration and internal improvement is a crucial
>> > >> >productivity multiplier which includes lower cost to ship features,
>> > >> >less
>> > >> >bugs and time spent debugging.
>> > >>
>> > >> There's one aspect that's special for open source projects: if a
>> project wants to survive long term, it should make it easy for people to
>> get started working on the project. In my experience, refactoring and
>> cleanup play an important role in that. So thanks also for making
>> recruiting of new contributers better.
>> > >>
>> > >> Isabel
>> > >> --
>> > >> This message was sent with K-9 from a mobile device with swipe to
>> type enabled. I'm sorry for any embarrassing typos that slipped through.
>>
>

Re: Does internal quality matters to users?

Posted by Pedro Larroy <pe...@gmail.com>.
To put a recent specific example and focus the discussion (there are
many as there are attributes), the shapes in the graph are a vector of
Shape set as an attribute using dmlc::any so this makes it very
difficult to debug the shapes when you have a graph object. I would
have it as a typed attribute to the graph, as we always need the
vector of shapes and operate on it while doing shape inference.

On Tue, Jun 11, 2019 at 10:18 AM Pedro Larroy
<pe...@gmail.com> wrote:
>
> Thanks for the good discussion.
>
> I actually wasn't referring particularly to our conversations in
> github with respect of the refactors, but it's nice from you to bring
> them up. And it's ok to disagree in small things, hopefully we can
> align in the big ones.
>
> I understand that for TVM you might have different constraints with
> how dynamic you want to be for mutating the graph and doing quick
> experimentation and research but please try to understand my
> perspectives coming from a software engineering background and helping
> maintain MXNet for thousands of users and teams using it in
> production. Let's also consider how many issues we have open and our
> bandwidth to deal with additional complexity.
>
> To your pros and cons I would like to add and emphasize that currently
> the heavy use of dynamic attributes in the graph using dmlc::any has
> two very negative consequences, at least for MXNet:
>
> 1 - Makes the data structures using dmlc::any almost impossible to
> debug, as they are just binary.
> 2 - Makes the code more difficult to understand because there's no
> declaration in a data structure of the data fields it uses and its
> responsibilities. We are basically shoving all kinds of stuff using
> dmlc::any.
> 3 - You get no help from the IDE to navigate and refactor as another
> consequence.
>
> I would really like you to give me solutions to these problems or at
> least acknowledge them and tell me why do we have to pay those
> tradeoffs instead of just dismissing them as engineering taste.
>
> The more I work with MXNet the more I wish debugging was easier, and
> reading and refactoring the code, and those fields would be declared
> and typed in their corresponding data structures, for MXNet I don't
> think this would affect anything in regards the python bindings since
> they go through the typed C API anyway.
>
> Maybe we can get some inspiration from LLVM as they have bindings for
> many languages to work with the AST and have very clean APIs for the
> compilation steps. It's also OK to have an initial dynamic codebase
> for research and experimentation and then "cure" them into a solid
> maintainable one with more types and more robust...
>
> Pedro.
>
>
>
>
>
> On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <tq...@cs.washington.edu> wrote:
> >
> > A good infrastructure design has a long way to go and has a profound impact on the project itself. That is why we always want to rethink if the interface can be better done, and think about the next possible infrastructure to make things better, Refactoring is certainly part of it.
> >
> > There are usually two types of refactoring we refers to :
> > 1) The major design change, in terms of class relations, data structures (e.g. numpy support, adding compilation to new hardware)
> > 2) The specific choice of API, programming style(more types or type-erased program)
> >
> > (1) affects the long term support of the project, introduces new features if necessary and need a lot of thoughts into that. I believe the general IR, compilation and numpy support belongs to that category.
> >
> > I would particularly like to talk about (2).
> > Because there is no unified correct answer in software engineering, different developers may prefer different views on a certain problem.
> > Some of them have things to do with the taste developers. The change could favor certain aspect of the project, but not necessarily another part.
> > Refactoring wrt these sometimes does require a more thoughtful conversation and make a reasonable compromise.
> >
> > For example, we have a recent discussion about whether to introduce more typing into the code base, to the extent that the base data structure could be templatized.
> > - The Pros of this approach
> >     - It introduces more typing and compile-time error message(instead of runtime checking), could help developers to find problem earlier.
> > - The Cons of the approach:
> >    - Having a template in the base data structure causes ABI problem(which code generated by DLL A vs DLL B) and will have potential future issues.
> >    - Template sometimes confuses some developers.
> >    - For serialization, it is hard to anticipate all kinds of classes and it is easier to have one class(any) that handles polymorphism.
> >    - Because of most frontends(python) are dynamic, it is easier to interface them with a type-erased API.
> >
> > As we can see there are pros and cons of bringing in more typing to the change, and there is no unified answer.
> > One good example of a nice infrastructure design trade-off is DLPack https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
> > This is a base data structure adopted by MXNet, Pytorch, Chainer, and many other frameworks unanimously.
> > It is a type-erased data structure that erases the data type, and memory allocator from the data structure and is designed to exchange tensor(coming from different memory allocators) across DLL boundaries.
> > As you can see this is a good example of type-erased data structures.
> >
> > When we are having this kind of questions. It is important to have a good conversation. Sometimes we have to make tradeoffs rather than bend everyone-else to our will. This is what open source is about.
> > I would also like to give some examples of conversations and how design decisions are resolved. It comes from the TVM community's recent discussion about VM design.
> > I directly paste the github issue conversation here for the sake of clarity(note that all the conversations are also mirrored to dev@tvm).
> > The background is that the community want to bring a virtual machine that can execute dynamic operations more effectively.
> >
> > - The initial proposal, made by one of the committers gave a detailed design based on Stack VM https://github.com/dmlc/tvm/issues/2810
> >    - As you can see that there are quite some discussions about whether we want to use a different set of design, in this case, a register-based version.
> >    - The conversation evolves, and while the community members disagree on some cases, also agrees with each other on the particular tradeoffs.
> > - After some discussions, the committers bring a tradeoff design that tries to consolidate the needs of both sides and this is the final solution being adopted  https://github.com/dmlc/tvm/issues/2915
> > I would like to particularly highlight the fact that: 1) there are disagreements in the development process. 2) developers work together to understand each others' needs and then make consensus on a perhaps better design.
> >
> > There are two other particular conversations between Pedro and myself, which are during his contributions.
> > - https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern about API consistency, and Pedro brings up a reason why he thinks it is a better idea, I agreed and we merged the PR
> > - https://github.com/dmlc/tvm/pull/3108 In this other case, there are technical reasons for going both sides for the case of MXNet, we have listed pros/cons about both sides and have a constructive conversation. Eventually, I decided to not merge the PR after weighing in all the cases.
> >
> > I believe both are useful conversations, and while Pedro and I disagree sometimes, we do agree on many other cases. The most crucial part is about having a constructive conversation.
> > To summarize, I do refactoring and making things better is certainly important to make the project better. And I do believe it is crucial to think about all the technical consequences and make good tradeoff decisions.
> > Sometimes the decision may not make every developer mostly happy, but a good technical compromise could move the project forward and help the community in general.
> >
> > Tianqi
> >
> > On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <is...@apache.org> wrote:
> >>
> >>
> >>
> >> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <pe...@gmail.com>:
> >> > I think Martin does a very good job explaining why
> >> >refactoring,
> >> >reducing developer frustration and internal improvement is a crucial
> >> >productivity multiplier which includes lower cost to ship features,
> >> >less
> >> >bugs and time spent debugging.
> >>
> >> There's one aspect that's special for open source projects: if a project wants to survive long term, it should make it easy for people to get started working on the project. In my experience, refactoring and cleanup play an important role in that. So thanks also for making recruiting of new contributers better.
> >>
> >> Isabel
> >> --
> >> This message was sent with K-9 from a mobile device with swipe to type enabled. I'm sorry for any embarrassing typos that slipped through.