You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2021/05/13 07:41:18 UTC

[GitHub] [incubator-pegasus] levy5307 edited a comment on pull request #722: refactor: implement value_shcema_v0

levy5307 edited a comment on pull request #722:
URL: https://github.com/apache/incubator-pegasus/pull/722#issuecomment-827254982


   > If your intention of this PR is to add new value schema more conveniently, I can give a simple solution:
   > 
   > ```c++
   > struct value_schema_v2
   > {
   >     static constexpr size_t version_field_offset = 0;
   >     typedef version_type = uint8_t;
   > 
   >     static constexpr size_t expire_ts_field_offset = 1;
   >     typedef expire_ts_type = uint32_t;
   > 
   >     static constexpr size_t timetag_field_offset = 5;
   >     typedef timetag_type = uint64_t;
   > 
   >     value_schema_v2() {
   >       _write_buf.resize(sizeof(uint8_t) + sizeof(uint32_t) + sizeof(uint64_t));
   >       _write_slices.resize(2);
   >     }
   > 
   >     rocksdb::SliceParts generate_value(uint8_t version, uint32_t expire_ts, uint64_t timetag, dsn::string_view user_data) {
   >       data_output out(_write_buf);
   >       out.write<uint8_t>(version);
   >       out.write<uint32_t>(expire_ts);
   >       out.write<uint64_t>(user_data);
   > 
   >       _write_buf[0] = rocksdb::Slice(_write_buf.data, )
   >       return {&_write_slices[0], static_cast<int>(_write_slices.size())};
   >     }
   > 
   >     static uint8_t extract_version(string_view val);
   >     static uint32_t extract_expire_ts(string_view val);
   >     static uint64_t extract_timetag(string_view val);
   >     static blob:: extract_user_data(std::string&& val);
   > 
   > private:
   >     std::string _write_buf;
   >     std::vector<rocksdb::Slice> _write_slices;
   > };
   > 
   > void extract_expire_ts_from_value(int version, string_view user_data);
   > void extract_timetag_from_value(int version, string_view user_data);
   > void extract_version_from_value(int version, string_view user_data);
   > ```
   > 
   > As you can see, this code is very similar to our thrift-generated code. It can be generated with a very simple template language, for example, yaml.
   > 
   > ```yaml
   > 0:
   >   - expire_ts : 
   >       type: uint32_t
   >       size: 4
   > 
   > 1:
   >   - expire_ts : 
   >       type: uint32_t
   >       size: 4
   >   - timetag : 
   >       type: uint64_t
   >       size: 8
   > 
   > 2:
   >   - version : 
   >       type: uint8_t
   >       size: 1
   >   - expire_ts : 
   >       type: uint32_t
   >       size: 4
   >   - timetag : 
   >       type: uint64_t
   >       size: 8
   > ```
   > 
   > I would suggest implementing a tool that can generate the above code according to the template, instead of using class inheritance.
   > The value encoding/decoding is extremely critical and requires efficiency. I think we should avoid runtime type deduction as much as we can, for example:
   > 
   > ```c++
   > void extract_expire_ts_from_value(value_schema *schema, string_view user_data) {
   >     auto field = schema->extract_field(raw_value, pegasus::value_field_type::EXPIRE_TIMESTAMP); // type deduction #1
   >     auto expire_ts_field = static_cast<expire_timestamp_field *>(segment.get()); // type deduction #2
   >     return expire_ts_field->expire_ts;
   > ```
   > 
   > There're 2 deductions here. I'm not indicating that this code will definitely perform poorly. But technically, I would prefer to write code without potential slow points.
   > 
   > Using a code generator can also avoid bugs so long as the tool itself is stable. So when we add another field in the future, we only need to modify the template file, rather than introduce more code.
   
   I don't think it's a good idea
   1.  we should add a different interface for each filed, for example: `extract_version_from_value`. This means you should add a interface if you add a filed. And In this interface, you should deal with all of these version even if this version does't support it. The code is not elegant enough if we implement it like this. To be honest, that's what I did initially, but all of us thought it wasn't good enough.
   2. `static_cast` doesn't cost too much time. And there is only one type deduction, not two.
   3. It's too complex to implement a tool to generate these code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org