You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2013/09/13 06:12:55 UTC

[lucy-dev] Deserializing and trust

On Thu, Sep 5, 2013 at 3:11 PM,  <nw...@apache.org> wrote:
> Eliminate mutable String in Util::Freezer

> Project: http://git-wip-us.apache.org/repos/asf/lucy/repo
> Commit: http://git-wip-us.apache.org/repos/asf/lucy/commit/cfea9e61
> Tree: http://git-wip-us.apache.org/repos/asf/lucy/tree/cfea9e61
> Diff: http://git-wip-us.apache.org/repos/asf/lucy/diff/cfea9e61

>      Hash_init(hash, size);
>
>      // Read key-value pairs with String keys.
>      while (num_strings--) {
>          uint32_t len = InStream_Read_C32(instream);
> -        char *key_buf = Str_Grow(key, len);
> +        char *key_buf = (char*)MALLOCATE(len + 1);
>          InStream_Read_Bytes(instream, key_buf, len);
>          key_buf[len] = '\0';
> -        Str_Set_Size(key, len);
> +        String *key = Str_new_steal_from_trusted_str(key_buf, len, len + 1);
>          Hash_Store(hash, (Obj*)key, THAW(instream));
> +        DECREF(key);
>      }

When reading the key, we should use a constructor which validates incoming
UTF-8 rather than Str_new_steal_from_trusted_str because we don't know (and
therefore don't "trust") the origin of the bytes we're deserializing.

Marvin Humphrey

Re: [lucy-dev] Deserializing and trust

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Sep 13, 2013 at 3:38 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> Now that the code pattern above appears in quite a few places, we should
> also consider a new method like:
>
> incremented String*
> InStream_Read_Utf8(InStream *self, size_t len);

+1

I'd name that second parameter "size" since we use "size" to refer to memory
dimensions more often and "length"/"len" more often to refer to a count of
logical characters or code points (though not exclusively).

Marvin Humphrey

Re: [lucy-dev] Deserializing and trust

Posted by Nick Wellnhofer <we...@aevum.de>.
On 13/09/2013 06:12, Marvin Humphrey wrote:
> On Thu, Sep 5, 2013 at 3:11 PM,  <nw...@apache.org> wrote:
>>       // Read key-value pairs with String keys.
>>       while (num_strings--) {
>>           uint32_t len = InStream_Read_C32(instream);
>> -        char *key_buf = Str_Grow(key, len);
>> +        char *key_buf = (char*)MALLOCATE(len + 1);
>>           InStream_Read_Bytes(instream, key_buf, len);
>>           key_buf[len] = '\0';
>> -        Str_Set_Size(key, len);
>> +        String *key = Str_new_steal_from_trusted_str(key_buf, len, len + 1);
>>           Hash_Store(hash, (Obj*)key, THAW(instream));
>> +        DECREF(key);
>>       }
>
> When reading the key, we should use a constructor which validates incoming
> UTF-8 rather than Str_new_steal_from_trusted_str because we don't know (and
> therefore don't "trust") the origin of the bytes we're deserializing.

+1

Now that the code pattern above appears in quite a few places, we should 
also consider a new method like:

incremented String*
InStream_Read_Utf8(InStream *self, size_t len);

Nick