You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jack Stockdale (JIRA)" <ji...@apache.org> on 2016/06/04 13:05:59 UTC
[jira] [Updated] (AVRO-1859) Potential Invalid memory allocation

     [ https://issues.apache.org/jira/browse/AVRO-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jack Stockdale updated AVRO-1859:
---------------------------------
    Description: 
The (potential) memory issue is illustrated by the following code, which is a minimal set of steps to reproduce an issue found on a much larger real application in which avro buffers are read from a third party producer using pre-configured reader and writer schemas, which are not attached to each individual avro buffer.

The cause of the original problem identified was a down to a change in one of these schemas, but the sample code below shows the same effect and illustrates that without additional validation a crafted input can cause unwanted behaviour in the avro c library.

It may be that here there are additional required validation steps missing  - in which case are there any examples of how validation should be performed by the application?

Essentially a particular avro input buffer causes the library to attempt to allocate memory of  18446744073709551577 bytes in realloc.

This was caught in testing of the real application because the jemalloc memory allocator's xmalloc setting was in use, which causes an abort on any failed memory allocation.

{code}
#include "avro.h"

const char* schemaBuffer = "\
\
{\"namespace\": \"example.avro\",\
 \"type\":\"record\",\
 \"name\":\"example\",\
 \"fields\": [\
	  {\"name\": \"s1\", \"type\": \"string\"}\
 ]\
}";

int main(int argc, char* argv[])
{
  char buffer[4];
  memset(buffer,0 , sizeof(buffer));

  buffer[0] = 0x4f;

  avro_reader_t reader = avro_reader_memory(buffer, 1);
  avro_schema_t schema;

  if(!avro_schema_from_json_length(schemaBuffer, strlen(schemaBuffer), &schema) && reader)
  {
    avro_value_iface_t* iface = avro_generic_class_from_schema(schema);

    if(iface)
    {
      avro_value_t row;

      avro_generic_value_new(iface, &row);

      // the following attempts to allocate memory for a string of size -39 bytes represented as int64_t in:
      //	   static int read_string(avro_reader_t reader, char **s, int64_t *len)

      // -39 is cast a to size_t of size 18446744073709551577 which is passed to the size to realloc in:
      //    avro_default_allocator(void *ud, void *ptr, size_t osize, size_t nsize)

      // an attempt is then made to allocate memory of 18446744073709551577 bytes

      avro_value_read(reader, &row);
    }
  }
}

{code}
Should there by additional validation steps in application code to ensure the buffer is valid for the schema before reading it?



  was:
The (potential) memory issue is illustrated by the following code, which is a minimal set of steps to reproduce an issue found on a much larger real application in which avro buffers are read from a third party producer using pre-configured reader and writer schemas, which are not attached to each individual avro buffer.

The cause of the original problem identified was a down to a change in one of these schemas, but the sample code below shows the same effect and illustrates that without additional validation a crafted input can cause unwanted behaviour in the avro c library.

It may be that here there are additional required validation steps missing  - in which case are there any examples of how validation should be performed by the application?

Essentially a particular avro input buffer causes the library to attempt to allocate memory of  18446744073709551577 bytes in realloc.

This was caught in testing of the real application because the jemalloc memory allocator's xmalloc setting was in use, which causes an abort on any failed memory allocation.


#include "avro.h"

const char* schemaBuffer = "\
\
{\"namespace\": \"example.avro\",\
 \"type\":\"record\",\
 \"name\":\"example\",\
 \"fields\": [\
	  {\"name\": \"s1\", \"type\": \"string\"}\
 ]\
}";

int main(int argc, char* argv[])
{
  char buffer[4];
  memset(buffer,0 , sizeof(buffer));

  buffer[0] = 0x4f;

  avro_reader_t reader = avro_reader_memory(buffer, 1);
  avro_schema_t schema;

  if(!avro_schema_from_json_length(schemaBuffer, strlen(schemaBuffer), &schema) && reader)
  {
    avro_value_iface_t* iface = avro_generic_class_from_schema(schema);

    if(iface)
    {
      avro_value_t row;

      avro_generic_value_new(iface, &row);

      // the following attempts to allocate memory for a string of size -39 bytes represented as int64_t in:
      //	   static int read_string(avro_reader_t reader, char **s, int64_t *len)

      // -39 is cast a to size_t of size 18446744073709551577 which is passed to the size to realloc in:
      //    avro_default_allocator(void *ud, void *ptr, size_t osize, size_t nsize)

      // an attempt is then made to allocate memory of 18446744073709551577 bytes

      avro_value_read(reader, &row);
    }
  }
}


Should there by additional validation steps in application code to ensure the buffer is valid for the schema before reading it?




> Potential Invalid memory allocation
> -----------------------------------
>
>                 Key: AVRO-1859
>                 URL: https://issues.apache.org/jira/browse/AVRO-1859
>             Project: Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.8.1
>         Environment: linux 64 bit
>            Reporter: Jack Stockdale
>
> The (potential) memory issue is illustrated by the following code, which is a minimal set of steps to reproduce an issue found on a much larger real application in which avro buffers are read from a third party producer using pre-configured reader and writer schemas, which are not attached to each individual avro buffer.
> The cause of the original problem identified was a down to a change in one of these schemas, but the sample code below shows the same effect and illustrates that without additional validation a crafted input can cause unwanted behaviour in the avro c library.
> It may be that here there are additional required validation steps missing  - in which case are there any examples of how validation should be performed by the application?
> Essentially a particular avro input buffer causes the library to attempt to allocate memory of  18446744073709551577 bytes in realloc.
> This was caught in testing of the real application because the jemalloc memory allocator's xmalloc setting was in use, which causes an abort on any failed memory allocation.
> {code}
> #include "avro.h"
> const char* schemaBuffer = "\
> \
> {\"namespace\": \"example.avro\",\
>  \"type\":\"record\",\
>  \"name\":\"example\",\
>  \"fields\": [\
> 	  {\"name\": \"s1\", \"type\": \"string\"}\
>  ]\
> }";
> int main(int argc, char* argv[])
> {
>   char buffer[4];
>   memset(buffer,0 , sizeof(buffer));
>   buffer[0] = 0x4f;
>   avro_reader_t reader = avro_reader_memory(buffer, 1);
>   avro_schema_t schema;
>   if(!avro_schema_from_json_length(schemaBuffer, strlen(schemaBuffer), &schema) && reader)
>   {
>     avro_value_iface_t* iface = avro_generic_class_from_schema(schema);
>     if(iface)
>     {
>       avro_value_t row;
>       avro_generic_value_new(iface, &row);
>       // the following attempts to allocate memory for a string of size -39 bytes represented as int64_t in:
>       //	   static int read_string(avro_reader_t reader, char **s, int64_t *len)
>       // -39 is cast a to size_t of size 18446744073709551577 which is passed to the size to realloc in:
>       //    avro_default_allocator(void *ud, void *ptr, size_t osize, size_t nsize)
>       // an attempt is then made to allocate memory of 18446744073709551577 bytes
>       avro_value_read(reader, &row);
>     }
>   }
> }
> {code}
> Should there by additional validation steps in application code to ensure the buffer is valid for the schema before reading it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)