You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by LOKANANDA DHAGE MUNISAMAPPA <dh...@wisc.edu> on 2017/12/10 03:45:16 UTC

Bulk Insert in Quickstep creates a new block for every row?

Hi Quickstep Devs,

I am trying to bulk insert tuples into a table in Quickstep created with BlockProperties value of BLOCKSIZEMB to 2MB as below.

Create table Query
CREATE TABLE lineorder (
  lo_orderkey      INT NOT NULL,
  lo_linenumber    INT NOT NULL,
  lo_custkey       INT NOT NULL,
  lo_orderpriority CHAR(1024) NOT NULL
) WITH BLOCKPROPERTIES (
TYPE split_rowstore,
BLOCKSIZEMB 2);

Insert table Query

INSERT INTO lineorder VALUES (22, 6, 0, 'high'),(20, 6, 1, 'high'),(20, 6, 2, 'high'),(20, 6, 3, 'high'),(20, 6, 4, 'high'),(20, 6, 5, 'high'),(20, 6, 6, 'high'),(20, 6, 7, 'high'),(20, 6, 8, 'high')….;

As a result of this, I am observing that, it is creating a separate block of 2MB for every tuple, thereby creating a number of half-filled blocks of size 2MB. Is this the expected behavior for bulk-insert in quickstep?

Is there a way to bulk-insert without creating a new block for every row in Quickstep? This would be helpful for creating the workload for our project, as insertion using multiple separate queries is taking a lot of time, although in this case, it is not really creating a separate block for every tuple and seems to work as expected.

Best,
Lokananda

Re: Bulk Insert in Quickstep creates a new block for every row?

Posted by Harshad Deshmukh <ha...@cs.wisc.edu>.
Hi Lokananda,


There is a support for bulk insert. If you have a csv file in which the columns are delimited by the symbol '|' (without the single quotes), the bulk insert syntax is :



COPY table_name FROM '/full/path/to/csvfile' WITH (DELIMITER '|');


An example of the COPY command can be found in the README file on Github (https://github.com/apache/incubator-quickstep)


Thanks,

Harshad

________________________________
From: LOKANANDA DHAGE MUNISAMAPPA <dh...@wisc.edu>
Sent: Saturday, December 9, 2017 9:45:16 PM
To: dev@quickstep.incubator.apache.org
Cc: Om Jadhav
Subject: Bulk Insert in Quickstep creates a new block for every row?

Hi Quickstep Devs,

I am trying to bulk insert tuples into a table in Quickstep created with BlockProperties value of BLOCKSIZEMB to 2MB as below.

Create table Query
CREATE TABLE lineorder (
  lo_orderkey      INT NOT NULL,
  lo_linenumber    INT NOT NULL,
  lo_custkey       INT NOT NULL,
  lo_orderpriority CHAR(1024) NOT NULL
) WITH BLOCKPROPERTIES (
TYPE split_rowstore,
BLOCKSIZEMB 2);

Insert table Query

INSERT INTO lineorder VALUES (22, 6, 0, 'high'),(20, 6, 1, 'high'),(20, 6, 2, 'high'),(20, 6, 3, 'high'),(20, 6, 4, 'high'),(20, 6, 5, 'high'),(20, 6, 6, 'high'),(20, 6, 7, 'high'),(20, 6, 8, 'high')….;

As a result of this, I am observing that, it is creating a separate block of 2MB for every tuple, thereby creating a number of half-filled blocks of size 2MB. Is this the expected behavior for bulk-insert in quickstep?

Is there a way to bulk-insert without creating a new block for every row in Quickstep? This would be helpful for creating the workload for our project, as insertion using multiple separate queries is taking a lot of time, although in this case, it is not really creating a separate block for every tuple and seems to work as expected.

Best,
Lokananda