You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Antoine Pitrou <an...@python.org> on 2024/01/18 10:08:11 UTC

Error building with IntelliJ

Hello all,

Thank you for the suggestions. I am trying to build parquet-mr from
IntelliJ now ("Build" -> "Build Project"), but I get the following
error:

/home/antoine/parquet/mr/parquet-common/src/test/java/org/apache/parquet/VersionTest.java:46:24
java: cannot find symbol
  symbol:   variable Version
  location: class org.apache.parquet.VersionTest


Am I missing something obvious? Does a separate step need to be run
first?

Regards

Antoine.



On Thu, 11 Jan 2024 18:48:20 +0100
Antoine Pitrou <an...@python.org> wrote:

> Hello,
> 
> I'm trying to build parquet-mr and I'm unsure how to make the
> experience smooth enough for development. This is what I observe:
> 
> 1) running the tests is extremely long (they have been running for 10
> minutes already, with no sign of nearing completion)
> 
> 2) the output logs are a true firehose; there's a ton of extremely
> detailed (and probably superfluous) information being output, such as:
> 
> 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> 2024-01-11 18:45:33 INFO ParquetRewriter - Finish rewriting input file:
> file:/tmp/test12306662267168473656/test.parquet 2024-01-11 18:45:33
> INFO InternalParquetRecordReader - RecordReader initialized will read a
> total of 100000 records. 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - at row 0. reading next block 2024-01-11
> 18:45:33 INFO CodecPool - Got brand-new decompressor [.zstd] 2024-01-11
> 18:45:33 INFO InternalParquetRecordReader - block read in memory in 1
> ms. row count = 100 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - Assembled and processed 100 records from
> 6 columns in 0 ms: Infinity rec/ms, Infinity cell/ms 2024-01-11
> 18:45:33 INFO InternalParquetRecordReader - time spent so far 100%
> reading (1 ms) and 0% processing (0 ms) 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - at row 100. reading next block 2024-01-11
> 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> ms. row count = 100 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - Assembled and processed 200 records from
> 6 columns in 1 ms: 200.0 rec/ms, 1200.0 cell/ms 2024-01-11 18:45:33
> INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> and 50% processing (1 ms) 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - at row 200. reading next block 2024-01-11
> 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> ms. row count = 100 2024-01-11 18:45:33 INFO
> InternalParquetRecordReader - Assembled and processed 300 records from
> 6 columns in 1 ms: 300.0 rec/ms, 1800.0 cell/ms 2024-01-11 18:45:33
> INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> and 50% processing (1 ms)
> 
> [etc.]
> 
> 
> 3) it seems the tests are leaving a lot of generated data files behind
> in /tmp/test..., though of course they might ultimately clean up at the
> end?
> 
> 
> How do people typically develop on parquet-mr? Do they have dedicated
> shell scripts that only build and test parts of the project? Do they
> use an IDE and select specific options there?
> 
> Regards
> 
> Antoine.
> 
> 
> 




Re: Error building with IntelliJ

Posted by Gang Wu <us...@gmail.com>.
Usually I will try to reload the maven project [1] after a full build.

If it doesn't work, my last resort is to run the following steps:
- mvn install -DskipTests
   All built jars will be installed in local ~/.m2/repository/xxx
- mvn dependency:copy-dependencies
   All dependencies will be copied to the project target/dependency
directory.
- Add the target/dependency as a dependency directory in Intellij IDEA  [2]

[1]
https://www.jetbrains.com/help/idea/delegate-build-and-run-actions-to-maven.html#maven_reimport
[2]
https://www.jetbrains.com/help/idea/working-with-module-dependencies.html#add-a-new-dependency

Best,
Gang


On Fri, Jan 19, 2024 at 1:34 AM Claire McGinty <cl...@gmail.com>
wrote:

> Hi Antoine,
>
> I've run into this problem before and can usually fix it by running `mvn
> clean` on the command line before building in IntelliJ.
>
> Hope that helps,
> Claire
>
> On Thu, Jan 18, 2024 at 5:08 AM Antoine Pitrou <an...@python.org> wrote:
>
> >
> > Hello all,
> >
> > Thank you for the suggestions. I am trying to build parquet-mr from
> > IntelliJ now ("Build" -> "Build Project"), but I get the following
> > error:
> >
> >
> >
> /home/antoine/parquet/mr/parquet-common/src/test/java/org/apache/parquet/VersionTest.java:46:24
> > java: cannot find symbol
> >   symbol:   variable Version
> >   location: class org.apache.parquet.VersionTest
> >
> >
> > Am I missing something obvious? Does a separate step need to be run
> > first?
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
> > On Thu, 11 Jan 2024 18:48:20 +0100
> > Antoine Pitrou <an...@python.org> wrote:
> >
> > > Hello,
> > >
> > > I'm trying to build parquet-mr and I'm unsure how to make the
> > > experience smooth enough for development. This is what I observe:
> > >
> > > 1) running the tests is extremely long (they have been running for 10
> > > minutes already, with no sign of nearing completion)
> > >
> > > 2) the output logs are a true firehose; there's a ton of extremely
> > > detailed (and probably superfluous) information being output, such as:
> > >
> > > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> > > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> > > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > > 2024-01-11 18:45:33 INFO ParquetRewriter - Finish rewriting input file:
> > > file:/tmp/test12306662267168473656/test.parquet 2024-01-11 18:45:33
> > > INFO InternalParquetRecordReader - RecordReader initialized will read a
> > > total of 100000 records. 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - at row 0. reading next block 2024-01-11
> > > 18:45:33 INFO CodecPool - Got brand-new decompressor [.zstd] 2024-01-11
> > > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 1
> > > ms. row count = 100 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - Assembled and processed 100 records from
> > > 6 columns in 0 ms: Infinity rec/ms, Infinity cell/ms 2024-01-11
> > > 18:45:33 INFO InternalParquetRecordReader - time spent so far 100%
> > > reading (1 ms) and 0% processing (0 ms) 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - at row 100. reading next block 2024-01-11
> > > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> > > ms. row count = 100 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - Assembled and processed 200 records from
> > > 6 columns in 1 ms: 200.0 rec/ms, 1200.0 cell/ms 2024-01-11 18:45:33
> > > INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> > > and 50% processing (1 ms) 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - at row 200. reading next block 2024-01-11
> > > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> > > ms. row count = 100 2024-01-11 18:45:33 INFO
> > > InternalParquetRecordReader - Assembled and processed 300 records from
> > > 6 columns in 1 ms: 300.0 rec/ms, 1800.0 cell/ms 2024-01-11 18:45:33
> > > INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> > > and 50% processing (1 ms)
> > >
> > > [etc.]
> > >
> > >
> > > 3) it seems the tests are leaving a lot of generated data files behind
> > > in /tmp/test..., though of course they might ultimately clean up at the
> > > end?
> > >
> > >
> > > How do people typically develop on parquet-mr? Do they have dedicated
> > > shell scripts that only build and test parts of the project? Do they
> > > use an IDE and select specific options there?
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > >
> >
> >
> >
> >
>

Re: Error building with IntelliJ

Posted by Claire McGinty <cl...@gmail.com>.
Hi Antoine,

I've run into this problem before and can usually fix it by running `mvn
clean` on the command line before building in IntelliJ.

Hope that helps,
Claire

On Thu, Jan 18, 2024 at 5:08 AM Antoine Pitrou <an...@python.org> wrote:

>
> Hello all,
>
> Thank you for the suggestions. I am trying to build parquet-mr from
> IntelliJ now ("Build" -> "Build Project"), but I get the following
> error:
>
>
> /home/antoine/parquet/mr/parquet-common/src/test/java/org/apache/parquet/VersionTest.java:46:24
> java: cannot find symbol
>   symbol:   variable Version
>   location: class org.apache.parquet.VersionTest
>
>
> Am I missing something obvious? Does a separate step need to be run
> first?
>
> Regards
>
> Antoine.
>
>
>
> On Thu, 11 Jan 2024 18:48:20 +0100
> Antoine Pitrou <an...@python.org> wrote:
>
> > Hello,
> >
> > I'm trying to build parquet-mr and I'm unsure how to make the
> > experience smooth enough for development. This is what I observe:
> >
> > 1) running the tests is extremely long (they have been running for 10
> > minutes already, with no sign of nearing completion)
> >
> > 2) the output logs are a true firehose; there's a ton of extremely
> > detailed (and probably superfluous) information being output, such as:
> >
> > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new decompressor [.gz]
> > 2024-01-11 18:45:33 INFO CodecPool - Got brand-new compressor [.zstd]
> > 2024-01-11 18:45:33 INFO ParquetRewriter - Finish rewriting input file:
> > file:/tmp/test12306662267168473656/test.parquet 2024-01-11 18:45:33
> > INFO InternalParquetRecordReader - RecordReader initialized will read a
> > total of 100000 records. 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - at row 0. reading next block 2024-01-11
> > 18:45:33 INFO CodecPool - Got brand-new decompressor [.zstd] 2024-01-11
> > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 1
> > ms. row count = 100 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - Assembled and processed 100 records from
> > 6 columns in 0 ms: Infinity rec/ms, Infinity cell/ms 2024-01-11
> > 18:45:33 INFO InternalParquetRecordReader - time spent so far 100%
> > reading (1 ms) and 0% processing (0 ms) 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - at row 100. reading next block 2024-01-11
> > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> > ms. row count = 100 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - Assembled and processed 200 records from
> > 6 columns in 1 ms: 200.0 rec/ms, 1200.0 cell/ms 2024-01-11 18:45:33
> > INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> > and 50% processing (1 ms) 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - at row 200. reading next block 2024-01-11
> > 18:45:33 INFO InternalParquetRecordReader - block read in memory in 0
> > ms. row count = 100 2024-01-11 18:45:33 INFO
> > InternalParquetRecordReader - Assembled and processed 300 records from
> > 6 columns in 1 ms: 300.0 rec/ms, 1800.0 cell/ms 2024-01-11 18:45:33
> > INFO InternalParquetRecordReader - time spent so far 50% reading (1 ms)
> > and 50% processing (1 ms)
> >
> > [etc.]
> >
> >
> > 3) it seems the tests are leaving a lot of generated data files behind
> > in /tmp/test..., though of course they might ultimately clean up at the
> > end?
> >
> >
> > How do people typically develop on parquet-mr? Do they have dedicated
> > shell scripts that only build and test parts of the project? Do they
> > use an IDE and select specific options there?
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
>
>
>
>