You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by "Grant Overby (groverby)" <gr...@cisco.com> on 2015/09/08 17:17:14 UTC

BlockMissingException

Drill is throwing a block missing exception; however, hdfs seems healthy. Thoughts?

>From Drill's web ui after executing a query:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484 file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on twig03.twigs:31010]

Retrieving the file from hdfs:

root@twig03:~# hdfs dfs -get /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet /tmp/.

root@twig03:~# ls /tmp/*.parquet

/tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet

hdfs report:

root@twig03:~# hdfs dfsadmin -report

Configured Capacity: 7856899358720 (7.15 TB)

Present Capacity: 7856899358720 (7.15 TB)

DFS Remaining: 4567228003960 (4.15 TB)

DFS Used: 3289671354760 (2.99 TB)

DFS Used%: 41.87%

Under replicated blocks: 22108

Blocks with corrupt replicas: 0

Missing blocks: 0


-------------------------------------------------

Live datanodes (2):


Name: 10.0.1.4:50010 (twig04.twigs)

Hostname: twig04.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644836539588 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283613139772 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 3

Last contact: Tue Sep 08 11:15:47 EDT 2015



Name: 10.0.1.3:50010 (twig03.twigs)

Hostname: twig03.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644834815172 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283614864188 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Sep 08 11:15:47 EDT 2015



[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





Re: BlockMissingException

Posted by "Grant Overby (groverby)" <gr...@cisco.com>.
Some new observations:

This consistently fails:
SELECT * FROM dfs.events.`connection_events` WHERE dir0 = 1441747800 LIMIT
25;



This is consistently successful:
SELECT count(*) FROM dfs.events.`connection_events` WHERE dir0 =
1441747800;


The error description has varying fragment descriptions, such as:
1:22
1:7
1:24
1:1

The error description references various files.





Grant Overby
Software Engineer
Cisco.com <http://www.cisco.com/>
groverby@cisco.com
Mobile: 865 724 4910




 Think before you print.This email may contain confidential and privileged
material for the sole use of the intended recipient. Any review, use,
distribution or disclosure by others is strictly prohibited. If you are
not the intended recipient (or authorized to receive for the recipient),
please contact the sender by reply email and delete all copies of this
message.
Please click here 
<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
Company Registration Information.







On 9/9/15, 4:41 PM, "Steven Phillips" <st...@dremio.com> wrote:

>The errors like: "java.io.FileNotFoundException: Path is not a file:
>/warehouse2/completed/events/connection_events/1441290600"
>are really just noise, and aren't related to the failure. We should
>probably clean them up so that we aren't attempting to open directories,
>but they are not causing the queries to fail.
>
>The problem is the BlockMissing exception. I don't know what would cause
>that exception, other than the block simply being missing, which appears
>to
>not be the case since the filesystem seems healthy.
>
>On Wed, Sep 9, 2015 at 12:51 PM, Grant Overby (groverby)
><groverby@cisco.com
>> wrote:
>
>> This failure prints more to the log than the first failure from my
>> original email. The log from this failure is here:
>> http://pastebin.com/GQMxdckJ
>>
>>
>>
>>
>> 0: jdbc:drill:> SELECT *
>>
>> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>>
>> . . . . . . . > where dir0 = 1441747800
>>
>> . . . . . . . >    LIMIT 25;
>>
>> Error: SYSTEM ERROR: BlockMissingException: Could not obtain block:
>> BP-1605794487-10.0.1.3-1435700184285:blk_1073829563_90291
>> 
>>file=/warehouse2/completed/events/connection_events/1441747800/1441747805
>>646-9-e4e1c8bd-a53a-46c1-8969-ea0ddaa74c16.parquet
>>
>>
>> Fragment 1:7
>>
>>
>> [Error Id: f26a3f0b-b52b-45a0-bc1c-8172606be740 on twig03.twigs:31010]
>> (state=,code=0)
>>
>> 0: jdbc:drill:>
>>
>>
>> [
>> 
>>http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.pn
>>g?ct=1398192119726
>> ]
>>
>> Grant Overby
>> Software Engineer
>> Cisco.com<http://www.cisco.com/>
>> groverby@cisco.com<ma...@cisco.com>
>> Mobile: 865 724 4910
>>
>>
>>
>>
>>
>>
>> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
>> before you print.
>>
>> This email may contain confidential and privileged material for the sole
>> use of the intended recipient. Any review, use, distribution or
>>disclosure
>> by others is strictly prohibited. If you are not the intended recipient
>>(or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here<
>> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>> Company Registration Information.
>>
>>
>>
>>
>>
>> From: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
>> groverby@cisco.com>>
>> Date: Wednesday, September 9, 2015 at 3:44 PM
>> To: Abdel Hakim Deneche <adeneche@maprtech.com<mailto:
>> adeneche@maprtech.com>>, user <user@drill.apache.org<mailto:
>> user@drill.apache.org>>
>> Subject: Re: BlockMissingException
>>
>> I didn¹t see the previous error in log before, but that could be my
>>fault.
>>
>> The exception in the log is correct.
>> /warehouse2/completed/events/connection_events/1441290600 is indeed a
>> directory, not a regular file; however, I don¹t understand why Drill
>>would
>> expect it to be a regular file.
>>
>> From CLI:
>>
>> 0: jdbc:drill:> SELECT *
>>
>> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>>
>> . . . . . . . > where dir0 = 1441747800
>>
>> . . . . . . . >  ORDER BY rna_flow_stats.initiator_ip DESC,
>> rna_flow_stats.first_packet_second DESC,
>>rna_flow_stats.last_packet_second
>> DESC
>>
>> . . . . . . . >    LIMIT 25;
>>
>> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR:
>> BlockMissingException: Could not obtain block:
>> BP-1605794487-10.0.1.3-1435700184285:blk_1073829562_90290
>> 
>>file=/warehouse2/completed/events/connection_events/1441747800/1441747805
>>713-13-eff51ede-e264-4a77-b1dd-4813f3f4d518.parquet
>>
>>
>> Fragment 3:29
>>
>>
>> [Error Id: 79c00987-6de8-41a9-a666-4af771f32c7e on twig03.twigs:31010]
>>
>> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>>
>> at
>> 
>>sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.jav
>>a:87)
>>
>> at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>>
>> at sqlline.SqlLine.print(SqlLine.java:1583)
>>
>> at sqlline.Commands.execute(Commands.java:852)
>>
>> at sqlline.Commands.sql(Commands.java:751)
>>
>> at sqlline.SqlLine.dispatch(SqlLine.java:738)
>>
>> at sqlline.SqlLine.begin(SqlLine.java:612)
>>
>> at sqlline.SqlLine.start(SqlLine.java:366)
>>
>> at sqlline.SqlLine.main(SqlLine.java:259)
>>
>> 0: jdbc:drill:>
>>
>>
>> From log:
>>
>> 2015-09-09 15:35:00,579 [2a0f761a-eac6-5399-a93e-f382cfd00cd1:foreman]
>> INFO  o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to
>> check for Parquet metadata file.
>>
>> java.io.FileNotFoundException: Path is not a file:
>> /warehouse2/completed/events/connection_events/1441290600
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:7
>>2)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:5
>>8)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpda
>>teTimes(FSNamesystem.java:1903)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(
>>FSNamesystem.java:1844)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSN
>>amesystem.java:1824)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSN
>>amesystem.java:1796)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocation
>>s(NameNodeRpcServer.java:554)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTransla
>>torPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java
>>:364)
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Client
>>NamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>
>>         at
>> 
>>org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Pr
>>otobufRpcEngine.java:619)
>>
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>>
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>>
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>
>>         at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1628)
>>
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>>
>>
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method) ~[na:1.7.0_79]
>>
>>         at
>> 
>>sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAc
>>cessorImpl.java:57)
>> ~[na:1.7.0_79]
>>
>>         at
>> 
>>sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConst
>>ructorAccessorImpl.java:45)
>> ~[na:1.7.0_79]
>>
>>         at 
>>java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> ~[na:1.7.0_79]
>>
>>         at
>> 
>>org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteExceptio
>>n.java:106)
>> ~[hadoop-common-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteExcepti
>>on.java:73)
>> ~[hadoop-common-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:114
>>4)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLe
>>ngth(DFSInputStream.java:264)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSyst
>>em.java:300)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSyst
>>em.java:296)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolve
>>r.java:81)
>> ~[hadoop-common-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.j
>>ava:296)
>> ~[hadoop-hdfs-2.4.1.jar:na]
>>
>>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
>> ~[hadoop-common-2.4.1.jar:na]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java
>>:128)
>> ~[drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.mat
>>ches(BasicFormatMatcher.java:139)
>> ~[drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormat
>>Matcher.java:108)
>> ~[drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatc
>>her.isDirReadable(ParquetFormatPlugin.java:226)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatc
>>her.isReadable(ParquetFormatPlugin.java:206)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.cr
>>eate(WorkspaceSchemaFactory.java:291)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.cr
>>eate(WorkspaceSchemaFactory.java:118)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(Expa
>>ndingConcurrentMap.java:96)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingCon
>>currentMap.java:90)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.ge
>>tTable(WorkspaceSchemaFactory.java:241)
>> [drill-java-exec-1.1.0.jar:1.1.0]
>>
>>         at
>> 
>>org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.
>>java:117)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatal
>>ogReader.java:117)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogRe
>>ader.java:100)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogRe
>>ader.java:1)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.j
>>ava:75)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(Delegat
>>ingScope.java:124)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(Identifi
>>erNamespace.java:104)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamesp
>>ace.java:86)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlVal
>>idatorImpl.java:874)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidat
>>orImpl.java:863)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidato
>>rImpl.java:2745)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidato
>>rImpl.java:2730)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValida
>>torImpl.java:2953)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamesp
>>ace.java:60)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamesp
>>ace.java:86)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlVal
>>idatorImpl.java:874)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidat
>>orImpl.java:863)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression
>>(SqlValidatorImpl.java:837)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImp
>>l.java:552)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:174)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>         at
>> 
>>org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.jav
>>a:185)
>> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>>
>>
>>
>>
>>
>>
>> I tried the following similar queries:
>>
>> This was successful:
>>
>> 0: jdbc:drill:> SELECT count(*)
>>
>> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>>
>> . . . . . . . > where dir0 = 1441747800;
>>
>>
>> This failed:
>>
>> 0: jdbc:drill:> SELECT *
>>
>> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>>
>> . . . . . . . > where dir0 = 1441747800
>>
>> . . . . . . . >    LIMIT 25;
>>
>> Error: SYSTEM ERROR: BlockMissingException: Could not obtain block:
>> BP-1605794487-10.0.1.3-1435700184285:blk_1073829567_90295
>> 
>>file=/warehouse2/completed/events/connection_events/1441747800/1441747810
>>272-4-609d62fe-b81b-4a35-acc9-42959a84a78d.parquet
>>
>>
>> Fragment 1:22
>>
>>
>> [Error Id: fbfb18e4-67d8-44c9-8a60-9f48d3798f4e on twig04.twigs:31010]
>> (state=,code=0)
>>
>>
>>
>>
>>
>>
>>
>>
>> [
>> 
>>http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.pn
>>g?ct=1398192119726
>> ]
>>
>> Grant Overby
>> Software Engineer
>> Cisco.com<http://www.cisco.com/>
>> groverby@cisco.com<ma...@cisco.com>
>> Mobile: 865 724 4910
>>
>>
>>
>>
>>
>>
>> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
>> before you print.
>>
>> This email may contain confidential and privileged material for the sole
>> use of the intended recipient. Any review, use, distribution or
>>disclosure
>> by others is strictly prohibited. If you are not the intended recipient
>>(or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here<
>> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>> Company Registration Information.
>>
>>
>>
>>
>>
>> From: Abdel Hakim Deneche <adeneche@maprtech.com<mailto:
>> adeneche@maprtech.com>>
>> Date: Wednesday, September 9, 2015 at 1:36 PM
>> To: user <us...@drill.apache.org>>
>> Cc: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
>> groverby@cisco.com>>
>> Subject: Re: BlockMissingException
>>
>> Hi Grant,
>>
>> Do you see any other errors in the logs ?
>>
>> I don't think the WorkEventBus warning has anything to do with the
>>issue.
>> It's a warning you can expect to see for failed/cancelled queries.
>>
>> Thanks
>>
>> On Wed, Sep 9, 2015 at 10:32 AM, Grant Overby (groverby) <
>> groverby@cisco.com<ma...@cisco.com>> wrote:
>> I'm still getting this, but it seems to go away after a while. It
>>happens
>> with multiple blocks.
>>
>> I'm seeing the following lines in logs. I suspect it's related:
>>
>>
>> 2015-09-09 12:38:12,556 [WorkManager-1147] WARN
>> o.a.d.exec.rpc.control.WorkEventBus - Fragment
>> 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.
>>
>> Any help or pointers would be greatly appreciated.
>>
>> [
>> 
>>http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.pn
>>g?ct=1398192119726
>> ]
>>
>> Grant Overby
>> Software Engineer
>> Cisco.com<http://www.cisco.com/>
>> groverby@cisco.com<ma...@cisco.com><mailto:groverby@cisco.com
>> <ma...@cisco.com>>
>> Mobile: 865 724 4910<tel:865%20724%204910>
>>
>>
>>
>>
>>
>>
>> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
>> before you print.
>>
>> This email may contain confidential and privileged material for the sole
>> use of the intended recipient. Any review, use, distribution or
>>disclosure
>> by others is strictly prohibited. If you are not the intended recipient
>>(or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here<
>> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>> Company Registration Information.
>>
>>
>>
>>
>>
>> From: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
>> 
>>groverby@cisco.com><ma...@cisco.com>>
>>>
>> Date: Tuesday, September 8, 2015 at 11:17 AM
>> To: "user@drill.apache.org<ma...@drill.apache.org><mailto:
>> user@drill.apache.org<ma...@drill.apache.org>>" <
>> user@drill.apache.org<ma...@drill.apache.org><mailto:
>> user@drill.apache.org<ma...@drill.apache.org>>>
>> Subject: BlockMissingException
>>
>> Drill is throwing a block missing exception; however, hdfs seems
>>healthy.
>> Thoughts?
>>
>> From Drill's web ui after executing a query:
>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>> BlockMissingException: Could not obtain block:
>> BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484
>> 
>>file=/warehouse2/completed/events/connection_events/1441707300/1441707312
>>267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
>> Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on
>> twig03.twigs:31010]
>>
>> Retrieving the file from hdfs:
>>
>> root@twig03:~# hdfs dfs -get
>> 
>>/warehouse2/completed/events/connection_events/1441707300/1441707312267-9
>>-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
>> /tmp/.
>>
>> root@twig03:~# ls /tmp/*.parquet
>>
>> /tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
>>
>> hdfs report:
>>
>> root@twig03:~# hdfs dfsadmin -report
>>
>> Configured Capacity: 7856899358720 (7.15 TB)
>>
>> Present Capacity: 7856899358720 (7.15 TB)
>>
>> DFS Remaining: 4567228003960 (4.15 TB)
>>
>> DFS Used: 3289671354760 (2.99 TB)
>>
>> DFS Used%: 41.87%
>>
>> Under replicated blocks: 22108
>>
>> Blocks with corrupt replicas: 0
>>
>> Missing blocks: 0
>>
>>
>> -------------------------------------------------
>>
>> Live datanodes (2):
>>
>>
>> Name: 10.0.1.4:50010<http://10.0.1.4:50010> (twig04.twigs)
>>
>> Hostname: twig04.twigs
>>
>> Decommission Status : Normal
>>
>> Configured Capacity: 3928449679360 (3.57 TB)
>>
>> DFS Used: 1644836539588 (1.50 TB)
>>
>> Non DFS Used: 0 (0 B)
>>
>> DFS Remaining: 2283613139772 (2.08 TB)
>>
>> DFS Used%: 41.87%
>>
>> DFS Remaining%: 58.13%
>>
>> Configured Cache Capacity: 0 (0 B)
>>
>> Cache Used: 0 (0 B)
>>
>> Cache Remaining: 0 (0 B)
>>
>> Cache Used%: 100.00%
>>
>> Cache Remaining%: 0.00%
>>
>> Xceivers: 3
>>
>> Last contact: Tue Sep 08 11:15:47 EDT 2015
>>
>>
>>
>> Name: 10.0.1.3:50010<http://10.0.1.3:50010> (twig03.twigs)
>>
>> Hostname: twig03.twigs
>>
>> Decommission Status : Normal
>>
>> Configured Capacity: 3928449679360 (3.57 TB)
>>
>> DFS Used: 1644834815172 (1.50 TB)
>>
>> Non DFS Used: 0 (0 B)
>>
>> DFS Remaining: 2283614864188 (2.08 TB)
>>
>> DFS Used%: 41.87%
>>
>> DFS Remaining%: 58.13%
>>
>> Configured Cache Capacity: 0 (0 B)
>>
>> Cache Used: 0 (0 B)
>>
>> Cache Remaining: 0 (0 B)
>>
>> Cache Used%: 100.00%
>>
>> Cache Remaining%: 0.00%
>>
>> Xceivers: 2
>>
>> Last contact: Tue Sep 08 11:15:47 EDT 2015
>>
>>
>>
>> [
>> 
>>http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.pn
>>g?ct=1398192119726
>> ]
>>
>> Grant Overby
>> Software Engineer
>> Cisco.com<http://www.cisco.com/>
>> groverby@cisco.com<ma...@cisco.com><mailto:groverby@cisco.com
>> <ma...@cisco.com>>
>> Mobile: 865 724 4910<tel:865%20724%204910>
>>
>>
>>
>>
>>
>>
>> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
>> before you print.
>>
>> This email may contain confidential and privileged material for the sole
>> use of the intended recipient. Any review, use, distribution or
>>disclosure
>> by others is strictly prohibited. If you are not the intended recipient
>>(or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here<
>> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>> Company Registration Information.
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Abdelhakim Deneche
>>
>> Software Engineer
>>
>>  
>>[http://www.mapr.com/sites/default/files/logos/mapr-logo-signature.png] <
>> http://www.mapr.com/>
>>
>>
>> Now Available - Free Hadoop On-Demand Training<
>> 
>>http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_ca
>>mpaign=Free%20available
>> >
>>


Re: BlockMissingException

Posted by Steven Phillips <st...@dremio.com>.
The errors like: "java.io.FileNotFoundException: Path is not a file:
/warehouse2/completed/events/connection_events/1441290600"
are really just noise, and aren't related to the failure. We should
probably clean them up so that we aren't attempting to open directories,
but they are not causing the queries to fail.

The problem is the BlockMissing exception. I don't know what would cause
that exception, other than the block simply being missing, which appears to
not be the case since the filesystem seems healthy.

On Wed, Sep 9, 2015 at 12:51 PM, Grant Overby (groverby) <groverby@cisco.com
> wrote:

> This failure prints more to the log than the first failure from my
> original email. The log from this failure is here:
> http://pastebin.com/GQMxdckJ
>
>
>
>
> 0: jdbc:drill:> SELECT *
>
> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>
> . . . . . . . > where dir0 = 1441747800
>
> . . . . . . . >    LIMIT 25;
>
> Error: SYSTEM ERROR: BlockMissingException: Could not obtain block:
> BP-1605794487-10.0.1.3-1435700184285:blk_1073829563_90291
> file=/warehouse2/completed/events/connection_events/1441747800/1441747805646-9-e4e1c8bd-a53a-46c1-8969-ea0ddaa74c16.parquet
>
>
> Fragment 1:7
>
>
> [Error Id: f26a3f0b-b52b-45a0-bc1c-8172606be740 on twig03.twigs:31010]
> (state=,code=0)
>
> 0: jdbc:drill:>
>
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com>
> Mobile: 865 724 4910
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>
> From: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
> groverby@cisco.com>>
> Date: Wednesday, September 9, 2015 at 3:44 PM
> To: Abdel Hakim Deneche <adeneche@maprtech.com<mailto:
> adeneche@maprtech.com>>, user <user@drill.apache.org<mailto:
> user@drill.apache.org>>
> Subject: Re: BlockMissingException
>
> I didn’t see the previous error in log before, but that could be my fault.
>
> The exception in the log is correct.
> /warehouse2/completed/events/connection_events/1441290600 is indeed a
> directory, not a regular file; however, I don’t understand why Drill would
> expect it to be a regular file.
>
> From CLI:
>
> 0: jdbc:drill:> SELECT *
>
> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>
> . . . . . . . > where dir0 = 1441747800
>
> . . . . . . . >  ORDER BY rna_flow_stats.initiator_ip DESC,
> rna_flow_stats.first_packet_second DESC, rna_flow_stats.last_packet_second
> DESC
>
> . . . . . . . >    LIMIT 25;
>
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR:
> BlockMissingException: Could not obtain block:
> BP-1605794487-10.0.1.3-1435700184285:blk_1073829562_90290
> file=/warehouse2/completed/events/connection_events/1441747800/1441747805713-13-eff51ede-e264-4a77-b1dd-4813f3f4d518.parquet
>
>
> Fragment 3:29
>
>
> [Error Id: 79c00987-6de8-41a9-a666-4af771f32c7e on twig03.twigs:31010]
>
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>
> at
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>
> at sqlline.SqlLine.print(SqlLine.java:1583)
>
> at sqlline.Commands.execute(Commands.java:852)
>
> at sqlline.Commands.sql(Commands.java:751)
>
> at sqlline.SqlLine.dispatch(SqlLine.java:738)
>
> at sqlline.SqlLine.begin(SqlLine.java:612)
>
> at sqlline.SqlLine.start(SqlLine.java:366)
>
> at sqlline.SqlLine.main(SqlLine.java:259)
>
> 0: jdbc:drill:>
>
>
> From log:
>
> 2015-09-09 15:35:00,579 [2a0f761a-eac6-5399-a93e-f382cfd00cd1:foreman]
> INFO  o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to
> check for Parquet metadata file.
>
> java.io.FileNotFoundException: Path is not a file:
> /warehouse2/completed/events/connection_events/1441290600
>
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:58)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1903)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1844)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1824)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1796)
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:554)
>
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:364)
>
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) ~[na:1.7.0_79]
>
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> ~[na:1.7.0_79]
>
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> ~[na:1.7.0_79]
>
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> ~[na:1.7.0_79]
>
>         at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> ~[hadoop-common-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
> ~[hadoop-common-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[hadoop-common-2.4.1.jar:na]
>
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
> ~[hadoop-hdfs-2.4.1.jar:na]
>
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
> ~[hadoop-common-2.4.1.jar:na]
>
>         at
> org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:128)
> ~[drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:139)
> ~[drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:108)
> ~[drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:226)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:206)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:291)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:118)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:241)
> [drill-java-exec-1.1.0.jar:1.1.0]
>
>         at
> org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:117)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:117)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:100)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:1)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2745)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2730)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2953)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:552)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:174)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>         at
> org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:185)
> [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
>
>
>
>
>
>
> I tried the following similar queries:
>
> This was successful:
>
> 0: jdbc:drill:> SELECT count(*)
>
> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>
> . . . . . . . > where dir0 = 1441747800;
>
>
> This failed:
>
> 0: jdbc:drill:> SELECT *
>
> . . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats
>
> . . . . . . . > where dir0 = 1441747800
>
> . . . . . . . >    LIMIT 25;
>
> Error: SYSTEM ERROR: BlockMissingException: Could not obtain block:
> BP-1605794487-10.0.1.3-1435700184285:blk_1073829567_90295
> file=/warehouse2/completed/events/connection_events/1441747800/1441747810272-4-609d62fe-b81b-4a35-acc9-42959a84a78d.parquet
>
>
> Fragment 1:22
>
>
> [Error Id: fbfb18e4-67d8-44c9-8a60-9f48d3798f4e on twig04.twigs:31010]
> (state=,code=0)
>
>
>
>
>
>
>
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com>
> Mobile: 865 724 4910
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>
> From: Abdel Hakim Deneche <adeneche@maprtech.com<mailto:
> adeneche@maprtech.com>>
> Date: Wednesday, September 9, 2015 at 1:36 PM
> To: user <us...@drill.apache.org>>
> Cc: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
> groverby@cisco.com>>
> Subject: Re: BlockMissingException
>
> Hi Grant,
>
> Do you see any other errors in the logs ?
>
> I don't think the WorkEventBus warning has anything to do with the issue.
> It's a warning you can expect to see for failed/cancelled queries.
>
> Thanks
>
> On Wed, Sep 9, 2015 at 10:32 AM, Grant Overby (groverby) <
> groverby@cisco.com<ma...@cisco.com>> wrote:
> I'm still getting this, but it seems to go away after a while. It happens
> with multiple blocks.
>
> I'm seeing the following lines in logs. I suspect it's related:
>
>
> 2015-09-09 12:38:12,556 [WorkManager-1147] WARN
> o.a.d.exec.rpc.control.WorkEventBus - Fragment
> 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.
>
> Any help or pointers would be greatly appreciated.
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com><mailto:groverby@cisco.com
> <ma...@cisco.com>>
> Mobile: 865 724 4910<tel:865%20724%204910>
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>
> From: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
> groverby@cisco.com><ma...@cisco.com>>>
> Date: Tuesday, September 8, 2015 at 11:17 AM
> To: "user@drill.apache.org<ma...@drill.apache.org><mailto:
> user@drill.apache.org<ma...@drill.apache.org>>" <
> user@drill.apache.org<ma...@drill.apache.org><mailto:
> user@drill.apache.org<ma...@drill.apache.org>>>
> Subject: BlockMissingException
>
> Drill is throwing a block missing exception; however, hdfs seems healthy.
> Thoughts?
>
> From Drill's web ui after executing a query:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> BlockMissingException: Could not obtain block:
> BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484
> file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
> Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on
> twig03.twigs:31010]
>
> Retrieving the file from hdfs:
>
> root@twig03:~# hdfs dfs -get
> /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
> /tmp/.
>
> root@twig03:~# ls /tmp/*.parquet
>
> /tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
>
> hdfs report:
>
> root@twig03:~# hdfs dfsadmin -report
>
> Configured Capacity: 7856899358720 (7.15 TB)
>
> Present Capacity: 7856899358720 (7.15 TB)
>
> DFS Remaining: 4567228003960 (4.15 TB)
>
> DFS Used: 3289671354760 (2.99 TB)
>
> DFS Used%: 41.87%
>
> Under replicated blocks: 22108
>
> Blocks with corrupt replicas: 0
>
> Missing blocks: 0
>
>
> -------------------------------------------------
>
> Live datanodes (2):
>
>
> Name: 10.0.1.4:50010<http://10.0.1.4:50010> (twig04.twigs)
>
> Hostname: twig04.twigs
>
> Decommission Status : Normal
>
> Configured Capacity: 3928449679360 (3.57 TB)
>
> DFS Used: 1644836539588 (1.50 TB)
>
> Non DFS Used: 0 (0 B)
>
> DFS Remaining: 2283613139772 (2.08 TB)
>
> DFS Used%: 41.87%
>
> DFS Remaining%: 58.13%
>
> Configured Cache Capacity: 0 (0 B)
>
> Cache Used: 0 (0 B)
>
> Cache Remaining: 0 (0 B)
>
> Cache Used%: 100.00%
>
> Cache Remaining%: 0.00%
>
> Xceivers: 3
>
> Last contact: Tue Sep 08 11:15:47 EDT 2015
>
>
>
> Name: 10.0.1.3:50010<http://10.0.1.3:50010> (twig03.twigs)
>
> Hostname: twig03.twigs
>
> Decommission Status : Normal
>
> Configured Capacity: 3928449679360 (3.57 TB)
>
> DFS Used: 1644834815172 (1.50 TB)
>
> Non DFS Used: 0 (0 B)
>
> DFS Remaining: 2283614864188 (2.08 TB)
>
> DFS Used%: 41.87%
>
> DFS Remaining%: 58.13%
>
> Configured Cache Capacity: 0 (0 B)
>
> Cache Used: 0 (0 B)
>
> Cache Remaining: 0 (0 B)
>
> Cache Used%: 100.00%
>
> Cache Remaining%: 0.00%
>
> Xceivers: 2
>
> Last contact: Tue Sep 08 11:15:47 EDT 2015
>
>
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com><mailto:groverby@cisco.com
> <ma...@cisco.com>>
> Mobile: 865 724 4910<tel:865%20724%204910>
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>  [http://www.mapr.com/sites/default/files/logos/mapr-logo-signature.png] <
> http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training<
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: BlockMissingException

Posted by "Grant Overby (groverby)" <gr...@cisco.com>.
This failure prints more to the log than the first failure from my original email. The log from this failure is here: http://pastebin.com/GQMxdckJ




0: jdbc:drill:> SELECT *

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800

. . . . . . . >    LIMIT 25;

Error: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073829563_90291 file=/warehouse2/completed/events/connection_events/1441747800/1441747805646-9-e4e1c8bd-a53a-46c1-8969-ea0ddaa74c16.parquet


Fragment 1:7


[Error Id: f26a3f0b-b52b-45a0-bc1c-8172606be740 on twig03.twigs:31010] (state=,code=0)

0: jdbc:drill:>


[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: "Grant Overby (groverby)" <gr...@cisco.com>>
Date: Wednesday, September 9, 2015 at 3:44 PM
To: Abdel Hakim Deneche <ad...@maprtech.com>>, user <us...@drill.apache.org>>
Subject: Re: BlockMissingException

I didn’t see the previous error in log before, but that could be my fault.

The exception in the log is correct. /warehouse2/completed/events/connection_events/1441290600 is indeed a directory, not a regular file; however, I don’t understand why Drill would expect it to be a regular file.

>From CLI:

0: jdbc:drill:> SELECT *

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800

. . . . . . . >  ORDER BY rna_flow_stats.initiator_ip DESC, rna_flow_stats.first_packet_second DESC, rna_flow_stats.last_packet_second DESC

. . . . . . . >    LIMIT 25;

java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073829562_90290 file=/warehouse2/completed/events/connection_events/1441747800/1441747805713-13-eff51ede-e264-4a77-b1dd-4813f3f4d518.parquet


Fragment 3:29


[Error Id: 79c00987-6de8-41a9-a666-4af771f32c7e on twig03.twigs:31010]

at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)

at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)

at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)

at sqlline.SqlLine.print(SqlLine.java:1583)

at sqlline.Commands.execute(Commands.java:852)

at sqlline.Commands.sql(Commands.java:751)

at sqlline.SqlLine.dispatch(SqlLine.java:738)

at sqlline.SqlLine.begin(SqlLine.java:612)

at sqlline.SqlLine.start(SqlLine.java:366)

at sqlline.SqlLine.main(SqlLine.java:259)

0: jdbc:drill:>


>From log:

2015-09-09 15:35:00,579 [2a0f761a-eac6-5399-a93e-f382cfd00cd1:foreman] INFO  o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for Parquet metadata file.

java.io.FileNotFoundException: Path is not a file: /warehouse2/completed/events/connection_events/1441290600

        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)

        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:58)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1903)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1844)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1824)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1796)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:554)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:364)

        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)


        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.7.0_79]

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) ~[na:1.7.0_79]

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.7.0_79]

        at java.lang.reflect.Constructor.newInstance(Constructor.java:526) ~[na:1.7.0_79]

        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:128) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:139) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:108) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:226) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:206) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:291) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:118) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:241) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:117) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:117) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:100) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:1) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2745) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2730) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2953) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:552) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:174) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:185) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]






I tried the following similar queries:

This was successful:

0: jdbc:drill:> SELECT count(*)

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800;


This failed:

0: jdbc:drill:> SELECT *

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800

. . . . . . . >    LIMIT 25;

Error: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073829567_90295 file=/warehouse2/completed/events/connection_events/1441747800/1441747810272-4-609d62fe-b81b-4a35-acc9-42959a84a78d.parquet


Fragment 1:22


[Error Id: fbfb18e4-67d8-44c9-8a60-9f48d3798f4e on twig04.twigs:31010] (state=,code=0)








[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: Abdel Hakim Deneche <ad...@maprtech.com>>
Date: Wednesday, September 9, 2015 at 1:36 PM
To: user <us...@drill.apache.org>>
Cc: "Grant Overby (groverby)" <gr...@cisco.com>>
Subject: Re: BlockMissingException

Hi Grant,

Do you see any other errors in the logs ?

I don't think the WorkEventBus warning has anything to do with the issue. It's a warning you can expect to see for failed/cancelled queries.

Thanks

On Wed, Sep 9, 2015 at 10:32 AM, Grant Overby (groverby) <gr...@cisco.com>> wrote:
I'm still getting this, but it seems to go away after a while. It happens with multiple blocks.

I'm seeing the following lines in logs. I suspect it's related:


2015-09-09 12:38:12,556 [WorkManager-1147] WARN  o.a.d.exec.rpc.control.WorkEventBus - Fragment 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.

Any help or pointers would be greatly appreciated.

[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>>
Mobile: 865 724 4910<tel:865%20724%204910>






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: "Grant Overby (groverby)" <gr...@cisco.com>>>
Date: Tuesday, September 8, 2015 at 11:17 AM
To: "user@drill.apache.org<ma...@drill.apache.org>>" <us...@drill.apache.org>>>
Subject: BlockMissingException

Drill is throwing a block missing exception; however, hdfs seems healthy. Thoughts?

>From Drill's web ui after executing a query:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484 file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on twig03.twigs:31010]

Retrieving the file from hdfs:

root@twig03:~# hdfs dfs -get /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet /tmp/.

root@twig03:~# ls /tmp/*.parquet

/tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet

hdfs report:

root@twig03:~# hdfs dfsadmin -report

Configured Capacity: 7856899358720 (7.15 TB)

Present Capacity: 7856899358720 (7.15 TB)

DFS Remaining: 4567228003960 (4.15 TB)

DFS Used: 3289671354760 (2.99 TB)

DFS Used%: 41.87%

Under replicated blocks: 22108

Blocks with corrupt replicas: 0

Missing blocks: 0


-------------------------------------------------

Live datanodes (2):


Name: 10.0.1.4:50010<http://10.0.1.4:50010> (twig04.twigs)

Hostname: twig04.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644836539588 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283613139772 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 3

Last contact: Tue Sep 08 11:15:47 EDT 2015



Name: 10.0.1.3:50010<http://10.0.1.3:50010> (twig03.twigs)

Hostname: twig03.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644834815172 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283614864188 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Sep 08 11:15:47 EDT 2015



[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>>
Mobile: 865 724 4910<tel:865%20724%204910>






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.







--

Abdelhakim Deneche

Software Engineer

 [http://www.mapr.com/sites/default/files/logos/mapr-logo-signature.png] <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: BlockMissingException

Posted by "Grant Overby (groverby)" <gr...@cisco.com>.
I didn’t see the previous error in log before, but that could be my fault.

The exception in the log is correct. /warehouse2/completed/events/connection_events/1441290600 is indeed a directory, not a regular file; however, I don’t understand why Drill would expect it to be a regular file.

>From CLI:

0: jdbc:drill:> SELECT *

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800

. . . . . . . >  ORDER BY rna_flow_stats.initiator_ip DESC, rna_flow_stats.first_packet_second DESC, rna_flow_stats.last_packet_second DESC

. . . . . . . >    LIMIT 25;

java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073829562_90290 file=/warehouse2/completed/events/connection_events/1441747800/1441747805713-13-eff51ede-e264-4a77-b1dd-4813f3f4d518.parquet


Fragment 3:29


[Error Id: 79c00987-6de8-41a9-a666-4af771f32c7e on twig03.twigs:31010]

at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)

at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)

at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)

at sqlline.SqlLine.print(SqlLine.java:1583)

at sqlline.Commands.execute(Commands.java:852)

at sqlline.Commands.sql(Commands.java:751)

at sqlline.SqlLine.dispatch(SqlLine.java:738)

at sqlline.SqlLine.begin(SqlLine.java:612)

at sqlline.SqlLine.start(SqlLine.java:366)

at sqlline.SqlLine.main(SqlLine.java:259)

0: jdbc:drill:>


>From log:

2015-09-09 15:35:00,579 [2a0f761a-eac6-5399-a93e-f382cfd00cd1:foreman] INFO  o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for Parquet metadata file.

java.io.FileNotFoundException: Path is not a file: /warehouse2/completed/events/connection_events/1441290600

        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)

        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:58)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1903)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1844)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1824)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1796)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:554)

        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:364)

        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)


        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.7.0_79]

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) ~[na:1.7.0_79]

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.7.0_79]

        at java.lang.reflect.Constructor.newInstance(Constructor.java:526) ~[na:1.7.0_79]

        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1144) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1132) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1122) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296) ~[hadoop-hdfs-2.4.1.jar:na]

        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764) ~[hadoop-common-2.4.1.jar:na]

        at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:128) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:139) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:108) ~[drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:226) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:206) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:291) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:118) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:241) [drill-java-exec-1.1.0.jar:1.1.0]

        at org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:117) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:117) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:100) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:1) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2745) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2730) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2953) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:874) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:552) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:174) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]

        at org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:185) [calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]






I tried the following similar queries:

This was successful:

0: jdbc:drill:> SELECT count(*)

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800;


This failed:

0: jdbc:drill:> SELECT *

. . . . . . . >    FROM dfs.events.`connection_events` AS rna_flow_stats

. . . . . . . > where dir0 = 1441747800

. . . . . . . >    LIMIT 25;

Error: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073829567_90295 file=/warehouse2/completed/events/connection_events/1441747800/1441747810272-4-609d62fe-b81b-4a35-acc9-42959a84a78d.parquet


Fragment 1:22


[Error Id: fbfb18e4-67d8-44c9-8a60-9f48d3798f4e on twig04.twigs:31010] (state=,code=0)








[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: Abdel Hakim Deneche <ad...@maprtech.com>>
Date: Wednesday, September 9, 2015 at 1:36 PM
To: user <us...@drill.apache.org>>
Cc: "Grant Overby (groverby)" <gr...@cisco.com>>
Subject: Re: BlockMissingException

Hi Grant,

Do you see any other errors in the logs ?

I don't think the WorkEventBus warning has anything to do with the issue. It's a warning you can expect to see for failed/cancelled queries.

Thanks

On Wed, Sep 9, 2015 at 10:32 AM, Grant Overby (groverby) <gr...@cisco.com>> wrote:
I'm still getting this, but it seems to go away after a while. It happens with multiple blocks.

I'm seeing the following lines in logs. I suspect it's related:


2015-09-09 12:38:12,556 [WorkManager-1147] WARN  o.a.d.exec.rpc.control.WorkEventBus - Fragment 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.

Any help or pointers would be greatly appreciated.

[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>>
Mobile: 865 724 4910<tel:865%20724%204910>






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: "Grant Overby (groverby)" <gr...@cisco.com>>>
Date: Tuesday, September 8, 2015 at 11:17 AM
To: "user@drill.apache.org<ma...@drill.apache.org>>" <us...@drill.apache.org>>>
Subject: BlockMissingException

Drill is throwing a block missing exception; however, hdfs seems healthy. Thoughts?

>From Drill's web ui after executing a query:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484 file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on twig03.twigs:31010]

Retrieving the file from hdfs:

root@twig03:~# hdfs dfs -get /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet /tmp/.

root@twig03:~# ls /tmp/*.parquet

/tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet

hdfs report:

root@twig03:~# hdfs dfsadmin -report

Configured Capacity: 7856899358720 (7.15 TB)

Present Capacity: 7856899358720 (7.15 TB)

DFS Remaining: 4567228003960 (4.15 TB)

DFS Used: 3289671354760 (2.99 TB)

DFS Used%: 41.87%

Under replicated blocks: 22108

Blocks with corrupt replicas: 0

Missing blocks: 0


-------------------------------------------------

Live datanodes (2):


Name: 10.0.1.4:50010<http://10.0.1.4:50010> (twig04.twigs)

Hostname: twig04.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644836539588 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283613139772 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 3

Last contact: Tue Sep 08 11:15:47 EDT 2015



Name: 10.0.1.3:50010<http://10.0.1.3:50010> (twig03.twigs)

Hostname: twig03.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644834815172 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283614864188 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Sep 08 11:15:47 EDT 2015



[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>>
Mobile: 865 724 4910<tel:865%20724%204910>






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.







--

Abdelhakim Deneche

Software Engineer

 [http://www.mapr.com/sites/default/files/logos/mapr-logo-signature.png] <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: BlockMissingException

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
Hi Grant,

Do you see any other errors in the logs ?

I don't think the WorkEventBus warning has anything to do with the issue.
It's a warning you can expect to see for failed/cancelled queries.

Thanks

On Wed, Sep 9, 2015 at 10:32 AM, Grant Overby (groverby) <groverby@cisco.com
> wrote:

> I'm still getting this, but it seems to go away after a while. It happens
> with multiple blocks.
>
> I'm seeing the following lines in logs. I suspect it's related:
>
>
> 2015-09-09 12:38:12,556 [WorkManager-1147] WARN
> o.a.d.exec.rpc.control.WorkEventBus - Fragment
> 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.
>
> Any help or pointers would be greatly appreciated.
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com>
> Mobile: 865 724 4910
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>
> From: "Grant Overby (groverby)" <groverby@cisco.com<mailto:
> groverby@cisco.com>>
> Date: Tuesday, September 8, 2015 at 11:17 AM
> To: "user@drill.apache.org<ma...@drill.apache.org>" <
> user@drill.apache.org<ma...@drill.apache.org>>
> Subject: BlockMissingException
>
> Drill is throwing a block missing exception; however, hdfs seems healthy.
> Thoughts?
>
> From Drill's web ui after executing a query:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> BlockMissingException: Could not obtain block:
> BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484
> file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
> Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on
> twig03.twigs:31010]
>
> Retrieving the file from hdfs:
>
> root@twig03:~# hdfs dfs -get
> /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
> /tmp/.
>
> root@twig03:~# ls /tmp/*.parquet
>
> /tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet
>
> hdfs report:
>
> root@twig03:~# hdfs dfsadmin -report
>
> Configured Capacity: 7856899358720 (7.15 TB)
>
> Present Capacity: 7856899358720 (7.15 TB)
>
> DFS Remaining: 4567228003960 (4.15 TB)
>
> DFS Used: 3289671354760 (2.99 TB)
>
> DFS Used%: 41.87%
>
> Under replicated blocks: 22108
>
> Blocks with corrupt replicas: 0
>
> Missing blocks: 0
>
>
> -------------------------------------------------
>
> Live datanodes (2):
>
>
> Name: 10.0.1.4:50010 (twig04.twigs)
>
> Hostname: twig04.twigs
>
> Decommission Status : Normal
>
> Configured Capacity: 3928449679360 (3.57 TB)
>
> DFS Used: 1644836539588 (1.50 TB)
>
> Non DFS Used: 0 (0 B)
>
> DFS Remaining: 2283613139772 (2.08 TB)
>
> DFS Used%: 41.87%
>
> DFS Remaining%: 58.13%
>
> Configured Cache Capacity: 0 (0 B)
>
> Cache Used: 0 (0 B)
>
> Cache Remaining: 0 (0 B)
>
> Cache Used%: 100.00%
>
> Cache Remaining%: 0.00%
>
> Xceivers: 3
>
> Last contact: Tue Sep 08 11:15:47 EDT 2015
>
>
>
> Name: 10.0.1.3:50010 (twig03.twigs)
>
> Hostname: twig03.twigs
>
> Decommission Status : Normal
>
> Configured Capacity: 3928449679360 (3.57 TB)
>
> DFS Used: 1644834815172 (1.50 TB)
>
> Non DFS Used: 0 (0 B)
>
> DFS Remaining: 2283614864188 (2.08 TB)
>
> DFS Used%: 41.87%
>
> DFS Remaining%: 58.13%
>
> Configured Cache Capacity: 0 (0 B)
>
> Cache Used: 0 (0 B)
>
> Cache Remaining: 0 (0 B)
>
> Cache Used%: 100.00%
>
> Cache Remaining%: 0.00%
>
> Xceivers: 2
>
> Last contact: Tue Sep 08 11:15:47 EDT 2015
>
>
>
> [
> http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726
> ]
>
> Grant Overby
> Software Engineer
> Cisco.com<http://www.cisco.com/>
> groverby@cisco.com<ma...@cisco.com>
> Mobile: 865 724 4910
>
>
>
>
>
>
> [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think
> before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here<
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
>


-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: BlockMissingException

Posted by "Grant Overby (groverby)" <gr...@cisco.com>.
I'm still getting this, but it seems to go away after a while. It happens with multiple blocks.

I'm seeing the following lines in logs. I suspect it's related:


2015-09-09 12:38:12,556 [WorkManager-1147] WARN  o.a.d.exec.rpc.control.WorkEventBus - Fragment 2a0f9fbf-f85b-39a3-dce2-891d7f62b385:1:40 not found in the work bus.

Any help or pointers would be greatly appreciated.

[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: "Grant Overby (groverby)" <gr...@cisco.com>>
Date: Tuesday, September 8, 2015 at 11:17 AM
To: "user@drill.apache.org<ma...@drill.apache.org>" <us...@drill.apache.org>>
Subject: BlockMissingException

Drill is throwing a block missing exception; however, hdfs seems healthy. Thoughts?

>From Drill's web ui after executing a query:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: BlockMissingException: Could not obtain block: BP-1605794487-10.0.1.3-1435700184285:blk_1073828756_89484 file=/warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet Fragment 1:19 [Error Id: 9a0b442a-e5e1-44f9-9200-fff9f59b990a on twig03.twigs:31010]

Retrieving the file from hdfs:

root@twig03:~# hdfs dfs -get /warehouse2/completed/events/connection_events/1441707300/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet /tmp/.

root@twig03:~# ls /tmp/*.parquet

/tmp/1441707312267-9-f98c71d9-dff0-4dad-8220-01b7a8a5e1d5.parquet

hdfs report:

root@twig03:~# hdfs dfsadmin -report

Configured Capacity: 7856899358720 (7.15 TB)

Present Capacity: 7856899358720 (7.15 TB)

DFS Remaining: 4567228003960 (4.15 TB)

DFS Used: 3289671354760 (2.99 TB)

DFS Used%: 41.87%

Under replicated blocks: 22108

Blocks with corrupt replicas: 0

Missing blocks: 0


-------------------------------------------------

Live datanodes (2):


Name: 10.0.1.4:50010 (twig04.twigs)

Hostname: twig04.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644836539588 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283613139772 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 3

Last contact: Tue Sep 08 11:15:47 EDT 2015



Name: 10.0.1.3:50010 (twig03.twigs)

Hostname: twig03.twigs

Decommission Status : Normal

Configured Capacity: 3928449679360 (3.57 TB)

DFS Used: 1644834815172 (1.50 TB)

Non DFS Used: 0 (0 B)

DFS Remaining: 2283614864188 (2.08 TB)

DFS Used%: 41.87%

DFS Remaining%: 58.13%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Sep 08 11:15:47 EDT 2015



[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.