You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Thomas Brugiere (JIRA)" <ji...@apache.org> on 2018/10/01 14:01:00 UTC
[jira] [Created] (SPARK-25582) Error in Spark logs when using the
org.apache.spark:spark-sql_2.11:2.2.0 Java library
Thomas Brugiere created SPARK-25582:
---------------------------------------
Summary: Error in Spark logs when using the org.apache.spark:spark-sql_2.11:2.2.0 Java library
Key: SPARK-25582
URL: https://issues.apache.org/jira/browse/SPARK-25582
Project: Spark
Issue Type: Bug
Components: Java API
Affects Versions: 2.2.0
Reporter: Thomas Brugiere
I have noticed an error that appears in the Spark logs when using the Spark SQL library in a Java 8 project.
When I run the code below with the attached files as input, I can see the ERROR below in the application logs:
*Code*
{code:java}
SparkConf conf = new SparkConf().setAppName("SparkBug").setMaster("local");
SparkSession sparkSession = SparkSession.builder().config(conf).getOrCreate();
Dataset<Row> df_a = sparkSession.read().option("header", true).csv("local/fileA.csv").dropDuplicates();
Dataset<Row> df_b = sparkSession.read().option("header", true).csv("local/fileB.csv").dropDuplicates();
Dataset<Row> df_c = sparkSession.read().option("header", true).csv("local/fileC.csv").dropDuplicates();
String[] key_join_1 = new String[]{"colA", "colB", "colC", "colD", "colE", "colF"};
String[] key_join_2 = new String[]{"colA", "colB", "colC", "colD", "colE"};
Dataset<Row> df_inventory_1 = df_a.join(df_b, arrayToSeq(key_join_1), "left");
Dataset<Row> df_inventory_2 = df_inventory_1.join(df_c, arrayToSeq(key_join_2), "left");
df_inventory_2.show();
{code}
*Error message*
{code:java}
18/10/01 09:58:07 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 202, Column 18: Expression "agg_isNull_28" is not an rvalue
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 202, Column 18: Expression "agg_isNull_28" is not an rvalue
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
at org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(UnitCompiler.java:7170)
at org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:5332)
at org.codehaus.janino.UnitCompiler.access$9400(UnitCompiler.java:212)
at org.codehaus.janino.UnitCompiler$13$1.visitAmbiguousName(UnitCompiler.java:5287)
at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4053)
at org.codehaus.janino.UnitCompiler$13.visitLvalue(UnitCompiler.java:5284)
at org.codehaus.janino.Java$Lvalue.accept(Java.java:3977)
at org.codehaus.janino.UnitCompiler.getConstantValue(UnitCompiler.java:5280)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2391)
at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:212)
at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1474)
at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1466)
at org.codehaus.janino.Java$IfStatement.accept(Java.java:2926)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1466)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1546)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3075)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:393)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:385)
at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1286)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:385)
at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1285)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:825)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:411)
at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:212)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:390)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:385)
at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1405)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:385)
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:357)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:234)
at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:446)
at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:313)
at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:235)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:204)
at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1417)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1493)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1490)
at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1365)
at org.apache.spark.sql.execution.WholeStageCodegenExec.liftedTree1$1(WholeStageCodegenExec.scala:579)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:578)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:337)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3278)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2489)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2703)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)
at org.apache.spark.sql.Dataset.show(Dataset.scala:723)
at org.apache.spark.sql.Dataset.show(Dataset.scala:682)
at org.apache.spark.sql.Dataset.show(Dataset.scala:691)
at SparkBug.main(SparkBug.java:30)
18/10/01 09:58:07 INFO CodeGenerator:
/* 001 */ public Object generate(Object[] references) {
/* 002 */ return new GeneratedIteratorForCodegenStage6(references);
/* 003 */ }
/* 004 */
/* 005 */ final class GeneratedIteratorForCodegenStage6 extends org.apache.spark.sql.execution.BufferedRowIterator {
/* 006 */ private Object[] references;
/* 007 */ private scala.collection.Iterator[] inputs;
/* 008 */ private boolean agg_initAgg_0;
/* 009 */ private org.apache.spark.unsafe.KVIterator agg_mapIter_0;
/* 010 */ private org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap agg_hashMap_0;
/* 011 */ private org.apache.spark.sql.execution.UnsafeKVExternalSorter agg_sorter_0;
/* 012 */ private scala.collection.Iterator inputadapter_input_0;
/* 013 */ private org.apache.spark.sql.execution.joins.UnsafeHashedRelation bhj_relation_0;
/* 014 */ private org.apache.spark.sql.execution.joins.UnsafeHashedRelation bhj_relation_1;
/* 015 */ private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[] agg_mutableStateArray_1 = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder[8];
/* 016 */ private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] agg_mutableStateArray_2 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[8];
/* 017 */ private UnsafeRow[] agg_mutableStateArray_0 = new UnsafeRow[8];
/* 018 */
/* 019 */ public GeneratedIteratorForCodegenStage6(Object[] references) {
/* 020 */ this.references = references;
/* 021 */ }
/* 022 */
/* 023 */ public void init(int index, scala.collection.Iterator[] inputs) {
/* 024 */ partitionIndex = index;
/* 025 */ this.inputs = inputs;
/* 026 */ wholestagecodegen_init_0_0();
/* 027 */ wholestagecodegen_init_0_1();
/* 028 */ wholestagecodegen_init_0_2();
/* 029 */
/* 030 */ }
/* 031 */
/* 032 */ private void wholestagecodegen_init_0_2() {
/* 033 */ agg_mutableStateArray_0[5] = new UnsafeRow(5);
/* 034 */ agg_mutableStateArray_1[5] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[5], 160);
/* 035 */ agg_mutableStateArray_2[5] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[5], 5);
/* 036 */ agg_mutableStateArray_0[6] = new UnsafeRow(23);
/* 037 */ agg_mutableStateArray_1[6] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[6], 736);
/* 038 */ agg_mutableStateArray_2[6] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[6], 23);
/* 039 */ agg_mutableStateArray_0[7] = new UnsafeRow(18);
/* 040 */ agg_mutableStateArray_1[7] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[7], 576);
/* 041 */ agg_mutableStateArray_2[7] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[7], 18);
/* 042 */
/* 043 */ }
/* 044 */
/* 045 */ private void agg_doAggregateWithKeysOutput_0(UnsafeRow agg_keyTerm_0, UnsafeRow agg_bufferTerm_0)
/* 046 */ throws java.io.IOException {
/* 047 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[4] /* numOutputRows */).add(1);
/* 048 */
/* 049 */ boolean agg_isNull_22 = agg_keyTerm_0.isNullAt(1);
/* 050 */ UTF8String agg_value_22 = agg_isNull_22 ? null : (agg_keyTerm_0.getUTF8String(1));
/* 051 */ boolean agg_isNull_23 = agg_keyTerm_0.isNullAt(6);
/* 052 */ UTF8String agg_value_23 = agg_isNull_23 ? null : (agg_keyTerm_0.getUTF8String(6));
/* 053 */ boolean agg_isNull_24 = agg_keyTerm_0.isNullAt(3);
/* 054 */ UTF8String agg_value_24 = agg_isNull_24 ? null : (agg_keyTerm_0.getUTF8String(3));
/* 055 */ boolean agg_isNull_25 = agg_keyTerm_0.isNullAt(4);
/* 056 */ UTF8String agg_value_25 = agg_isNull_25 ? null : (agg_keyTerm_0.getUTF8String(4));
/* 057 */ boolean agg_isNull_26 = agg_keyTerm_0.isNullAt(2);
/* 058 */ UTF8String agg_value_26 = agg_isNull_26 ? null : (agg_keyTerm_0.getUTF8String(2));
/* 059 */ boolean agg_isNull_27 = agg_keyTerm_0.isNullAt(0);
/* 060 */ UTF8String agg_value_27 = agg_isNull_27 ? null : (agg_keyTerm_0.getUTF8String(0));
/* 061 */
/* 062 */ // generate join key for stream side
/* 063 */
/* 064 */ agg_mutableStateArray_1[2].reset();
/* 065 */
/* 066 */ agg_mutableStateArray_2[2].zeroOutNullBytes();
/* 067 */
/* 068 */ if (agg_isNull_22) {
/* 069 */ agg_mutableStateArray_2[2].setNullAt(0);
/* 070 */ } else {
/* 071 */ agg_mutableStateArray_2[2].write(0, agg_value_22);
/* 072 */ }
/* 073 */
/* 074 */ if (agg_isNull_23) {
/* 075 */ agg_mutableStateArray_2[2].setNullAt(1);
/* 076 */ } else {
/* 077 */ agg_mutableStateArray_2[2].write(1, agg_value_23);
/* 078 */ }
/* 079 */
/* 080 */ if (agg_isNull_24) {
/* 081 */ agg_mutableStateArray_2[2].setNullAt(2);
/* 082 */ } else {
/* 083 */ agg_mutableStateArray_2[2].write(2, agg_value_24);
/* 084 */ }
/* 085 */
/* 086 */ if (agg_isNull_25) {
/* 087 */ agg_mutableStateArray_2[2].setNullAt(3);
/* 088 */ } else {
/* 089 */ agg_mutableStateArray_2[2].write(3, agg_value_25);
/* 090 */ }
/* 091 */
/* 092 */ if (agg_isNull_26) {
/* 093 */ agg_mutableStateArray_2[2].setNullAt(4);
/* 094 */ } else {
/* 095 */ agg_mutableStateArray_2[2].write(4, agg_value_26);
/* 096 */ }
/* 097 */
/* 098 */ if (agg_isNull_27) {
/* 099 */ agg_mutableStateArray_2[2].setNullAt(5);
/* 100 */ } else {
/* 101 */ agg_mutableStateArray_2[2].write(5, agg_value_27);
/* 102 */ }
/* 103 */ agg_mutableStateArray_0[2].setTotalSize(agg_mutableStateArray_1[2].totalSize());
/* 104 */
/* 105 */ // find matches from HashedRelation
/* 106 */ UnsafeRow bhj_matched_0 = agg_mutableStateArray_0[2].anyNull() ? null: (UnsafeRow)bhj_relation_0.getValue(agg_mutableStateArray_0[2]);
/* 107 */ final boolean bhj_conditionPassed_0 = true;
/* 108 */ if (!bhj_conditionPassed_0) {
/* 109 */ bhj_matched_0 = null;
/* 110 */ // reset the variables those are already evaluated.
/* 111 */
/* 112 */ }
/* 113 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[7] /* numOutputRows */).add(1);
/* 114 */
/* 115 */ // generate join key for stream side
/* 116 */
/* 117 */ agg_mutableStateArray_1[5].reset();
/* 118 */
/* 119 */ agg_mutableStateArray_2[5].zeroOutNullBytes();
/* 120 */
/* 121 */ if (agg_isNull_22) {
/* 122 */ agg_mutableStateArray_2[5].setNullAt(0);
/* 123 */ } else {
/* 124 */ agg_mutableStateArray_2[5].write(0, agg_value_22);
/* 125 */ }
/* 126 */
/* 127 */ if (agg_isNull_23) {
/* 128 */ agg_mutableStateArray_2[5].setNullAt(1);
/* 129 */ } else {
/* 130 */ agg_mutableStateArray_2[5].write(1, agg_value_23);
/* 131 */ }
/* 132 */
/* 133 */ if (agg_isNull_24) {
/* 134 */ agg_mutableStateArray_2[5].setNullAt(2);
/* 135 */ } else {
/* 136 */ agg_mutableStateArray_2[5].write(2, agg_value_24);
/* 137 */ }
/* 138 */
/* 139 */ if (agg_isNull_25) {
/* 140 */ agg_mutableStateArray_2[5].setNullAt(3);
/* 141 */ } else {
/* 142 */ agg_mutableStateArray_2[5].write(3, agg_value_25);
/* 143 */ }
/* 144 */
/* 145 */ if (agg_isNull_26) {
/* 146 */ agg_mutableStateArray_2[5].setNullAt(4);
/* 147 */ } else {
/* 148 */ agg_mutableStateArray_2[5].write(4, agg_value_26);
/* 149 */ }
/* 150 */ agg_mutableStateArray_0[5].setTotalSize(agg_mutableStateArray_1[5].totalSize());
/* 151 */
/* 152 */ // find matches from HashedRelation
/* 153 */ UnsafeRow bhj_matched_1 = agg_mutableStateArray_0[5].anyNull() ? null: (UnsafeRow)bhj_relation_1.getValue(agg_mutableStateArray_0[5]);
/* 154 */ final boolean bhj_conditionPassed_1 = true;
/* 155 */ if (!bhj_conditionPassed_1) {
/* 156 */ bhj_matched_1 = null;
/* 157 */ // reset the variables those are already evaluated.
/* 158 */
/* 159 */ }
/* 160 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[10] /* numOutputRows */).add(1);
/* 161 */
/* 162 */ agg_mutableStateArray_1[7].reset();
/* 163 */
/* 164 */ agg_mutableStateArray_2[7].zeroOutNullBytes();
/* 165 */
/* 166 */ if (agg_isNull_22) {
/* 167 */ agg_mutableStateArray_2[7].setNullAt(0);
/* 168 */ } else {
/* 169 */ agg_mutableStateArray_2[7].write(0, agg_value_22);
/* 170 */ }
/* 171 */
/* 172 */ if (agg_isNull_23) {
/* 173 */ agg_mutableStateArray_2[7].setNullAt(1);
/* 174 */ } else {
/* 175 */ agg_mutableStateArray_2[7].write(1, agg_value_23);
/* 176 */ }
/* 177 */
/* 178 */ if (agg_isNull_24) {
/* 179 */ agg_mutableStateArray_2[7].setNullAt(2);
/* 180 */ } else {
/* 181 */ agg_mutableStateArray_2[7].write(2, agg_value_24);
/* 182 */ }
/* 183 */
/* 184 */ if (agg_isNull_25) {
/* 185 */ agg_mutableStateArray_2[7].setNullAt(3);
/* 186 */ } else {
/* 187 */ agg_mutableStateArray_2[7].write(3, agg_value_25);
/* 188 */ }
/* 189 */
/* 190 */ if (agg_isNull_26) {
/* 191 */ agg_mutableStateArray_2[7].setNullAt(4);
/* 192 */ } else {
/* 193 */ agg_mutableStateArray_2[7].write(4, agg_value_26);
/* 194 */ }
/* 195 */
/* 196 */ if (agg_isNull_27) {
/* 197 */ agg_mutableStateArray_2[7].setNullAt(5);
/* 198 */ } else {
/* 199 */ agg_mutableStateArray_2[7].write(5, agg_value_27);
/* 200 */ }
/* 201 */
/* 202 */ if (agg_isNull_28) {
/* 203 */ agg_mutableStateArray_2[7].setNullAt(6);
/* 204 */ } else {
/* 205 */ agg_mutableStateArray_2[7].write(6, agg_value_28);
/* 206 */ }
/* 207 */
/* 208 */ if (bhj_isNull_19) {
/* 209 */ agg_mutableStateArray_2[7].setNullAt(7);
/* 210 */ } else {
/* 211 */ agg_mutableStateArray_2[7].write(7, bhj_value_19);
/* 212 */ }
/* 213 */
/* 214 */ if (bhj_isNull_21) {
/* 215 */ agg_mutableStateArray_2[7].setNullAt(8);
/* 216 */ } else {
/* 217 */ agg_mutableStateArray_2[7].write(8, bhj_value_21);
/* 218 */ }
/* 219 */
/* 220 */ if (bhj_isNull_23) {
/* 221 */ agg_mutableStateArray_2[7].setNullAt(9);
/* 222 */ } else {
/* 223 */ agg_mutableStateArray_2[7].write(9, bhj_value_23);
/* 224 */ }
/* 225 */
/* 226 */ if (bhj_isNull_25) {
/* 227 */ agg_mutableStateArray_2[7].setNullAt(10);
/* 228 */ } else {
/* 229 */ agg_mutableStateArray_2[7].write(10, bhj_value_25);
/* 230 */ }
/* 231 */
/* 232 */ if (bhj_isNull_27) {
/* 233 */ agg_mutableStateArray_2[7].setNullAt(11);
/* 234 */ } else {
/* 235 */ agg_mutableStateArray_2[7].write(11, bhj_value_27);
/* 236 */ }
/* 237 */
/* 238 */ if (bhj_isNull_29) {
/* 239 */ agg_mutableStateArray_2[7].setNullAt(12);
/* 240 */ } else {
/* 241 */ agg_mutableStateArray_2[7].write(12, bhj_value_29);
/* 242 */ }
/* 243 */
/* 244 */ if (bhj_isNull_31) {
/* 245 */ agg_mutableStateArray_2[7].setNullAt(13);
/* 246 */ } else {
/* 247 */ agg_mutableStateArray_2[7].write(13, bhj_value_31);
/* 248 */ }
/* 249 */
/* 250 */ if (bhj_isNull_68) {
/* 251 */ agg_mutableStateArray_2[7].setNullAt(14);
/* 252 */ } else {
/* 253 */ agg_mutableStateArray_2[7].write(14, bhj_value_68);
/* 254 */ }
/* 255 */
/* 256 */ if (bhj_isNull_70) {
/* 257 */ agg_mutableStateArray_2[7].setNullAt(15);
/* 258 */ } else {
/* 259 */ agg_mutableStateArray_2[7].write(15, bhj_value_70);
/* 260 */ }
/* 261 */
/* 262 */ if (bhj_isNull_72) {
/* 263 */ agg_mutableStateArray_2[7].setNullAt(16);
/* 264 */ } else {
/* 265 */ agg_mutableStateArray_2[7].write(16, bhj_value_72);
/* 266 */ }
/* 267 */
/* 268 */ if (bhj_isNull_74) {
/* 269 */ agg_mutableStateArray_2[7].setNullAt(17);
/* 270 */ } else {
/* 271 */ agg_mutableStateArray_2[7].write(17, bhj_value_74);
/* 272 */ }
/* 273 */ agg_mutableStateArray_0[7].setTotalSize(agg_mutableStateArray_1[7].totalSize());
/* 274 */ append(agg_mutableStateArray_0[7]);
/* 275 */
/* 276 */ }
/* 277 */
/* 278 */ private void wholestagecodegen_init_0_1() {
/* 279 */ agg_mutableStateArray_0[2] = new UnsafeRow(6);
/* 280 */ agg_mutableStateArray_1[2] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[2], 192);
/* 281 */ agg_mutableStateArray_2[2] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[2], 6);
/* 282 */ agg_mutableStateArray_0[3] = new UnsafeRow(20);
/* 283 */ agg_mutableStateArray_1[3] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[3], 640);
/* 284 */ agg_mutableStateArray_2[3] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[3], 20);
/* 285 */ agg_mutableStateArray_0[4] = new UnsafeRow(14);
/* 286 */ agg_mutableStateArray_1[4] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[4], 448);
/* 287 */ agg_mutableStateArray_2[4] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[4], 14);
/* 288 */
/* 289 */ bhj_relation_1 = ((org.apache.spark.sql.execution.joins.UnsafeHashedRelation) ((org.apache.spark.broadcast.TorrentBroadcast) references[8] /* broadcast */).value()).asReadOnlyCopy();
/* 290 */ incPeakExecutionMemory(bhj_relation_1.estimatedSize());
/* 291 */
/* 292 */ org.apache.spark.TaskContext$.MODULE$.get().addTaskCompletionListener(new org.apache.spark.util.TaskCompletionListener() {
/* 293 */ @Override
/* 294 */ public void onTaskCompletion(org.apache.spark.TaskContext context) {
/* 295 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[9] /* avgHashProbe */).set(bhj_relation_1.getAverageProbesPerLookup());
/* 296 */ }
/* 297 */ });
/* 298 */
/* 299 */ }
/* 300 */
/* 301 */ private void agg_doConsume_0(InternalRow inputadapter_row_0, UTF8String agg_expr_0_0, boolean agg_exprIsNull_0_0, UTF8String agg_expr_1_0, boolean agg_exprIsNull_1_0, UTF8String agg_expr_2_0, boolean agg_exprIsNull_2_0, UTF8String agg_expr_3_0, boolean agg_exprIsNull_3_0, UTF8String agg_expr_4_0, boolean agg_exprIsNull_4_0, UTF8String agg_expr_5_0, boolean agg_exprIsNull_5_0, UTF8String agg_expr_6_0, boolean agg_exprIsNull_6_0) throws java.io.IOException {
/* 302 */ UnsafeRow agg_unsafeRowAggBuffer_0 = null;
/* 303 */
/* 304 */ // generate grouping key
/* 305 */ agg_mutableStateArray_1[0].reset();
/* 306 */
/* 307 */ agg_mutableStateArray_2[0].zeroOutNullBytes();
/* 308 */
/* 309 */ if (agg_exprIsNull_0_0) {
/* 310 */ agg_mutableStateArray_2[0].setNullAt(0);
/* 311 */ } else {
/* 312 */ agg_mutableStateArray_2[0].write(0, agg_expr_0_0);
/* 313 */ }
/* 314 */
/* 315 */ if (agg_exprIsNull_1_0) {
/* 316 */ agg_mutableStateArray_2[0].setNullAt(1);
/* 317 */ } else {
/* 318 */ agg_mutableStateArray_2[0].write(1, agg_expr_1_0);
/* 319 */ }
/* 320 */
/* 321 */ if (agg_exprIsNull_2_0) {
/* 322 */ agg_mutableStateArray_2[0].setNullAt(2);
/* 323 */ } else {
/* 324 */ agg_mutableStateArray_2[0].write(2, agg_expr_2_0);
/* 325 */ }
/* 326 */
/* 327 */ if (agg_exprIsNull_3_0) {
/* 328 */ agg_mutableStateArray_2[0].setNullAt(3);
/* 329 */ } else {
/* 330 */ agg_mutableStateArray_2[0].write(3, agg_expr_3_0);
/* 331 */ }
/* 332 */
/* 333 */ if (agg_exprIsNull_4_0) {
/* 334 */ agg_mutableStateArray_2[0].setNullAt(4);
/* 335 */ } else {
/* 336 */ agg_mutableStateArray_2[0].write(4, agg_expr_4_0);
/* 337 */ }
/* 338 */
/* 339 */ if (agg_exprIsNull_5_0) {
/* 340 */ agg_mutableStateArray_2[0].setNullAt(5);
/* 341 */ } else {
/* 342 */ agg_mutableStateArray_2[0].write(5, agg_expr_5_0);
/* 343 */ }
/* 344 */
/* 345 */ if (agg_exprIsNull_6_0) {
/* 346 */ agg_mutableStateArray_2[0].setNullAt(6);
/* 347 */ } else {
/* 348 */ agg_mutableStateArray_2[0].write(6, agg_expr_6_0);
/* 349 */ }
/* 350 */ agg_mutableStateArray_0[0].setTotalSize(agg_mutableStateArray_1[0].totalSize());
/* 351 */ int agg_value_14 = 42;
/* 352 */
/* 353 */ if (!agg_exprIsNull_0_0) {
/* 354 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_0_0.getBaseObject(), agg_expr_0_0.getBaseOffset(), agg_expr_0_0.numBytes(), agg_value_14);
/* 355 */ }
/* 356 */
/* 357 */ if (!agg_exprIsNull_1_0) {
/* 358 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_1_0.getBaseObject(), agg_expr_1_0.getBaseOffset(), agg_expr_1_0.numBytes(), agg_value_14);
/* 359 */ }
/* 360 */
/* 361 */ if (!agg_exprIsNull_2_0) {
/* 362 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_2_0.getBaseObject(), agg_expr_2_0.getBaseOffset(), agg_expr_2_0.numBytes(), agg_value_14);
/* 363 */ }
/* 364 */
/* 365 */ if (!agg_exprIsNull_3_0) {
/* 366 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_3_0.getBaseObject(), agg_expr_3_0.getBaseOffset(), agg_expr_3_0.numBytes(), agg_value_14);
/* 367 */ }
/* 368 */
/* 369 */ if (!agg_exprIsNull_4_0) {
/* 370 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_4_0.getBaseObject(), agg_expr_4_0.getBaseOffset(), agg_expr_4_0.numBytes(), agg_value_14);
/* 371 */ }
/* 372 */
/* 373 */ if (!agg_exprIsNull_5_0) {
/* 374 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_5_0.getBaseObject(), agg_expr_5_0.getBaseOffset(), agg_expr_5_0.numBytes(), agg_value_14);
/* 375 */ }
/* 376 */
/* 377 */ if (!agg_exprIsNull_6_0) {
/* 378 */ agg_value_14 = org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeBytes(agg_expr_6_0.getBaseObject(), agg_expr_6_0.getBaseOffset(), agg_expr_6_0.numBytes(), agg_value_14);
/* 379 */ }
/* 380 */ if (true) {
/* 381 */ // try to get the buffer from hash map
/* 382 */ agg_unsafeRowAggBuffer_0 =
/* 383 */ agg_hashMap_0.getAggregationBufferFromUnsafeRow(agg_mutableStateArray_0[0], agg_value_14);
/* 384 */ }
/* 385 */ // Can't allocate buffer from the hash map. Spill the map and fallback to sort-based
/* 386 */ // aggregation after processing all input rows.
/* 387 */ if (agg_unsafeRowAggBuffer_0 == null) {
/* 388 */ if (agg_sorter_0 == null) {
/* 389 */ agg_sorter_0 = agg_hashMap_0.destructAndCreateExternalSorter();
/* 390 */ } else {
/* 391 */ agg_sorter_0.merge(agg_hashMap_0.destructAndCreateExternalSorter());
/* 392 */ }
/* 393 */
/* 394 */ // the hash map had be spilled, it should have enough memory now,
/* 395 */ // try to allocate buffer again.
/* 396 */ agg_unsafeRowAggBuffer_0 = agg_hashMap_0.getAggregationBufferFromUnsafeRow(
/* 397 */ agg_mutableStateArray_0[0], agg_value_14);
/* 398 */ if (agg_unsafeRowAggBuffer_0 == null) {
/* 399 */ // failed to allocate the first page
/* 400 */ throw new OutOfMemoryError("No enough memory for aggregation");
/* 401 */ }
/* 402 */ }
/* 403 */
/* 404 */ // common sub-expressions
/* 405 */
/* 406 */ // evaluate aggregate function
/* 407 */
/* 408 */ // update unsafe row buffer
/* 409 */
/* 410 */ }
/* 411 */
/* 412 */ private void agg_doAggregateWithKeys_0() throws java.io.IOException {
/* 413 */ while (inputadapter_input_0.hasNext() && !stopEarly()) {
/* 414 */ InternalRow inputadapter_row_0 = (InternalRow) inputadapter_input_0.next();
/* 415 */ boolean inputadapter_isNull_0 = inputadapter_row_0.isNullAt(0);
/* 416 */ UTF8String inputadapter_value_0 = inputadapter_isNull_0 ? null : (inputadapter_row_0.getUTF8String(0));
/* 417 */ boolean inputadapter_isNull_1 = inputadapter_row_0.isNullAt(1);
/* 418 */ UTF8String inputadapter_value_1 = inputadapter_isNull_1 ? null : (inputadapter_row_0.getUTF8String(1));
/* 419 */ boolean inputadapter_isNull_2 = inputadapter_row_0.isNullAt(2);
/* 420 */ UTF8String inputadapter_value_2 = inputadapter_isNull_2 ? null : (inputadapter_row_0.getUTF8String(2));
/* 421 */ boolean inputadapter_isNull_3 = inputadapter_row_0.isNullAt(3);
/* 422 */ UTF8String inputadapter_value_3 = inputadapter_isNull_3 ? null : (inputadapter_row_0.getUTF8String(3));
/* 423 */ boolean inputadapter_isNull_4 = inputadapter_row_0.isNullAt(4);
/* 424 */ UTF8String inputadapter_value_4 = inputadapter_isNull_4 ? null : (inputadapter_row_0.getUTF8String(4));
/* 425 */ boolean inputadapter_isNull_5 = inputadapter_row_0.isNullAt(5);
/* 426 */ UTF8String inputadapter_value_5 = inputadapter_isNull_5 ? null : (inputadapter_row_0.getUTF8String(5));
/* 427 */ boolean inputadapter_isNull_6 = inputadapter_row_0.isNullAt(6);
/* 428 */ UTF8String inputadapter_value_6 = inputadapter_isNull_6 ? null : (inputadapter_row_0.getUTF8String(6));
/* 429 */
/* 430 */ agg_doConsume_0(inputadapter_row_0, inputadapter_value_0, inputadapter_isNull_0, inputadapter_value_1, inputadapter_isNull_1, inputadapter_value_2, inputadapter_isNull_2, inputadapter_value_3, inputadapter_isNull_3, inputadapter_value_4, inputadapter_isNull_4, inputadapter_value_5, inputadapter_isNull_5, inputadapter_value_6, inputadapter_isNull_6);
/* 431 */ if (shouldStop()) return;
/* 432 */ }
/* 433 */
/* 434 */ agg_mapIter_0 = ((org.apache.spark.sql.execution.aggregate.HashAggregateExec) references[0] /* plan */).finishAggregate(agg_hashMap_0, agg_sorter_0, ((org.apache.spark.sql.execution.metric.SQLMetric) references[1] /* peakMemory */), ((org.apache.spark.sql.execution.metric.SQLMetric) references[2] /* spillSize */), ((org.apache.spark.sql.execution.metric.SQLMetric) references[3] /* avgHashProbe */));
/* 435 */ }
/* 436 */
/* 437 */ protected void processNext() throws java.io.IOException {
/* 438 */ if (!agg_initAgg_0) {
/* 439 */ agg_initAgg_0 = true;
/* 440 */ long wholestagecodegen_beforeAgg_0 = System.nanoTime();
/* 441 */ agg_doAggregateWithKeys_0();
/* 442 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[11] /* aggTime */).add((System.nanoTime() - wholestagecodegen_beforeAgg_0) / 1000000);
/* 443 */ }
/* 444 */
/* 445 */ // output the result
/* 446 */
/* 447 */ while (agg_mapIter_0.next()) {
/* 448 */ UnsafeRow agg_aggKey_0 = (UnsafeRow) agg_mapIter_0.getKey();
/* 449 */ UnsafeRow agg_aggBuffer_0 = (UnsafeRow) agg_mapIter_0.getValue();
/* 450 */ agg_doAggregateWithKeysOutput_0(agg_aggKey_0, agg_aggBuffer_0);
/* 451 */
/* 452 */ if (shouldStop()) return;
/* 453 */ }
/* 454 */
/* 455 */ agg_mapIter_0.close();
/* 456 */ if (agg_sorter_0 == null) {
/* 457 */ agg_hashMap_0.free();
/* 458 */ }
/* 459 */ }
/* 460 */
/* 461 */ private void wholestagecodegen_init_0_0() {
/* 462 */ agg_hashMap_0 = ((org.apache.spark.sql.execution.aggregate.HashAggregateExec) references[0] /* plan */).createHashMap();
/* 463 */ inputadapter_input_0 = inputs[0];
/* 464 */ agg_mutableStateArray_0[0] = new UnsafeRow(7);
/* 465 */ agg_mutableStateArray_1[0] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[0], 224);
/* 466 */ agg_mutableStateArray_2[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[0], 7);
/* 467 */ agg_mutableStateArray_0[1] = new UnsafeRow(7);
/* 468 */ agg_mutableStateArray_1[1] = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(agg_mutableStateArray_0[1], 224);
/* 469 */ agg_mutableStateArray_2[1] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(agg_mutableStateArray_1[1], 7);
/* 470 */
/* 471 */ bhj_relation_0 = ((org.apache.spark.sql.execution.joins.UnsafeHashedRelation) ((org.apache.spark.broadcast.TorrentBroadcast) references[5] /* broadcast */).value()).asReadOnlyCopy();
/* 472 */ incPeakExecutionMemory(bhj_relation_0.estimatedSize());
/* 473 */
/* 474 */ org.apache.spark.TaskContext$.MODULE$.get().addTaskCompletionListener(new org.apache.spark.util.TaskCompletionListener() {
/* 475 */ @Override
/* 476 */ public void onTaskCompletion(org.apache.spark.TaskContext context) {
/* 477 */ ((org.apache.spark.sql.execution.metric.SQLMetric) references[6] /* avgHashProbe */).set(bhj_relation_0.getAverageProbesPerLookup());
/* 478 */ }
/* 479 */ });
/* 480 */
/* 481 */ }
/* 482 */
/* 483 */ }
18/10/01 09:58:07 WARN WholeStageCodegenExec: Whole-stage codegen disabled for plan (id=6):
*(6) Project [colA#10, colB#11, colC#12, colD#13, colE#14, colF#15, colG#16, colH#41, ColI#42, colJ#43, colK#44, colL#45, colM#46, colN#47, colP#77, colQ#78, colR#79, colS#80]
+- *(6) BroadcastHashJoin [colA#10, colB#11, colC#12, colD#13, colE#14], [colA#72, colB#73, colC#74, colD#75, colE#76], LeftOuter, BuildRight
:- *(6) Project [colA#10, colB#11, colC#12, colD#13, colE#14, colF#15, colG#16, colH#41, ColI#42, colJ#43, colK#44, colL#45, colM#46, colN#47]
: +- *(6) BroadcastHashJoin [colA#10, colB#11, colC#12, colD#13, colE#14, colF#15], [colA#35, colB#36, colC#37, colD#38, colE#39, colF#40], LeftOuter, BuildRight
: :- *(6) HashAggregate(keys=[colF#15, colA#10, colE#14, colC#12, colD#13, colG#16, colB#11], functions=[], output=[colA#10, colB#11, colC#12, colD#13, colE#14, colF#15, colG#16])
: : +- Exchange hashpartitioning(colF#15, colA#10, colE#14, colC#12, colD#13, colG#16, colB#11, 200)
: : +- *(1) HashAggregate(keys=[colF#15, colA#10, colE#14, colC#12, colD#13, colG#16, colB#11], functions=[], output=[colF#15, colA#10, colE#14, colC#12, colD#13, colG#16, colB#11])
: : +- *(1) FileScan csv [colA#10,colB#11,colC#12,colD#13,colE#14,colF#15,colG#16] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/tbrugier/projects/analytics/dev/spark-bug/local/fileA.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<colA:string,colB:string,colC:string,colD:string,colE:string,colF:string,colG:string>
: +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, string, true], input[1, string, true], input[2, string, true], input[3, string, true], input[4, string, true], input[5, string, true]))
: +- *(3) HashAggregate(keys=[colF#40, colL#45, colA#35, colH#41, colE#39, colM#46, colC#37, colN#47, colD#38, colJ#43, ColI#42, colB#36, colK#44], functions=[], output=[colA#35, colB#36, colC#37, colD#38, colE#39, colF#40, colH#41, ColI#42, colJ#43, colK#44, colL#45, colM#46, colN#47])
: +- Exchange hashpartitioning(colF#40, colL#45, colA#35, colH#41, colE#39, colM#46, colC#37, colN#47, colD#38, colJ#43, ColI#42, colB#36, colK#44, 200)
: +- *(2) HashAggregate(keys=[colF#40, colL#45, colA#35, colH#41, colE#39, colM#46, colC#37, colN#47, colD#38, colJ#43, ColI#42, colB#36, colK#44], functions=[], output=[colF#40, colL#45, colA#35, colH#41, colE#39, colM#46, colC#37, colN#47, colD#38, colJ#43, ColI#42, colB#36, colK#44])
: +- *(2) FileScan csv [colA#35,colB#36,colC#37,colD#38,colE#39,colF#40,colH#41,ColI#42,colJ#43,colK#44,colL#45,colM#46,colN#47] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/tbrugier/projects/analytics/dev/spark-bug/local/fileB.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<colA:string,colB:string,colC:string,colD:string,colE:string,colF:string,colH:string,ColI:s...
+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, string, true], input[1, string, true], input[2, string, true], input[3, string, true], input[4, string, true]))
+- *(5) HashAggregate(keys=[colR#79, colA#72, colE#76, colP#77, colC#74, colQ#78, colD#75, colS#80, colB#73], functions=[], output=[colA#72, colB#73, colC#74, colD#75, colE#76, colP#77, colQ#78, colR#79, colS#80])
+- Exchange hashpartitioning(colR#79, colA#72, colE#76, colP#77, colC#74, colQ#78, colD#75, colS#80, colB#73, 200)
+- *(4) HashAggregate(keys=[colR#79, colA#72, colE#76, colP#77, colC#74, colQ#78, colD#75, colS#80, colB#73], functions=[], output=[colR#79, colA#72, colE#76, colP#77, colC#74, colQ#78, colD#75, colS#80, colB#73])
+- *(4) FileScan csv [colA#72,colB#73,colC#74,colD#75,colE#76,colP#77,colQ#78,colR#79,colS#80] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/Users/tbrugier/projects/analytics/dev/spark-bug/local/fileC.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<colA:string,colB:string,colC:string,colD:string,colE:string,colP:string,colQ:string,colR:s...
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org