ldbc_snb_datagen_spark: Error: java.lang.NullPointerException during "Serializing persons"
While executing the data generator with scale factor 1 on an hadoop cluster I’m getting this error during “Serializing persons” execution:
17/09/28 11:39:11 INFO client.RMProxy: Connecting to ResourceManager at siti-rack.crumb.disco.it/172.24.1.201:8032
17/09/28 11:39:11 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/09/28 11:39:12 INFO input.FileInputFormat: Total input paths to process : 1
17/09/28 11:39:12 INFO mapreduce.JobSubmitter: number of splits:1
17/09/28 11:39:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1505825335849_0082
17/09/28 11:39:12 INFO impl.YarnClientImpl: Submitted application application_1505825335849_0082
17/09/28 11:39:12 INFO mapreduce.Job: The url to track the job: http://siti-rack.crumb.disco.it:8088/proxy/application_1505825335849_0082/
17/09/28 11:39:12 INFO mapreduce.Job: Running job: job_1505825335849_0082
17/09/28 11:39:16 INFO mapreduce.Job: Job job_1505825335849_0082 running in uber mode : false
17/09/28 11:39:16 INFO mapreduce.Job: map 0% reduce 0%
17/09/28 11:39:20 INFO mapreduce.Job: map 100% reduce 0%
17/09/28 11:39:29 INFO mapreduce.Job: Task Id : attempt_1505825335849_0082_r_000000_0, Status : FAILED
Error: java.lang.NullPointerException
at ldbc.snb.datagen.hadoop.HadoopPersonSortAndSerializer$HadoopPersonSerializerReducer.cleanup(HadoopPersonSortAndSerializer.java:123)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
17/09/28 11:39:36 INFO mapreduce.Job: Task Id : attempt_1505825335849_0082_r_000000_1, Status : FAILED
Error: java.lang.NullPointerException
at ldbc.snb.datagen.hadoop.HadoopPersonSortAndSerializer$HadoopPersonSerializerReducer.cleanup(HadoopPersonSortAndSerializer.java:123)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
17/09/28 11:39:43 INFO mapreduce.Job: Task Id : attempt_1505825335849_0082_r_000000_2, Status : FAILED
Error: java.lang.NullPointerException
at ldbc.snb.datagen.hadoop.HadoopPersonSortAndSerializer$HadoopPersonSerializerReducer.cleanup(HadoopPersonSortAndSerializer.java:123)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
17/09/28 11:39:51 INFO mapreduce.Job: map 100% reduce 100%
17/09/28 11:39:51 INFO mapreduce.Job: Job job_1505825335849_0082 failed with state FAILED due to: Task failed task_1505825335849_0082_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1
17/09/28 11:39:51 INFO mapreduce.Job: Counters: 37
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=169318
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=59137
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed reduce tasks=4
Launched map tasks=1
Launched reduce tasks=4
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=4420
Total time spent by all reduces in occupied slots (ms)=79484
Total time spent by all map tasks (ms)=2210
Total time spent by all reduce tasks (ms)=19871
Total vcore-seconds taken by all map tasks=2210
Total vcore-seconds taken by all reduce tasks=19871
Total megabyte-seconds taken by all map tasks=4526080
Total megabyte-seconds taken by all reduce tasks=81391616
Map-Reduce Framework
Map input records=100
Map output records=100
Map output bytes=59199
Map output materialized bytes=29575
Input split bytes=166
Combine input records=0
Spilled Records=100
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=7
CPU time spent (ms)=380
Physical memory (bytes) snapshot=519487488
Virtual memory (bytes) snapshot=2384412672
Total committed heap usage (bytes)=501219328
File Input Format Counters
Bytes Read=58971
Error during execution
null
java.lang.Exception
at ldbc.snb.datagen.hadoop.HadoopPersonSortAndSerializer.run(HadoopPersonSortAndSerializer.java:166)
at ldbc.snb.datagen.generator.LDBCDatagen.runGenerateJob(LDBCDatagen.java:155)
at ldbc.snb.datagen.generator.LDBCDatagen.main(LDBCDatagen.java:340)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Exception in thread "main" java.lang.Exception
at ldbc.snb.datagen.hadoop.HadoopPersonSortAndSerializer.run(HadoopPersonSortAndSerializer.java:166)
at ldbc.snb.datagen.generator.LDBCDatagen.runGenerateJob(LDBCDatagen.java:155)
at ldbc.snb.datagen.generator.LDBCDatagen.main(LDBCDatagen.java:340)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Someone have a possible solution? I can’t figure it out by myself
Edit: I solved the problem adding the following configuration in the params.ini file:
ldbc.snb.datagen.serializer.updateStreams:false
I’m still able to run a benchmark with that configuration set to false? Maybe generating the update streams parameters after the data generation?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (7 by maintainers)
好的,目前使用的是hadoop2.7.2,我先试着换成2.6.0