在运行SLS时可能会碰到如下问题:
命令:
sh $HADOOP_HOME/share/hadoop/tools/sls/bin/slsrun.sh --input-sls=/home/c/sls/output2/sls-jobs.json --nodes=/home/c/sls/output2/sls-nodes.json --output-dir=/home/c/sls/output1 --print-simulation
其中input-sls和--nodes的文件最好加上绝对路径,如果只写一个文件名,则默认从当前文件夹下取文件。
1.报错:
Exception in thread "main" java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131) at org.apache.hadoop.yarn.sls.SLSRunner.startAMFromSLSTraces(SLSRunner.java:313) at org.apache.hadoop.yarn.sls.SLSRunner.startAM(SLSRunner.java:248) at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:145) at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:528)Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123) ... 4 more
原因:找不到sls-runner.xml,只有在/hadoop/etc/hadoop文件夹下的xml配置文件才会被发现,而在当前hadoop版本中,sls-runner.xml在/hadoop/share/hadoop/tools/sls/sample-conf中。因此将sls-runner.xml拷贝至/hadoop/etc/hadoop下即可。
2.报错:
java.lang.NullPointerException at org.apache.hadoop.yarn.sls.web.SLSWebApp.(SLSWebApp.java:86)
原因:找不到html文件夹,而html文件夹在/hadoop/share/hadoop/tools/sls目录下,因此到该目录下,执行slsrun.sh脚本即可。
3.报错:
18/07/11 16:58:48 WARN capacity.CapacityScheduler: Couldn't find application application_1531299523163_000118/07/11 16:58:48 WARN resourcemanager.RMAuditLogger: USER=jenkins OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1531299523163_0001 submitted by user jenkins to unknown queue: sls_queue_1 APPID=application_1531299523163_000118/07/11 16:58:48 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1531299523163_0001,name=N/A,user=jenkins,queue=sls_queue_1,state=FAILED,trackingUrl=N/A,appMasterHost=N/A,startTime=1531299528010,finishTime=1531299528035,finalStatus=FAILED
容器启动失败
原因:yarn-site.xml配置文件没有配置好,在/hadoop/etc/hadoop下有个空的yarn-site.xml,系统默认执行该文件,因此报错。其实在sls/sample-conf文件夹下除了上面的sls-runner.xml文件,还有一个专门为sls例子准备的yarn-site.xml。将此文件替换至/hadoop/etc/hadoop的yarn-site.xml即可。