部署或建cube出问题大部分都是环境问题或hadoop hbase 版本问题
首先按照文档中所标注的版本进行部署
Recommended Hadoop Versions
- Hadoop: 2.4 – 2.7
- Hive: 0.13 – 0.14
- HBase: 0.98 – 0.99
- JDK: 1.7+
其中略坑的是 hbase的0.9*版本是不支持Hadoop 2.7,的,若hadoop是2.7.*,需要部署hbase 1.* ,对于hbase 1.* 版本需要下载单独编译的kylin二进制包
Binary Package (for running on HBase 1.1.3 or above)
创建Cube时执行 job 出错
1.
native snappy library not available: SnappyCompressor has not been loaded.
原因是hadoop native lib少了snappy解压缩库
sudo yum install snappy snappy-devel
sudo ln -s /usr/lib64/libsnappy.so $HADOOP_HOME/lib/native/libsnappy.so
在 $HADOOP_HOME/etc/hadoop/hadoop-env.sh 增加
export JAVA_LIBRARY_PATH="/usr/local/hadoop/lib/native"
重启
$HADOOP_HOME/sbin/stop-all.sh $HADOOP_HOME/sbin/start-all.sh
2.
2016-02-22 16:24:16,740 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 8 more
解决方法
http://stackoverflow.com/questions/34449561/hadoop-map-reduce-job-class-org-apache-hive-hcatalog-mapreduce-hcatinputformat
The issue here is Kylin assumes the same Hive jars on all Hadoop nodes. And when certain node missing the Hive jars (or even in different location), you get the ClassNotFoundException on HCatInputFormat. Btw, you should be able to get a clear error message from Yarn job console. This is a met issue. Deploying Hive to all cluster nodes can surely fix the problem, like you have tried. Or another (cleaner) workaround is manually configure Kylin to submit Hive jars as additional job dependencies. See https://issues.apache.org/jira/browse/KYLIN-1021 Finally there's also a open JIRA suggests that Kylin should submit Hive jars by default. See https://issues.apache.org/jira/browse/KYLIN-1082
3.
org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:111)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:111)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
... 4 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
at org.apache.hadoop.yarn.conf.HAUtil.getConfKeyForRMInstance(HAUtil.java:239)
at org.apache.hadoop.yarn.conf.HAUtil.getConfValueForRMInstance(HAUtil.java:250)
at org.apache.hadoop.yarn.conf.HAUtil.getConfValueForRMInstance(HAUtil.java:262)
at org.apache.kylin.job.common.MapReduceExecutable.getRestStatusCheckUrl(MapReduceExecutable.java:191)
at org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:135)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
... 6 more
拷贝新版本的hadoop-yarn-api.*.jar 至hbase/lib hbase版本不对
4.
kylin 报错: org.apache.hadoop.hbase.TableNotFoundException: Table KYLIN_* is not currently available.
Load HFile to HBase Table failed
查看hbase log,看了下是snappy的问题,把kylin的压缩方式改为gzip,重启好了,可以顺利建cube了