Quantcast
Channel: 齐码代码
Viewing all articles
Browse latest Browse all 17

Apache Kylin使用

$
0
0

部署或建cube出问题大部分都是环境问题或hadoop hbase 版本问题

首先按照文档中所标注的版本进行部署

  • Hadoop: 2.4 – 2.7
  • Hive: 0.13 – 0.14
  • HBase: 0.98 – 0.99
  • JDK: 1.7+

其中略坑的是 hbase的0.9*版本是不支持Hadoop 2.7,的,若hadoop是2.7.*,需要部署hbase 1.* ,对于hbase 1.* 版本需要下载单独编译的kylin二进制包

Binary Package (for running on HBase 1.1.3 or above)

创建Cube时执行 job 出错

1.
native snappy library not available: SnappyCompressor has not been loaded.

原因是hadoop native lib少了snappy解压缩库

sudo yum install snappy snappy-devel
sudo ln -s /usr/lib64/libsnappy.so $HADOOP_HOME/lib/native/libsnappy.so

在 $HADOOP_HOME/etc/hadoop/hadoop-env.sh 增加

export JAVA_LIBRARY_PATH="/usr/local/hadoop/lib/native"

重启

$HADOOP_HOME/sbin/stop-all.sh
$HADOOP_HOME/sbin/start-all.sh

2.

2016-02-22 16:24:16,740 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
	... 8 more

解决方法
http://stackoverflow.com/questions/34449561/hadoop-map-reduce-job-class-org-apache-hive-hcatalog-mapreduce-hcatinputformat

The issue here is Kylin assumes the same Hive jars on all Hadoop nodes. And when certain node missing the Hive jars (or even in different location), you get the ClassNotFoundException on HCatInputFormat.

Btw, you should be able to get a clear error message from Yarn job console. This is a met issue.

Deploying Hive to all cluster nodes can surely fix the problem, like you have tried.

Or another (cleaner) workaround is manually configure Kylin to submit Hive jars as additional job dependencies. See https://issues.apache.org/jira/browse/KYLIN-1021

Finally there's also a open JIRA suggests that Kylin should submit Hive jars by default. See https://issues.apache.org/jira/browse/KYLIN-1082

3.

org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:111)
        at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kylin.job.exception.ExecuteException: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:111)
        at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        ... 4 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.yarn.conf.YarnConfiguration.getServiceAddressConfKeys(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
        at org.apache.hadoop.yarn.conf.HAUtil.getConfKeyForRMInstance(HAUtil.java:239)
        at org.apache.hadoop.yarn.conf.HAUtil.getConfValueForRMInstance(HAUtil.java:250)
        at org.apache.hadoop.yarn.conf.HAUtil.getConfValueForRMInstance(HAUtil.java:262)
        at org.apache.kylin.job.common.MapReduceExecutable.getRestStatusCheckUrl(MapReduceExecutable.java:191)
        at org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:135)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        ... 6 more

拷贝新版本的hadoop-yarn-api.*.jar 至hbase/lib  hbase版本不对

4.

kylin 报错: org.apache.hadoop.hbase.TableNotFoundException: Table KYLIN_* is not currently available.
           Load HFile to HBase Table failed

查看hbase log,看了下是snappy的问题,把kylin的压缩方式改为gzip,重启好了,可以顺利建cube了


Viewing all articles
Browse latest Browse all 17

Trending Articles