phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: CsvBulkLoadTool error with Phoenix 4.0
Date Mon, 11 Aug 2014 15:54:56 GMT
Hi Vadim,

It looks like this is due to a file permissions issue. The bulk import
tool creates HFiles in a temporary directory, and these files then get
moved into HBase. It's during this moving that things are going wrong
here.

The easiest (and worst) way of getting around this is by simply
disabling permissions on HDFS. It's obviously an easy way to do
things, but has an obvious major drawback.

I think that a couple of other options are:
* you could run the import job as the hbase user, i.e. sudo -u hbase
hadoop jar ....
* set the dfs.umask to a more permissive umask when running the import
job. It might be possible to just do this when starting up the job
itself, with "hadoop jar phoenix-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool
-Dfs.permissions.umask-mode=0000 --table ..."

I actually haven't tried any of these yet (I'm currently running on a
cluster without permissions enabled), but the general idea is to make
sure that the files and directories being created by the bulk import
tool can be read and written by the hbase user.

BTW, could you let me know what you needed to do in the end to get the
bulk import to run on CDH 5.1?

- Gabriel

On Mon, Aug 11, 2014 at 12:04 AM, Azarov, Vadim <Vadim.Azarov@teoco.com> wrote:
> Hi Gabriel,
> After a lot of playing around with the classpath, and after rebuilding the Phoenix source,
the CSVBulkLoading finally began,
> but during the stage where it should copy the tmp HFiles into HBase, there is the error
attached below.
>
> Running it from a CDH 5.1 VM, with the cloudera user.
> Tried several suggestions from various forums - no effect.
>
> Is there something that should be configured before running the job?
>
> Thank you!
> Vadim
>
> Sun Aug 10 14:54:45 PDT 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@5c994959,
java.io.IOException: java.io.IOException: Exception in rename
>         at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.rename(HRegionFileSystem.java:952)
>         at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:352)
>         at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:426)
>         at org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:666)
>         at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3621)
>         at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3527)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3262)
>         at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29499)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>         at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>         at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>         at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase,
access=WRITE, inode="/tmp/df84b3c1-1b02-446c-bee7-e2776bdd9e8c/M":cloudera:supergroup:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:182)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5584)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3272)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3242)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3210)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:523)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
>
>         at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:1636)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:532)
>         at org.apache.hadoop.fs.FilterFileSystem.rename(FilterFileSystem.java:214)
>         at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.rename(HRegionFileSystem.java:944)
>         ... 13 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=hbase, access=WRITE, inode="/tmp/df84b3c1-1b02-446c-bee7-e2776bdd9e8c/M":cloudera:supergroup:drwxr-xr-x
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
>         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:182)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5584)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3272)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3242)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3210)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:682)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:523)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1362)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy16.rename(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor126.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy16.rename(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rename(ClientNamenodeProtocolTranslatorPB.java:431)
>         at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
>         at com.sun.proxy.$Proxy17.rename(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:1634)
>         ...
>
> -----Original Message-----
> From: Gabriel Reid [mailto:gabriel.reid@gmail.com]
> Sent: Thursday, August 07, 2014 10:11 PM
> To: user@phoenix.apache.org
> Subject: Re: CsvBulkLoadTool error with Phoenix 4.0
>
> Hi Vadim,
>
> Sorry for the long delay on this.
>
> Just to be sure, can you confirm that you're using the hadoop-2 build of Phoenix 4.0
on the client when starting up the CsvBulkLoadTool?
>
> Even if you are, this may actually require a rebuild of Phoenix using CDH 5.1.0 dependencies.
>
> Could you post the full stack trace that you're getting?
>
> - Gabriel
>
> On Mon, Aug 4, 2014 at 11:08 AM, Azarov, Vadim <Vadim.Azarov@teoco.com> wrote:
>> Hi,
>>
>> I'm getting this error when trying to use the sample bulk loading with
>> MapReduce via CsvBulkLoadTool -
>>
>> java.lang.NoSuchMethodError:
>> org.apache.hadoop.net.NetUtils.getInputStream
>>
>>
>>
>> I'm using Phoenix 4.0, HBase 0.98.1 , Hadoop 2.3.0 and Clouder CDH
>> 5.1.0
>>
>>
>>
>> I saw that other encountered this problem with older and possibly
>> mismatching versions –
>>
>> http://stackoverflow.com/questions/15363490/cascading-hbase-tap
>>
>>
>>
>> but thought that the latest ones should work ok.
>>
>>
>>
>> Could you suggest what seems to be the problem?
>>
>>
>>
>> Thank you,
>>
>> Vadim Azarov
>>
>>
>>
>> Information in this e-mail and its attachments is confidential and
>> privileged under the TEOCO confidentiality terms that can be reviewed here.
> Information in this e-mail and its attachments is confidential and privileged under the
TEOCO confidentiality terms that can be reviewed here<http://www.teoco.com/email-disclaimer>.

Mime
View raw message