flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kashif khan <drkashif8...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Tue, 27 Nov 2012 21:25:44 GMT
Dear shekhar,

I am still struggling. I have written some java code, it does work but not
100%  as some time sending file with 0KB data. If you have any solution,
please let me know. I shall be very grateful to you.

Many thanks

Best regards

On Mon, Nov 26, 2012 at 4:42 PM, shekhar sharma <shekhar2581@gmail.com>wrote:

> Hello Khasif,
> Sorry for late reply...Are you done? or u still struggling?
>
> mail me: shekhar2581@gmail.com
> Regards,
> Som Shekhar Sharma
>
>
>
> On Wed, Nov 21, 2012 at 6:06 PM, kashif khan <drkashif8310@gmail.com>wrote:
>
>> Dear Shankar Sharma.
>>
>> I am using Eclipse as IDE. I dont have any idea, how to create the
>> project as maven project. I have downloaded Mave2 but given me some strange
>> error. So if you can help me then I will try the maven. Actually, I am
>> trying to automatically upload the files into HDFS and then will apply some
>> algorithms to analyze the data. The algorithms will implement in mapreduce
>> . So if you think maven will good for me then please let me know how I can
>> create the project as maven project.
>>
>>
>> Many thanks
>>
>> Best regards,
>>
>> KK
>>
>>
>>
>>  On Tue, Nov 20, 2012 at 7:06 PM, shekhar sharma <shekhar2581@gmail.com>wrote:
>>
>>> By the way how are you building and running your project.Are u running
>>> from any IDE?
>>> The best practises you can follow:
>>>
>>> (1) Create your project as maven project and give the dependency of
>>> hadoop-X.Y.Z . So your project will automatically will have all the
>>> necessary jars
>>> and i am sure you will not face these kind of errors
>>> (2) And In your HADOOP_CLASSPATH provide the path for $HADOOP_LIB.
>>>
>>> Regards,
>>> SOm
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 20, 2012 at 9:52 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>
>>>> Dear Tariq
>>>>
>>>> Many thanks, finally I have created the directory and upload the file.
>>>>
>>>> Once again many thanks
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 3:04 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>
>>>>> Dear Many thanks
>>>>>
>>>>>
>>>>> I have downloaded the jar file and added to project. Now getting
>>>>> another error as:
>>>>>
>>>>> og4j:WARN No appenders could be found for logger
>>>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>>> log4j:WARN Please initialize the log4j system properly.
>>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info.
>>>>> Exception in thread "main" java.io.IOException: No FileSystem for
>>>>> scheme: hdfs
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2206)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
>>>>>     at
>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
>>>>>
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>
>>>>> Have any idea about this?
>>>>>
>>>>> Thanks again
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 20, 2012 at 2:53 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>
>>>>>> You can download the jar here :
>>>>>> http://search.maven.org/remotecontent?filepath=com/google/guava/guava/13.0.1/guava-13.0.1.jar
>>>>>>
>>>>>> Regards,
>>>>>>     Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 20, 2012 at 8:06 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>>
>>>>>>> Could please let me know the name of jar file and location
>>>>>>>
>>>>>>> Many thanks
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 20, 2012 at 2:33 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>>>
>>>>>>>> Download the required jar and include it in your project.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>     Mohammad Tariq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Nov 20, 2012 at 7:57 PM, kashif khan <
>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Dear Tariq Thanks
>>>>>>>>>
>>>>>>>>> I have added the jar files from Cdh and download the cdh4 eclipse
>>>>>>>>> plugin and copied into eclipse plugin folder. The previous error I think
>>>>>>>>> sorted out but now I am getting another strange error.
>>>>>>>>>
>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>> com/google/common/collect/Maps
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.MetricsRegistry.<init>(MetricsRegistry.java:42)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:87)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.<init>(MetricsSystemImpl.java:133)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:97)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:190)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2373)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2365)
>>>>>>>>>     at
>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2233)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:156)
>>>>>>>>>     at CopyFile.main(CopyFile.java:14)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> com.google.common.collect.Maps
>>>>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>>>>>>>     at
>>>>>>>>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>>>>>>>     ... 13 more
>>>>>>>>>
>>>>>>>>> Have any idea about this error.
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 20, 2012 at 2:19 PM, Mohammad Tariq <
>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Kashif,
>>>>>>>>>>
>>>>>>>>>>      You are correct. This because of some version mismatch. I am
>>>>>>>>>> not using CDH personally but AFAIK, CDH4 uses Hadoop-2.x.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 20, 2012 at 4:10 PM, kashif khan <
>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> HI M Tariq
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am trying the following the program to create directory and
>>>>>>>>>>> copy file to hdfs. But I am getting the following errors
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Program:
>>>>>>>>>>>
>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>
>>>>>>>>>>> public class CopyFile {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>         public static void main(String[] args) throws
>>>>>>>>>>> IOException{
>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>         conf.set("fs.default.name", "hadoop1.example.com:8020");
>>>>>>>>>>>         FileSystem dfs = FileSystem.get(conf);
>>>>>>>>>>>         String dirName = "Test1";
>>>>>>>>>>>         Path src = new Path(dfs.getWorkingDirectory() + "/" +
>>>>>>>>>>> dirName);
>>>>>>>>>>>         dfs.mkdirs(src);
>>>>>>>>>>>         Path scr1 = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>         Path dst = new Path(dfs.getWorkingDirectory() +
>>>>>>>>>>> "/Test1/");
>>>>>>>>>>>         dfs.copyFromLocalFile(src, dst);
>>>>>>>>>>>
>>>>>>>>>>>         }
>>>>>>>>>>>         }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     Exception in thread "main"
>>>>>>>>>>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>>>>>>>>>>> communicate with client version 4
>>>>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>>>>>     at $Proxy1.getProtocolVersion(Unknown Source)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>>>>>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>>>>>>     at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>>>>>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
>>>>>>>>>>>     at CopyFile.main(CopyFile.java:11)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am using CDH4.1. i have download the source file of
>>>>>>>>>>> hadoop-1.0.4 and import the jar files into Eclipse. I think it is due to
>>>>>>>>>>> version problem. Could you please let me know what will be correct version
>>>>>>>>>>> for the CDH4.1?
>>>>>>>>>>>
>>>>>>>>>>> Many thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:41 PM, Mohammad Tariq <
>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It should work. Same code is working fine for me. Try to create
>>>>>>>>>>>> some other directory in your Hdfs and use it as your output path. Also see
>>>>>>>>>>>> if you find something in datanode logs.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 9:04 PM, kashif khan <
>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The input path is fine. Problem in output path. I am just
>>>>>>>>>>>>> wonder that it copy the data into local disk  (/user/root/) not into hdfs.
>>>>>>>>>>>>> I dont know why? Is it we give the correct statement to point to hdfs?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 3:10 PM, Mohammad Tariq <
>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Try this as your input file path
>>>>>>>>>>>>>> Path inputFile = new Path("file:///usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:31 PM, kashif khan <
>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> when I am applying the command as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> $ hadoop fs -put /usr/Eclipse/Output.csv
>>>>>>>>>>>>>>> /user/root/Output.csv.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> its work fine and file browsing in the hdfs. But i dont know
>>>>>>>>>>>>>>> why its not work in program.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Many thanks for your cooperation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:53 PM, Mohammad Tariq <
>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It would be good if I could have a look on the files.
>>>>>>>>>>>>>>>> Meantime try some other directories. Also, check the directory permissions
>>>>>>>>>>>>>>>> once.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <
>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have tried through root user and made the following
>>>>>>>>>>>>>>>>> changes:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>> Path outputFile = new Path("/user/root/Output1.csv");
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> No result. The following is the log output. The log shows
>>>>>>>>>>>>>>>>> the destination is null.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)	ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <
>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yeah, My cluster running. When brows
>>>>>>>>>>>>>>>>>> http://hadoop1.example.com: 50070/dfshealth.jsp. I am
>>>>>>>>>>>>>>>>>> getting the main page. Then click on Brows file system. I am getting the
>>>>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> hbase
>>>>>>>>>>>>>>>>>> tmp
>>>>>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And when click on user getting:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> beeswax
>>>>>>>>>>>>>>>>>> huuser (I have created)
>>>>>>>>>>>>>>>>>> root (I have created)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Would you like to see my configuration file. As did not
>>>>>>>>>>>>>>>>>> change any things, all by default. I have installed CDH4.1 and running on
>>>>>>>>>>>>>>>>>> VMs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is your cluster running fine? Are you able to browse
>>>>>>>>>>>>>>>>>>> Hdfs through the Hdfs Web Console at 50070?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <
>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Many thanks.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have changed the program accordingly. It does not
>>>>>>>>>>>>>>>>>>>> show any error but one warring , but when I am browsing the HDFS folder,
>>>>>>>>>>>>>>>>>>>> file is not copied.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> public class CopyData {
>>>>>>>>>>>>>>>>>>>> public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>         Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>         //Configuration configuration = new
>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         //configuration.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         conf.addResource(new
>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>         conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/usr/Eclipse/Output.csv");
>>>>>>>>>>>>>>>>>>>>         Path outputFile = new
>>>>>>>>>>>>>>>>>>>> Path("/user/hduser/Output1.csv");
>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 19-Nov-2012 13:50:32
>>>>>>>>>>>>>>>>>>>> org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>>>>>>>>>>>>>>>>>> WARNING: Unable to load native-hadoop library for your
>>>>>>>>>>>>>>>>>>>> platform... using builtin-java classes where applicable
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Have any idea?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If it is just copying the files without any processing
>>>>>>>>>>>>>>>>>>>>> or change, you can use something like this :
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  public class CopyData {
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>     public static void main(String[] args) throws
>>>>>>>>>>>>>>>>>>>>> IOException{
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>         Configuration configuration = new
>>>>>>>>>>>>>>>>>>>>> Configuration();
>>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>         configuration.addResource(new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>>>>>>>>>>>>>>>>>         Path inputFile = new
>>>>>>>>>>>>>>>>>>>>> Path("/home/mohammad/pc/work/FFT.java");
>>>>>>>>>>>>>>>>>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>>>>>>>>>>>>>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>>>>>>>>>>>>>>>>>         fs.close();
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Obviously you have to modify it as per your
>>>>>>>>>>>>>>>>>>>>> requirements like continuously polling the targeted directory for new files.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks M  Tariq
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> As I am new in  Java and Hadoop and have no much
>>>>>>>>>>>>>>>>>>>>>> experience. I am trying to first write a simple program to upload data into
>>>>>>>>>>>>>>>>>>>>>> HDFS and gradually move forward. I have written the following simple
>>>>>>>>>>>>>>>>>>>>>> program to upload the file into HDFS, I dont know why it does not working.
>>>>>>>>>>>>>>>>>>>>>> could you please check it, if have time.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedInputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.BufferedOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.File;
>>>>>>>>>>>>>>>>>>>>>> import java.io.FileInputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.FileOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.IOException;
>>>>>>>>>>>>>>>>>>>>>> import java.io.InputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.io.OutputStream;
>>>>>>>>>>>>>>>>>>>>>> import java.nio.*;
>>>>>>>>>>>>>>>>>>>>>> //import java.nio.file.Path;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>>>>>>>>>>>>>>>>>> import org.apache.hadoop.fs.Path;
>>>>>>>>>>>>>>>>>>>>>> public class hdfsdata {
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> public static void main(String [] args) throws
>>>>>>>>>>>>>>>>>>>>>> IOException
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>     try{
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     Configuration conf = new Configuration();
>>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new
>>>>>>>>>>>>>>>>>>>>>> Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>>>>>>>>>>>>>>>>>     conf.addResource(new Path
>>>>>>>>>>>>>>>>>>>>>> ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>>>>>>>>>>>>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>>>>>>>>>>>>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>>>>>>>>>>>>>>>>>     String dest = "/user/hduser/input/";
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     //String fileName =
>>>>>>>>>>>>>>>>>>>>>> source.substring(source.lastIndexOf('/') + source.length());
>>>>>>>>>>>>>>>>>>>>>>     String fileName = "Output1.csv";
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         dest = dest + "/" +fileName;
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     else
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         dest = dest + fileName;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     Path path = new Path(dest);
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>     if(fileSystem.exists(path))
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>         System.out.println("File" + dest + " already
>>>>>>>>>>>>>>>>>>>>>> exists");
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>>>>>>>>>>>>>>>>>    InputStream in = new BufferedInputStream(new
>>>>>>>>>>>>>>>>>>>>>> FileInputStream(new File(source)));
>>>>>>>>>>>>>>>>>>>>>>    File myfile = new File(source);
>>>>>>>>>>>>>>>>>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>>>>>>>>>>>>>>>>>    int numbytes = 0;
>>>>>>>>>>>>>>>>>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>>>>>>>        out.write(b,0,numbytes);
>>>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>>>    in.close();
>>>>>>>>>>>>>>>>>>>>>>    out.close();
>>>>>>>>>>>>>>>>>>>>>>    //bos.close();
>>>>>>>>>>>>>>>>>>>>>>    fileSystem.close();
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     catch(Exception e)
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>         System.out.println(e.toString());
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> KK
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <
>>>>>>>>>>>>>>>>>>>>>> dontariq@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> You can set your cronjob to execute the program
>>>>>>>>>>>>>>>>>>>>>>> after every 5 sec.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>     Mohammad Tariq
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Well, I want to automatically upload the files as
>>>>>>>>>>>>>>>>>>>>>>>> the files are generating about every 3-5 sec and each file has size about
>>>>>>>>>>>>>>>>>>>>>>>> 3MB.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>  Is it possible to automate the system using put or
>>>>>>>>>>>>>>>>>>>>>>>> cp command?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I read about the flume and webHDFS but I am not
>>>>>>>>>>>>>>>>>>>>>>>> sure it will work or not.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Many thanks
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander
>>>>>>>>>>>>>>>>>>>>>>>> Alten-Lorenz <wget.null@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Why do you don't use HDFS related tools like put
>>>>>>>>>>>>>>>>>>>>>>>>> or cp?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> - Alex
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <
>>>>>>>>>>>>>>>>>>>>>>>>> drkashif8310@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> > HI,
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > I am generating files continuously in local
>>>>>>>>>>>>>>>>>>>>>>>>> folder of my base machine. How
>>>>>>>>>>>>>>>>>>>>>>>>> > I can now use the flume to stream the generated
>>>>>>>>>>>>>>>>>>>>>>>>> files from local folder to
>>>>>>>>>>>>>>>>>>>>>>>>> > HDFS.
>>>>>>>>>>>>>>>>>>>>>>>>> > I dont know how exactly configure the sources,
>>>>>>>>>>>>>>>>>>>>>>>>> sinks and hdfs.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > 1) location of folder where files are
>>>>>>>>>>>>>>>>>>>>>>>>> generating: /usr/datastorage/
>>>>>>>>>>>>>>>>>>>>>>>>> > 2) name node address: htdfs://
>>>>>>>>>>>>>>>>>>>>>>>>> hadoop1.example.com:8020
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Please let me help.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Many thanks
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > Best regards,
>>>>>>>>>>>>>>>>>>>>>>>>> > KK
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>>>>>>>>>>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>>>>>>>>>>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message