flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Automatically upload files into HDFS
Date Mon, 19 Nov 2012 14:53:22 GMT
It would be good if I could have a look on the files. Meantime try some
other directories. Also, check the directory permissions once.

Regards,
    Mohammad Tariq



On Mon, Nov 19, 2012 at 8:13 PM, kashif khan <drkashif8310@gmail.com> wrote:

>
> I have tried through root user and made the following changes:
>
>
> Path inputFile = new Path("/usr/Eclipse/Output.csv");
> Path outputFile = new Path("/user/root/Output1.csv");
>
> No result. The following is the log output. The log shows the destination
> is null.
>
>
> 2012-11-19 14:36:38,960 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)
ip=/134.91.36.41	cmd=getfileinfo	src=/user	dst=null	perm=null
> 2012-11-19 14:36:38,977 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)
ip=/134.91.36.41	cmd=listStatus	src=/user	dst=null	perm=null
> 2012-11-19 14:36:39,933 INFO FSNamesystem.audit: allowed=true	ugi=hbase (auth:SIMPLE)
ip=/134.91.36.41	cmd=listStatus	src=/hbase/.oldlogs	dst=null	perm=null
> 2012-11-19 14:36:41,147 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)
ip=/134.91.36.41	cmd=getfileinfo	src=/user/root	dst=null	perm=null
> 2012-11-19 14:36:41,229 INFO FSNamesystem.audit: allowed=true	ugi=dr.who (auth:SIMPLE)
ip=/134.91.36.41	cmd=listStatus	src=/user/root	dst=null	perm=null
>
>
> Thanks
>
>
>
>
>
>
> On Mon, Nov 19, 2012 at 2:29 PM, kashif khan <drkashif8310@gmail.com>wrote:
>
>> Yeah, My cluster running. When brows http://hadoop1.example.com:
>> 50070/dfshealth.jsp. I am getting the main page. Then click on Brows file
>> system. I am getting the following:
>>
>> hbase
>> tmp
>> user
>>
>> And when click on user getting:
>>
>> beeswax
>> huuser (I have created)
>> root (I have created)
>>
>> Would you like to see my configuration file. As did not change any
>> things, all by default. I have installed CDH4.1 and running on VMs.
>>
>> Many thanks
>>
>>
>>
>>
>>
>> On Mon, Nov 19, 2012 at 2:04 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>
>>> Is your cluster running fine? Are you able to browse Hdfs through the
>>> Hdfs Web Console at 50070?
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Nov 19, 2012 at 7:31 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>
>>>> Many thanks.
>>>>
>>>> I have changed the program accordingly. It does not show any error but
>>>> one warring , but when I am browsing the HDFS folder, file is not copied.
>>>>
>>>>
>>>> public class CopyData {
>>>> public static void main(String[] args) throws IOException{
>>>>         Configuration conf = new Configuration();
>>>>         //Configuration configuration = new Configuration();
>>>>         //configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>         //configuration.addResource(new
>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>
>>>>         conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>         conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>          FileSystem fs = FileSystem.get(conf);
>>>>         Path inputFile = new Path("/usr/Eclipse/Output.csv");
>>>>         Path outputFile = new Path("/user/hduser/Output1.csv");
>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>         fs.close();
>>>>     }
>>>> }
>>>>
>>>> 19-Nov-2012 13:50:32 org.apache.hadoop.util.NativeCodeLoader <clinit>
>>>> WARNING: Unable to load native-hadoop library for your platform...
>>>> using builtin-java classes where applicable
>>>>
>>>> Have any idea?
>>>>
>>>> Many thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 1:18 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>
>>>>> If it is just copying the files without any processing or change, you
>>>>> can use something like this :
>>>>>
>>>>> public class CopyData {
>>>>>
>>>>>     public static void main(String[] args) throws IOException{
>>>>>
>>>>>         Configuration configuration = new Configuration();
>>>>>         configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/core-site.xml"));
>>>>>         configuration.addResource(new
>>>>> Path("/home/mohammad/hadoop-0.20.205/conf/hdfs-site.xml"));
>>>>>         FileSystem fs = FileSystem.get(configuration);
>>>>>         Path inputFile = new Path("/home/mohammad/pc/work/FFT.java");
>>>>>         Path outputFile = new Path("/mapout/FFT.java");
>>>>>         fs.copyFromLocalFile(inputFile, outputFile);
>>>>>         fs.close();
>>>>>     }
>>>>> }
>>>>>
>>>>> Obviously you have to modify it as per your requirements like
>>>>> continuously polling the targeted directory for new files.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 6:23 PM, kashif khan <drkashif8310@gmail.com>wrote:
>>>>>
>>>>>> Thanks M  Tariq
>>>>>>
>>>>>> As I am new in  Java and Hadoop and have no much experience. I am
>>>>>> trying to first write a simple program to upload data into HDFS and
>>>>>> gradually move forward. I have written the following simple program
to
>>>>>> upload the file into HDFS, I dont know why it does not working. 
could you
>>>>>> please check it, if have time.
>>>>>>
>>>>>> import java.io.BufferedInputStream;
>>>>>> import java.io.BufferedOutputStream;
>>>>>> import java.io.File;
>>>>>> import java.io.FileInputStream;
>>>>>> import java.io.FileOutputStream;
>>>>>> import java.io.IOException;
>>>>>> import java.io.InputStream;
>>>>>> import java.io.OutputStream;
>>>>>> import java.nio.*;
>>>>>> //import java.nio.file.Path;
>>>>>>
>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>> import org.apache.hadoop.fs.FSDataInputStream;
>>>>>> import org.apache.hadoop.fs.FSDataOutputStream;
>>>>>> import org.apache.hadoop.fs.FileSystem;
>>>>>> import org.apache.hadoop.fs.Path;
>>>>>> public class hdfsdata {
>>>>>>
>>>>>>
>>>>>> public static void main(String [] args) throws IOException
>>>>>> {
>>>>>>     try{
>>>>>>
>>>>>>
>>>>>>     Configuration conf = new Configuration();
>>>>>>     conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
>>>>>>     conf.addResource(new Path ("/etc/hadoop/conf/hdfs-site.xml"));
>>>>>>     FileSystem fileSystem = FileSystem.get(conf);
>>>>>>     String source = "/usr/Eclipse/Output.csv";
>>>>>>     String dest = "/user/hduser/input/";
>>>>>>
>>>>>>     //String fileName = source.substring(source.lastIndexOf('/')
+
>>>>>> source.length());
>>>>>>     String fileName = "Output1.csv";
>>>>>>
>>>>>>     if (dest.charAt(dest.length() -1) != '/')
>>>>>>     {
>>>>>>         dest = dest + "/" +fileName;
>>>>>>     }
>>>>>>     else
>>>>>>     {
>>>>>>         dest = dest + fileName;
>>>>>>
>>>>>>     }
>>>>>>     Path path = new Path(dest);
>>>>>>
>>>>>>
>>>>>>     if(fileSystem.exists(path))
>>>>>>     {
>>>>>>         System.out.println("File" + dest + " already exists");
>>>>>>     }
>>>>>>
>>>>>>
>>>>>>    FSDataOutputStream out = fileSystem.create(path);
>>>>>>    InputStream in = new BufferedInputStream(new FileInputStream(new
>>>>>> File(source)));
>>>>>>    File myfile = new File(source);
>>>>>>    byte [] b = new byte [(int) myfile.length() ];
>>>>>>    int numbytes = 0;
>>>>>>    while((numbytes = in.read(b)) >= 0)
>>>>>>
>>>>>>    {
>>>>>>        out.write(b,0,numbytes);
>>>>>>    }
>>>>>>    in.close();
>>>>>>    out.close();
>>>>>>    //bos.close();
>>>>>>    fileSystem.close();
>>>>>>     }
>>>>>>     catch(Exception e)
>>>>>>     {
>>>>>>
>>>>>>         System.out.println(e.toString());
>>>>>>     }
>>>>>>     }
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Thanks again,
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> KK
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 12:41 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>>>>>>
>>>>>>> You can set your cronjob to execute the program after every 5
sec.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 19, 2012 at 6:05 PM, kashif khan <drkashif8310@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Well, I want to automatically upload the files as  the files
are
>>>>>>>> generating about every 3-5 sec and each file has size about
3MB.
>>>>>>>>
>>>>>>>>  Is it possible to automate the system using put or cp command?
>>>>>>>>
>>>>>>>> I read about the flume and webHDFS but I am not sure it will
work
>>>>>>>> or not.
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 12:26 PM, Alexander Alten-Lorenz
<
>>>>>>>> wget.null@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Why do you don't use HDFS related tools like put or cp?
>>>>>>>>>
>>>>>>>>> - Alex
>>>>>>>>>
>>>>>>>>> On Nov 19, 2012, at 11:44 AM, kashif khan <drkashif8310@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> > HI,
>>>>>>>>> >
>>>>>>>>> > I am generating files continuously in local folder
of my base
>>>>>>>>> machine. How
>>>>>>>>> > I can now use the flume to stream the generated
files from local
>>>>>>>>> folder to
>>>>>>>>> > HDFS.
>>>>>>>>> > I dont know how exactly configure the sources, sinks
and hdfs.
>>>>>>>>> >
>>>>>>>>> > 1) location of folder where files are generating:
>>>>>>>>> /usr/datastorage/
>>>>>>>>> > 2) name node address: htdfs://hadoop1.example.com:8020
>>>>>>>>> >
>>>>>>>>> > Please let me help.
>>>>>>>>> >
>>>>>>>>> > Many thanks
>>>>>>>>> >
>>>>>>>>> > Best regards,
>>>>>>>>> > KK
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexander Alten-Lorenz
>>>>>>>>> http://mapredit.blogspot.com
>>>>>>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message