Create hadoopguide in mysql
Mysql> CREATE DATABASE hadoopguide;
Mysql> GRANT ALL PRIVILEGES ON
hadoopguide.* TO ‘%’@’localhost’;
Mysql> GRANT ALL PRIVILEGES ON
haddopguide.* TO ‘’@’localhost’;
% mysql hadoopguide
Mysql> CREATE TABLE widgets(id INT NOT
NULL PRIMARY KEY AUTO_INCREMENT,
widget_name
VARCHAR(64) NOT NULL),
price
DECIMAL(10,2),
design_date
DATE,
version
INT,
design_comment
VARCHAR(100));
MYSQL> INSERT INTO widgets VALUES (NULL, 'sprocket', 0.25, '2010-02-10',
1,
'Connects two gizmos');
MYSQL> INSERT
INTO WIDGETS VALUES (NULL, 'gizmo', 4.00, '2009-11-30', 4,
NULL);
MYSQL> INSERT
INTO WIDGETS VALUES (NULL, 'gadget', 99.99, '1983-08-13',
13,
'Our flagship product');
mysql>
quit;
Lets
use Sqoop to import this table into hdfs
Sqoop import --connect
jdbc:mysql://localhost:hadoopguide --table widgets –m 1
Sqoop import tool runs a MR job that connects to mysql db and reads the
table. By default it runs 4 mappers in parallel creating 4 output files in the
same dir.
Since we know that we are importing only 3 rows, we have specified to use
only 1 mapper by using -m 1, so
we get a single file in HDFS.
% hadoop fs –cat widgets/part-m-00000
1,sprocket,0.25,2010-02-10,1,Connects
two gizmos
2,gizmo,4.00,2009-11-30,4,null
3,gadget,99.99,1983-08-13,13,Our
flagship product
The connect string jdbc:mysql//localhost:hadoopguide
will read from a local database. Do not mention localhost, if the Hadoop cluster
is used; specify the full hostname.
Generate code without import
% sqoop codegen --connect jdbc:mysql://localhost/hadoopguide
--table widgets --class-name Widget
Codegen tool simply generates code without
doing the full import. It generates a class Widget.java. If we are working with
records imported to SequenceFiles, we have to work with generated code to de-serialize
data from Sequence File storage. We can work with text based records without
using generated code, but it can handle some tedious aspects of data processing
for us.
No comments:
Post a Comment