LOAD DATA
說明
LOAD DATA
陳述式會從使用者指定的目錄或檔案載入資料到 Hive serde 表格。如果指定目錄,則會載入目錄中的所有檔案。如果指定檔案,則只會載入單一檔案。此外,LOAD DATA
陳述式會採用選用分割規格。當指定分割時,資料檔案 (當輸入來源是目錄時) 或單一檔案 (當輸入來源是檔案時) 會載入至目標表格的分割中。
如果表格已快取,則此指令會清除表格的快取資料及其所有參考它的依賴項。當下次存取表格或依賴項時,快取將會延遲填滿。
語法
LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ partition_spec ]
參數
-
路徑
檔案系統的路徑。可以是絕對路徑或相對路徑。
-
表格識別碼
指定表格名稱,可以選擇加上資料庫名稱作為限定詞。
語法:
[ database_name. ] table_name
-
分割規格
選用參數,指定分割的鍵值對清單,以逗號分隔。
語法:
PARTITION ( partition_col_name = partition_col_val [ , ... ] )
-
LOCAL
如果指定,會導致
INPATH
相對於本機檔案系統解析,而不是預設檔案系統,預設檔案系統通常是分散式儲存。 -
OVERWRITE
預設情況下,新的資料會附加到表格。如果使用
OVERWRITE
,表格會被新的資料覆寫。
範例
-- Example without partition specification.
-- Assuming the students table has already been created and populated.
SELECT * FROM students;
+---------+----------------------+----------+
| name| address|student_id|
+---------+----------------------+----------+
|Amy Smith|123 Park Ave, San Jose| 111111|
+---------+----------------------+----------+
CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT) USING HIVE;
-- Assuming the students table is in '/user/hive/warehouse/'
LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load;
SELECT * FROM test_load;
+---------+----------------------+----------+
| name| address|student_id|
+---------+----------------------+----------+
|Amy Smith|123 Park Ave, San Jose| 111111|
+---------+----------------------+----------+
-- Example with partition specification.
CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) PARTITIONED BY (c2, c3);
INSERT INTO test_partition PARTITION (c2 = 2, c3 = 3) VALUES (1);
INSERT INTO test_partition PARTITION (c2 = 5, c3 = 6) VALUES (4);
INSERT INTO test_partition PARTITION (c2 = 8, c3 = 9) VALUES (7);
SELECT * FROM test_partition;
+---+---+---+
| c1| c2| c3|
+---+---+---+
| 1| 2| 3|
| 4| 5| 6|
| 7| 8| 9|
+---+---+---+
CREATE TABLE test_load_partition (c1 INT, c2 INT, c3 INT) USING HIVE PARTITIONED BY (c2, c3);
-- Assuming the test_partition table is in '/user/hive/warehouse/'
LOAD DATA LOCAL INPATH '/user/hive/warehouse/test_partition/c2=2/c3=3'
OVERWRITE INTO TABLE test_load_partition PARTITION (c2=2, c3=3);
SELECT * FROM test_load_partition;
+---+---+---+
| c1| c2| c3|
+---+---+---+
| 1| 2| 3|
+---+---+---+