実データで覚える Treasure Client コマンドラインリファンス 〜1.Data Import〜 - doryokujin's blog
基本的に上のリンクをそのままに。
# テーブルを作る $ td table:create test shigemk2_bulk Table 'test.shigemk2_bulk' is created. # セッションを作る $ td import:create session_shigemk2 test shigemk2_bulk Bulk import session 'session_shigemk2' is created. # 1行目をヘッダーとして準備用データを用意する これを利用して何度もimportできるようにする $ td import:prepare 101-2014-02.csv --format csv --column-header --time-column 'time' -o ./parts/ Preparing sources Output dir : ./parts/ Source : 101-2014-02.csv (13842646 bytes) Converting '101-2014-02.csv'... sample row: {"time":0,"device":"1366x768","browser":"Mozilla\/5.0 (Windows NT 6.3; WOW64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/32.0.1700.102 Safari\/537.36","unknown":24,"language":"ja,en-US;q=0.8,en;q=0.6","referer":"http:\/\/zenback.itmedia.co.jp\/contents","ip":"xxx.xx.xxx.xxx"} Prepare status: Source : 101-2014-02.csv Status : SUCCESS Read lines : 37881 Valid rows : 37880 Invalid rows : 0 Converted Files : ./parts/101-2014-02_csv_0.msgpack.gz (2084235 bytes) Next steps: => execute following 'td import:upload' command. if the bulk import session is not created yet, please create it with 'td import:create <session> <database> <table>' command. $ td import:upload <session> './parts/101-2014-02_csv_0.msgpack.gz' # データをアップロードする。この段階ではデータをあげているだけ。 $ td import:upload session_shigemk2 './parts/101-2014-02_csv_0.msgpack.gz' Uploading prepared sources Session name : session_shigemk2 Source : ./parts/101-2014-02_csv_0.msgpack.gz (2084235 bytes) Uploading ./parts/101-2014-02_csv_0.msgpack.gz (2084235 bytes)... Upload status: Source : ./parts/101-2014-02_csv_0.msgpack.gz Status : SUCCESS Part name : 101-2014-02_csv_0_msgpack_gz Size : 2084235 Retry count : 0 Next Steps: => execute 'td import:perform session_shigemk2'. # データの保存。結構時間かかった $ td import:perform session_shigemk2 Job 9279134 is queued. Use 'td job:show [-w] 9279134' to show the status. $ td job:show -w 9279134 JobID : 9279134 Status : running Type : bulk_import_perform Database : test queued... started at 2014-04-03T22:41:06Z 14/04/03 22:41:11 INFO log.MLog: MLog clients using log4j logging. 14/04/03 22:41:11 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 14/04/03 22:41:12 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/04/03 22:41:17 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS finished at 2014-04-03T23:32:15Z Use '-v' option to show detailed messages. # データのコミット $ td import:commit session_shigemk2 Bulk import session 'session_shigemk2' started to commit.