Skip to content

Latest commit

 

History

History
22 lines (18 loc) · 649 Bytes

File metadata and controls

22 lines (18 loc) · 649 Bytes

cloud_computing_mrjob

sandbox for cloud computing map reduce jobs

Run only mapper in command line

python3 mr_word_count.py -r local < data/pg27827.txt --mapper

Run in command line and output to a file

python3 mr_count_freq_words.py data/pg27827.txt --output-dir=freq_word_count_out --no-output

Run in jupyter with output dir

!python3 ipynb_map_reduce_cc1_top10.py -r local ./data/shortjokes.csv --output-dir=ipynb_map_reduce_cc1_top10_out --no-output

Run on AWS EMR

!python3 ipynb_map_reduce_cc1_top10.py -r emr s3://cc1top10/shortjokes.csv --output-dir=s3://cc1top10/output --no-output