In this assignment, you are a given a dataset of approximately 20,000 news documents collected from a set of newsgroups (mailing lists). The set of documents (email messages) is partitioned almost evenly across 20 different topics such as sport, electronics, politics, etc. The documents of each newsgroup are stored in one directory. Each news document is stored in a text file in a semi-structured format.
17 freelancers are bidding on average $69 for this job
hi, i am an expert in java and big data. i can easily complete this project for you. we can have a discussion about it in chat. thanks Relevant Skills and Experience java, big data Proposed Milestones $60 USD - all
Hi How are you? I am interested in this job. You provided description but didn't provide the details of work to be done on the data. Can you please share more details? Cheers
Hi there, you will need my expertise on this project. I can deliver he job fast and accurate. Please accept my bid. Thank you very much. Relevant Skills and Experience Big data
I will hive and static partition to store file according to policitcs, sports etc Relevant Skills and Experience hive, sqoop, static partitiom Proposed Milestones $15 USD - i m super suited foer this