What are the main challenges of text analysis
Homework 5 (Chapter 9 and Chapter 10)
What are the main challenges of text analysis?
What is a corpus?
What are common words (such as a, and, of) called?
Why can’t we use TF alone to measure the usefulness of the words?
What is a caveat of IDF? How does TFIDF address the problem?
Name three benefits of using the TFIDF.
What methods can be used for sentiment analysis?
Research and document additional use cases and actual implementations for Hadoop.
Compare and contrast Hadoop, Pig, Hive, and HBase. List strengths and weaknesses of each tool set.
Research and summarize three published use cases for Hadoop, Pig, Hive, and HBase.
Answer preview to what are the main challenges of text analysis

992 words