follow instruc

EBK JAVA PROGRAMMING
9th Edition
ISBN:9781337671385
Author:FARRELL
Publisher:FARRELL
Chapter8: Arrays
Section: Chapter Questions
Problem 9PE
icon
Related questions
Question

Need help with a python program. Please follow instructions provided in assignmnet.

Part C: Sentiment Analysis of Unlabelled Reviews
Once Part B is complete, save your program into a new file so that you can modify it for Part C.
Now, after creating and filling in the dictionary in Part A, open a new file containing unlabelled
movie reviews (unlabelledReviews.txt). The file will contain one movie review per line.
Read in this file one line at a time, and calculate the rating for that line using the dictionary you
created in Part A. The rating for each unlabelled review should be calculated as the average
rating of each word in that review (which you already know how to get from Part B, right?)
For each review in this new file, print out:
1. The review itself,
2. The rating for that review to 1 decimal place, and
3. Whether that review is considered POSITIVE, NEUTRAL, or NEGATIVE:
Reviews with a rating greater than 2 are POSITIVE
Reviews with a rating equal to 2 are NEUTRAL
Reviews with a rating less than 2 are NEGATIVE
Hint: When debugging your code, it will be helpful to print out each word in the new review and
the word's rating – so you can make sure that everything is working correctly. Then remove
those print statements to get the final output, as below.
Example output for two unlabelled movie reviews from a file:
(Note: the file has more than 2 unlabelled movie reviews, and your program must process all of
them)
The movie review was:
"You could hate it for the same reason
This review is rated at 1.9 which is NEGATIVE.
The movie review was:
"The performances are
This review is rated at 2.2 which is POSITIVE.
an absolute joy
End of Processing.
Requirements and Development notes:
You do not need to ask the user for the filenames in Q2, you can "hardcode" the file
names into your program.
Sample files are available on UM Learn.
You can assume that the Grader will put the files to read in into the same directory as
your code when marking.
program should display output in the format shown above.
Your
Transcribed Image Text:Part C: Sentiment Analysis of Unlabelled Reviews Once Part B is complete, save your program into a new file so that you can modify it for Part C. Now, after creating and filling in the dictionary in Part A, open a new file containing unlabelled movie reviews (unlabelledReviews.txt). The file will contain one movie review per line. Read in this file one line at a time, and calculate the rating for that line using the dictionary you created in Part A. The rating for each unlabelled review should be calculated as the average rating of each word in that review (which you already know how to get from Part B, right?) For each review in this new file, print out: 1. The review itself, 2. The rating for that review to 1 decimal place, and 3. Whether that review is considered POSITIVE, NEUTRAL, or NEGATIVE: Reviews with a rating greater than 2 are POSITIVE Reviews with a rating equal to 2 are NEUTRAL Reviews with a rating less than 2 are NEGATIVE Hint: When debugging your code, it will be helpful to print out each word in the new review and the word's rating – so you can make sure that everything is working correctly. Then remove those print statements to get the final output, as below. Example output for two unlabelled movie reviews from a file: (Note: the file has more than 2 unlabelled movie reviews, and your program must process all of them) The movie review was: "You could hate it for the same reason This review is rated at 1.9 which is NEGATIVE. The movie review was: "The performances are This review is rated at 2.2 which is POSITIVE. an absolute joy End of Processing. Requirements and Development notes: You do not need to ask the user for the filenames in Q2, you can "hardcode" the file names into your program. Sample files are available on UM Learn. You can assume that the Grader will put the files to read in into the same directory as your code when marking. program should display output in the format shown above. Your
Question 2: Movie Reviews
Sentiment Analysis is a problem within the field of Artificial Intelligence which seeks to
determine the general attitude of a writer given some text they have written. For instance, we
would like the program to recognize that the text "My favourite film all year" is a positive
statement while “A giant waste of time" is negative.
One algorithm that we can use for this is to assign a number to each word based on how positive
or negative that word is, and then score the statement based on the values of the words. But, how
do we come up with our word scores in the first place?
That's what we will do in this assignment. You are going to search through a file containing
movie reviews from the Rotten Tomatoes website which have both a numeric score as well as
text. Your program will use this to learn which words are positive and which are negative. The
file is called movieReviews.txt and is available on UM Learn.
Notice that each review starts with a number 0 through 4 with the following meaning:
0: negative
1: somewhat negative
2: neutral
3: somewhat positive
4: positive
You are going to write a program that determines the score for each word in this file, and then
uses those word scores to decide if an unlabelled movie review is positive, negative, or neutral.
Part A: Learning from Labelled Movie Reviews
To begin, your program must compute the average sentiment score for each of the words in the
movieReviews.txt file. Download the text file and save it in the same folder where your
program will be. Then write a program to do the following:
Set up a new, empty dictionary.
Iterate over every review in the text file (there is one review per line).
Examine every word in every review within the file.
If the word is not yet in your dictionary:
o Add a new entry into your dictionary for that word. The word itself is the key, and
the value to store at this key is a list that contains two items: the sentiment score
and the number 1 (meaning that you've seen this word 1 time).
Otherwise (if the word is already in your dictionary):
o Add the new sentiment score to the score that is already stored in the list, and
o Increase the number of times that you have seen this word.
For example, if the file contained only two reviews:
4 It was great
1 It was terrible
3
Then we would have the following key-value pairs in the dictionary:
Value
Key
it
[5, 2]
[5, 2]
[4, 1]
[1, 1]
was
great
terrible
There is no console output for Part A.
What is a word? When designing a program like this, you need to make sure that you and the
program's end users agree on what counts as a unique word. For this assignment:
• Ignore capitalization: "And" and "and" should be counted as the same word.
Do not worry about punctuation, symbols, or numbers within the review text - you do not
need to remove these. Just put everything you find into your dictionary. Your dictionary
will have entries that are just symbols (ex. "." or ",") and also entries that are numbers.
• Make sure strip out all whitespace from each word. For example, you should not have
words in your dictionary that contain a space, tab character ("\t"), or newline (“\n").
Part B: Creating Word Scores
After your dictionary has been created and filled in Part A, ask the user to enter a word in the
console.
If that word exists in the dictionary, print out the number of times that word occurred in the
movie review file and the average rating for that word. The average rating is the total score
divided by the number of times that word occurred in the movie review file.
Imagine our movie review file contains the same two reviews from Part A.
Here are 3 example sessions (user input in blue):
Enter a word: Great
'great' appears 1 time(s) and has an average rating of 4.
End of Processing.
Enter a word: it
'it' appears 2 time(s) and has an average rating of 2.5.
End of Processing.
Enter a word: haberdashery
'haberdashery' does not appear in any movie reviews.
End of Processing.
4
Transcribed Image Text:Question 2: Movie Reviews Sentiment Analysis is a problem within the field of Artificial Intelligence which seeks to determine the general attitude of a writer given some text they have written. For instance, we would like the program to recognize that the text "My favourite film all year" is a positive statement while “A giant waste of time" is negative. One algorithm that we can use for this is to assign a number to each word based on how positive or negative that word is, and then score the statement based on the values of the words. But, how do we come up with our word scores in the first place? That's what we will do in this assignment. You are going to search through a file containing movie reviews from the Rotten Tomatoes website which have both a numeric score as well as text. Your program will use this to learn which words are positive and which are negative. The file is called movieReviews.txt and is available on UM Learn. Notice that each review starts with a number 0 through 4 with the following meaning: 0: negative 1: somewhat negative 2: neutral 3: somewhat positive 4: positive You are going to write a program that determines the score for each word in this file, and then uses those word scores to decide if an unlabelled movie review is positive, negative, or neutral. Part A: Learning from Labelled Movie Reviews To begin, your program must compute the average sentiment score for each of the words in the movieReviews.txt file. Download the text file and save it in the same folder where your program will be. Then write a program to do the following: Set up a new, empty dictionary. Iterate over every review in the text file (there is one review per line). Examine every word in every review within the file. If the word is not yet in your dictionary: o Add a new entry into your dictionary for that word. The word itself is the key, and the value to store at this key is a list that contains two items: the sentiment score and the number 1 (meaning that you've seen this word 1 time). Otherwise (if the word is already in your dictionary): o Add the new sentiment score to the score that is already stored in the list, and o Increase the number of times that you have seen this word. For example, if the file contained only two reviews: 4 It was great 1 It was terrible 3 Then we would have the following key-value pairs in the dictionary: Value Key it [5, 2] [5, 2] [4, 1] [1, 1] was great terrible There is no console output for Part A. What is a word? When designing a program like this, you need to make sure that you and the program's end users agree on what counts as a unique word. For this assignment: • Ignore capitalization: "And" and "and" should be counted as the same word. Do not worry about punctuation, symbols, or numbers within the review text - you do not need to remove these. Just put everything you find into your dictionary. Your dictionary will have entries that are just symbols (ex. "." or ",") and also entries that are numbers. • Make sure strip out all whitespace from each word. For example, you should not have words in your dictionary that contain a space, tab character ("\t"), or newline (“\n"). Part B: Creating Word Scores After your dictionary has been created and filled in Part A, ask the user to enter a word in the console. If that word exists in the dictionary, print out the number of times that word occurred in the movie review file and the average rating for that word. The average rating is the total score divided by the number of times that word occurred in the movie review file. Imagine our movie review file contains the same two reviews from Part A. Here are 3 example sessions (user input in blue): Enter a word: Great 'great' appears 1 time(s) and has an average rating of 4. End of Processing. Enter a word: it 'it' appears 2 time(s) and has an average rating of 2.5. End of Processing. Enter a word: haberdashery 'haberdashery' does not appear in any movie reviews. End of Processing. 4
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Public key encryption
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
EBK JAVA PROGRAMMING
EBK JAVA PROGRAMMING
Computer Science
ISBN:
9781337671385
Author:
FARRELL
Publisher:
CENGAGE LEARNING - CONSIGNMENT