DATA310_Public

View the Project on GitHub

Response for Class on 7/29

Using NLP to build a sarcasm classifier

Question 1: Pick two or three news sources and select a few news titles from their feed (about 5 is likely enough). For example you could select CNN, Fox News, MSNBC, NPR, PBS, Al Jazeera, RT (Russia Today), Deutsche Welle, Facebook, BBC, France24, CCTV, NHK World or another source you wish you analyze. Run your sarcasm model to predict whether the titles are interpreted as sarcastic or not. Analyze the results and comment on the different news sources you have selected.

Answer:

Articles Selected and Sarcasm Prediction-

CNN:

“Trump must win North Carolina. He’s losing there.”

“Biden clarifies he has not taken cognitive test”

NPR:

“Americans Back Trump On Immigration — But Only To Stop COVID-19, Poll Finds”

“Airline Food For Sale. No Plane Ticket Required”

“When Oil Prices Plummeted, So Did Oklahoma’s State Budget”

When providing the numerical value to the titles the model is predicting whether or not the title is sarcastic. The closer a title is to 1 the more likely the article title is sarcastic. Inversely, the closer the title is to 0 the more likely the title is not sarcastic. The model did a better job than I thought it would. Honestly, with the state of everything in the world I sometimes have a hard time telling if a title is sarcastic or not, so I thought the model did well. The only article that I believe the model got incorrect was the second article title it classified as sarcastic, but based on the tone of the article, itself, I do not think it is. However, 10 years go in the world of politics the need to know and specific whether or not a party candidate for the Presidency would have come across as so unnecessary that it would sarcastic, so I do understand the mistake on the model’s part.

Text generation with an RNN

Question 1: Use the generate_text() command at the end of the exercise to produce synthetic

output from your RNN model. Run it a second time and review the output. How has your RNN model been able to “learn” and “remember” the shakespeare text in order to reproduce a similar output?

Answer:

Generated Shakespear Text:

RNN models are able to learn the sequences of the letters/strings and then generate new plausible sequences for the letters/strings. First the model is able to have the strings represented by numerical values. Then it takes the first string/letter specified and will predicted/generate the next string/letter based on the learned sequence from the provided text the model.

Stretch Goal: Harry Potter Generated Text

The Harry Potter text for the most part does not make sense and has words that do not even exist; however, an aspect that really surprised me was that when predicting dialogue between characters it was able to still get across the attitude of the characters that they have towards each other. The model should definitely be run for more epochs and probably have the model layers be adjusted a little.

Neural machine translation with attention

Question 1: Use the translate() command at the end of the exercise to translate three sentences

from Spanish to English. How did your translations turn out?

Answer:

It did not go very well, which makes sense to me. Throughout Arabic class at College it has always been painfully obvious when someone tries to use Google Translate for assignments, because it does not understand the aspects of the word that have to do with culture or even if you have words that can appear to mean the same thing but they are used in different contexts.