Due: Monday April 21
Consider the following data, namely a set of word unigrams and bigrams from a familiar text:
Note that <s> is the beginning of sentence marker and </s> is the end of sentence marker.The conditional probability P(w2|w1) can be computed as C(w1w2)/C(w1).
P(<s> w1 w2 w3 </s>) = P(w1 | <s>) * P(w2 | w1) * P(w3 | w2) * P(</s> | w3)
What probability would a bigram language model based on the data above assign to the following sentences. Assume that in each case the sentence has the beginning-of-sentence and end-of-sentence tags surrounding it:
Your answer should be a floating point number in each case. You should show how you calculated it so that if you get the answer wrong, I will at least know that you know how to calculate the probability and just got a clerical error.