## NPTEL Natural Language Processing Week 4 Assignment Answers 2023

**1. Baum-Welch algorithm is an example of – [Marks 1]**a. Forward-backward algorithm

b. Special case of the Expectation-maximization algorithm

c. Both A and B

c. None

Answer :-For AnswerClick Here

**2. **

Answer :-For AnswerClick Here

**3. **

Answer :-

**4. Let us define an HMM Model with K classes for hidden states and T data points as observations. The dataset is defined as X = {x1, x2, . .., XT } and the corresponding hidden states are Z = {z1, 22, . .., 2T }. Please note that each xi is an observed variable and each zi can belong to one of classes for hidden state. What will be the size of the state transition matrix, and the emission matrix, respectively for this example.**A) K × K, K x T

B) K× T, K× T

C) K× K. K × K

D) K x T, K × K

Answer :-

**5. You are building a model distribution for an infinite stream of word tokens. You know that the source of this stream has a vocabulary of size 1000. Out of these 1000 words you know of 100 words to be stop words each of which has a probability of 0.0019. With only this knowledge what is the maximum possible entropy of the modelled distribution. (Use log base 10 for entropy calculation) [Marks 2]**a. 5.079

b. O

c. 2.984

d. 12.871

Answer :-

**6. For an HMM model with N hidden states, V observable states, what are the dimensions of parameter matrices A,B and m? A: Transition matrix, B: Emission matrix, m: Initial Probability matrix. [Marks 1]**a. N× V, N× V, N× N

b. N × N, N × V, N × 1

c. N × N. V × V. N× 1

d. N × V, V × V, V × 1

Answer :-For AnswerClick Here

**7. **

Answer :-

**8. In Hidden Markov Models or HMMs, the joint likelihood of an observed sequence O with a hidden state sequence Q, is written as P(O, Q; 0). In many applications, like POS tagging, one is interested in finding the hidden state sequence Q, for a given observation sequence, that maximizes P(O, Q; 0). What is the time required to compute the most likely Q using an exhaustive search? The required notations are, N: possible number of hidden states, T: length of the observed sequence. [Marks 1] **

a. Of the order of TNT

b. Of the order of N2T

c. Of the order of Th

d. Of the order of N2

Answer :-For AnswerClick Here

## NPTEL Natural Language Processing Week 3 Assignment Answers 2023

**1. Which of the following words contains both derivational as well inflectional suffixes:**

- regularity
- carefully
- older
- availabilities

Answer :-

**2. Let’s assume the probability of rolling 1 two times in a row of a dice is p. Consider a sentence consisting of N random digits. A model assigns probability to each of the digit with the probability p. Find the perplexity of the sentence.**1. 10

2. 6

3. 36

4. 3

Answer :-

**3. Assume that “x” represents the input and “y” represents the tag/label. Which of the following mappings are correct?**

- Generative Models – learn Joint Probability p(x, y)
- Discriminative Models – learn Joint Probability p(x, y)
- Generative Models – learn Posterior Probability p(y | x) directly
- Discriminative Models – learn Posterior Probability p(y ×) directly

Answer :-

**4. Which one of the following is an example of the Generative model?**

- Conditional Random Fields
- Naive Bayes
- Support Vector Machine
- Logistic Regression

Answer :-

**5. Natural language processing is essentially the study of the meaning of the words a human says or writes. Natural language processing is all around us all the time, but it lso happens to be a way to improve the chatbot or product we interact with on a regular basis. Natural language processing is all about mimicking our own language patterns. Natural language processing can also improve the efficiency of business transactions and customer care. Natural language processing is the application of computer technology.**

Suppose we want to check the probabilities of the final words that succeed the string language processing in the above paragraph. Assume d= 0; it is also given that no of unigrams = 78, no of bigrams = 122, no of trigrams = 130,, Question 6 and Question 7 are related to Question 5 corpus.

Solve the question with the help of Kneser-Ney backoff technique.

**What is the continuation probability of “is”?**

- 0.0078
- 0.0076
- 0.0307
- 0.0081

Answer :-

**6. What will be the value of P(is| language processing) using Kneser-Ney backoff technique and choose the correct answer below.. Please follow the paragraph in Question.**

- 0.5
- 0.6
- 0.8
- 0.7

Answer :-

**7. What is the value of P(can| language processing)? Please follow the paragraph in Question 5**

- 0.1
- 0.02
- 0.3
- 0.2

Answer :-

**8. Which of the following morphological process is true for motor+ hotel – motel?**

- Suppletion
- Compounding
- Blending
- Clipping

Answer :-For AnswerClick Here

**9. Consider the HMM given below to solve the sequence labeling problem of POS tagging. With that HMM, calculate the probability that the sequence of words “free workers” will be assigned the following parts of speech;**

**The above table contains emission probability and the figure contains transition probability**

- 4.80 * 10-8
- 9.80 * 10-8
- 3.96 * 10-7
- 4.96 * 10-8

Answer :-

**10. Which of the following is/are true?**

- Only a few non-deterministic automation can be transformed into a deterministic one
- Recognizing problem can be solved in linear time
- Deterministic FSA might contain empty (€ transition
- There exist an algorithm to transform each automation into a unique equivalent automation with the least no of states

Answer :-

## NPTEL Natural Language Processing Week 2 Assignment Answers 2023

**1. According to Zipf’s law which statements) is/are correct?**

(i) A small number of words occur with high frequency.

(il) A large number of words occur with low frequency.

a. Both (i) and (ii) are correct

b. Only (ti) is correct

c. Onlv (h) is correct

d. Neither (i) nor (ii) is correct

Answer :-a. Both (i) and (ii) are correct Zipf's law is an empirical law that states that in a large text corpus, the frequency of a word is inversely proportional to its rank. In other words, a small number of words (highly frequent words) occur with high frequency, while a large number of words (rare words) occur with low frequency

**2. Consider the following corpus C1 of 4 sentences. What is the total count of unique bi-grams for which the likelihood will be estimated? Assume we do not perform any pre-processing.**

today is Nayan’s birthday

she loves ice cream

she is also fond of cream cake

we will celebrate her birthday with ice cream cake

а. 24

b. 28

c. 27

d. 23

Answer :-а. 24

**3. A 4-gram model is a_________order Markov Model.**

a. Constant

b. Five

C. Four

d.Three

Answer :- d.Three

**4. Which one of these is a valid Markov assumption?**

a. The probability of a word depends only on the current word.

b. The probability of a word depends only on the previous word.

c. The probability of a word depends only on the next word.

d. The probability of a word depends only on the current and the previous word.

Answer :- b. The probability of a word depends only on the previous word.In a Markov model, the probability of a future event (in this case, the occurrence of a word) depends only on the current state (previous word), and not on the sequence of events that preceded it. This is known as the Markov property, and it's the foundation of Markov models, including Markov chains and n-gram language models.

**5. For the string ‘mash’, identify which of the following set of strings have a Levenshtein distance of 1.**

a. smash, mas, lash, mushy, hash

b. bash, stash, lush, flash, dash

C. smash, mas, lash, mush, ash

d. None of the above

Answer :- C. smash, mas, lash, mush, ash

**6. Assume that we modify the costs incurred for operations in calculating Levenshtein distance, such that both the insertion and deletion operations incur a cost of 1 each, while substitution incurs a cost of 2. Now, for the string lash’ which of the following set of strings will have an edit distance of 1?**

a. ash, slash, clash, flush

b. flash, stash, lush, blush,

c. slash, last, bash, ash

d. None of the above

Answer :-d. None of the above

7. Given a corpus C2, the Maximum Likelihood Estimation (MLE) for the bigram “dried berries” is 0.3 and the count of occurrence of the word “dried” is 580. for the same corpus C2. the likelihood of “dried berries” after applying add-one smoothing is 0.04. What is the vocabulary size of C2?

a. 3585

b. 3795

C. 4955

d. 3995

Answer :-b. 3795

For Question 8 to 10, consider the following corpus C3 of 3 sentences.

there is a big garden

children play in a garden

they play inside beautiful garden**8. Calculate P(they play in a big garden) assuming a bi-gram language model.**

a. 1/8

b. 1/12

c. 1/24

d None of the above

Answer :-b. 1/12

**9. Considering the same model as in Question 7, calculate the perplexity of <$> they play in big garden < (s>.**a. 2.289

b. 1.426

c. 1.574

d. 2.178

Answer :-b. 1.426

**10. Assume that you are using a bi-gram language model with add one smoothing. Calculate P(they play in beautiful garden).**

a. 4.472 x 10^-6

b. 2.236 x 10^-6

c. 3.135 × 104-6

d. None of the above

Answer :-b. 2.236 x 10^-6

## NPTEL Natural Language Processing Week 1 Assignment Answers 2023

**1. In a corpus, you found that the word with rank 4th has a frequency of 600. What can be the best guess for the rank of a word with frequency 300?**

- 2
- 4
- 8
- 6

Answer:-8

**2. In the sentence, “In Kolkata I took my hat off. But I can’t put it back on.”, total number of word tokens and word types are: **

- 14, 13
- 13, 14
- 15, 14
- 14, 15

Answer:-14, 13

**3. Let the rank of two words, w1 and w2, in a corpus be 1600 and 400, respectively. Let m1 and m2 represent the number of meanings of w1 and w2 respectively. The ratio m1 : m2 would tentatively be **

- 1:4
- 4:1
- 1:2
- 2:1

Answer:-1:2

**4. What is the valid range of type-token ratio of any text corpus?**

- TTRe(0, 1] (excluding zero)
- TTRe[0, 1]
- TTRe[-1,1]
- TTRe[0, +∞] (any non-negative number)

Answer:-TTRe(0, 1] (excluding zero)

**5. If first corpus has TTR, = 0.025 and second corpus has TTR2 = 0.25, where TTR, and TTR2 represents type/token ratio in first and second corpus respectively, then**

- First corpus has more tendency to use different words.
- Second corpus has more tendency to use different words.
- Both a and b
- None of these

Answer:- Second corpus has more tendency to use different words.

**6. Which of the following is/are true for the English Language? **

- Lemmatization works only on inflectional morphemes and Stemming works only on derivational morphemes.
- The outputs of lemmatization and stemming for the same word might differ.
- Output of lemmatization are always real words
- Output of stemming are always real words

Answer:-b. The outputs of lemmatization and stemming for the same word might differ. c. Output of lemmatization are always real words.

**7. An advantage of Porter stemmer over a full morphological parser? **

- The stemmer is better justified from a theoretical point of view
- The output of a stemmer is always a valid word
- The stemmer does not require a detailed lexicon to implement
- None of the above

Answer:-The stemmer does not require a detailed lexicon to implement.

8. Which of the following are instances of stemming? (as per Porter Stemmer)

- are -> be
- plays -> play
- saw -> s
- university -> univers

Answer:-b. plays -> playd.university -> univers

**9. What is natural language processing good for? **

- Summarize blocks of text
- Automatically generate keywords
- Identifying the type of entity extracted
- All of the above

Answer:-All of the above

10. What is the size of unique words in a document where total number of words = 12000. K = 3.71 Beta = 0.69?

- 2421
- 3367
- 5123
- 1529

Answer:-2421