A file concordance tracks the unique words in a file and their frequencies. Write a program that displays a concordance for a file. The program should output the unique words and their frequencies in alphabetical order. Variations are to track sequences of two words and their frequencies, or n words and their frequencies. Below is an example file along with the program input and output: example.txt I AM SAM I AM SAM SAM I AM

Respuesta :

Answer:

Here is the Python program:

filename=input('Enter the file name: ')  #prompts to enter file name

file = open(filename,"r+")  #opens file in read mode

dict={}  #creates a dictionary

for word in file.read().split(): #reads and splits the file and checks each word in the file

 if word not in dict:  #if word is not already present in dictionary

    dict[word] = 1  #sets dict[word] to 1

 else:  #if word is already in the dictionary

    dict[word] += 1  #adds 1 to the count of that word

file.close();  #closes file

for key in sorted(dict):  #words are sorted as per their ASCII value

 print("{0} {1} ".format(key, dict[key])); #prints the unique words (sorted order) and their frequencies

   

Explanation:

I will explain the program with an example:

Lets say the file contains the following contents:

I AM SAM I AM SAM SAM I AM

This file is read using read() method and is split into a list using split() method so the file contents become:

['I', 'AM', 'SAM', 'I', 'AM', 'SAM', 'SAM', 'I', 'AM']      

for word in file.read().split():  

The above for loop checks file contents after the contents are split into a list of words. So if statement if word not in dict: inside for loop checks if each item in the list i.e. each word is not already in dict. It first checks word 'I'. As the dictionary dict is empty so this word is not present in dict. So the statement dict[word] = 1 executes which adds the word 'I' to dict and sets it to 1. At each iteration, each word is checked if it is not present in dictionary and sets its count to 1. This way the words 'I', 'AM' and 'SAM' count is set to 1. Now when 'I' appears again then this if condition: if word not in dict: evaluates to false because 'I' is already present in the dict. So else part executes dict[word] += 1 which adds 1 to the count of this word. So frequency of 'I' becomes 2. Now each time when 'I' in encountered its count is incremented by 1. Since 'I' appears three times so frequency of 'I' is 3.

After all the words are checked and the loop ends, the next for loop for key in sorted(dict): uses sorted method to sort the items in dict in alphabetical order then check each key of the dict and prints the key value along with it frequency. So the output of the entire program is:

AM 3                                                                                                                           I 3                                                                                                                            SAM 3

The screenshot of program along with its output is attached.

Ver imagen mahamnasir