Removing words from a file
        Posted  
        
            by 
                user1765792
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by user1765792
        
        
        
        Published on 2012-10-22T16:46:14Z
        Indexed on 
            2012/10/22
            17:01 UTC
        
        
        Read the original article
        Hit count: 184
        
I'm trying to take a regular text file and remove words identified in a separate file (stopwords) containing the words to be removed separated by carriage returns ("\n").
Right now I'm converting both files into lists so that the elements of each list can be compared. I got this function to work, but it doesn't remove all of the words I have specified in the stopwords file. Any help is greatly appreciated.
def elimstops(file_str): #takes as input a string for the stopwords file location
  stop_f = open(file_str, 'r')
  stopw = stop_f.read()
  stopw = stopw.split('\n')
  text_file = open('sample.txt') #Opens the file whose stop words will be eliminated
  prime = text_file.read()
  prime = prime.split(' ') #Splits the string into a list separated by a space
  tot_str = "" #total string
  i = 0
  while i < (len(stopw)):
    if stopw[i] in prime:
      prime.remove(stopw[i]) #removes the stopword from the text
    else:
      pass
    i += 1
  # Creates a new string from the compilation of list elements 
  # with the stop words removed
  for v in prime:
    tot_str = tot_str + str(v) + " " 
  return tot_str
        © Stack Overflow or respective owner