Unable to get set intersection to work
        Posted  
        
            by chavanak
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by chavanak
        
        
        
        Published on 2010-03-03T13:05:30Z
        Indexed on 
            2010/03/08
            11:06 UTC
        
        
        Read the original article
        Hit count: 245
        
Sorry for the double post, I will update this question if I can't get things to work :)
I am trying to compare two files. I will list the two file content:
 File 1                           File 2
"d.complex.1"                     "d.complex.1"
  1                                 4
  5                                 5
  48                                47
  65                                21
d.complex.10                    d.complex.10
  46                                6
  21                                46
 109                               121
 192                               192
TI am trying to compare the contents of the two file but not in a trivial way. I will explain what I want with an example. If you observe the file content I have typed above, the d.complex.1 of file_1 has "5" similar to d.complex.1 in file_2; the same d.complex.1 in file_1 has nothing similar to d.complex.10 in file_2. What I am trying to do is just to print out those d.complex. which has nothing in similar with the other d.complex. Consider the d.complex. as a heading if you want. But all I am trying is compare the numbers below each d.complex. and if nothing matches, I want that particular d.complex. from both files to be printed. If even one number is present in both d.complex. of both files, I want it to be rejected.
My Code: The method I chose to achieve this was to use sets and then do a difference. Code I wrote was:
first_complex=open( "file1.txt", "r" )
first_complex_lines=first_complex.readlines()
first_complex_lines=map( string.strip, first_complex_lines )
first_complex.close()
second_complex=open( "file2.txt", "r" )
second_complex_lines=second_complex.readlines()
second_complex_lines=map( string.strip, second_complex_lines )
second_complex.close()
list_1=[]
list_2=[]
res_1=[]
for line in first_complex_lines:
    if line.startswith( "d.complex" ):
        res_1.append( [] )
    res_1[-1].append( line )
res_2=[]
for line in second_complex_lines:
    if line.startswith( "d.complex" ):
        res_2.append( [] )
    res_2[-1].append( line )
h=len( res_1 )
k=len( res_2 )
for i in res_1:
   for j in res_2:
       print i[0]
       print j[0]
       target_set=set ( i )
       target_set_1=set( j )
       for s in target_set:
           if s not in target_set_1:
               if s[0] != "d":
                   print s
The above code is giving an output like this (just an example): d.complex.1.dssp d.complex.1.dssp 1 48 65
d.complex.1.dssp
d.complex.10.dssp    
46
21
109
What I would like to have is:
d.complex.1
d.complex.1 (name from file2)
d.complex.1
d.complex.10 (name from file2)
I am sorry for confusing you guys, but this is all that is required.
I am so new to python so my concept above might be flawed. Also I have never used sets before :(. Can someone give me a hand here?
© Stack Overflow or respective owner