Unable to get set intersection to work

Posted by chavanak on Stack Overflow See other posts from Stack Overflow or by chavanak
Published on 2010-03-03T13:05:30Z Indexed on 2010/03/08 11:06 UTC
Read the original article Hit count: 180

Filed under:
|
|
|

Sorry for the double post, I will update this question if I can't get things to work :)

I am trying to compare two files. I will list the two file content:

 File 1                           File 2

"d.complex.1"                     "d.complex.1"

  1                                 4
  5                                 5
  48                                47
  65                                21

d.complex.10                    d.complex.10

  46                                6
  21                                46
 109                               121
 192                               192

TI am trying to compare the contents of the two file but not in a trivial way. I will explain what I want with an example. If you observe the file content I have typed above, the d.complex.1 of file_1 has "5" similar to d.complex.1 in file_2; the same d.complex.1 in file_1 has nothing similar to d.complex.10 in file_2. What I am trying to do is just to print out those d.complex. which has nothing in similar with the other d.complex. Consider the d.complex. as a heading if you want. But all I am trying is compare the numbers below each d.complex. and if nothing matches, I want that particular d.complex. from both files to be printed. If even one number is present in both d.complex. of both files, I want it to be rejected.

My Code: The method I chose to achieve this was to use sets and then do a difference. Code I wrote was:

first_complex=open( "file1.txt", "r" )
first_complex_lines=first_complex.readlines()
first_complex_lines=map( string.strip, first_complex_lines )
first_complex.close()

second_complex=open( "file2.txt", "r" )
second_complex_lines=second_complex.readlines()
second_complex_lines=map( string.strip, second_complex_lines )
second_complex.close()


list_1=[]
list_2=[]

res_1=[]
for line in first_complex_lines:
    if line.startswith( "d.complex" ):
        res_1.append( [] )
    res_1[-1].append( line )

res_2=[]
for line in second_complex_lines:
    if line.startswith( "d.complex" ):
        res_2.append( [] )
    res_2[-1].append( line )
h=len( res_1 )
k=len( res_2 )
for i in res_1:
   for j in res_2:
       print i[0]
       print j[0]
       target_set=set ( i )
       target_set_1=set( j )
       for s in target_set:
           if s not in target_set_1:
               if s[0] != "d":
                   print s

The above code is giving an output like this (just an example): d.complex.1.dssp d.complex.1.dssp 1 48 65

d.complex.1.dssp
d.complex.10.dssp    
46
21

109

What I would like to have is:

d.complex.1
d.complex.1 (name from file2)

d.complex.1
d.complex.10 (name from file2)

I am sorry for confusing you guys, but this is all that is required.

I am so new to python so my concept above might be flawed. Also I have never used sets before :(. Can someone give me a hand here?

© Stack Overflow or respective owner

Related posts about python

Related posts about sets