Improve Efficiency in Array comparison in Ruby

Posted by user2985025 on Stack Overflow See other posts from Stack Overflow or by user2985025
Published on 2013-11-12T21:05:31Z Indexed on 2013/11/12 21:54 UTC
Read the original article Hit count: 190

Filed under:
|
|

Hi I am working on Ruby /cucumber and have an requirement to develop a comparison module/program to compare two files.

Below are the requirements

  1. The project is a migration project . Data from one application is moved to another

  2. Need to compare the data from the existing application against the new ones.

Solution :

I have developed a comparison engine in Ruby for the above requirement.

a) Get the data, de duplicated and sorted from both the DB's b) Put the data in a text file with "||" as delimiter c) Use the key columns (number) that provides a unique record in the db to compare the two files

For ex File1 has 1,2,3,4,5,6 and file2 has 1,2,3,4,5,7 and the columns 1,2,3,4,5 are key columns. I use these key columns and compare 6 and 7 which results in a fail.

Issue :

The major issue we are facing here is if the mismatches are more than 70% for 100,000 records or more the comparison time is large. If the mismatches are less than 40% then comparison time is ok.

Diff and Diff -LCS will not work in this case because we need key columns to arrive at accurate data comparison between two applications.

Is there any other method to efficiently reduce the time if the mismatches are more thatn 70% for 100,000 records or more.

Thanks

© Stack Overflow or respective owner

Related posts about ruby-on-rails

Related posts about ruby