How to use bzdiff to find difference between 2 bzipped files with diff -I option?

Posted by englebip on Server Fault See other posts from Server Fault or by englebip
Published on 2011-03-11T11:53:15Z Indexed on 2011/03/12 0:12 UTC
Read the original article Hit count: 570

Filed under:
|
|

I'm trying to do a diff on MySQL dumps (created with mysqldump and piped to bzip2), to see if there are changes between consecutive dumps. The followings are the tails of 2 dumps:

tmp1:

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2011-03-11  1:06:50

tmp2:

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2011-03-11  0:40:11

When I bzdiff their bzipped version:

$ bzdiff tmp?.bz2 
10c10
< -- Dump completed on 2011-03-11  1:06:50
---
> -- Dump completed on 2011-03-11  0:40:11

According to the manual of bzdiff, any option passed on to bzdiff is passed on to diff. I therefore looked at the -I option that allows to define a regexp; lines matching it are ignored in the diff. When I then try:

$ bzdiff -I'Dump' tmp1.bz2 tmp2.bz2

I get an empty diff. I would like to match as much as possible of the "Dump completed" line, though, but when I then try:

$ bzdiff -I'Dump completed' tmp1.bz2 tmp2.bz2
diff: extra operand `/tmp/bzdiff.miCJEvX9E8'
diff: Try `diff --help' for more information.

Same thing happens for some variations:

$ bzdiff '-IDump completed' tmp1.bz2 tmp2.bz2
$ bzdiff '-I"Dump completed"' tmp1.bz2 tmp2.bz2
$ bzdiff -'"IDump completed"' tmp1.bz2 tmp2.bz2

If I diff the un-bzipped files there is no problem:

$diff -I'^[-][-] Dump completed on' tmp1 tmp2

gives also an empty diff.

bzdiff is a shell script usually placed in /bin/bzdiff. Essentially, it parses the command options and passes them on to diff as follows:

OPTIONS=
FILES=
for ARG
do
    case "$ARG" in
    -*) OPTIONS="$OPTIONS $ARG";;
     *) if test -f "$ARG"; then
            FILES="$FILES $ARG"
        else
            echo "${prog}: $ARG not found or not a regular file"
            exit 1
        fi ;;
    esac
done
[...]
bzip2 -cdfq "$1" | $comp $OPTIONS - "$tmp"

I think the problem stems from escaping the spaces in the passing of $OPTIONS to diff, but I couldn't figure out how to get it interpreted correctly.

Any ideas?

EDIT @DerfK: Good point with the ., I had forgotten about them... I tried the suggestion with the multiple level of quotes, but that is still not recognized:

$ bzdiff "-I'\"Dump.completed.on\"'" tmp1.bz2 tmp2.bz2
diff: extra operand `/tmp/bzdiff.Di7RtihGGL'

© Server Fault or respective owner

Related posts about linux

Related posts about diff