Unconvert Text File from Binary Format

Posted by Hammer Bro. on Super User See other posts from Super User or by Hammer Bro.
Published on 2010-12-28T00:06:50Z Indexed on 2010/12/28 0:57 UTC
Read the original article Hit count: 635

Filed under:
|
|
|

I've got a rather large CSV file (~700MB) which I know to consist of lines of 27-character alpha-numeric hashes; no commas or anything fancy. Somehow, during its migration from Windows to Linux (via winSCP and then a few regular SCPs), it has converted into some kind of binary format I am unfamiliar with.

If I open the file in vi, everything appears fine, and it says [converted] at the bottom, although I know it's not a line endings issue (and dos2unix doesn't help). If I 'head' the file, it looks proper except for a "ÿþ" at the beginning of the first line. If I open up the file in nano, however, I see the "ÿþ" at the start and then "^@" before every character (even newlines and EoF).

If I try to re-save or copy the file (say via: head file.csv > short.txt), this special encoding is preserved. I copied the first ten lines out of vi (which displays it properly) into my Windows clipboard via my SSH client, then pasted it into a new text file, test.txt. This file is visually identical when opened in vi (and similar through 'head', minus the "ÿþ"), although it's roughly half of the filesize. Additionally,

file test.txt
test.txt: ASCII text
file short.txt
short.txt:

I have no idea what format this once-text file got converted to (it's notoriously hard to search the internet for symbols), but surely there must be some way to convert it back. Any ideas?

© Super User or respective owner

Related posts about linux

Related posts about encoding