When transferring files from system to system, for instance, using csv files to transfer data from one db to another, sometimes there are problems where programs will not process a file because of its line endings. This happens especially if you process a file on one platform, say Mac, and try to use the file on another, say Windows. What can you do about it?
Even if you have saved a file as CSV from Mac Excel 2008, it will not necessarily be saved in a format that can be read programmatically, if the program is expecting a certain type of line ending.
How Can We Avoid Line Terminator Problems and Troubles?
Let’s review how lines are terminated by default on Windows, Mac and Unix.
- Windows-style line endings are CRLF (
\r\nor hex 0D0A)
- Unix-style line endings are LF (
\nor hex 0A)
- Mac-style line endings were in CR (
\ror hex 0D), but are now LF (
\nor hex 0A)
There are a number of ready-made command line programs like
dos2mac and so on, that can be used to convert line endings. Note that you can also use the tr or perl commands as well.
Tr is available on Macs by default and on almost any unix. Perl is pretty ubiquitous as well. E.g:
1 2 3
If you want to find out whether a file has the expected line terminators, you can use the
_file_ command on *nix or Mac. Here’s what that looks like:
1 2 3 4
You can also use the cat command to show line endings, with its -e switch. Do a man cat for more info, because you can also get line numbers, for instance. The first file below has CRLF, which shows up in cat’s output as ^M$, and the second file has only a ^, which is equivalent to the Mac CR line ending only situation. What you need will depend upon the import program.
1 2 3 4 5 6 7 8
Besides line endings, there is also the text encoding of the file, to watch out for. For instance, is the file saved in Roman or Unicode or some other format? In the end, take care to confirm the file you have output is what is needed by the program for input. Enjoy!