TSV, better than CSV

Lately I’ve been getting a few questions on why one should use .tsv files since CSVs are “standard” and virtually everyone knows how they work — so I wanted to clarify this once and for all.

CSVs ARE A PAIN IN THE BACK. USE TSVs SINCE THEY LIKELY WON’T SCREW AROUND WITH YOUR SOURCE DATA.

To put it differently:

TSV is an alternative to the common comma-separated values (CSV) format, which often causes difficulties because of the need to escape commas – literal commas are very common in text data, but literal tab stops are infrequent in running text. The IANA standard for TSV achieves simplicity by simply disallowing tabs within fields.

Do yourself a favor and stop using CSVs, transition over to TSVs since they’re insanely less error-prone — say goodbye to sleepless nights trying to figure out why funny data has been uploaded to the DB.

If you’re worried about converting old CSV documents, python has the answer:

csv2tsv.sh
1
2
3
#!/usr/bin/env python
import csv, sys
csv.writer(sys.stdout, dialect='excel-tab').writerows(csv.reader(sys.stdin))

It’s then a matter of pipes: cat file.csv | csv2tsv.sh and you’re done.


Hi there! I recently wrote an ebook on web application security, currently sold on leanpub, the Amazon Kindle store and gumroad.

It contains 150+ pages of content dedicated to securing web applications and improving your security awareness when building web apps, with chapters ranging from explaining how to secure HTTP cookies with the right flags to understanding why it is important to consider joining a bug bounty program.

Feel free to skim through some of the free chapters published on this blog and, if the content seems interesting enough to you, grab a copy on leanpub, the Amazon Kindle store, gumroad or simply checkout right down below!

Buy the Web Application Security ebook for $6.99

In the mood for some more reading?

...or check the archives.