<div dir="ltr">Thank you. I read the W3C recommendation, as well as the referenced documents. I drafted a comparison here:<div><div> <a href="https://github.com/pmitros/tsvx/blob/master/doc_source/related_formats.md" target="_blank">https://github.com/pmitros/<wbr>tsvx/blob/master/doc_source/<wbr>related_formats.md</a></div><div><br><div>I think the standards are trying to do something a bit different, and are actually pretty complementary. tsvx is designed to facilitate compatibility between applications for internal data analysis and BI work. It is a prescriptive standard. It says how files ought to be escaped and formatted. The W3C CSV for the Web group appears to be doing exactly what name implies -- provide descriptive metadata for public redistribution of datasets on the web, especially for use on the semantic web. It is a descriptive standard designed to work with all essentially all tabular data files. A tsvx file could certainly be described with the W3C metadata if the intention were external distribution.</div><div><br></div><div>Just to give the types of use cases I have internally:</div><div><ul><li>I have pipelines where I might have a dozen TSV files generated by scripts working on data from MySQL, Vertica, and spreadsheets, all feeding back to create reports. Before I switched to tsvx, scripts were brittle to fairly modest format changes (e.g. adding a column), and had a bunch of unnecessary logic parsing data types.</li><li>Each time I import something into a tool I didn't create, I need to click through a dialog letting it know what the delimiter is, and in LibreOffice, reformat column types.</li></ul>Adding W3C metadata files would add overhead for this type of work, rather than reducing it, and would only provide benefit at the stage of the final results.</div><div><br></div><div>Piotr</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 3, 2016 at 10:35 AM, Eike Rathke <span dir="ltr"><<a href="mailto:erack@redhat.com" target="_blank">erack@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Piotr,<br>
<span><br>
On Thursday, 2016-11-03 08:08:23 -0400, Piotr Mitros wrote:<br>
<br>
> I do a fair bit of work where I move data between LibreOffice, MySQL,<br>
> Vertica, Google Docs, Hadoop, Python, and a few other systems. The<br>
> formatting of TSV files is ad-hoc. Each system has little differences in<br>
> how strings are escaped, and similar. In addition, there is no way to<br>
> preserve metadata.<br>
><br>
> I drafted a modest proposed spec for standardizing TSV files by<br>
> standardizing types, and adding metadata, and was hoping to solicit<br>
> feedback on that proposal:<br>
><br>
> <a href="http://www.tsvx.org/" rel="noreferrer" target="_blank">http://www.tsvx.org/</a><br>
<br>
</span>It seems to me you're attempting to reinvent a wheel. I suggest you take<br>
a look at <a href="https://www.w3.org/standards/techs/csv" rel="noreferrer" target="_blank">https://www.w3.org/standards/t<wbr>echs/csv</a> and maybe<br>
<a href="https://www.w3.org/community/csvw/" rel="noreferrer" target="_blank">https://www.w3.org/community/c<wbr>svw/</a><br>
<span class="m_-3781693829328573136HOEnZb"><font color="#888888"><br>
Eike<br>
<br>
--<br>
LibreOffice Calc developer. Number formatter stricken i18n transpositionizer.<br>
GPG key "ID" 0x65632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A<br>
Better use 64-bit 0x6A6CD5B765632D3A here is why: <a href="https://evil32.com/" rel="noreferrer" target="_blank">https://evil32.com/</a><br>
Care about Free Software, support the FSFE <a href="https://fsfe.org/support/?erack" rel="noreferrer" target="_blank">https://fsfe.org/support/?erac<wbr>k</a><br>
</font></span></blockquote></div><br></div></div>