Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The problem had to do not just with data analysis per se, but with what database researchers call “provenance” — broadly, where did data arise, what inferences were drawn from the data, and how relevant are those inferences to the present situation?

Plug: I work at a company (https://www.pachyderm.com) whose product is designed precisely to track data provenance across pipelines and through a company's larger data-processing operation for this reason



You all should really play that up more in your messaging - "provenance" is one of the hardest and least-addressed components of building AI/ML/data science systems that actually have measurable impact (rather than analysts making plots and speculating). In general having a structured, centralized representation of business processes is super valuable I'm sure.

If you write a blog post describing how critical that is to practical data science efficacy with some examples I bet you'll end up in a bunch of VP inboxes.


It may be time for a refresh, but our CEO has such a blog post from about a year ago: https://medium.com/pachyderm-data/provenance-the-missing-fea...


Nice - I went to the blog page on your site and this wasn't there. Good stuff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: