> The problem had to do not just with data analysis per se, but with what database researchers call “provenance” — broadly, where did data arise, what inferences were drawn from the data, and how relevant are those inferences to the present situation?
Plug: I work at a company (https://www.pachyderm.com) whose product is designed precisely to track data provenance across pipelines and through a company's larger data-processing operation for this reason
You all should really play that up more in your messaging - "provenance" is one of the hardest and least-addressed components of building AI/ML/data science systems that actually have measurable impact (rather than analysts making plots and speculating). In general having a structured, centralized representation of business processes is super valuable I'm sure.
If you write a blog post describing how critical that is to practical data science efficacy with some examples I bet you'll end up in a bunch of VP inboxes.
Plug: I work at a company (https://www.pachyderm.com) whose product is designed precisely to track data provenance across pipelines and through a company's larger data-processing operation for this reason