Version control and data provenance in charts, slides, and marketing materials that derive from code ouput

Posted by EMS on Programmers See other posts from Programmers or by EMS
Published on 2013-11-12T16:44:20Z Indexed on 2013/11/12 22:03 UTC
Read the original article Hit count: 336

I develop as part of a small team that mostly does research and statistics stuff. But from the output of our code, other teams often create promotional materials, slides, presentations, etc.

We run into a big problem because the marketing team (non-programmers) tend to use Excel, Adobe products, or other tools to carry out their work, and just want easy-to-use data formats from us. This leads to data provenance problems. We see email chains with attachments from 6 months ago and someone is saying "Hey, who generated this data. Can you generate more of it with the recent 6 months of results added in?"

I want to help the other teams effectively use version control (my team uses it reasonably well for the code, but every other team classically comes up with many excuses to avoid it).

For version controlling a software project where the participants are coders, I have some reasonable understanding of best practices and what to do.

But for getting a team of marketing professionals to version control marketing materials and associate metadata about the software used to generate the data for the charts, I'm a bit at a loss.

Some of the goals I'd like to achieve:

  1. Data that supported a material should never be associated with a person. As in, it should never be the case that someone says "Hey Person XYZ, I see you sent me this data as an attachment 6 months ago, can you update it for me?" Rather, data should be associated with the code and code-version of any code that was used to get it, and perhaps a team of many people who may maintain that code. Then references for data updates are about executing a specific piece of code, with a known version number.

  2. I'd like this to be a process that works easily with the tech that the marketing team already uses (e.g. Excel files, Adobe file, whatever). I don't want to burden them with needing to learn a bunch of new stuff just to use version control. They are capable folks, so learning something is fine. Ideally they could use our existing version control framework, but there are some issues around that. I think knowing some general best practices will be enough though, and I can handle patching that into the way our stuff works now.

Are there any goals I am failing to think about? What are the time-tested ways to do something like this?

© Programmers or respective owner

Related posts about version-control

Related posts about data