On Jul 7, 2021, at 6:29 AM, Robert Story <rstory@isi.edu> wrote:
On Wed 2021-07-07 08:33:06-0400 Ken wrote:
I'm thinking of this more at a higher layer: balancing user privacy versus the desire to collect data for analysis. What level of control should the user have on the decision to publish/not publish? In option #3, publication of the data is almost an after-thought (active opt-in). It requires the user to do something extra, which they might skip or forget once they see their results. Option #2 means the user has to actively do something to NOT publish (active opt-out).
Why not choose the best (worst?) of both worlds? Don't have a default, and require that either --opt-in or --opt-out be specified on the command line or in a config file. Or default to opt-in, but prompt user for confirmation if no config file or argument specifies a default.
I'm inclined to agree with Robert. There should not be a default. Each user should be required to choose whether or not to share the results, at least the first time it is run. I also feel strongly that the data repository should require something like an API key to submit results. This is intended to maintain high data quality. I expect that how API keys are generated and distributed would be out of scope for our document. DW