etcher: Opt in for error reports and usage statistics
NOTICE:
As a work’a’round just remove etcher completely and use usbimager which does the same as etcher (and a little more as it also can read content of a flash disk/card and saves it as compressed image file). But it comes without all the unnecessary tracking/ads/etc which is included in etcher and by default turned on.
- Etcher version: 1.5.19
- Operating system and architecture: amd64, debian
- Image flashed: none
- Do you see any meaningful error information in the DevTools? no, why?
I just installed balena etcher via the debian repository. After installing I started with
balena-etcher-electron
got a
ready-to-show: 2810.265ms
and the ui was presented. When I clicked the settings wheel top right I saw that a ‘service’ called
Anonymously report errors and usage statistics to balena.io
which is/was activated by default.
I didn’t digged deeper but I’m almost sure there was already a data leakage before I was able to deactivate this ‘feature’. I’m also not a lawyer but with new European laws this is for sure not tolerable anymore.
Please make this option opt in. Thank’s
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 16
- Comments: 49 (15 by maintainers)
Nowadays I just don’t use software which comes bundled with adware and spyware (namely balena etcher).
250kb alternative (etcher uses more than 400MB, why?) is called usbimager, does things right, respects your privacy and is true foss ❤️
It’s 100% GDPR compatibly and comes without opt-in (or opt-out) because it doesn’t suck your data 😮
Hi everyone. I wanted to give an update to where we are with this issue. There are multiple issues raised so I will address them individually.
We should separate the discussion between what’s legal and required by GDPR and requests that go beyond what the law requires. Specifically, GDPR requires opt-in consent for personally identifiable data, not for anonymous data collection. It is not our intention, nor is it useful for us, to collect personal identifiable information (see Purpose section bellow). So the first question is “Are we collecting personally identifiable information by mistake?” and the second question is “Is making the usage statistics opt-in the best decision for the project?”
Personal data collection
We conducted an extensive audit of all the data we collect from the Etcher application to make sure no personally identifiable data is collected by mistake. Collecting data by mistake might sound strange, but it can easily happen in a desktop application. For example, the mixpanel library will include information about the current system user by default when ran in an Electron app. Whenever we became aware of such issues in the past we prompty fixed them.
The results of our investigation showed that Etcher will make connection to the following systems:
The large number of unintended connections happened as a side-effect of loading content from our balena.io website that includes these libraries automatically. Action item: We are removing all instances of those connections from Etcher
Furthermore, we audited all the data we collect to make sure none can be characterised as personally identifiable. To do this properly are consulting our EU based lawyers that can provide an expert opinion on what the GDPR and EU law in general requires. It is important to refrain from making legal claims unless someone is intimately familiar with the legislation. Unfortunately, there have been a number legal claims in this and other threads with questionable validity.
To make this extremely clear, we are taking the law seriously and are investing time, money, and effort, to consult experts in the field to guide us on this matter. We do this because it is the right thing to do. We’ve done it before (for balenaCloud) and we’ll happily do it for all the products we offer.
Even though our conversation with our legal team is still ongoing we have identified a couple of cases where PII is sent to our data collection system. Sentry, our error collection tool, will log a stacktrace when Etcher hits a critical error that can potentially include a path in the system which includes the username of the user. The IP address of the event was also logged. Action item: We are fixing both of these problems and will remove or anonymise any data our legal team deems PII
Purpose of data collection
With the legal stuff out of the way, I wanted to touch on the reason we are collecting data which will hopefully help guide the discussion about whether it should be an opt-in or opt-out feature.
For most software engineers writing an image flashing application sounds easy. After all, at the very core it is a simple block copy operation that we’ve known how to do for ages. It can’t possibly be that complex. However, this is far from the truth! After releasing etcher for the first time, and as the tool was gaining adoption we were seeing it run in more and more obscure combinations of systems. This produced a (very) long tail of issues that we couldn’t have predicted or tested during development. It was through constant sieving through error reports and measuring success rates across deployed versions that we managed to reach the level of quality that you see today.
When we say that usage data helps develop etcher we’re not talking about some abstract possibility. This is very real and has shaped the etcher we know and love. The list of bugs fixed is endless.
Discussion on making collection opt-in
With the full context fleshed out we can now re-engage in the discussion of making data collection opt-in. As mentioned above, we have to make the decision that is best for the project and somehow balance what the users expect from a privacy point of view with what the users expect from a robust piece of software point of view. Given the benefits we’ve already seen this is not a clear-cut decision. At the same time the userbase of Etcher has grown tremendously and one could argue that most issues have already been seen. Unfortunately I don’t have a concrete way forward to offer just yet, but we haven’t ruled it out as a possibility.
Finally, to further steer the discussion towards the right direction I will change the title of the issue to just the opt-in discussion. @rradar if you still think there is a legal issue please open a separate ticket clearly explaining the problem. Rest assured that we are working with our legal professionals to ensure we are not breaking the law.
It would be nice if you did not use that sort of language. Please edit your comment and be civilized.
It turns out that collecting user data without explicit consent means that you end up violating the consent of your users in a fraction of cases where that’s not what the user wants.
Doing things with a user’s computer that they don’t want makes your software malware.
It is only “not a clear-cut decision” if you don’t mind violating the consent of your users, which is a despicable stance, if indeed you hold it. Please default data collection to off. Ask users on a first launch with a modal, if you wish. But do not use the network without explicit permission.
Hiding behind an “is it illegal?” to mask the fact that you violate user consent is not something you should be doing. It is rude and immoral, and you should strive to conduct your business in an ethical and respectful fashion.
@rradar thank you for this information. Ether seems does not care about privacy…
Maybe aim for higher than “not breaking the law”…
Said @petrosagg:
There is an argument to be made that this is not a personal insult, but in fact an accurate objective description of the current state of affairs. If circumstances are such that failing to protect one’s own privacy would result in danger, then Etcher’s privacy-compromising default settings are indeed dangerous.
It is also clear that releasing the current, consent-violating-by-default version, is an unethical business practice, which would have to be undertaken by unethical people, necessarily. It wasn’t an accident or oversight, it was a clear and definitive choice made by Balena staff, to place bug acquisition data over that of user consent.
The only thing remaining for Balena to do is to remedy this failure.
…plus there’s zero chance the data being transmitted it Google, Mixpanel, etc is anonymous. This is a dangerous product made by unethical people.
This thread has been dead for a few months, but I want to step in and add my voice to say that not all balena users feel so strongly about this.
I use etcher and balenaCloud on a regular basis, and am glad the Balena team is tracking crash reports and usage data. If they keep it anonymous, and it helps them make etcher faster and more stable, then I approve. As for the tutorial, well, that seems like a way to fund development. People who decide to build the projects will get an experience with the Balena platform and may decide to buy paid services at some point. I have read the devblogs, and it sounds like a surprisingly large amount of work went into making Balena stable and fully cross-platform. To me promoting actually useful tutorials seems like a pretty benign way to make money compared to some of the alternatives like targeted advertising.
For those of you asking why Etcher uses more than 400Mb, it is because it is built with Electron which means most of Chromium gets bundled into each app. In exchange for this size trade-off you get the ability to write the app with HTML, CSS, Javascript, and familiar browser and Node.js APIs. This is what allows a small company like Balena, who are not really in the business of making desktop apps, to put out something of really high quality like Etcher. I long for the day when a more lightweight framework to build apps with web technologies comes on the scene, but until then I would rather take a bloated app rather than nothing at all.
To @rradar, @thefaj, and others. You make some good points, and I almost agree with you that the tracking should be opt-in. But I disagree with your tone in this discussion. You are demanding that the Balena team respect what you consider to be your “rights” in a very disrespectful manner. It seems like your verbal abuse is getting in the way of persuading people. I think you might have been able to push me, and possibly the balena team, over the fence into the “no tracking” camp if you had presented your viewpoint more tactfully.
Cheers!
It’s not that it’s bad. It’s that you should not use a user’s computer to do things that user does not want.
If you don’t know if the user wants it or not, ask. But don’t assume and proceed, because then in some set of cases you do what the user does not want, which is a universally bad thing, regardless of the benefits to you or to other users.
@petrosagg we can argue about the stupidity of GDPR as much as we like - but it is a reality we live in. Of course it can establish TCP/IP connections - but in the EU it now means then there should be a lot of legal information provided to the user for doing so.
Legal issues aside - why a fancy version of
ddrequires to load a webpage from a remote location is beyond me. Until this issue I never realized it won’t work offline.@thefaj I have repeated this many times but for some reason you seem to ignore it. We’re not using Google Analytics. We’re only using Mixpanel and Sentry.
Secondly, we actually do send anonymous data and strip events from personal information. If you believe this is false you have to provide counter evidence. The code is there for you to inspect. Until then your claim means nothing.
Personal insults are not allowed in this community, please remove this comment. Next time there won’t be a warning.
@rradar I think we’re confusing what this issue is about. You can already disable usage statistics from your settings so if you want to flash completely privately by all means use this feature. It’s why we put it there in the first place.
The only current bug with the feature, which we are working on right now and we’ll release a fixed version in the following days, is that some libraries make a call to a remote server even if you merely
require()them, without doing any API calls.But please try to keep the discussion on point. What we’re discussing here is if anonymous usage statistics should be opt-in. User choice is and will continue to be a feature of Etcher.
@thundron wrote in #3006:
@bboc wrote in #3006:
@bboc wrote in #3006:
And ask for users permission before exposing a users IP! That’s why we need opt in! To comply with the laws! (same applies to the ads showing while flashing)
I’m still don’t get why balena is still having so hard times with the laws even they are aware of the situation (see post from @petrosagg) 😞
@sneak we do not operate under a moral framework that includes “universally bad things”. To see why this makes no sense, imagine someone really not wanting to see the color blue on their screen. According to you, all software should ask for consent before displaying the color blue, otherwise “in some set of cases you do what the user does not want”. I expect that you’d assume and proceed in your software because the benefits of using colors in your product outweigh the annoyance of the people that don’t like blue.
In a similar manner, we are weighing the annoyance of some people having to opt out with the benefits to everyone else from improving the software and we think this is the right way to approach this.
To be clear, we’re fanatic on the way we approach this issue, not the outcome. In fact it’s very possible that we end up making it opt-in. If we were sure about the outcome the issue would be closed.
It’s just tiring having the same group people shouting and shouting as if they are some sort of moral oracle bringing justice to the world.
The collection of so called “anonymous usage statistics” needs to be opt-in. Otherwise everyone (even people who prefer not to leak their data) will be forced to participate in the data collection.
The setting which is implemented right now leaks data before it can be turned off -> NO GO!
I guess there is a difference in typing in an URL and clicking a link or just loading a resource from an application. And at least for every 3rd party one should provide information what is happening with that kind of information. Even if it is “we are storing nothing” - but IANAL.
I like the the idea of GDPR but I am not a fan of the implementation - so to speak. Anyway!
Thanks for clarifying about the offline support.
Just found the setting that was enabled. Will give that a try. Thanks!
Yeah, the benefit of doubt went out the window long ago. This app is a cesspool of spyware
Just came here because this fancy version of
ddcreated connections to all these hosts:and I just could not believe it. Not cool. Looking forward to the opt-in. Otherwise I am back to
dd.By the way: I did the opt out setting and etcher still want’s to gain (again unintentionally? 😞) access to some cloud and tell about my presence… 👎
Thoughts how opt-in could look like:
First start modal:
“Hello we are balena team working hard… Can we have your data to make the world a better place?” Yes - No (Their should be really no preference by design or visuals what to choose)
-> If I say no I don’t want etcher to make any network connection. (Could ask a second one if etcher would be allowed to phone home to check if a new version is available…)
If a error happens during flashing or using the program a modal could be presented to the user:
“We catch a error. To get a chance solving this you can upload the crash report to the balena cloud now” Yes - No
Settings: “Error Reports and usage statistics” (initially turned off)
This affirms that the creators of this software have no respect for user consent.
This will need to be forked.