Page MenuHomePhabricator

Can exfat be used in WMF production?
Closed, ResolvedPublic

Description

Context here: WMF is engaged with an external party which has indicated only exfat will work for them for data transfer of a physical volume. Let's figure out if that's really ok.


From https://tools.wmflabs.org/sal/log/AWdbRQA5zCcrHSwq1eVM

17:01 <chasemp> stat1004:~# aptitude install exfat-fuse exfat-utils (elukey fyi)

exfat is patent encumbered, non-free software. From https://en.wikipedia.org/wiki/ExFAT:

exFAT is proprietary, and Microsoft owns patents on several elements of its design.[2]

Fedora has the same stance, from https://fedoraproject.org/wiki/Forbidden_items#exfat:

Implementations of exfat are currently patent-encumbered, thus not permitted in Fedora.

In addition, this was installed outside of puppet, in violation of the Wikimedia Server Access Responsibilities document (L3).

Event Timeline

Legoktm triaged this task as Unbreak Now! priority.Nov 28 2018, 9:08 PM
Legoktm created this task.
chasemp lowered the priority of this task from Unbreak Now! to Medium.Nov 28 2018, 9:13 PM

I am under the impression anything in Debian main is ok to install in prod, but this is based on adhoc conversations during the Ubuntu->Debian decision making process.

Ofc, let's remove and talk to legal ppls:

stat1004:~# aptitude remove -y exfat-fuse exfat-utils
The following packages will be REMOVED:
  exfat-fuse exfat-utils
0 packages upgraded, 0 newly installed, 2 to remove and 48 not upgraded.
Need to get 0 B of archives. After unpacking 295 kB will be freed.
No directory, logging in with HOME=/
INFO:debmonitor:Got 2 updates from dpkg hook version 3
INFO:debmonitor:Successfully sent the dpkg_hook update to the DebMonitor server
(Reading database ... 120661 files and directories currently installed.)
Removing exfat-fuse (1.2.5-2) ...
Removing exfat-utils (1.2.5-2) ...
Processing triggers for man-db (2.7.6.1-2) ...
chasemp renamed this task from Non-free software installed on stat1004 outside of puppet to Can exfat be used in WMF production?.Nov 28 2018, 9:14 PM
chasemp added a project: Analytics.

@fgiunchedi I need to sync up with you here for other reasons, but if you could take a look at this that would be great

exfat-fuse itself is free software (GPL) and part of Debian main. Debian's approach on patents is written up at https://www.debian.org/reports/patent-faq (TLDR; unless patents are actively enforced, they're ignored. Debian has been shipping ffmpeg which implements patent-encumbered algorithms for a long time as well).

Other distros (especially Red Hat/Fedora which are operating under US law) have taken a more rigid stance. Given that we're not distributing exfat that should also not pose a risk to WMF, though. (Obviously IANAL).

That said, the use of exfat is definitely non-ideal and we should strive to avoid it in favour of a more open filesystem (and especially one with a native, in-kernel implementation), but this is a one-off use case and not something for the day-to-day.

exfat-fuse itself is free software (GPL) and part of Debian main. Debian's approach on patents is written up at https://www.debian.org/reports/patent-faq (TLDR; unless patents are actively enforced, they're ignored. Debian has been shipping ffmpeg which implements patent-encumbered algorithms for a long time as well).

Other distros (especially Red Hat/Fedora which are operating under US law) have taken a more rigid stance. Given that we're not distributing exfat that should also not pose a risk to WMF, though. (Obviously IANAL).

If it's not re-distributable by us, then it doesn't meet the definition of free software (freedom #2). The ffmpeg example makes it pretty clear that the WMF does care about patents, given that previous legal opinion during the MP4 debate was that we couldn't use those patent encumbered libraries without signing the contract with MPEG-LA [1].

Small bit of background from my perspective, I had discussed this on hangout with a few folks who I will let acknowledge their own level of approval. I used !log and pinged @elukey with the intention of uninstalling post work-at-hand. Nothing here was me intending to take unilateral action or circumvent process. I really am under the impression that Debian main has nothing which would be incompatible with WMF prod infrastructure. I have no particular affection for exfat, and would much prefer to be able to use ext[34] but have been assured that is not a possibility. If there are legal issues here I'm glad @Legoktm flagged it.

Other distros (especially Red Hat/Fedora which are operating under US law) have taken a more rigid stance. Given that we're not distributing exfat that should also not pose a risk to WMF, though. (Obviously IANAL).

If it's not re-distributable by us, then it doesn't meet the definition of free software (freedom #2).

I have absolutely no skin in this game, but this statement is incorrect. Anyone living in a country without enforcement of those specific software patents can freely distribute exfat.

When we talk about software freedom, we refer to the freedoms the author gave everyone else, not the ones a bizzarre law takes away from our organization.

Hi all, commenting on this from WMF Legal.

As I understand the question and context, the issue is using a proprietary format for some data transfer because an external party isn’t set up to handle other file formats that might more properly meet the definition of free software. If so, that’s okay in this case and probably okay in most other cases though maybe not all. The WMF is committed to making free software, and our mandate is to look for free software solutions where possible, but our guiding principles contemplate that the WMF as an organization will sometimes use proprietary tools where there is currently no open-source tool that will effectively meet our needs. Where, as appears to be the case here, a third party doing important work for the Foundation isn’t able to do that work without a transfer using a patent-encumbered file-format, that would fit the situation of having no alternative that effectively meets our needs, so using the proprietary file format to get work done is acceptable. This is likely true in similar situations with other third parties, although there could be situations where the third party could actually make a free software solution work with minimal trouble, so I can’t say it’s true in every case.

Hi all, commenting on this from WMF Legal.

As I understand the question and context, the issue is using a proprietary format for some data transfer because an external party isn’t set up to handle other file formats that might more properly meet the definition of free software.

The main legal issue here is whether it's OK to install proprietary, patent encumbered code on our production servers.

The secondary and probably more important philosophical one is whether we even should. The guiding principle says:

As an organization, we strive to use open source tools over proprietary ones, although we use proprietary or closed tools (such as software, operating systems, etc.) where there is currently no open-source tool that will effectively meet our needs.

Given the amount of free filesystem implementations available to us, I don't think we can justify using a proprietary one here.

If it's not re-distributable by us, then it doesn't meet the definition of free software (freedom #2).

I have absolutely no skin in this game, but this statement is incorrect. Anyone living in a country without enforcement of those specific software patents can freely distribute exfat.

Sure, if you don't live in a jurisdiction where these patents don't apply, then you can redistribute exfat.

When we talk about software freedom, we refer to the freedoms the author gave everyone else, not the ones a bizzarre law takes away from our organization.

That's not really my understanding. https://www.gnu.org/philosophy/free-sw.html.en specifically outlines that free refers to the freedoms the user actually has. To quote:

A program is free software if it gives users adequately all of these freedoms. Otherwise, it is nonfree.

In this case, I, a potential exfat user, don't have all freedoms. So it's non-free. It's not as explicitly spelled out in the free software definition (it's buried in the explanation of freedom #0) that this consideration is for all users, but the open source definition explicitly says:

  1. No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons.

In short, if the license terms aren't usable by everyone (including people who live under bizzarre patent laws!), then it's not open source (aka free software). Semi-relatedly, this is why we have and use tools like CC-0, because "public domain" doesn't exist in all jurisdictions, and we care about making sure our work is (re)usable by everyone, regardless of where they live.

So I think this task raises a few different issues (and @Legoktm correct me if I'm wrong):

  1. Legal concerns about using this particular piece of software, and in general software in the same limbo status with regards to freedom-respecting copyright license, but enforced patents;
  2. Guiding principles / Wikimedia movement / free software movements concerns over using patent encumbered software
  3. Installing software outside of our regular processes (puppet, no code review etc.) and in contrast with the commitments we enumerate in L3.

(1) is for Foundation's Legal department and based on their response above it seems like this particular software is not an issue for us. That said, I think that when known-to-be-problematic cases come up we should proactively check with them before using such software, just in case. (I'm not sure if this happened here, but it doesn't matter now anyway)

While a complicated subject, I'm personally sympathetic to the arguments on (2) in general and of course would agree on the process issue around (3), broadly speaking. However, on both points, it's worth noting that this is a one-off thing, done under what is a highly unusual and relatively urgent situation, and the use of exFAT is temporary and not driven by internal requirements. As far as I know (and correct me if I'm wrong) this installation has an expiration date that's probably counted in days.

If that is indeed the case, it's probably not worth debating too much on the specifics of this task, but figure out what we can learn from this for future cases like it.

I think what may help here and in the future would be to document this situation in a private Phabricator task, the reasons behind this (and similar) ask, and the rationale of the actions taken. That also help to communicate the context behind those actions and the limitations of the environment that we have to operate in, keep a central place where actions are logged, and help to ensure changes are peer-reviewed (which AIUI happened here, but without that being obvious to others). It would also help to not forget about this, and e.g. neglecting to uninstall this package when the time comes.

I'm honestly not sure how much of that is happening already in tasks I haven't seen, nor how feasible it is in the first place to be transparent internally about all this, but… I think we should definitely try :)

Well, I don't think it even needs to be treated in a private task. There was a situation about how to interpret the rules / how much pure we wish to be, so people brought it up and discussed it to reach a consensus.

By the way, I would like to question the premise of the external party that "only exfat will work for them". They may think so, but it's probably not the case.

Options that may be available to them if WMF had indicated that due to patent issues they can't use exfat.

  • Place the files on a FAT32 filesystem. Probably not an options due to having files > 4GB, and the source of the statement.
  • Use a FAT32 filesystem splitting the large files as eg. multi-file zips
  • Boot from a linux live CD, format the disk as ext3/4 and copy the files there.
  • Install a ext3 driver for the OS they use and format the disk with that.
  • Put an archive file (eg. a tar file) directly on the raw media
  • Convince the patent holder to free the format. (-:

They have varying degrees of difficulty on them (obviously the easiest one is to have the wmf use exfat), and we should probably be more lenient with some third-parties than others, but that would also be a way of evangelism.

Thanks @faidon for weighing in, I think you got right to the heart of it. Not responding to you necessarily but I'm going to steal the 3 point breakdown as it makes sense to me. I don't feel empowered to relate much of the detail for history here, but I do value this conversation and want to respond.

(1) Legal counsel has been involved along the way and I defer to them.

(2) The ideological component here is bigger than I feel I can encompass entirely in this response. I have argued on the side of FLOSS exclusively as the ideal in WMF production in meetings and threads over the last few years and feel comfortable saying I'm a believer in the necessity. There are exceptions in production that reflect compromises made. Proprietary firmware, routing and switching infrastructure, etc. Where those compromises make sense is pretty complicated. Suffice it to say, with enough time and money any obstacle can be overcome. In this situation specifically those are both limited resources. When it comes to external parties a post-mortem item here (in my opinion and documented in a place for it) is that a technical resource WMF side can be engaged early to communicate these values and the realities they manifest. When I've found myself in a position where I'm coming in down the line after initial engagement the hurdles regarding time constraint on both sides and money have been significant and impacting for practical outcomes.

(3) I don't understand that I've softened my commitment to L3, and I'm really open to walking it out to be better if I have. That probably makes sense somewhere not this task. It is a semi-regular (but not lovable) part of operational reality here that commands and changes are run in parallel to or outside of Puppet and thus !log and such exists. It's 100% not what I strive to do in the normal course of business, but in looking at apt in SAL it's not alien to normal life. In this case specifically, my thinking was that I had agreement and understanding with another Opsen, a manager in Tech, a director in Tech and a couple more knowledgeable and engaged parties in real time right before (as review of action). I installed the package with a !log so it would be recorded in the right place and a ping to one of the Opsen who works in that specific area. I spoke about it in channels where relevant people are involved, and that's how this task came to exist. The reason I did not push this through Puppet is I knew I don't want this package installed as part of the system build, and I knew we had agreed to remove it. I would agree the duration here is days (maybe a week-ish) and not months. I feel confident in my commitment and ability to complete the uninstall as described. I would totally have done it differently had I thought this was not sound reasoning. I did not consider that something in Main for Debian would be an issue, and I'm still under the impression even now that nothing I know of in main is out of bounds even if a particular package may not be something to encourage or perpetuate. I would never do something like sudo apt-add-repository non-free for example.

In this case specifically, my thinking was that I had agreement and understanding with another Opsen, a manager in Tech, a director in Tech and a couple more knowledgeable and engaged parties in real time right before (as review of action). I installed the package with a !log so it would be recorded in the right place and a ping to one of the Opsen who works in that specific area.

Totally -- and I didn't mean to imply that this didn't happen!

To clarify, I'm trying to say: a) generally speaking those L3 responsibilities still exist and make sense and we're not saying -implicity or explicitly- that it's OK to bypass them (this isn't what happened here) and b) we should (collectively) strive to document these things in the future, for the benefit of others outside of the private email threads and Hangouts, to avoid the appearance that things have happened outside of those processes and/or without a good reason (when in fact they have been!). Does this make more sense?

In this case specifically, my thinking was that I had agreement and understanding with another Opsen, a manager in Tech, a director in Tech and a couple more knowledgeable and engaged parties in real time right before (as review of action). I installed the package with a !log so it would be recorded in the right place and a ping to one of the Opsen who works in that specific area.

Totally -- and I didn't mean to imply that this didn't happen!

To clarify, I'm trying to say: a) generally speaking those L3 responsibilities still exist and make sense and we're not saying -implicity or explicitly- that it's OK to bypass them (this isn't what happened here) and b) we should (collectively) strive to document these things in the future, for the benefit of others outside of the private email threads and Hangouts, to avoid the appearance that things have happened outside of those processes and/or without a good reason (when in fact they have been!). Does this make more sense?

Yep, completely thanks. For myself, next time I would rather do something like <chasemp> stat1004:~# aptitude install exfat-fuse exfat-utils (elukey fyi -- to be uninstalled shortly and approved via f2f) as an improvement to <chasemp> stat1004:~# aptitude install exfat-fuse exfat-utils (elukey fyi)

Reading the backlog only now, this was good learning lesson for me too (I was aware of what Chase did as mentioned, and didn't think that it would have been flagged as issue to review). Thanks a lot to all that contributed with their thoughts and suggestions :)

So I think this task raises a few different issues (and @Legoktm correct me if I'm wrong):
<snip>

I think that was a good summary, thanks.

(1) is for Foundation's Legal department and based on their response above it seems like this particular software is not an issue for us. That said, I think that when known-to-be-problematic cases come up we should proactively check with them before using such software, just in case. (I'm not sure if this happened here, but it doesn't matter now anyway)

Yep.

While a complicated subject, I'm personally sympathetic to the arguments on (2) in general and of course would agree on the process issue around (3), broadly speaking. However, on both points, it's worth noting that this is a one-off thing, done under what is a highly unusual and relatively urgent situation, and the use of exFAT is temporary and not driven by internal requirements. As far as I know (and correct me if I'm wrong) this installation has an expiration date that's probably counted in days.

On #2 (the freedom issue), I don't think "a highly unusual and relatively urgent situation" is an exception to our guiding principles. Wikimedia itself is highly unusual, and we still always use free solutions. I don't know what the specific urgency in this case is, but I think we've been in plenty of other urgent issues (like outages!) and we don't use that as a reason to use non-free software.

I think the more important point is that there are plenty of free alternative filesystems available. And Platonides has helpfully provided even more alternative ideas as well.

On #3 (the process/puppet issue), I see no reason to bypass our processes of using puppet. How difficult would it have been to add the following few lines to puppet:

# needed by chasemp for TXXXXX, to be removed on 2018-12-XX
require_package( 'exfat...' )

Probably would have taken 5 minutes to write the change, and another few minutes to go through Gerrit, puppet-merge, and deploy. What's the downside? The upsides are pretty well documented...in our policies like L3. I don't think the fact that the software that was installed bypassing our policies turned out to be problematic was a coincidence.

If there are legitimate exceptions, then the language All software should be installed via puppetized packages only from L3 should be replaced with the actual practice that ops/SRE follows.

I want to acknowledge a few things:

  • @Legoktm I appreciate that you feel strongly about this
  • The use of exfat is not any sort of preference or personal judgement call on my part
  • No foundation servers will be formatted with exfat

To bring this back around to the breakdown above:

  1. Legal has expressly granted this exception
  2. With respect to the WMF charter and the values and manifestation thereof, it seems the exception process and/or the bar for each use case is at the discretion of WMF leadership and probably specifically most informed by WMF technical leadership. I'll defer to @JBennett who has more information.
  3. I defer to @faidon on any process issues to this point or improvements going forward. L3 could possibly use some language changes to bridge understanding.
  1. With respect to the WMF charter and the values and manifestation thereof, it seems the exception process and/or the bar for each use case is at the discretion of WMF leadership and probably specifically most informed by WMF technical leadership. I'll defer to @JBennett who has more information.

Thanks everyone of for their thoughtful consideration. I have no issues nor do I see a conflict with temporarily allowing the use of this so we can complete our work with our 3rd party partners.

Thanks everyone of for their thoughtful consideration. I have no issues nor do I see a conflict with temporarily allowing the use of this so we can complete our work with our 3rd party partners.

Thanks for commenting. To clarify, are you saying that there is no free filesystem software that will meet our needs? Given that @Platonides offered a decent amount of alternatives in T210667#4789273, it would be helpful if it could be explained why none of those alternatives are sufficient. That will help me when I reach out to various free software upstreams so they can improve their software so it will meet our needs - a goal that I think we all share.

Thanks everyone of for their thoughtful consideration. I have no issues nor do I see a conflict with temporarily allowing the use of this so we can complete our work with our 3rd party partners.

Thanks for commenting. To clarify, are you saying that there is no free filesystem software that will meet our needs? Given that @Platonides offered a decent amount of alternatives in T210667#4789273, it would be helpful if it could be explained why none of those alternatives are sufficient. That will help me when I reach out to various free software upstreams so they can improve their software so it will meet our needs - a goal that I think we all share.

This was reviewed with the vendor and there is not a practical approach outside of exfat for them. The main issues boil down to retention/the ability to reproduce and expediency. This vendor has a very standard process for replicating and executing their work. Any deviation in this process has to be re-assessed and re-certified by the vendor. This takes time and will delay all of this work which in turn adds overhead and cost on all ends of this project. This vendor will need to potentially retain this information for extended period of time and again, potentially reproduce or re-assess this information. If the format provided does not meet their standard the assessor will need to re-invent this one off solution again.

So to answer your question for this vendor no, there is not a free file system that will meet their needs in a reasonable way.

chasemp claimed this task.
chasemp mentioned this in Unknown Object (Task).