The truly shocking VW emissions fraud should force us to think through how we can ensure the transparency that is needed in software. The general issue is excellently summarize in this recent NYT article:
“Intelligent public policy, as we all have learned since the early 20th century, is to require elevators to be inspectable, and to require manufacturers of elevators to build them so they can be inspected,” [Mr. Moglen, a lawyer, technologist and historian who founded the Software Freedom Law Center] said. “If Volkswagen knew that every customer who buys a vehicle would have a right to read the source code of all the software in the vehicle, they would never even consider the cheat, because the certainty of getting caught would terrify them.”
That is not how carmakers or even the E.P.A. see things. The code in automobiles is tightly protected under the Digital Millennium Copyright Act. Last year, several groups sought to have the code made available for “good-faith testing, identifying, disclosing and fixing of malfunctions, security flaws or vulnerabilities,” as Alex Davies reported last week in Wired.
A group of automobile manufacturers said that opening the code to scrutiny could create “serious threats to safety and security.” And two months ago, the E.P.A. said it, too, opposed such a move because people might try to reprogram their cars to beat emission rules.
At one level, that organizations’ policies have to be expressed in software these days is good news — at least in theory it should be much easier to find pricing discrimination buried in a website algorithm, than to show a pattern among individuals involved in intuitive price setting (see my earlier blog here).
But that depends on the software being open and accessible for review. The industry has argued successfully against this in terms of maintaining competitiveness and intellectual property, and in terms of the need to protect against hacking of various forms. The competitiveness argument carries little weight. We already have a robust industry-dominated system to do that. Rather it is an argument for maintaining barriers to entry, something that we should regard with deep suspicion.
The hacking and security argument also cuts both ways. Hackers can always get in, with enough resources. The difference is that transparent software might be easier to hack, but it will also have its errors found much more quickly by the good guys.
This is important for us not just because it raises all sorts of legal issues, but because very soon software-driven algorithms are going to be doing all kinds of things in the justice system. Probably most important in terms of transparency and credibility will be the emerging triage systems being developed in court and legal aid contexts. Similarly, as big data is used to drive decisions in how documents are assembled, cases are processed, the need for transparency is critical. See here triage principles developed in 2012.
So the rules, and indeed how the rules are actually put in practice, both need to be transparent. The problem this creates is the risk of gaming. If for example, it is known that saying you read badly makes it easier to get a lawyer in a particular triage system, then some people may say that they do not read well. Sadly, therefore, the best triage factors, or rather the proxies that are used to score the factors, may be objective data, like level of education. Moreover, the best of all are those that can be obtained from data in other databases — such as whether a tax return has been filed on one’s own, or with help.
This may well make building triage systems harder, but if we ignore the risk, it is only a matter of time before we find that some Ferguson-like municipality has developed a fee maximization algorithm for arrests, fines, fees and assessments.
The irony of the EPA statement is that the organization is clearly identified to be on the wrong side of history, at least on this issue. Although throwing in the Ferguson issue is just inflammatory provocation, it fits. So what other areas also fit.
Try the issue of Phase III randomized trials of new medicines in which the size of the population to be studied may be calculated based on the size of difference between the treatments to be compared and where in the scale this difference occurs. Fine. So we are looking for Efficacy in this calculus. As a matter of form Safety is also included. The FDA concludes that the new treatment works when compared to the former method and that no safety issues emerged in the trial. BUT was the trial powered to look at safety issues per se . Of course not. The drug companies are driven by sales and margin ; the scientists are driven by wanting to make discoveries ( and career building may be! ); and the FDA wants to show that their public mission is successful on their watch. So what’s the problem. ?? Who is speaking for the patients ? Post marketing surveillance is rarely mandated. More often a voluntary collection of sporadic complaints is “encouraged”. And so serious complications and even low levels but significant death rates may go on for years !!
At least the CEO of VW has resigned which will give them the opportunity to become TRANSPARENT. The general principle should now be accepted that transparency of process and of data are essential in these days of multiple interests in order to provide the means of checks and balances on which our knowledge and individual autonomy must now depend.