I’m an avid reader of engineering disasters since one of my primary roles in my day job is to avoid them. And, away from work, we are taking a small boat around the world with only two people on board and that too needs to be done with care where an engineering or operational mistake could conceivably be terminal. In Why I enjoy reading about Engineering Accidents, Failures, and d Disasters I talk about some of the advantages of reading about and learning from disasters across all domains. Past topics have included Studying the Costa Concordia Grounding, What Went Wrong at Fukushima Dai-1, Official Report of the Fukushima Nuclear Accident Independent Investigation Commission Executive, The Power Failure Seen Around the World, and an operational mistake from my personal experience 69.1 degrees.
Almost all engineering or operational disasters have a large people related component even if the system is largely automated. People, and less than ideal decisions made by people, are at the core of many of these complex system failures. Knowing this, I’m always looking to learn more about what can cause bad operational practices to set in over time, and how leadership can set the right values and audit sufficiently closely ensure the right thing happens every day even when there have been no problems for years. How do we keep the team vigilant and active avoiding mistakes?
With this backdrop, the Volkswagen emission fiasco is a highly interesting example. In this situation it appears that Volkswagen intentionally configured their emissions control system to special case the emissions tests such that they pass while the system actually is unable to attain the standard when not being tested.
It’s easy to see why a company might find this tempting. Emissions requirements force tough engineering compromises. Hitting them can reduce engine drivability and engine power, and can both consume more fuel and substantially increase costs. To be sure, many emissions requirements have actually improved engine efficiency, drivability, and power output but, as with all things in engineering, the more a single dimension is optimized, the harder it is to not give up ground in other dimensions. Modern emissions standards bring compromises and put high pressure on the engineering teams.
For those not fully up to speed on the VW Emissions Fiasco, the summary from Volkswagen Emissions Scandal is worth including here:
On 18 September 2015, the United States Environmental Protection Agency (EPA) issued a notice of violation of the Clean Air Act to German automaker Volkswagen Group, after it was found that the car maker had intentionally programmed turbocharged direct injection (TDI) diesel engines to activate certain emissions controls only during laboratory emissions testing. The programming caused the vehicles’ nitrogen oxide (NOx) output to meet US standards during regulatory testing, but emit up to 40 times more NOx in real-world driving. Volkswagen put this programming in about eleven million cars worldwide, and in 500,000 in the United States, during model years 2009 through 2015
Hitting automotive emissions requirements today is a very difficult engineering challenge and almost always forces tough compromises. It follows then that it could be a real competitive advantage to not meet the emission requirements. Whenever the challenges are great, the competitive pressures high, the temptation to do the wrong thing can be very large. This temptation is somewhat curtailed by the knowledge that there will be independent government tests that must be passed but still, what about special casing for these and still not meeting the emissions standards?
When there are many millions of dollars on the line, company leadership can be very tempted and individual engineers can feel enormous pressure to succeed, even to the point of saving their job by cheating. What usually prevents cheating is that big engineering projects, even though highly secretive, still involve 10s or even 100s of engineers. A single engineer simply can’t put a conditional test into the code that says, for example, if the OBDII connection is live (it will be during emissions tests) or if the car is not moving when under acceleration (on a chassis dynamometer), then comply with the emission standards but otherwise don’t. There are few secrets on large engineering projects. If company leadership asks the team to cheat, everyone on the team knows it. If a single engineer puts in a code change to do something like what I outlined above, it’ll need to be reviewed by other engineers and it’s close to impossible that nobody will notice the illegal code. There is a good chance that any substantial engineering project will have at least one person that feels personally committed to doing the right thing on emissions and not cheating their customers. Hopefully a lot more than one, but one is all it takes. On large teams, and most automotive engineering projects are quite large, it’s almost impossible that there will not be at least one honest or environmentally committed engineer.
This event really caught my interest. How did VW intentionally implement a non-complying emissions systems and yet it not be reported or detected for years? It just seems impossible that someone on the team would not at least anonymously report it. But, since nobody did, we all need to understand what went wrong and find ways to avoid similar failures on our own engineering projects.
Lawsuits and other legal action both civil and criminal make it very difficult to get information and learn from this event. Volkswagen as a company is facing an estimated $18B liability (Putting a Price on Volkswagen’s Emission-Fraud Mess) so it’s particularly difficult to get data on the event. If VW management asked the team to cheat, how did they keep the knowledge of this so tightly controlled? If the decision was made by a rogue engineer under enormous pressure to hit the emissions standards without giving up cost, drivability, fuel economy, or power, then how did the changes go in to the firmware without being broadly seen by other engineers on the team? I’ve still not found a definitive answer for any of those questions but did find what appears to be a very credible explanation of exactly what actually happened at VW. Understanding what was done gives us some clues into how this escaped broad notice b the engineering team and why nobody reported it publicly.
In Inside the Volkswagen Emissions Cheating, Jake Edge reports on a talk given at 32nd Chaos Communications Congress (32C3). From Edge’s posting:
The 32nd Chaos Communication Congress (32C3) held at the end of December, Daniel Lange and Felix Domke gave a detailed look into the Volkswagen emissions scandal—from the technical side. Lange gave an overview of the industry, the testing regime, and the regulatory side in the first half, while Domke presented the results of his reverse-engineering effort on the code in the engine electronic control unit (ECU), as well as tests he ran on his own affected VW car. The presentation and accompanying slides [PDF] provide far more detail than has previously been available.
One of the authors of the presentation, Lange, is a security researcher. These are folks that crack software and hardware systems looking for security weaknesses. Some of these problems are reported to the company that produced the system and fixed which is a service to the industry. Some are sold to the companies involved which isn’t a business model I particularly like but it arguably also contributes to the industry. Some of these security flaws are sold on the open market and get used to illegally. Again, this fringe aspect of the security research community is not my favorite but, whether we like all the business models, it’s still very important to stay current with the security research community if you work in the commercial hardware, software, or services world.
I really like this application of security research to understand what was actually done when a company isn’t being forthcoming due to legal complications on what went wrong and exactly what happened. What the Lange and Domke found is super interesting and is the best source I have come across so far on what actually happened at VW. What these researchers found involved a component of the emission control systems that injects controlled amounts of urea and water into the system. This is used by modern Selective Catalytic Reduction (SCR) diesel engines to control Nitrogen Oxide NOx emissions. But, like many things in the control system world, choosing the right amount to inject can be difficult. Insufficient urea injection levels will allow excessive NOx emissions which would fail the emission test. But excessive injection levels will produce high levels of ammonia which, of course, is highly undesirable.
Understanding that correct injection levels are incredibly difficult to achieve under all circumstances, some conditions are treated specially. From Jake Edge’s posting, an ECU is an Engine Control Unit and AdBlue is the German nomenclature for the urea that is injected into SCR diesel engines to meet emission requirements in many jurisdictions:
The SCR is also modeled in the ECU. It takes sensor readings and outputs from other models and produces an amount of AdBlue to use. Ideally, that would be the right amount to eliminate NOx, but emit no ammonia. There is also a separate monitoring function that will trigger an OBD-II error if the efficiency of the conversions is too low. That might cause a “check engine” condition so that the owner takes the car in for service.
It turns out that the standard SCR model does not work under all conditions (e.g. if the engine is too hot), so there is an alternative model that runs in parallel. It is a much simpler model, with fewer inputs, that has the goal of never adding too much AdBlue. There is code in the ECU that determines which model to use, and that code depends on the data provided by the car maker. In addition, the ECU stores information about which model is chosen at each ten-millisecond interval.
At this point we have an expected and accepted exception in the engine management systems so nobody will be surprised to see this second, more conservative urea injection curve in the ECU injection maps. And nobody will be surprised to see a complex set of conditions on whether to use the standard map or the exception map. Again, the existence of this code is unsurprising and normal.
What the researchers found is that their test car was consuming roughly 24% of the urea it would have been expected to consume under compliant emissions operations. So, being security researchers, they disassembled the engine management systems code and went through the sets of conditions that use the more conservative urea injection model and found these conditions were broader than they should be. Specifically the alternative conservative urea injection curve should be use whenever any of a variety of operating conditions test true but one of the conditions being ORed was engine temperature is above -3276.8K. For the non-physicists amongst you, that test will always be true. Essentially the alternative injection model is always used. This will clearly fail emissions tests so they knew it couldn’t be that simple.
In digging deeper Lange and Domke found another set of conditions that would force the system back to the standard, emission complying urea injection model. These conditions included a complex set of linear curves that if all matched true would force the system back to the compliant model. As you could probably guess the emissions tests happen to just barely be contained in these curves while almost all normal driving will fall outside them.
Essentially the cheat was hidden in plain sight. There was expected to be an alternative curve. It’s unsurprising that the system of tests that select the curve be complex and rather than specific easy to think through discrete levels, they are curves that all must be matched. It wouldn’t be surprising that nobody thought through the ANDing of all the required curves.
It seems conceivable that everyone on the team could see this code and yet not realize that it is non-compliant. Clearly we can’t know how many people were aware of the emissions tests optimization but it’s conceivable that the group was small. More detail will come out during the numerous civil and criminal actions that follow but in the interim I get two lessons:
- Any metric that is used by a jurisdictional body or that we use internally to monitor our systems is going to be incomplete. Metrics necessarily abstract away some of the complexity of reality to allow us to use a small number of numbers or curves to understand how a system is performing. Without some testing to ensure your metrics are complete or not being optimized around, there is risk they are missing important details. The application of this learning for jurisdictions wanting to do emission testing is they need to do some random component of the emissions test conditions to ensure that the results are close to expected after averaging. One rule we use at Amazon to help ensure our metrics are sufficiently inclusive is to say that no customer should have a bad day without it showing up in at least one metric. This rule forces us to have an incredibly dense mesh of metrics but, without as many, important exceptions will be missed. The VW violation should have been caught. I’m not trying to relieve VW of responsibility but it certainly is the case that emissions tests are poor representatives of real world automobile usage and there needs to be more checks to ensure the real world results match the legal intentions.
- It’s important for leadership to set aggressive goals for individual engineers and for teams. This is how great things are achieved and this tension helps deliver great products to customers. But, what this shows is that very detailed auditing is needed. Leaders need to set aggressive goals but they need to be in the details asking lots of questions. There needs to be strong metrics in place to detect quality, performance, and legal compliance issues early. These tests and metrics may run into the thousands of discrete data point in order to have the fidelity to prevent the tension of high expectations allowing even a single engineer to take a shortcut. Without a real focus on company values, constant questions and auditing, and a dense web of metrics to detect problems early, these violations will certainly happen.
I’m looking forward to learning far more about what happened in this case but the data already unearthed by Lange and Domke and reported by Edge gives several important lessons for anyone in an engineering or engineering leadership role. A mistake of this nature is enough to cause a great company to fail so it’s worth spending significantly to avoid the risk of these issues happening where we work. If the metrics are weak, even good people will get complacent and gaming will set in.
For more information:
- Volkswagen Emissions Scandal: https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal
- Inside the Volkswagen Emissions Cheating: https://lwn.net/Articles/670488/
- The Exhaust Emissions Scandal (Dieselgate): https://events.ccc.de/congress/2015/Fahrplan/system/event_attachments/attachments/000/002/812/original/32C3_-_Dieselgate_FINAL_slides.pdf
The last set of slides is particularly worth studying.
Whats the big rush I just don’t see it really don’t emission fiasco their chasing their tails go back to green buy American & live healthier ever after.
This is not in persuit of environmental goodness. Countries pass emmision laws and to sell vehicles in those countries, you must pass the emission laws or cheat. Current emission laws are hard to pass without negatively impacting fuel economy, power, and drivability so cheating is considerably easier and less expensive than engineering a solution. The desire to cheat is unsurprising. What is surprising is the hope of a large company that vast numbers of engineers will know what was done but nobody will leak. Big companies can’t keep secrets and, even if they did, something of this magnitude will be noticed and, when it does get detected, the costs will run 10s of billions of dollars.
This case continues to deliver shock waves for VW. There is a useful account here, with links:
This will run and run.
OT and as an escape from the mosquitoes: reading of your interest in race cars you may wish to explore last weekend`s Goodwood Revival via the extensive YouTube clips posted there by Goodwood Road and Racing:
This event if for several classes of older cars and motorcycles – including a sensational performance by a 1933 Rudge motor cycle which won one event in heavy rain against 1953 Norton Dominators no less; and it managed a third place in the dry on Sunday. There is some cracking racing to watch across the different classes from GP cars to humble saloons (sedans).
Great find David. That is exactly the detail I was expecting: http://www.thetruthaboutcars.com/2016/09/indictment-vw-updated-emissions-cheat-2014-hid-epa-carb/#more-1408857
Well worth reading especially for those in engineering leadership positions.
Latest report from Reuters suggest the software was developed by Audi in 1999, but never used.
Not much detail on that one but an interesting development. Good find Tom.
Excellent topic. Thanks for exploring and posting. Very interesting read. Keep the perspectives coming as almost all perspectives have been very interesting articles. We share the same industry so I’m biased in my opinion although you were here a bit earlier :p
Thanks for the write up. I too have been following this case with great interest. In my working life, my rule of thumb for business practice was that whatever you did as a business should not only comply with the prevailing laws and regulations, but also be capable of withstanding the test of public scrutiny. If the business operated world wide, that required paying close attention to local cultures. I do wonder if in this instance the VW engineers thought that compliance with test procedure (however achieved) was enough to get by and that the chances of getting caught out were very slim.
I should also be interested in your views on the difference between internally mandated metrics (such as you have in AWS) and externally mandated metrics, for example as in politically imposed emissions standards. These have been the bane of the automotive industry over the years as the politicians usually have, at best, a scant understanding of the engineering issues at stake. I also believe that the shipping industry is not required to adhere to these emission standards.
Your test is 100% the right one David. My version of it was what I called the Wall Street Journal test. What what we and our customers think if the current decision was written up in a WSJ article? It’s a way of getting past the letter of the law and ensuring we are representing our customers and shareholders properly.
As you allude to in your comment above, the best favor any industry can do for themselves is to be sufficiently responsible that government agencies and lobby groups don’t feel a need to externally regulate. External regulation is a blunt instrument. It’s better than irresponsible corprorate leadership but a poor substitute for an industry innovating with goals that match their customers and other important constituents.
Inside the EU there is no shortage of single issue pressure groups who seek to impose their views on the rest of us, often by lobbying the European Commission. This can lead to the imposition of rules and regulations that do not reflect the concerns of the general public but of those specific interest groups. Sometimes the EU itself pays or sets up lobby groups to promote ideas it wants to incorporate in regulations. If they get through the system, and many do, there is nothing much that can be done about it either by the industries affected or even by national governments. It is a key reason why many industries band together to keep a watch on what is being cooked up. If you are a small business you can be crushed and put out of business.. Often these changes have little or nothing to do with economic efficiency or consumer benefit.
That regulatory environment sounds challenging. It happens to various degrees all over the world. Since not all voters can know all there is to know about all topics and, for certain the politicians they elect can’t, so lobby groups gain power and the influence they gain is often out of whack with what is best for the voting population as a whole.
When I was young and more naive, I thought that the right answer would eventually win but, as much as I dislike the process, the only solution is to ensure that your interestes are also represented and get time with political leaders. You really don’t want them to be exposed to only a single sided view.
I posted this article in Facebook Sep 20, 2015. I thought it was cool because West Virginia University is my alumni (technically, it’s an off chute). Unfortunately, a couple friends including a couple in WV had diesel Volkswagen….
“The cars were first found to produce too much nitrogen oxides, or NOx, by researchers at West Virginia University who were working with the International Council on Clean Transportation, the EPA says. After the WVU analysis found irregular NOx levels in diesel Volkswagens, the EPA and the California Air Resources Board took up their own study.”
Hey Cary. Thanks for the additional background on the early work to detect this one at the University of West Virginia.
Another point here: the unavoidable importance of human discretion. A similar issue last year was the “designated entities” spectrum caper. Yes, they’d fulfilled the requirements on the form. A machine would have been delighted.
No, the law was never intended to let huge national MNOs set up straw buyers. There is an obvious intention to deceive, and so, Chairman Wheeler used his discretion to tell them to bugger off. Similarly, the VW people came up with a solution that complied, only when it was tested.
This doesn’t look to me like a case of meeting the letter of the law but missing on the intent. California, US, and European regulators are all treating this as illegal activity and Volkswagen has recalled 11 million vehicles to be refit. The costs of this “mistake” to Volkswagen shareholders is estimated to be of the same order as the cost of NASA putting a person on the moon. At this pooint, this is looking to be one of the most expensive corporate management errors ever.
The code could be a candidate for an obfuscated code contest, like so: http://www.ioccc.org/
A couple trusted actors would be all that it takes to get away with this for years.
It’s hard for senior management to request that the test be worked around without people on the team knowing. On most larger teams, there will be someone unhappy with leadership, with an environmental concience, or for other reasons chose to be a whistle blower. If it was an individual decision, it’s hard to hide the change from the rest of the team and likely will get reported to management or reported publicly. This particular example appear to have been hidden by it being normal for the code to have multiple injnection tables and the code to chose between them being complex so not particularily obvious if they are carefully boxing the test and only the test.
But, it still seems almost impossible that it wouldn’t be widely known on the engineering team. They will be running test engines and cars through 100s of hours and to not notice that the cars are consuming only 20% of the expected urea during tests seems, well, amazing at least to me.
Another interesting point: chances are that the cheat is actually legal – at least in some countries of the world. Law prescribes low emissions under a clearly defined test cycle. And that’s what the engines do. But does the same law say anything about other driving patterns? If it doesn’t, the cheat isn’t a cheat, but a clever interpretation of the law. Shady, no doubt about that, but legal.
The eventual deffinition of what is “legal” often comes down to the interpretation of a judge or 12 of your peers thinks is “reasonable”. This issues looks more difficult than most to get over the line and I’ve seen some pretty reasonable cases fail.
From a management perspective I’d the following lesson learned: If you really care about some policy being followed you need to have it audited by independent auditors. There’s a variety of ways people and groups of people can get out of line. Might be unintentional such a group-think or sloppiness. Need to audit, you can’t ask people to audit themselves.
If I was a CEO or leader I would find someone to audit myself (and keep the results secret :) ).
I think that CEOs that need to rely on seperate management consultants and audits are in the wrong job. Great leaders are able to dig into the details, will force teams to build metrics that track actual progress, and will find ways to cross chess their facts.
This comment of yours implies that VW top management knew, or that they are incompetent. Not disagreeing, just pointing this out.
I saw your fantastic AWS talks. I know what you mean by “digging in”. You did that.
Maybe it’s more difficult to inspect everything important by yourself in a car company because cars are so complicated. At some point you have to stop and rely on others to a significant extent.
Yup, your right. I really do think that it’s a management responsibility to dig in, really know the details, ask lots of questions, and to cross reference and audit everthing. I agree the automotive business is complex but so is nuclear power, bridge construction, aircraft building, and numerous other engineering disciplines. In my opinion, it’s a senior management responsibility to drive a culture where the right thing happens, there is a deep mesh of metrics to catch non-performance or shaved corners and, after that, to ask deep questions with many cross references to detect problem areas, staffing problems, or poor leadership early.
I see you’re a fan of many metrics (as opposed to a few “KPIs” which is also sometimes advocated). Would be interesting to hear your thoughts on that in a blog post should the occasion arise.
Great content here, thanks!
Also, the regulatory testing made it very easy to cheat because they always used a fixed procedure and never tried a real drive. Makes you wonder if there were people there in on the scam.
I heard that it’s an open industry secret that all car makers cheat such metrics. If true that implies that the regulators are in.
Don’t forget that age old saying: “Never attribute to malice that which is adequately explained by stupidity.”
There is no question that regulators have a far too narrow a measure but I suspect they are not part of a secret cover up. Just poor testing methodology and the difficult of getting government policy updated.