The second of two posts where we examine the key aspects of the causal mechanisms behind these incidents, and the simple and basic underlying systemic failures that all of the manufacturers are making that allow weaknesses to escape into customers’ hands. The problem really has nothing to do with fires, but I’ve never noticed a news headline about a vehicle breaking down under warranty, or that a warning light has come on.
In the first of these posts, we saw that a large proportion of incidents of vehicle fires, for which there is publicly reported information, have been attributed to causes that have happened over and over again. These are not unique to electric vehicles, or indeed unique to vehicles. Those same problems occur in any and every product, from medical devices to spacecraft, from consumer devices to industrial plant. Although errors in design, and poor quality manufacturing are occurrences that happen frequently in every organization, most are quickly understood and get fixed. There is a single common feature to those that are neither detected before it is too late, and for which the problem is compounded because gaining the necessary understanding to fix them takes much longer than they should. They may possibly never be understood, and the consequences are dire.
The common feature to certain types of failure (such as leaks) is that they occur when a more fundamental or primary function has crossed a threshold. Characterizing how bad the failure is does not help much. Characterizing how well the primary function is being performed will transform these types of failure so that they can be detected in time, then understood, and fixed just like the majority of errors made in design and manufacturing where weaknesses are flushed out whenever they may occur in the product life cycle.
Efforts to prevent failure do work, but only when performance of the function is fully characterized. As long as a failure mode is the result of poor performance of an un-characterized function, failures will never be prevented. Chapter 7 of the book Diagnosing Performance and Reliability explains a methodology for identifying the at-risk functions of a system, and characterization approaches are covered in chapters 2 and 6 in great detail.
In this post I will look at the remainder of the causal mechanisms that have been attributed to these incidents, something that is unique to energy storage using batteries. That mechanism is usually categorized as thermal runaway. I want to outline a common failing in the way manufacturing processes are characterized, the outcome being that many quality problems are never fully understood, so not fixed. Sometimes, this means that no-one is even aware that any sort of problem exists with a manufacturing process. Again, fixing this is really easy, and the main difficulty is once again getting folks to see what they are doing wrong, to change their mindsets or paradigms. This time they need to change how they look at manufacturing processes rather than the functionality of the output of those processes. Once again it is about characterization, doing it is easy, but it seems like the world’s best kept secret. It’s the main topic of chapters 2 and 3 of Diagnosing Performance and Reliability. We will look at manufacturing batteries, but the same principles apply to every product and manufacturing process.
First, we need to quickly review the manufacturing process for a lithium battery. The smallest working unit in a battery is the electro-chemical cell, consisting of a cathode and an anode separated by an electrolyte. The electrolyte conducts ions but is an insulator to electrons. In a charged state, the anode contains a high concentration of lithium while the cathode is depleted of it. During the discharge, a lithium ion leaves the anode and migrates through the electrolyte to the cathode while its associated electron is collected by the current collector to be used to power an electric device. The electrolyte of a solid-state cell is actually a structural component and no additional separators between electrodes are needed to avoid short circuits. Cells can take various forms such as buttons and cylinders. For high-power applications a number of cells are packaged into a module, and a number of modules are packaged into a battery.
To manufacture cylindrical cells for example, the electrolytes are first mixed into a paste made out of powdered active materials, binders, solvents and various additives. The paste is fed to coating machines where it is spread onto cathode and anode foils. The coated foils are squeezed between rolls (known as calendering) to control thickness and particle size, then they are slit into sections and sections into strips. Anode and cathode strips are stacked, then wound into cylinders. Cylinders are assembled into cases and filled with electrolyte. The mixing, coating, cutting, stacking sequence also happens for button and other types of cells. Once a cell has been finish-assembled with various insulators and seals, it can be charged for the first time and tested.
There is still considerable effort being expended on the development of battery technology to address safety, performance and useful life. Much of the cutting edge of battery technology is involved with the ingredients of the pastes. However, what we see is that the manufacturers are hindering themselves with a lack of attention to detail in the processes. This is in spite of employing literally hundreds of engineers for process development. They are expanding capacity at a phenomenal rate, but there are huge gaps in knowledge, and some equipment is not up to snuff.
Mixing, applying coat to strips, calendaring and slitting processes are well established technologies, much older than we are, even. We come across them all the time. We see them being done well, and we see them being done where there is considerable room for improvement. This is a tale of what they were doing at one battery manufacturer, but it is where most folks go wrong when trying to improve and develop their processes, whatever they are producing, and however they are producing them.
For all processes, there are dozens of parameters that can be altered. Some of them involve turning knobs (actual knobs or by dialing numbers into a computer). Some of them may involve modifying the equipment or tooling (more expensive), or even the environment that the equipment sits in (very expensive). The simple process described above involves four major steps, which multiplies the number of possibilities. We find engineers designing experiments on processes like these, where they are looking to see what happens when several different combinations of parameters (often called factors) are changed, and that is exactly what they were doing at one battery plant. Its what they have been taught to do. It makes me feel guilty if I think back to 25 years ago that I also taught people to do that in seminars (although I didn’t need to practice it in the plant). It was a trendy approach back then, but the world really should know better in 2020.
There are two problems with it. The first is that, for the most part, however well-educated and experienced the engineers are, they are playing poke and hope with the process – little better than “let’s see what happens when I press this button”. That’s the way it is, I’m afraid, even if you are unwilling yet to admit that the emperor isn’t wearing any clothes. Poke and hope can be contrasted with an efficient and effective progressive search, in which about half of the search space is eliminated on every pass using information generated from a cheap and fast to execute test.
There’s another basic trap that engineers also fall into, and the battery folks were no exception. When considering the results of the various experiments they were conducting, they would evaluate the performance, or safety and durability (which takes quite a long time to get answers to, and itself is often subject to the problem highlighted in part 1 of this post) for an entire battery. But a battery is made up of modules, and modules are made up of individual cells. The cells are assembled out of coated components, the characteristics of which vary hugely depending upon where they were cut from in the calendered sheet. All the variation in any number of characteristics is aggregated and averaged by the time you get to look at an entire battery. Most batteries are as good or as bad as each other. A statistician might blame that on the central limit theorem.
What if we knew exactly where a cell originated, and we characterized it in a way that tells us how well the process is performing at precisely that point, both spatially and in time? If we expanded our sample, carefully choosing the points in space and time we were looking at, we would start to understand the behavior and performance of the various process functions. It is possible that there is very little difference cell to cell. However, we found, as we do in most of the processes we look at in this way, that there were huge differences. In fact, that is where all of the action is, but as obvious as it may sound here, very few engineers look at their processes this way. That is why I jokingly refer to it as one of the manufacturing world’s best kept secrets.
That spatio-temporal performance information sets up the diagnosis for implementing a progressive search, some of which involves figuring out which process is driving the variation, and exactly which process function is not being performed as intended. I don’t need to go into any details here about how that was done. There are some further insights and connected principles covered in my short book The New Science of Fixing Things, and a complete explanation of how to exploit them, and how to execute the progressive search strategies in the book Diagnosing Performance and Reliability.
However, whilst we can talk about the battery folks doing what everyone else does, and although all the action was originally hidden from them because it lived in the spatio-temporal framework of a single process cycle, in common with most processes, it wouldn’t be ethical of us to publish the causal explanation for their actual process. The point is, it doesn’t matter. They know how to fix it, and that knowledge belongs to them. We are not claiming to understand what is wrong with every battery manufacturing plant or process. That’s symptomatic knowledge, and we will always have less symptomatic knowledge than the engineers we work with. There will be other unknown problems, in other parts of the process when we look at a different plant in a different company. But the topographic approach outlined here will find them, and get to a causal explanation in every case. It takes no prisoners!
#electricvehicles #ProbelmSolving #qualitymanagement #quality #qualityimprovement #ProblemSolvingSkills #ElectroMobility #BatteryCharging #rootcauseanalysis #troubleshooting #continuousimprovement #reliability