09 jan 2022
this page describes the operation and implementation of the closed-loop cooling system on my roadster, in daily use since early 2017. there's nothing rambler-specific in here, this scheme should be perfectly fine on any internal combustion engine.
the code described below is available on my github page. see, hear. the algorithms described below are in cooling.h on that page. an overall description of my coding strategy, fastCode, will appear eventually. finally the electronics for all of the Roadster boxes is the same.
everything is datalogged once per second, from these i produce occasional plots for analysis. Here's a plot of the start of the July 2021 SoCal TT, from cafe parking lot, up the 2 and 210 freeways, exit Osborne Street, up Little Tejunga and stop. There's a lof of information in this plot. And here is the data log spreadsheet its' derived from.
Click the plot for a large, readable version.
for comparison purposes, BMW currently (2018) uses a system with a single
electric pump and a thermostat that diverts fluid between two outlets depending
on temperature. the pump is PWM controlled for speed and in more recent years
the thermostat itself has a variable set point. this system dates back to the
2002 models (not year). if i had known about this system before i started on
this one i might have adapted it. it has constant intra-engine circulation and
proportional control. the operation of the system with the variable-setting
thermostat can be seen here: 2:40 second
video is just great, showing components and system very succinctly. it can
be found at this autologic.com
page. (the variable-setpoint thermostat is however very complex to operate;
there is a 2-D map for input power since it's set point depends on power-input
plus ambient coolant temperature).
this system completely replaces the "traditional" auto cooling system components: belt-driven pump, thermostat, plumbing, etc. cooling capacity is independent of engine speed, power consumption is typically under 50 watts. the system is physically simple, all complexity is in the control software.
temperature regulation, measured at a temperature sensor in the cylinder head water jacket, is within 3 degrees of the setpoint.
temperature is measured at 10 Hz rate, with 500 mS filter.
the systems is composed of:
i have no idea how much power (in horsepower or watts) was consumed by the
belt-driven coolant pump my engine once had; but typical power consumption of
this system is around 50 watts (under 0.07 HP) and peak maximum
including radiator fan is 210 watts (0.27HP). most important, cooling
capacity is independent of engine speed. the engine is 200 ci/3.2L,
200 HP maximum, and given it's vintage (1950's), not particularly thermally
efficient (a kind way to state it).
analogous to what it took for electronic fuel injection to replace carburetors, the solution for replacing the barbaric leaky-pump/thermostat system is more complex than you'd think it ought to be. a thermostat system works surprisingly well and is about as simple as it can be.
conceptually this system consists of two masses, two independently operating pumps, and two sensors (and occasional hints from the rest of the chassis).
two pump heat
one mass is a heat source. the rate at which it generates heat varies widely and unpredictably (your foot on the throttle; the load on the engine). since the mass is large, thermal change is slow and delayed in time. this mass is of course the engine, block and head and coolant (but see footnote). a temperature sensor is in the head and called "head temp". holding head temp to the set point temperature ("thermostat temperature") is, more or less, the entire goal of the system.
the other mass is a heat sink; the radiator full of coolant (and hoses). radiator output is unpredictable, and to the software, uncontrollable. the amount of heat removed (to the atmosphere) depends on vehicle speed, load, weather, and the velocity of coolant moving through it. to make things worse the heat content of the fluid measured (at radiator outlet) arrival at the engine's water jacket is delayed as well, dependent on pump speed.
the main coolant pump pulls fluid out of the bottom of the radiator and pushing it into the water jacket. coolant flows up through the block into the head and out the top, to the radiator. (in a traditional system the thermostat is located right at the "top" before the radiator.) coolant flow rate is proportional to electrical power put into the pump. the pump is moving fluid within a system with no artificial restrictions and so the pump can be surprisingly small and efficient.
there is an additional pump, the circulator, which does many critical tasks at once.
the circulator pump ("circ pump") pulls hot coolant from the outlet from the top of the cylinder head and pushes it into the bottom of the block. this circulates coolant within the engine only, not through the radiator. recall that speed of each pump is independent of the other. the circ pump is operating at all times, and is independant of cooling requirements.
in a running engine heat of combustion warms the coolant in the head and its temperature rises. circulation, independent of cooling, ensures that there are no hot-spots in the head. second, the head and block are now thermally coupled; cylinders and head remain at the same temperature (see footnote). third, constantly-moving coolant fluid means that the head temp sensor accurately reflects actual coolant temperature, independent of engine load and coolant flow rate. fourth, constant flow tends to push air bubbles up into the radiator tank where they are purged out the overflow during normal heat/cool cycles. the lack of air bubbles itself minimizes hot spots.
fifth, and of major importance, is that the circulator mixes "cold" coolant that the main pump is pulling out of the radiator, and into the block. the circ pump's speed is inversely proportional to the main pumps speed; when little cooling is required (idle, low speeds) the circ pump flow rate is high, mixing in small amounts of "cold" (OK, cooler) (see footnote) coolant, producing very fine control over heat removal. mixing is of major consequence of the circ pump.
radiator outlet temperature varies wildly in any cooling system. the system has little control over the outlet temperature, and the control it does have (radiator cooling fan) can be delayed a minute or more. radiator outlet temperature is more dependent on the weather than it is on fan motor speed.
in a traditional thermostat system, only when the thermostat begins to close does the system "notice" the change, and by that time the block and head have been filled with the cooler coolant; the thermostat closes, the temperature rises again, opens, etc. this leads to a damped temperature oscillation.
mixing dilutes the incoming colder coolant into that within block and head. temperature change is "seen" in the head sensor more rapidly and the coolant pump speed can be adjusted so that combined coolant remains at the system set point temperature.
the end result of a closed-loop system is that rarely are there large temperature or heat-content differences; from idle to full load heat production the pumps' speed increases slowly and in response to small temperature changes. a major side effect is that the pump required to cool the engine during even the largest heat production is surprisingly small.
FOOTNOTE: i intentionally use casual (aka sloppy) language here for the sake of brevity.
the engine mass doesn't pass through pumps etc, coolant does, and itself is coupled
to the iron mass. there are lags within lags. there is a lot more trivia and math
involved than this discussion acknowledges; refer to the code for what it is i
really think i'm doing.
closed loop all
the illustration above includes all of the major components, but leaves out the radiator cooling fan for simplicity. it has a similar but simpler algorithm driving it. in addition to the metal and plastic above, the control system itself consists of custom electronics, an off the shelf controller (Arduino Mega 2560) and code.
my code is structured in a coding strategy i call fastCode. fastCode is a method for designing cooperative, single-threaded multitasking based on timers and inter-process communication, with ideas borrowed from John Day's PATTERNS IN NETWORK ARCHITECTURE and R W Watson's delta-T transport protocol).
the cooling code consists of two major task loops, and a number of smaller, simpler support tasks. process() is the "inner loop" that handles closed-loop temperature regulation (the point of the discussion here). the other major task loop is stateChange() that handles startup, engine running/cooldown, shutdown, etc.
stateChange() can be thought of as enclosing process(). stateChange() decides when or if process() is run. most of this discussion is about the core algorithms within process().
support task loops read, smooth and filter the two temperature sensors, control pump and fan motor slew rates, generate reports, handle communication, etc. these won't be discussed here; see the code.
the point of all this is to hold the cylinder head temperature as close as
possible to the set point temperature (currently 170F on my engine). in
practice the instantaneous temperature is + or - 3 degrees, and rarely exceeds
+ or - 5 degrees (and i'm working on those corner cases). note that these are
instantaneous temps, meaning measured at the task loop rate of 100
in fact there are two closed loops within process; one for the main pump ("pump") and one for the radiator cooling fan ("fan"). circulator pump control is a side effect of the pump process.
pump and fan began life as PID controllers (proportional integral derivative). PID is a standard control process that uses feedback to stabilize changing systems. however, classical PID makes two assumptions about the error correction term that make it unsuitable for engine thermal control: that process changes effected by the error term are not delayed (much) and/or have fixed delay, and that a given error-correction value has a constant effect. in this process the error term E, the number of degrees deviation from the set point, applied as feedback has an effect dependent on a number of factors, and worse, that effect is significantly and variably delayed in time. for these reasons the P (proportional) term that usually dominates PID feedback correction is of no use here. (in many systems, such as an electric furnace control, one "unit" of error term E always maps to some physical unit, volts, amperes, etc which will have predictable effect.)
a major distortion is that correction for temperature error, eg. too hot or too cold, is asymmetrical. we can only remove heat from the engine; we cannot (and would not want to) add heat. to "add heat", we wait. as long as the engine is running temperature will rise. the same is true of the radiator, with the difference that the cooling fan actually has some effect, though greatly delayed.
these asymmetries and nonlinearities make the closed loop complex.
the pump process is a fairly straightforward closed-loop ID controller. error correction is positive domain only ("too hot"), negative values ("too cold") are clipped to zero. P is useless due to the arbitrary phase error. D "seems to" stabilize short term correction, it's traditional function. it may be voodoo at this point, the D gain is small and i haven't looked at it in a year.
if radiator outlet temperature were constant, then the ID output would directly control pump speed. however it is is not, and requires enthalpy correction. hot radiator coolant removes less heat than cold radiator coolant, and so more of it (increased coolant pump speed) is required. all the nasty bits (volume dependent) fall out or can be ignored with the closed loop control. heat content is then linearly proportional to temperature under these conditions and so a simple multiplier on the pump speed sufficies.
the main coolant pump speed is then the sum of the pump ID output, scaled
proportionately by enthalpy compensation. numerically, this is 0 (pump stopped)
to 255 (pump maximum speed). recently added to this term is decelleration
overcool, see below.
the pump process also controls the circulator pump. the circ pump runs
constantly at fixed speed, except by ad hoc reasoning, when (if, i've never
seen it happen) the main coolant pump approaches 80% rate, the circ pump speed
slows proportionally, to 0 at full coolant pump speed.
a fairly new and experimental sub-process has been added to deal with a somewhat common corner case that i (incorrectly) call decelleration overcool. it might be better named de-load overcool. the condition occurs when a long duration moderate to high engine load (high heat production), with the system comfortably in closed loop control, is suddenly removed (eg. pulled off the freeway into slow surface street traffic). unlike heat production, where temperature rate-of-increase of the mass is fixed, over-cooling can happen much faster than the ID process runs down (integrating all those 0's); for a brief period the system over cools while the integrator runs down.
decel overcool operates in the negative-error domain ("too cold"). in most closed-loop operation head temp hovers around the set point, plus and minus, and decel overcool's integrator hovers around 0. when the process is caught up in overcool due to the conditions stated above, decel's integrator "pumps up" to a significant value, and this is subracted from the ID output that determines pump speed.
what makes this work is the fact that the decel overcool time constant is much shorter, currently 20% of the pump process time. this means its output both rises and falls more rapidly, making it a short-term correction.
decel overcool also adds to stability, since each (pump ID, decel overcool
integrator) are in opposite domains, each damps the other.
the fan process is also ID, has a much longer time constant, some 1200 seconds, and relaxed operating conditions, and overshoot ("too cold") is acceptable. the ideal goal of the fan process, when conditions are right, is to hold rad outlet temp to something like 20 degrees below set point. that difference is optimal in that it causes the main pump to run about 25%, nicely within it's optimum servo range, and low power and (electronic) heat production. too-cold coolant actually worsens regulation, since such a small amount of it needs to be introduced into the engine (via the circ/mixer) to maintain closed loop.
the fan motor speed is slewed to it's calculated target speed at a fairly
slow rate. the fan motor is the largest electrical load, another reason to
limit the outlet temperature differential to a practical maximum -- it is
quieter. the cooling fan is turned off when vehicle speed reaches 45 MPH.
there are two temperature sensors, one in the head and one in the radiator outlet circuit. these are typical NTC resistor type sensors (Stewart Warner, 33..240 ohm 100..280F) in a precision (1%) circuit. analog readings are despeckled (shot noise filter; hey this is an automobile) and averaged with a relatively fast low-pass filter (100 to 500 mS). this is an order of magnitude faster, at least, than typical temperature guages.
i spent considerable time in the last decade messing with cylinder head thermal issues, including head dissection to determine optimum sensor placement. this was before i even considered this system, with it's circulator. with the circ pump sensor placement is largely unimportant as long as circ'ed coolant passes by it. (a second circulator pump, independently moving water from radiator inlet to outlet would both remove heat and make it measurable, i deemed too much work for too little return and a reduction in reliability.)
radiator outlet temperature is also not critical, for more or less the
opposite reason: there is simply no good place to measure it. mine's simply
stuck in the flow, before the main pump, about 6" past the lower radiator tank.
the circ pump mixing effect partially moots the, again, phase error of
measurement vs. arrival at the head sensor. we want coolant temp regulated
everywhere, not just at the point of the sensor, and circulation helps approach
there is also an engine-oil cooling loop that runs a fan attached to the
remote oil cooler, via simple proportional control. starting at it's own set
point (205F i think) the fan runs at a minimum speed; fan speed increases to
maximum by 220F. this is adequate because of the very long time constant
involved. temperature is nearly always hovering around the lower set point,
with higher temperature excursions during high-load (mountains).
aka motors and pumps. the processes above calculate pump and fan goal
speeds, but do not directly control the motors. the motorSlew() loop causes the motor
speed(s) to change to the goal speed at a low, slow slew rate. everything
here has mass, rapid change is not needed, and a long slew rate means that there
are no sudden electrical surges or load dumps. power consumption is much lowered.
electronic components remain cool and unstressed, and it is overall quieter.
managing the overall operating state is as important as regulating operating temperature. closed-loop operation occurs only during a very narrow range of operating conditions. the initial state of the engine, thermally, is unknown when the car is turned on. it could be stone cold from overnight parking, or at peak heat from a brief stop after a long hard run.
"wrapped around" the engine cooling algorithms described above is the state control machine. loosely speaking, the car is turned on, probably started and run, is driven/warms up, operates for some time and then the engine is turned off. at a minimum the car is turned on, then turned off. stateChange() must follow actual conditions at all times.
state changes are easy to describe in the most common operating sequence
outlined above, each state id described below in order of typical execution.
though none of this sequencing is rocket science, many of the details are
critical to operation.
the following are true for all operation below, and are mentioned here, once, for clarity.
if power is removed (key off) at any time, in any state, control always returns to the STOP state.
it follows then that for all states after IDLE, the controller and software assumes the engine is running and producing heat. it is entirely possible for this to not be true; you can stall the engine or run out of gas. but as long as ignition power remains on, the cooling system continues to operate. there is no serious downside to this. since the cooling system is independent, it will simply run until the temperature of the cylinder head is brought (down) to the set point, at which time the cooling system would become quiescent anyway. the circulator will remain running however as long as the cooling system is operating (it draws about 2 amps).
any change to power off (ignition off, car off, etc) begins here, including initial power on of the cooling computer. motor goals are set to zero, the various closed-loop parameters are reset and errors are cleared. this is a transient state, control is immediately passed to IDLE.
idle state is simple: nothing happens. the cooling system is off, quiecent. if power ("ignition") is turned on control passes to the next state, DEFER.
defer imposes an initial delay between key-on/power on and cooling operation. this addresses real-world operation... the engine may be cranked and started, immediately or after a delay; the key may turn off, or may simply be left on without running the engine. defer delays later cooling system operation until it is better known what is likely to happen. defer minimizes electrical loads during the high-load-condition engine cranking/start sequence. there's no instant need for cooling in any case.
defer switches to the next, EXCH, state if either 10 seconds passes or if it sees the engine is running (rpm > minimum). [note: in the Roadster engine RPM arrives from the chassis computer via IPC.]
EXCH, or exchange, is a critical state that determines the thermal content of the engine and cooling system. when power is turned on, the engine could have been run hot and turned off, or cold overnight, or in between. after intermediate time periods, the radiator may be cool from a breeze, with the engine hot; thermal siphoning means the heat is hotter the the block...
exchange is simple: both pumps are run at a moderate fixed rate for a fixed period. this distributes coolant and heat amongst all components, allowing for a rational measurement of heat content.
an additional heuristic is applied: while the head temperature is above closed loop operating range (eg. engine hot) the radiator cooling fan is turned on. this need not be accurate.
exch terminates after a fixed interval and control passes to EVALUATE.
EVAL is a transient state; a decision is made on how to enter close-loop operation here. the outcome of EVAL is to select one of two possible next states: WUP (warmup) or RUN (closed loop).
though the pump ID controller will 'close the loop' with any initial starting conditions, we want to enter closed loop smoothly without any spikes or oscillations in regulated temperature. if the ID controller integrator equals zero for an already-warm engine, then engine temperature is likely to overshoot (cooling system undershoot) until the integrator "pumps up" to accommodate the initially-unknown temperature error.
the purpose of EVAL, for the case where the engine already contains heat, is to pre-set the ID controller integrator with a value that is "close enough" to lock into closed loop operation without excessive over or undershoot.
EVAL is a simple decision: if the cylinder head temperature after EXCH is within the close-loop operating range, then the "large" preset is made and control passes to RUN state. if the engine is deemed "cold", and needing warmup, control passes to the WUP state, next. (in other words, if the engine is already warmed up, skip the warmup state).
warm-up has two internal thresholds. it is assumed that the engine is running, producing heat, and that temperature will eventually rise. when the head temp reaches a minimum threshold, the circulator pump is turned on nominally, immediately improving temperature measurement accuracy (until this point conduction/convection is "good enough" to detect "warm enough"). the second temperature threshold, the lower limit of closed loop operation, then presets the pump ID controller to the "small" preset value, and passes control to RUN.
the cooling system run state controls engine oeprating temperature until the ignition is turned off. the code involved in stateChange() does nothing, in fact, except watch for power-off. the process() code itself watches operating state and when it is RUN, performs its closed-loop calculations.
when power turns off control passes to DONE.
this is a transient state. conditions are set to evaluate post-shut-off cooldown requirements; though in closed loop, the engine has a variable amount of heat that may need to be removed. the two pumps are set to nominal speed, and if the radiator fan was on, it is set to a nominal speed (it was removing heat at power-off, assume continue). control passes to DISTRIB.
analogous to the earlier EVAL state, this state simply distributes coolant between head, radiator and block to determine heat remaining. this state persists for a fixed time, but if temperature drops below a lower threshold the state is terminated. the latter is fairly common, and occurs on short runs where the engine just reaches operating temperature but the contents of the radiator are cool/cold, such as in cold weather. often distrib negates any need for a cooldown cycle. when complete (due to time or temperature) control passes to COOLDOWN.
the cooldown cycle extracts engine (underhood) heat and prevents heat soak of underhood components. plenty of heat is left if the engine is subsequently re-started. DISTRIB and cooldown prevent head hot spots from forming (from yet-extracted combustion heat still in the metal after shutdown) and eliminates vapor lock. there are plenty of hot exhaust side components dumping heat under the hood.
cooldown persists until a predetermined engine-cool temperature is reached, a preset time passes, or battery voltage drops below a threshold.
upon termination control passes back to the STOP state.
there exists a manual control state, that runs pumps and fan at various fixed speeds, mainly for test purposes. these states are reach via IPC messages from instrument panel controllers.