Writing

750× Faster: Making Detailed Chemical Kinetics Usable in Engine Simulation

By the time my PhD work on energetic materials was finished, one thing was clear: building detailed, validated reaction mechanisms is possible and scientifically valuable. The less comfortable truth was equally clear: you cannot run those mechanisms in an engine simulation.

A detailed mechanism for a practical fuel — gasoline, diesel, a natural gas blend — can contain hundreds to thousands of species and tens of thousands of reactions. Solving the coupled differential equations for all of these at every computational cell and every timestep in a 3D CFD model of an engine is not feasible. For a mechanism with 679 species, the direct integration approach is roughly 750 times too slow. This is not a gap that hardware improvements alone will close on a useful timescale.

The question, then: can we use the accuracy of detailed chemistry without paying the full cost of detailed chemistry at runtime?

The core idea: separate the expensive work from the simulation

The answer is yes, through a class of methods collectively called tabulated chemistry — or, in the machine learning variant, machine learning tabulation (MLT). The central insight is that the expensive chemistry computation does not need to happen during the simulation. It can happen before it.

Consider the thermochemical state of a combustion system: temperature, pressure, fuel-air equivalence ratio, and the concentrations of all species. The evolution of that state — how it changes as reactions proceed — is determined entirely by the rate equations of the mechanism. If we pre-compute that evolution across the space of conditions we expect the engine to encounter, we can store the results and look them up at runtime rather than recomputing them on the fly.

This is the tabulation concept. The pre-computation is expensive; but it happens once, offline. The online cost — during the actual engine simulation — is reduced to a table lookup and interpolation, which is cheap regardless of mechanism size.

Describing the state with a progress variable

The practical challenge is dimensional: even a modest set of operating conditions involves many degrees of freedom, and a table that covers all of them would be enormous. The key to making tabulation work is finding a low-dimensional description of the state that captures the essential physics.

The approach developed in this work uses a progress variable cc — a scalar quantity that tracks the extent of reaction from fresh mixture to fully burned products. The thermochemical state of the mixture at any point in the combustion process is parameterized primarily by cc and its rate of change c˙\dot{c}, along with the initial conditions (equivalence ratio, temperature, pressure).

A novel formulation of the progress variable was proposed that improves the quality of the parameterization across a wide range of conditions and fuels. Crucially, the method also includes an automated identification of the important intermediate species during table generation — the intermediates whose concentrations are needed to accurately track the state, rather than requiring the user to specify them manually.

The tables are populated by running homogeneous reactor simulations (zero-dimensional, constant-volume or constant-pressure) at grid points in the operating space. These are fast for any single point; the cost of covering the full operating space scales with the number of grid points, but this is a one-time cost.

Validation: does the table actually work?

Accuracy has to be demonstrated against the full chemistry, not assumed. The tabulated chemistry method was applied to spark-ignition engine simulations — specifically to knock prediction and emissions — and validated against direct integration of the full mechanism.

The results:

  • Knock onset angle: predicted with a root-mean-squared error of 0.6 degrees crank angle across the tested operating conditions. For context, knock timing is one of the most sensitive outputs of an engine simulation — small errors in chemistry can shift the predicted onset by several degrees, which is physically significant.
  • Cylinder pressure and temperature: in excellent agreement with full detailed chemistry across the full engine cycle.
  • Emissions (CO, NOₓ): good agreement across a wide range of operating conditions — the challenging test, because emissions are sensitive to the precise temperature and species history throughout the cycle, not just the bulk thermodynamic state.

And the computational cost: the tabulated method provided a speedup of approximately 750× compared to direct integration of the 679-species mechanism, with computational time that is independent of mechanism size. Once the table is built, a 2,000-species mechanism costs the same to run as a 100-species mechanism.

The machine learning evolution

The tabulation approach solves the speed problem but introduces a storage problem: lookup tables for complex mechanisms covering wide operating ranges can become large. For large parametric studies or multi-dimensional problems, table size and memory requirements become a practical constraint.

The machine learning tabulation (MLT) method addresses this directly. Instead of a lookup table, the thermochemical state evolution is represented by deep neural networks (DNNs) — one or more networks per cluster of the operating space, trained on the same pre-computed data that would have been stored in the table.

The practical advantages:

  • No table storage: the networks are compact representations of the same information. Memory requirements are reduced by three orders of magnitude relative to conventional tables.
  • Smooth interpolation: DNNs interpolate smoothly between training points, avoiding the artifacts that can appear at the boundaries of table bins.
  • Wider applicability: the MLT approach generalizes more naturally to diverse fuels and conditions, including those that may fall in sparsely tabulated regions.
  • Training efficiency: networks were trained using the stochastic Levenberg-Marquardt (SLM) optimization algorithm, which reduces both training time and memory requirements for large-scale problems.

The MLT method achieved a speedup of approximately 300× for a 621-species mechanism — slightly less than the lookup table approach (which has essentially zero marginal cost per evaluation once the table is built), but with the storage and flexibility advantages that matter for large-scale studies.

Validation against full kinetics confirmed high interpolation accuracy for ignition delay, knock onset prediction, and species concentrations — with the MLT method performing across a wider range of conditions than a fixed lookup table of comparable size would cover.

What this enables in practice

The combination of these methods makes it practical to include detailed chemistry in applications that were previously limited to global empirical correlations:

  • Engine design optimization: sweeping over a large parameter space (fuel blend, compression ratio, injection timing, operating condition) requires many simulation runs. At 750× speedup, a study that would take six months becomes tractable in days.

  • Knock prediction and engine calibration: current production-level knock models use simplified Arrhenius expressions or Livengood-Wu integrals that do not account correctly for residual gas species and cannot generalize easily to new fuels. MLT provides a physically grounded, fuel-agnostic alternative at comparable computational cost.

  • Emissions prediction: CO and NOₓ formation depend on the full chemical history of the gas, not just the peak temperature or the overall equivalence ratio. Fast detailed chemistry means emissions can be predicted accurately, not estimated from bulk correlations.

  • Alternative fuels: as engine manufacturers explore hydrogen blends, ammonia co-firing, and synthetic fuels, the detailed mechanisms change significantly. Global correlations require re-fitting for every new fuel. A tabulated or ML-based method that works from the underlying mechanism requires only that the mechanism be validated — the tabulation or training is a routine step.

The thread connecting phases one through three

What connects the PhD work on energetic materials to the tabulation work to the current focus on Chemical Reactor Networks is not the application domain — propellants, gasoline engines, and gas turbines are different systems. The connecting thread is a single recurring question: how do we use detailed chemistry at the scale where engineering decisions are made?

Quantum mechanics establishes what the chemistry is. Tabulation and machine learning make it computationally affordable for system-level simulation. Chemical Reactor Networks apply it to the complex geometries of gas turbines and industrial combustion systems where 3D CFD with detailed kinetics remains impractical.

Each phase is a response to a limitation of the previous one, and each builds on the same foundation: start from the physics, keep the chemistry honest, and make the tools usable.


The full work is described in:

More to explore