A Ti–Yong–Shu Map of Probabilistic Modeling
This article is mainly generated by ChatGPT, I have tried many prompts and the results is still not perfrect. However I found it already quite useful to give me a mind map of this field.
The Bayesian worldview has become so influential across many areas of modern science that it’s almost impossible to ignore, whether one likes it or not. It often feels as if, unless you speak this language, people won’t take you seriously — to the point that it sometimes seems like a religion dressed up as science, or a science that resembles a religion. I believe we should treat it as one of many useful tools, rather than let it confine our way of thinking. A thoughtful reflection on this is Holes in Bayesian Statistics.
Anyway, let’s get back to the main topic.
The modern landscape of probabilistic modeling seems both rich and fragmented. From Bayes’ theorem to GANs, from variational inference to diffusion models, we find a vast collection of techniques—each powerful, yet sometimes disconnected. But is there a deeper unity?
To reveal this unity, we can borrow a classical conceptual framework from Chinese philosophy: 体 (Ti), 用 (Yong), and 术 (Shu)—Essence, Function, and Method, which captures three complementary layers:
- Ti (体, Essence): the worldview, the first principles.
- Yong (用, Function): distinguishes why we model (infer, represent, generate, decide, explain).
- Shu (术, Algorithm): distinguishes how we compute (maximize, sample, approximate, compete, propagate).
Each concrete model (ICA, VAE, GMM, etc.) is thus a point in this three-dimensional design space.
I. 体 (Ti): Foundational Principles
| Principle | Essence |
|---|---|
| 1. Uncertainty as intrinsic to knowledge | Model distributions, not fixed values. |
| 2. Latent generative process | Observed data arise from hidden variables. |
| 3. Learning = inference | To learn means to infer hidden causes or parameters. |
| 4. Rational belief updating | Bayes’ rule provides a consistent mechanism. |
| 5. Structure and parsimony | Simpler probabilistic structures generalize better. |
II. 用 (Yong): Core Purposes and Functions
| Purpose | Core Question | Outcome |
|---|---|---|
| Inference | What are the hidden parameters or causes given data? | Posteriors, estimates. |
| Representation Learning | How to encode data into latent factors or components? | Compressed latent space. |
| Generation | How to synthesize new samples consistent with observed reality? | Generative models. |
| Decision-making | How to act optimally under uncertainty? | Policies, expected utility. |
| Causal Reasoning | How to understand interventions and structure? | Structural causal models. |
III. 术 (Shu): Core Methods and Algorithms
| Method Family | Description | Typical Algorithms |
|---|---|---|
| Likelihood-based estimation | Optimize data likelihood (possibly with priors). | MLE, MAP, EM. |
| Sampling-based inference | Approximate posterior by random samples. | MCMC, Gibbs. |
| Optimization-based inference | Approximate posterior by optimization. | Variational inference, ELBO. |
| Energy minimization / message passing | Solve inference in structured graphs. | Belief propagation, mean-field. |
| Adversarial or contrastive learning | Learn by distribution matching or contrastive objectives. | GAN, Noise-Contrastive Estimation, Contrastive Divergence. |
IV. 合 (He): Integrative Table — Mapping Ti → Yong → Shu
This is the fourth integrative layer — a cross-mapping table that shows for each major method:
- what it is fundamentally for (Yong: purpose),
- what principle or worldview it inherits (Ti connection), and
- what technique or inference mechanism (Shu) it relies on.
Each method embodies one or more Yong (functions) and realizes it via certain Shu (techniques), under the common Ti (probabilistic worldview).
| Model / Method | Primary Yong (Function) | Core Shu (Technique) | Ti Connection (Foundational Principle) | Remarks |
|---|---|---|---|---|
| MLE | Inference / Parameter estimation | Likelihood maximization | Learning = inference | Pure frequentist baseline. |
| MAP | Inference with prior knowledge | Likelihood + prior regularization | Rational belief updating | Bayesian variant of MLE. |
| EM Algorithm | Inference (latent variables) | Iterative E/M steps under MLE | Latent generative process | GMM, HMM, Factor Analysis. |
| MCMC | Inference / Sampling | Markov chain sampling | Approximate posterior inference | Foundational for Bayesian computation. |
| Variational Inference (VI) | Inference / Approximation | Optimization of ELBO | Learning = inference | Core to VAEs, topic models. |
| PCA / Probabilistic PCA | Representation learning | Closed-form MLE under Gaussian | Latent generative process | Linear Gaussian latent model. |
| ICA | Representation learning + inference | MLE under independence constraint | Latent sources assumption | Links “Yong: representation” + “Shu: MLE”. |
| Sparse Coding | Representation learning | MAP (L1 prior on latent) | Prior regularization = parsimony | Bridge between inference & compression. |
| GMM | Generation + clustering | EM (MLE) | Mixture latent process | Canonical mixture model. |
| Bayesian Networks | Causal reasoning / inference | Exact or approximate inference | Structured dependency modeling | Directed graphical model. |
| MRF / CRF | Contextual inference / prediction | Energy minimization, message passing | Local dependency modeling | Undirected structured model. |
| RBM | Representation & generation | Contrastive Divergence (approx MLE) | Energy-based latent structure | Basis for deep belief nets. |
| Products of Experts (PoE) | Generation / density modeling | Joint energy minimization | Combining independent constraints | RBM is special case. |
| Field of Experts (FoE) | Generation / image priors | MRF + learned filters | Structured local dependencies | Natural image modeling. |
| VAE | Representation + generation | Variational inference (ELBO) | Latent generative process | Deep latent variable model. |
| GAN | Generation (implicit) | Adversarial training (min–max) | Distributional realism | Not explicit probability but shares generative Ti. |
| Normalizing Flows | Generation / inference | Invertible transformation, MLE | Probabilistic bijection | Exact likelihood models. |
| Diffusion Models | Generation | Reverse stochastic process training | Probabilistic time-evolution | Modern SOTA generative model. |
| HMM / Kalman Filter | Inference / sequence modeling | EM, filtering, smoothing | Temporal latent process | Sequential probabilistic structure. |
| Bayesian Decision Theory | Decision-making | Expected utility maximization | Rational belief updating | Foundation for Bayesian RL. |
V. Summary Diagram (Conceptual)
1
2
3
4
5
6
7
[体] Foundations: Probability as logic of uncertainty
↓
[用] Functions: Inference | Representation | Generation | Decision | Causality
↓
[术] Techniques: MLE/MAP | EM | MCMC | VI | BP | Adversarial | Energy-based
↓
[合] Concrete Methods: PCA, ICA, GMM, RBM, VAE, GAN, CRF, etc.
Each level unfolds naturally from the one above:
- Understand “体”, and the functions (用) become inevitable.
- Grasp “用”, and the diversity of methods (术) becomes intuitive.
- The mapping (合) simply records how the abstract purposes manifest concretely.