Tesla's 3, 6, 9 — Pseudoscience or a Forgotten Riddle for AGI?
"If you only knew the magnificence of the 3, 6 and 9, then you would have the key to the universe."
— attributed to Nikola Tesla (almost certainly apocryphal)
Prologue: The Most Famous Quote Tesla Never Said
Here's the uncomfortable truth we need to get out of the way: Nikola Tesla never said this.
Historians have scoured Tesla's published writings, patents, correspondence, and biographies. The quote first appeared in a 1990s book by Dale Pond about John Keely (a different inventor with spiritualist leanings). Someone, somewhere, misattributed it to Tesla. The internet did the rest.
What Tesla actually had was an obsessive-compulsive fixation on the number 3 in his personal life:
- He circled a building three times before entering
- He only stayed in hotel rooms divisible by 3 (New Yorker Hotel: room 3327)
- He used exactly 18 napkins to clean his silverware
- He swam exactly 33 laps daily
Biographer John J. O'Neill (1944) documented these as what we'd today recognize as OCD rituals. Nothing more.
So — case closed? 369 is a cozy pseudoscience marketed to TikTok "manifestation" influencers and New Age retreats?
Yes. And also: no.
Because while the mystical framing is wrong, the mathematical structure that 369 points toward is surprisingly real — and it intersects with one of the most fascinating developments in modern AI research.
Let me show you what I mean.
Part I: What 369 Actually Is (The Math, Not the Myth)
Remove the mysticism. Strip away the "cosmic vibration frequency" marketing. What remains is this:
Digital Root = Modulo 9 Arithmetic
The digital root of a number is what you get by repeatedly summing its digits until a single digit remains.
369 → 3+6+9 = 18 → 1+8 = 9
Mathematically: digital_root(n) = n mod 9 (with 0 mapped to 9).
This is elementary number theory. Nothing cosmic about it.
Why 3, 6, 9 Are "Special"
Look at the multiplication table for 3:
| Multiple | Value | Digital Root |
|---|---|---|
| 3×1 | 3 | 3 |
| 3×2 | 6 | 6 |
| 3×3 | 9 | 9 |
| 3×4 | 12 | 3 |
| 3×5 | 15 | 6 |
| 3×6 | 18 | 9 |
| ... | ... | ... |
Multiples of 3 only ever produce 3, 6, or 9 as digital roots. That's the entire "mystery."
Now try the doubling sequence that vortex mathematicians obsess over:
1 → 2 → 4 → 8 → 16 → 32 → 64 → 128 → 256 → ...
Digital root: 1 → 2 → 4 → 8 → 7 → 5 → 1 → 2 → 4 → ...
The numbers 3, 6, 9 never appear in this sequence! "Proof they operate in a higher dimension!" vortex enthusiasts declare.
The real explanation? Much simpler. The doubling sequence in mod 9 is a cyclic group generated by 2. It cycles through {1, 2, 4, 8, 7, 5} because these are precisely the numbers coprime to 9. Since gcd(3, 9) = 3 and gcd(6, 9) = 3, they're not coprime to 9, so they can't be generated by the doubling operation.
This is not a cosmic revelation. It's a property of modulo arithmetic that any first-year math student can prove.
As the YouTube channel Mathologer put it: "If humans had 8 fingers and used octal, Tesla would be worshiping the numbers 7, 8, and 9."
Part II: The Bridge Nobody Built — Modular Arithmetic and the Grokking Revolution
Here's where it gets interesting.
That digital root operation — n mod 9 — is a member of a vast family: modular arithmetic mod p. And modular arithmetic, it turns out, is the key to one of the most mysterious phenomena in modern deep learning.
The Grokking Phenomenon (Gromov 2023)
Train a small transformer to do modular addition — (a + b) mod 113. For thousands of epochs, nothing. The network memorizes the training set and performs terribly on unseen pairs. Then, suddenly, it "grokks" — performance jumps to near-perfect generalization overnight.
What did the network learn?
Researchers at Harvard and OpenAI found that the network's weights spontaneously converge to a discrete Fourier basis. Each hidden neuron learns to fire at a specific frequency — a cosine wave over the discrete group ℤ/113ℤ. The network internally re-discovers the discrete Fourier transform to solve the task.
This isn't a niche finding. It's been replicated across architectures:
- Li et al. (2024), arXiv:2402.09469: Transformers converge to single Fourier frequencies for maximum-margin solutions in modular arithmetic
- Mallinar et al. (2025), ICML 2025 Oral: Even non-neural models (Recursive Feature Machines) learn the same block-circulant matrix structure — confirming this is a fundamental property of learning algorithms, not just neural networks
- Africa et al. (2025), arXiv:2506.23679: Modular exponentiation tasks show the same grokking dynamics
Why This Matters for 369
The digital root is mod 9 arithmetic. The grokking phenomenon involves mod p arithmetic. They're the same mathematical family: cyclic group computation.
When a neural network learns to compute digital roots (mod 9), it's doing the same kind of frequency-basis learning that happens in mod 113 grokking. The network discovers Fourier frequencies. It becomes a frequency analyzer.
Tesla's intuition about 3, 6, 9 as "vibration patterns" was wrong in the specifics (there's nothing special about mod 9) but accidentally prescient in the category: modular arithmetic, and its deep connection to frequency representations, is fundamental to how neural networks generalize.
Part III: Frequency as Computation — The Neuromorphic Frontier
If the 369 myth were to be "translated" into an actual research program for AGI, it would look something like this:
1. Multi-Frequency Oscillation Neural Networks
Liu et al. (2025), arXiv:2508.02191 — The paper that comes closest to realizing the "frequency-based AGI" vision. Their architecture has three subsystems:
- Perceptual system — encodes inputs as spike trains with specific frequencies
- Auxiliary system — maintains multi-frequency oscillations as a computational substrate
- Executive system — reads out decisions from the oscillatory state
Results: 2.18% higher accuracy than SOTA, with 48.44% fewer iterations.
This is a spiking neural network that uses frequency as its primary information dimension. Not a metaphor — actual frequency-encoded computation.
2. Hyperdimensional Computing (HDC)
Also called Vector Symbolic Architectures (VSA), HDC represents information as high-dimensional vectors (1000–10,000 dimensions). Operations are:
- Binding (⊗) — combines two vectors into a new one
- Bundling (+) — aggregates information
- Permutation (ρ) — sequences information over time
In the Fourier Holographic Reduced Representation (FHRR) variant, each dimension is a phase angle. Information is literally encoded as phases of a frequency vector.
Olin-Ammentorp (2023), arXiv:2312.11783 demonstrated that HDC provides a "programming paradigm for oscillatory systems" — the natural way to program analog oscillator-based computers.
This is computation by resonance.
3. Reservoir Computing
A fixed, untrained, nonlinear dynamical system (the "reservoir") maps inputs to a high-dimensional state space. Only a linear readout layer is trained.
Conceptual resonance with Tesla: The reservoir is exactly what Tesla described as "energy, frequency, and vibration" — a system where information is processed through its inherent dynamical response patterns. The resonator is the computer.
4. Modulo Arithmetic in Analog AI Accelerators
Demirkiran et al. (2024), Nature Communications 15:5098 — Using the Residue Number System (RNS) for analog DNN accelerators:
- Decompose large numbers into multiple low-bit residues
- Each residue is computed on a separate analog core
- Achieve ≥99% FP32 accuracy using only 6-bit integer analog cores
- 6 orders of magnitude energy efficiency improvement
The RNS decomposes numbers by their remainders modulo several small moduli. The set {3, 7, 8, 9} — moduli that include 3 and 9 — would be a valid RNS basis. The 369 pattern is, at root, a residue computation.
Part IV: Reinterpreting 3, 6, 9 for the Age of AGI
If we strip away the mysticism and rebuild 3, 6, 9 as a research framework, here's what emerges:
3 — Three Computational Paradigms for Frequency-Based AGI
| Paradigm | Carrier | Computation | Key Architecture |
|---|---|---|---|
| Symbolic / Modular | Discrete cyclic group (ℤ/pℤ) | Fourier frequency decomposition | Grokked Transformers, Fourier Circuits |
| Subsymbolic / Neural | Spike trains, rate codes | Oscillatory dynamics, phase synchronization | Multi-frequency SNNs, LIF neurons |
| Hyperdimensional / Phase | High-dimensional phase vectors | Binding, bundling, permutation (phase arithmetic) | FHRR-HDC, Oscillatory VSA |
AGI may require all three — a triadic architecture where symbolic reasoning (modular), pattern recognition (neural), and compositional binding (hyperdimensional) coexist.
6 — Six Design Principles (The "Resonance" Blueprint)
- Oscillation — Computation is a temporal process, not a static feedforward pass
- Resonance — Systems respond maximally to inputs matching their eigenfrequencies
- Phase — Information is encoded in relative timing/spatial relationships
- Modulation — Carrier frequencies can be modulated to carry information (like radio)
- Binding — Phase locking synchronizes distributed representations
- Bundling — Multiple frequency channels coexist without interference (orthogonal codes)
9 — Nine Research Frontiers That Converge
| # | Frontier | Why It Matters |
|---|---|---|
| 1 | Grokking dynamics | Understanding how networks "harmonize" to generalize |
| 2 | Fourier circuits | How internal frequency representations emerge |
| 3 | Neuromorphic oscillators | Hardware that computes with frequency natively |
| 4 | Reservoir computing | Fixed dynamics as a computational substrate |
| 5 | Hyperdimensional computing | Phase-encoded symbolic operations |
| 6 | RNS analog accelerators | Modular arithmetic as an energy-efficiency lever |
| 7 | Brain oscillations | Neuroscience of theta/gamma phase coding |
| 8 | SNN learning rules | STDP as a resonance alignment mechanism |
| 9 | Compositional generalization | Binding symbols across frequency channels |
Part V: China's Frequency Computing Revolution — The Missing Research Program
If you've followed the argument so far, you might wonder: is anyone actually building this? Is there a research community that takes the "frequency = computation" paradigm seriously, not as mysticism but as engineering?
The answer is yes — and a surprising amount of it is happening in China. Over the past seven years, Chinese labs have independently converged on many of the ideas that 3-6-9 mysticism blindly gestures at.
1. The F-Principle — Shanghai Jiao Tong University (2018–2025)
In 2018, Zhi-Qin John Xu (许志钦) and collaborators at Shanghai Jiao Tong University published a series of papers revealing what they called the Frequency Principle (F-Principle, 频率原理) : deep neural networks learn target functions from low to high frequencies during training.
This wasn't a speculation — it was a rigorous empirical finding with a mathematical framework:
- Xu et al. (2018), ICONIP 2019: First demonstration — DNNs on 1D synthetic data learn low frequencies first
- Xu et al. (2020), Communications in Computational Physics: Extended to high-dimensional benchmarks (MNIST, CIFAR10) and deep architectures (VGG16)
- Luo, Ma, Xu, Zhang (2021), CSIAM Trans. Appl. Math: Theory of the F-Principle for general deep neural networks
- Xu, Zhang, Luo (2024), Communications on Applied Mathematics and Computation: Comprehensive overview of F-Principle / spectral bias
Why this matters for the 3-6-9 framework:
| Aspect | 369 Mysticism Says | F-Principle Actually Shows |
|---|---|---|
| Core idea | "Vibration frequencies govern reality" | Networks learn from low to high frequencies |
| Mechanism | Magic / resonance | Activation function regularity → frequency-domain decay |
| 3, 6, 9 role | Special cosmic numbers | Numbers coprime to the modulus determine the cyclic group structure |
| Practical value | None | Explains generalization, inspires multi-scale DNNs (MscaleDNN) |
The F-Principle provides the mathematical justification for why frequency analysis is central to understanding neural network learning. It's not mysticism — it's Fourier analysis of the training dynamics. And it was discovered and systematized by a Chinese research group.
Key reference: Xu, Zhang, Luo (2024), "Overview Frequency Principle/Spectral Bias in Deep Learning," Communications on Applied Mathematics and Computation 7(3): 827–864.
2. SpikingBrain — Chinese Academy of Sciences (2025–2026)
In September 2025, the Institute of Automation, Chinese Academy of Sciences (CASIA) — led by Bo Xu (徐波) and Guoqi Li (李国齐) — released SpikingBrain 1.0, the world's first brain-inspired spiking large language model.
This is arguably the closest existing system to the "frequency-based AGI" vision:
- Architecture: Linear and hybrid-linear attention with adaptive spiking neurons
- Training data: Only ~150B tokens (~2% of what mainstream LLMs use)
- Hardware: Fully trained on domestic MetaX C550 GPUs (no NVIDIA dependency)
- Performance: Comparable to open-source Transformer baselines
The breakthrough metric: 100× speedup in Time to First Token (TTFT) for 4-million-token sequences. The smaller 7B model achieved 26.5× speedup over Transformers on first-token generation with a 1M-token context.
Why? Because spiking neurons are event-driven — they only fire when input crosses a threshold, achieving 69.15% sparsity at the micro level. Combined with MoE sparsity at the macro level, this creates a system that literally computes through discrete firing events — a physical instantiation of "frequency as computation."
In March 2026, SpikingBrain was accepted by TMLR 2026. A major upgrade, SpikingBrain 2.0, was released in April 2026 with comprehensive architecture improvements.
Key reference: Pan et al. (2025), "SpikingBrain: Spiking Brain-inspired Large Models," arXiv:2509.05276, accepted by TMLR 2026.
3. Darwin Monkey / 「悟空」 — Zhejiang University (2025)
In August 2025, Zhejiang University's brain-computer intelligence lab unveiled Darwin Monkey (「悟空」, Wukong) — the world's largest neuromorphic computer based on dedicated chips, with over 2 billion spiking neurons and 100 billion synapses.
Key specs:
- 960 Darwin-III chips, each supporting 2.35 million spiking neurons
- ~2,000W power consumption at typical operation — comparable to a space heater, not a data center
- 15 blade servers, each containing 64 Darwin-III chips
- Wafer-scale integration: DarwinWafer uses 2.5D CoWoS-S packaging to integrate 64 dies on a single 12-inch wafer
- Runs DeepSeek brain-inspired large models for reasoning, content generation, and math solving
This surpasses Intel's Hala Point (1.15 billion neurons, April 2024) as the largest dedicated neuromorphic system. It represents the culmination of a decade of Chinese neuromorphic research — from Darwin Mouse (100 million neurons, 2020) to Darwin Monkey (2 billion neurons, 2025).
The connection to 3-6-9: Darwin Monkey is a physical system where computation is oscillation. Its spiking neurons communicate through discrete pulse events. The "vibration" metaphor becomes literal hardware architecture.
4. Speck Chip — CAS / Swiss Collaboration (2024)
In 2024, CASIA researchers, collaborating with Swiss partners, published the Speck neuromorphic chip in Nature Communications:
- Static power consumption: 0.42 mW — nearly zero when idle
- Sensing-computing integration: Directly processes sensory data without separate memory reads
- Event-driven: Only activates when input is present
This is the hardware-level realization of the resonance principle: the system responds rather than processes. When there's nothing to compute, it consumes effectively zero energy.
5. What This Tells Us
The Chinese research ecosystem has independently built the key pieces of the "frequency-based AGI" puzzle:
| Piece | Where | Who | When |
|---|---|---|---|
| Theory (F-Principle) | Shanghai Jiao Tong Univ. | Xu, Zhang, Luo | 2018–2025 |
| Model (SpikingBrain) | CAS, Beijing | Xu, Li, Pan | 2025–2026 |
| Hardware (Darwin Monkey) | Zhejiang Univ. | Pan Gang lab | 2025 |
| Chip (Speck) | CAS / Switzerland | Li Guoqi et al. | 2024 |
None of this was inspired by Tesla's 3-6-9. It emerged organically from the convergence of neuromorphic engineering, deep learning theory, and the practical imperative of energy-efficient AI. But the fact that it maps so cleanly onto the "frequency as computation" thesis — which 3-6-9 mysticism dimly glimpses — suggests that this direction is not a fringe curiosity but a genuine research paradigm.
The question is no longer "does frequency-based computing work?" It's "how quickly can we scale it?"
Tesla's 3, 6, 9 isn't a cosmic key to AGI. It's not a key to anything. It's an OCD habit wrapped in a fabricated quote, elevated to pseudoscience by the internet's love for mystery.
But.
The mathematical family that 369 belongs to — modular arithmetic, cyclic groups, frequency representations — is genuinely fundamental to how neural networks learn. The grokking phenomenon shows that networks spontaneously become frequency analyzers. Neuromorphic computing shows that frequency-encoded computation is not just possible but energy-efficient. Hyperdimensional computing shows that phase-encoded symbols can do real reasoning.
Tesla's famous intuition was wrong about the specific numbers. But if you squint hard enough, you can see that he was pointing at something real: computation through vibration, frequency, and resonance.
The quote is fake. The insight may yet be real.
We just needed 80 years of actual science to figure out what he was pointing at.
This article is part of a series exploring the cross-section of esoteric ideas and cutting-edge AI research. For a rigorous treatment of the topics discussed, see the references below.
Key References
- Gromov (2023), "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets," arXiv:2301.02679
- Li et al. (2024), "The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks," arXiv:2402.09469
- Mallinar et al. (2025), "Grokking in Non-Neural Models," ICML 2025 Oral
- Liu et al. (2025), "Neuromorphic Computing with Multi-Frequency Oscillations," arXiv:2508.02191
- Demirkiran et al. (2024), "Residue Number System for Analog DNN Accelerators," Nature Communications 15:5098
- Olin-Ammentorp (2023), "Hyperdimensional Computing as a Programming Paradigm for Oscillatory Systems," arXiv:2312.11783
- Tesla biography: John J. O'Neill (1944), "Prodigal Genius: The Life of Nikola Tesla"
New References (2025–2026 Supplement)
- F-Principle: Xu, Zhang, Luo (2024), "Overview Frequency Principle/Spectral Bias in Deep Learning," Communications on Applied Mathematics and Computation 7(3): 827–864. (SJTU, China)
- F-Principle (foundational): Xu et al. (2020), "Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks," Communications in Computational Physics 28(5): 1746–1767.
- SpikingBrain: Pan et al. (2025), "SpikingBrain Technical Report: Spiking Brain-inspired Large Models," arXiv:2509.05276, accepted by TMLR 2026. (CAS, China)
- Darwin Monkey / Darwin-III: Zhejiang University (2025), "World's first 2-billion-neuron brain-inspired computer," ZJU Newsroom, Aug 2025.
- Speck chip: CAS / Swiss collaboration (2024), "Energy-efficient sensing-computing neuromorphic chip," Nature Communications.
- Grokking is fast in transformers: springtail.ai (2026), minibatch SGD accelerates grokking in modular arithmetic tasks.
- Africa et al. (2025): "Learning Modular Exponentiation with Transformers," arXiv:2506.23679.