← Back to Posts

Building EMLVM: A Single-Instruction Machine for All Math

What if I told you that you could compute any elementary mathematical function using only a single instruction and the constant 1?

That's the premise behind a recent paper I read: "All elementary functions from a single operator" (arXiv 2603.21852). It introduces the EML operator (Exponential Minus Logarithm):

eml(x, y) = exp(x) - ln(y)

Just like the NAND gate is Turing-complete for boolean logic, this single operator can bootstrap the rest of mathematics.

EMLVM, a terminal-based stack machine to explore this concept interactively in WezTerm. It's essentially a cursed, math-centric version of Forth. emlvm.

How it works

The virtual machine is absurdly simple. It evaluates Reverse Polish Notation (RPN) strings like 11xE1EE where:

  • 1 pushes the constant 1.0
  • x pushes an input variable
  • E pops y, pops x, and computes eml(x, y)

It turns out 11xE1EE mathematically simplifies exactly to ln(x). By composing these sequences, we can build identity, addition, multiplication, and complex functions.

Proving it with SymPy

Playing around with a calculator to see if 11xE1EE looks like ln(x) is fun, but I wanted hard proof. I integrated a symbolic evaluator using Python's sympy library.

Now, we can execute EML programs symbolically:

$ emlvm algebra '11xE1EE'
# ... builds a symbolic trace proving it collapses to log(x)

We also built an equivalence checker. If we write a wrapper around exp(x) and ln(x) like 11x1EE1EE, we can run an algebraic proof to verify emlvm equiv 11x1EE1EE x to confirm that ln(exp(x)) == x.

Golfing for unknown functions

While the paper lays out the basics, there are plenty of operations—like negation (-x)—that have huge minimum program lengths (e.g., $K=15$).

To find these, I built a golf CLI command, which basically does an exhaustive search over all valid RPN trees up to a max length $K$, filtering them at C-speed using numpy and exact complex numbers, and then verifying candidates against algebraically independent transcendental numbers using mpmath to guarantee correctness.

It computes trig functions by accident

Because the entire VM uses complex math representations under the hood (via mpmath.mpc), we get trigonometric functions for free via Euler's formula.

Normally, computing sin(x) on purely real numbers with EML takes at least 75 tokens! But if we simply compute exp(i * x) on the VM with an imaginary input, it prints out the correct cos and sin components instantly.

The Exponential Blow-up (and how to fix it)

If you read the paper, you'll notice a catch: composing functions leads to an exponential blow-up in tree depth. Multiplication alone takes a depth-8 tree with over 40 leaves. Searching for programs like sqrt(x) requires exploring a trillion-parameter search space.

However, as some folks pointed out in discussions, this is an artifact of representing the mathematical functions as pure trees. If we represent the programs as a Directed Acyclic Graph (DAG) by caching and reusing common sub-expressions—similar to lemma definitions in MetaMath or standard registers in a CPU—the exponential blow-up collapses completely. Program scales become polynomial rather than exponential. Extending EMLVM to support a memory register/DAG compilation target is definitively the next big challenge.

Where to take it next?

The community discussions around this have been incredibly inspiring. Some ideas I'm hoping to explore (or hoping someone else PRs into the repo!):

  1. Symbolic Derivatives: Adding a command to find the algorithmic derivative of any EML tree. Tracing exactly how the calculus plays out on raw expressions could yield clear proofs on why certain functions lack indefinite integrals in symbolic form.
  2. Absolute Value & Signum: Squeezing out abs(x) (perhaps via sqrt(x*x)?) and eventually pushing to min/max and signum. Pushing these across a complex domain is going to generate incredibly strange, branch-cut-heavy behavior that I'm eager to chart.
  3. Alternate Basis Functions: Exploring an Elias omega coding-style basis like $2^x - \log_2(y)$ with constants 1 and $e$. Building exponential topologies with base-2 might prove to be far more computationally feasible at scale.
  4. Fuzzy Logics: Extending this complex number domain behavior to fuzzy logic. Unifying Lukasiewicz and product logics under complex truth values utilizing exponential properties—what does a "complex truth value logarithm" actually look like in an EML tree?

If any of this sounds interesting, PRs are open!


← Back to all posts