Any day traders who moved to China under the "six-year rule"?
Is there anybody?
Can it be done?
What was your experience with both the "six-year rule" and with moving to China?
I am using ea on vps, and not sure about the nuances.
Is there anybody?
Can it be done?
What was your experience with both the "six-year rule" and with moving to China?
I am using ea on vps, and not sure about the nuances.
Reality check up front: this is slower than your GPU today. What's interesting is why it works at all, and where the ceiling is.
The memory wall is the usual bottleneck for LLM inference: shuttling weights from VRAM to compute units. With BitNet's ternary weights ({-1, 0, +1}), there's a different option — instead of moving weights to a processor, do the math inside the DRAM chip.
The mechanism (this is what the explainer walks through visually):
If you send a DDR4 chip an ACT–PRE–ACT sequence with timing that violates JEDEC's tRAS/tRP rules in specific ways, the sense amps don't have time to fully resolve any single row. Multiple rows open at once and the analog charges mix on the bitlines. The sense amp then resolves to the majority value across the opened rows — every bitline in the subarray becomes one MAJ gate, computed in parallel across the row.
MAJ(a, b, 0) = AND. From AND you build ternary × int8 multiplies (the activations are int8 in BitNet, so the multiply decomposes into masked ANDs across 8 bitplanes + popcount). From those, you build a full transformer linear layer.
This isn't a hack — it's a line of published research (SiMRA, FracDRAM, POPCNT3) showing that the out-of-spec behavior of commodity DRAM is exploitable for compute. We built an end-to-end path from HuggingFace BitNet b1.58-2B-4T → PyTorch → DDR4-as-multiplier → next token.
What's in the explainer (10 scenes):
Code pane on each scene references specific lines in our project repo and the relevant paper section.
Honest caveats:
Happy to answer questions about the calibration nightmare, the row-decoder hypothesis for why max K = 32, or why ternary weights map onto this particular kind of compute so cleanly.