r/ScientificComputing

Audited 512³ split-step quantum-state simulation on an i7 laptop — evidence packet included

I’m an independent researcher in Cairo working on CPU-first numerical simulation and reproducible solver evidence.

I recently released a bounded solver-evidence paper and SHA-256 locked artifact packet:

Audited Laptop-Scale 512³ Quantum-State Simulation: A REPA-Governed Solver Stack Beyond the Cluster-Only Assumption

DOI: https://zenodo.org/records/20247942

The claim is narrow:

  • 512³ internal-state complex split-step simulation using a oneAPI CPU backend on an Intel i7 laptop-class machine
  • persisted outputs are 2D amplitude/phase slice planes, not full 512³ volume dumps
  • separate Crank–Nicolson Hermitian conservation validation
  • separate GMRES/multigrid comparison against a PARDISO direct-solve oracle at calibration scale
  • dimension-tagged evidence matrix to prevent merging solver lanes

What I am not claiming:

  • not 512³ Crank–Nicolson execution
  • not 512³ GMRES/PARDISO parity
  • not cluster obsolescence in general
  • not proof of any AI/identity theory attached to the broader research program

I’m looking for hostile technical review: numerical issues, memory-accounting mistakes, evidence-boundary problems, reproduction suggestions, or places where the public claim should be narrowed.

Paper/evidence packet:
https://zenodo.org/records/20247942GitHub:
https://github.com/ChasingBlu/RECP_evidence

reddit.com
u/BlusLoopedMirror — 2 days ago
▲ 2 r/ScientificComputing+1 crossposts

MCP server for the TLA+ model checker tla-rs

Hi all,

Just shipped an MCP server some of you might find useful: **tla-mcp**.

TLA+ is a formal-spec language for designing concurrent and distributed
systems. You describe what your protocol should do and a model checker
tries every reachable state to catch invariant violations, deadlocks,
race conditions you didn't see coming. With tla-mcp registered, Claude
Code can call the checker as a first-class tool: validate a spec, run a
bounded check with a counterexample trace, replay specific scenarios, all from inside the chat.

Tool descriptions are deliberately opinionated about how the model
should use the checker (budget all limits upfront, treat `limit_reached`
as inconclusive, look at the last transition of a trace first) so the
guidance survives context truncation.

Install + client config snippet + tour of the four tools is on the
landing page: **https://fabracht.github.io/tla-rs/**

It's an experiment. Feedback and bug reports welcome.
reddit.com
u/Anxious_Tool — 4 days ago
▲ 9 r/ScientificComputing+1 crossposts

Two identical MPI jobs slow down drastically on Intel Alder Lake but not on Threadripper. Is it normal?

Hi everyone,

I regularly run multiple parallel MPI jobs simultaneously on my workstations. I have two systems:

  • Intel i7-12700 (12 cores: 8 P-cores + 4 E-cores), OS: Ubuntu 20.04
  • AMD Threadripper 3960X (24 cores, 48 threads), OS: Ubuntu 18.04

I wrote a simple C++ MPI test program that runs with mpirun -np 2. On both machines, a single instance finishes in about 12 seconds.

The problem appears when I run two instances at the same time (both mpirun -np 2):

  • Threadripper: Both finish in ~12 seconds (no slowdown)
  • Intel: Both take ~30 seconds (significant slowdown)

I tried pinning processes to specific cores using taskset and --cpu-set in mpirun. The processes do land on the correct cores (I verified with ps), but the slowdown persists.

Is this expected behavior for Alder Lake? Could the hybrid P-core/E-core architecture be causing memory bandwidth contention? Or am I missing something else?

I'm trying to figure out if my Intel system is performing normally or if I should be hunting for a configuration issue.

Additional notes:

  • My code shows reasonable&normal speed-up with increasing core numbers on both systems
  • The Intel PC has only one memory stick
  • The AMD PC has multiple memory sticks
  • My test code is not memory intensive (mostly CPU math)

I can provide more details if needed. I'm not super knowledgeable about CPU architectures, so apologies in advance.

Thanks for any insights!

reddit.com
u/hconel — 13 days ago
▲ 17 r/ScientificComputing+1 crossposts

PhysCC: A DSL Compiler for Physics Simulations (SYCL, MPI, AVX2)

I’ve been working on PhysCC, an open-source tool designed to bridge the gap between high-level physics equations and low-level hardware optimization.

The problem: Writing boilerplate for SYCL, MPI, or AVX2 stencils is tedious. The solution: You write a simple equation like u = u + dt * lap(u) and PhysCC generates the optimized backend code.

Key Features:

  • Multi-backend support (Single-core, OpenMP, MPI, SYCL, CUDA).
  • AI-informed pass: It analyzes the PDE type (Hyperbolic, Parabolic, Elliptic) and suggests optimal work-group sizes for Intel Iris Xe.
  • Built-in visualization script for heatmaps.

It’s still a work in progress, but I’d love to hear your thoughts on the codegen or the feature extraction logic!
https://github.com/NikosPappas/PhysCC

u/Pure_Treat6246 — 12 days ago