r/cpp

▲ 25 r/cpp+2 crossposts

hi!
i made a memory allocation library/learning resource. i wanted to learn more about them and i couldn't find one comprehensive source of knowledge, so i decided that i'll make one of my own:].
it currently has these basic allocator types: arena (linear), stack, pool, free list, free tree, tracking, buddy, slab.
i gave my best to describe everything clearly in the readme, also added svg diagrams (written in Typst, btw). i plan to implement a bucket/size-segregated free list allocator as well.
hoping anyone will find this resource useful!
https://github.com/nihiL7331/oo-alloc

u/Financial_Travel_543 — 14 hours ago
▲ 9 r/cpp

C++ profiles: a chance to fix some annoying defaults? Brainstorming and ideas.

Hello everyone,

Lately I have been thinking about the opportunity that profiles could give to C++ for "better defaults" and "cleanups".

Which profiles would you like to see in an eventually profile-enforced version as "standard" or "enabled by default" that you think can be fit reasonably?

I will start:

- ununitialized variables: must use [[indeterminate]]
- [[nodiscard]] by default? Would that be possible? Maybe this changes the meaning.
- hardened std lib guarantee?
- type safety/bounds safety (in user code)
reddit.com
u/germandiago — 1 day ago
▲ 50 r/cpp+1 crossposts

citor: a header-only C++20 thread pool tuned for sub-µs dispatch

I just released citor, a small header-only C++20 thread pool / parallel runtime aimed at CPU-bound workloads where per-dispatch latency actually shows up in the profile.

Repo: https://github.com/Lallapallooza/citor

The main idea is: keep the common CPU-parallel shapes in one pool, avoid per-call allocations on the hot path, let the producer participate as slot 0, and make short repeated phases cheaper than repeatedly waking a worker team.

The simplest thing looks like what you'd expect:

citor::ThreadPool pool(8);

pool.parallelFor<citor::HintsDefaults>(
    0, data.size(),
    [&](std::size_t lo, std::size_t hi) {
        for (std::size_t i = lo; i < hi; ++i)
            data[i] *= 2;
    });

Beyond parallelFor, it has deterministic parallelReduce, parallelScan, parallelChain, runPlex for repeated phases over the same partition, recursive forkJoin with per-worker Chase-Lev deques, bulkForQueries, and submitDetached. There is also a PoolGroup that creates one arena per shared-L3 group, mostly useful on multi-CCD Zen.

A few internals that ended up mattering more than I expected:

  • each worker owns a cache-line-aligned mailbox and the whole dispatch protocol is a per-slot mailbox stamp, no shared queue
  • the producer can short-circuit small jobs by CAS-ing the worker's mailbox to DONE itself and running the body inline, no wake at all (worker's own ack races the producer's self-stamp, loser short-circuits);
  • the join barrier is a per-slot done-epoch scan with cancellation riding the same epoch read, so no shared sense bit and no per-iteration cancel poll
  • the worker's spin-entry rdtscp doubles as a store-buffer drain, so the producer sees the DONE stamp before its next mailbox read - free side benefit of timing the spin
  • kCacheLine is 128 bytes rather than 64 because Zen prefetches in cache-line pairs and contended atomics get measurably worse if you size to 64.

For perf, I wrote a comparative harness against BS::thread_pool, dp::thread_pool, task-thread-pool, riften, oneTBB, Taskflow, Eigen, OpenMP, Leopard, dispenso, libfork, and TooManyCooks. Competitor revisions are pinned, host gates are printed at startup, OpenMP wait policy is normalized, and raw samples can be exported as JSON.

In my current benchmark sweep, citor wins roughly:

  • 92% of contested cells on a Ryzen 9950X3D
  • 75% on a 96-core Genoa box
  • 69% on a 48-core Sapphire Rapids box

Hot fan-out dispatch on the 9950X3D is usually in the 100-400 ns range depending on participant count and shape.

Please treat those as "my harness on my machines or aws," not universal truth. If the numbers matter to your use case, run the benchmark yourself. The README has the methodology and reproduction commands.

There is real work left:

  • topology detection is still shaped mostly around Zen CCDs
  • multi-socket EPYC, sub-NUMA clustering, hybrid P/E cores, and Intel mesh are not first-class yet
  • parallelReduce uses static contiguous chunks and does not steal after a worker finishes, so heavy-tail bodies can leave cores idle
  • the coroutine wrapper queues on a per-pool driver thread rather than doing continuation stealing
  • bulkForQueries only fans across queries today a true 2D fan is probably the next useful shape.

What citor is not:

  • not an I/O executor
  • not a general async/future abstraction
  • not a TBB or OpenMP replacement for arbitrary workloads
  • not tuned equally for every CPU topology

I'd especially like feedback on benchmark fairness, API shape before 1.0, missing competitors, and whether the affinity / pinning behavior is too surprising for a library like this and for sure any perf improvenments suggestions. If anything in the README reads like overclaiming, I'd rather fix it now.

upd. There is an external benchmark as well https://github.com/tzcnt/runtime-benchmarks

github.com
u/ShabelonMagician — 1 day ago
▲ 27 r/cpp+1 crossposts

FluxUI — write your C++ UI once, run on Windows, Linux, and Android natively

FluxUI — write your C++ UI once, run on Windows, Linux, and Android natively

Most C++ UI frameworks drop the ball on Android. FluxUI doesn't — same C++20 codebase, all three platforms. The framework handles all platform-specific details under the hood so you never have to think about them.

The API is Flutter-inspired (declarative widgets, reactive state), and there's a CLI to scaffold and run projects in two commands.

Just tagged v0.1.0. It's early but the core is solid.

GitHub: https://github.com/HeyItsBablu/flux

Feedback welcome — especially from anyone who's tried cross-platform C++ UI and given up.

u/dEvator8085 — 1 day ago
▲ 16 r/cpp

Should I continue learning C++?

I’m not sure if this is the right subreddit, but I just needed to put this somewhere. My ex (M20) and I (F20) broke up about 2 weeks ago. Before our breakup, he was teaching me C++, his favourite language. I also code, but only in Python, so C++ felt pretty different to me and I was still at a very beginner stage. The thing is, I still want to keep learning it on my own, as I find it pretty interesting. But now, every time I try to write C++, I immediately think of him and end up feeling weirdly emotionally loaded for no particular reason. It feels a bit ridiculous because it’s literally just C++. At this point I'm not sure if I should keep going with C++ or just pick up another language.

reddit.com
u/_unstableunicorn_ — 1 day ago
▲ 70 r/cpp

Virtual dispatch isn't always the slowest, and std::variant isn't always the fastest

I've been looking at how OpenJDK's GC barrier system picks its implementation at runtime using templates instead of virtual dispatch. The trick is lazy resolution: you pay once at first use instead of a vtable lookup on every call.

That got me curious enough to benchmark it against three other approaches: virtual functions, function pointers, and std::variant + std::visit. I was surprised to see std::variant being the slowest on libstdc++ while virtual dispatch beat it comfortably.

Please refer to the blog for my full analysis. Would love to hear what you think!

Edit: Benchmarks are on GCC 11 (Ubuntu 22.04 default). GCC 12+ significantly improves std::visit. Full compiler version comparison in the next post.

shubhankar-gambhir.github.io
u/AdMotor4869 — 2 days ago
▲ 7 r/cpp+3 crossposts

BeeMesh++ — A distributed volunteer computing framework built with modern C++ & Asio

Hi,

We have been working on an open-source project called BeeMesh++ which is the C++ implementation of the original python code BeeMesh.

This is basically like SLURM but for multiple geographically independent devices.

It uses a nature-inspired architectural model:

  • The Hive (Orchestrator): Manages the state of the network, tracks available compute nodes (bees), handles job dispatching logic, and aggregates results.
  • The Bees (Workers): Volunteer compute nodes that connect to the Hive, announce their availability, listen for incoming serialized task payloads, execute them, and stream the results back.

NOTE: This is still in it's early stages.

Plan ahead would be to implement encryption for all the network communications, communication between bees, parallelizing independent code blocks etc.

Feedback, architectural critiques, or code reviews appreciated.

u/dheerajshenoy22 — 1 day ago
▲ 120 r/cpp+2 crossposts

Announcing iceoryx2 v0.9: Fast and Robust Inter-Process Communication (IPC) Library

ekxide.io
u/elfenpiff — 3 days ago
▲ 23 r/cpp+3 crossposts

7 New Projects Made in Unigine, Using C language

New projects and games made in Unigine. C is a primary form of coding in Unigine

youtu.be
u/Confident_Door9438 — 2 days ago
▲ 80 r/cpp

Boost 1.91.0 is now available in both Conan and vcpkg

For those of you waiting to upgrade through your package manager, Boost 1.91.0 has landed in both Conan and vcpkg.

What's in 1.91:

  • Boost.Decimal — new library implementing IEEE 754 decimal floating point arithmetic (from Matt Borland and Christopher Kormanyos)
  • Asio binary versioning — optional inline namespace lets multiple Asio versions coexist in the same process without symbol conflicts
  • 58 fewer internal dependencies across 55 libraries
  • StaticAssert merged into Config — no code changes needed, just update your dependency declarations when ready
  • CMake import std detection fix

Install:

conan install --requires=boost/1.91.0

vcpkg install boost

Links:

boost.org
u/boostlibs — 3 days ago
▲ 37 r/cpp

Clang Lifetime Safty Doc Update

Intro:

Clang Lifetime Safety Analysis is a C++ language extension which warns about potential dangling pointer defects in code. The analysis aims to detect when a pointer, reference or view type (such as std::string_view) refers to an object that is no longer alive, a condition that leads to use-after-free bugs and security vulnerabilities. Common examples include pointers to stack variables that have gone out of scope, pointers to heap objects that have been freed, fields holding views to stack-allocated objects (dangling-field), returning pointers/references to stack variables (return stack address) or iterators into container elements invalidated by container operations (e.g., std::vector::push_back)

The analysis design is inspired by Polonius, the Rust borrow checker, but adapted to C++ idioms and constraints, such as the lack of exclusivity enforcement (alias-xor-mutability). Further details on the analysis method can be found in the RFC on Discourse.

This is compile-time analysis; there is no run-time overhead. It tracks pointer validity through intra-procedural data-flow analysis. While it does not require lifetime annotations to get started, in their absence, the analysis treats function calls optimistically, assuming no lifetime effects, thereby potentially missing dangling pointer issues. As more functions are annotated with attributes like clang::lifetimebound, gsl::Owner, and gsl::Pointer, the analysis can see through these lifetime contracts and enforce lifetime safety at call sites with higher accuracy. This approach supports gradual adoption in existing codebases.

clang.llvm.org
u/ContDiArco — 3 days ago