u/mxroute

▲ 18 r/mxroute

Fixing stability on per-server Roundcube instances

As many of you have no doubt noticed, the servername.mxrouting.net/webmail instances go down every time someone adds a domain, every time someone requests an SSL certificate, every time the system renews someone's SSL certificate. This is an old problem that has reached critical mass with our growth this year. Basically, Apache config reloads cause brief outages. With DirectAdmin still running as our backend (for now), every user has their own httpd.conf include that has to be loaded on every config reload. The more users we have, the longer that takes. The longer it takes, the more likely you notice the webmail go down.

Ultimately, our plan isn't to scale these webmail instances. Our plan is to replace them. But right now they drive your custom webmail.yourdomain.tld URLs, and they serve the one-click webmail login from the control panel.

Tonight, all servers are being upgraded to Litespeed to resolve this.

status.mxroute.com
u/mxroute — 9 days ago

fable-mode, a slash command to enable on-demand Fable-like behavior in Claude Code

Warning: This will burn through tokens. This is nothing more than a rigor+style overlay for your own Claude Code sessions. It brings some of the rigor and voice you may have appreciated in Fable 5 to Opus 4.8.

I know there are at least a few of these around, but I much prefer mine so I'm just sharing. Feel free to hate it. This enables "/fable-mode" in your Claude Code sessions to emulate some of the behavior you may have enjoyed with Fable 5. This was of course created by asking Opus 4.8 to review all session history (the entire weekly limit for a Max 25 plan) with Fable 5 and turn it into an instruction manual for Opus 4.8. That does not include the reasoning, only my prompt(s) and its responses. It does not add any capability that Opus does not already possess, it guides its capabilities just like any other prompt (just heavily in the direction of how Fable 5 acted, and I wouldn't dismiss that too quickly as cosmetic). It was tested against the same benchmark project that I had Fable 5 working on before it was disabled, and for my purposes the results were fairly impressive. I especially liked that it seemed to smooth over the process of handing a Fable project to Opus, where previously Opus just seemed to miss everything good about what Fable had been doing.

Context: My personal benchmark of every new model is to ask Claude to build the webmail app of my dreams all on its own from one prompt, and deploy it to a cloud server that was created exclusively for the project. I almost exclusively refuse to answer any question it asks, informing it instead to "Make the decision that you feel provides the best user experience within the definitions and constraints that were given in the initial prompt." This is not how I recommend production projects be completed, it is a benchmark specifically chosen as it is the kind of thing I am personally most prepared to quickly identify the point of failure. My requirements and constraints are ridiculous, failure is built in (though it's never told where the inherent failure will be). The benchmark isn't whether or not it succeeds, it's about observing how far it gets before it creates something worthy of deletion. Fable went absurdly far in this benchmark compared to the others before it, and it was taken away before I could observe the point that it failed. Some of the references in the .md files do relate to that specific benchmark case and my messages to it, but were left in place because they were valid examples of behavior.

github.com
u/mxroute — 17 days ago
▲ 23 r/mxroute

MXroute's road to feature-complete

I've been a bit quiet here this past week. I've been switching between the items listed here as I work on marking everything here as deployed: https://docs.mxroute.com/docs/changelog/mxroute-4.1.html

As I get frustrated with one project, I simply move to another and come back to it. One finger dipped into each project all of the time keeps things interesting, but it also makes progress appear slow as no single project often gets done quickly. So I wanted to update you on these.

  1. CalDAV/CardDAV Performance and Reliability

Of course, posted about that recently. Done. I can verify better experiences across the board from the logs.

  1. AI Spam Filtering

I'm still struggling to get what I want with this, but haven't given up. Just running emails through a local LLM is easy, and I like that. But I still want everyone to have the benefit of private AI spam filtering without any additional cost to any customer. I'm on a path, the concept is laid out, there are just several points in the plan where I have strong decisions to make and I want to explore the choices before I finalize them all.

  1. SMTP Logs in the Control Panel

This was almost done. It was about a day away from shipping to prod. The problem is that it timed out on accounts that receive a ton of email, and it was slow. In the past, I might have said "Well it's v1, ship it and come back to it." I really want to get this right. The whole process was canned and started over. The backend is being tested for internal automations, and any day now the frontend work will begin again.

  1. Self-Service Spam Filter Settings

This is low hanging fruit, but I'm on the fence about when to finalize it. The AI spam filtering decisions could open up some dramatic changes to this, and I'd hate to introduce everyone to a new UI only to pull it back later and say "You know what, you don't need all of this anymore because of this other thing that I just deployed." It feels sloppy. Could be finished in one afternoon, holding for now.

  1. User-Configurable Spam Folder

This is also low hanging fruit now. The logic is down, the pieces are in place. But they could be altered by AI spam filtering, self-service spam filter settings, and in-house webmail client.

  1. Simplified Default Folders

Holding out for AI spam filtering, self-service spam filter settings, and in-house webmail client.

  1. In-House Webmail Client

This is a really big one, and I don't want to release something basic and then iterate. I want you to fall in love with this on day one. Almost everything about MXroute hinges on this one release. It determines so many decisions. I'm being far too much of a perfectionist on it. But it is in progress.

reddit.com
u/mxroute — 18 days ago
▲ 24 r/mxroute

Boring but nice: CalDAV/CardDAV improved

This is one of those small things that just might improve someone's day. CalDAV/CardDAV will now perform faster and more reliably as of today. We have implemented a login caching mechanism that takes your submitted password + our private key to create a hash that gets stored for 60 seconds. If you submit additional login requests to dav[.]mxroute[.]com within that 60 seconds, and the password you submit creates the same hash, we simply don't bother performing the backend IMAP login verification that we first do to confirm your authentication.

Simple, as secure as something like this can/should be, no less than 50% increase in performance and should almost entirely resolve any reliability issues that you've had with it up until today.

reddit.com
u/mxroute — 21 days ago
▲ 16 r/mxroute

Random act of kindness: Dear scammers,

You should never, ever, teach scammers how to do their job better. But this one I just hit one too many times as of today. Please pass on this message to your local village idiot:

If the scale of your scam is easily measured by inbound mail, and the scale of your scam operation is large, don't forward it all to Gmail. Seriously. I have graphs. When you forward 1600 emails in an hour, all to the same Google-hosted address, it's my job to make sure nothing went wrong. That's when you get caught. Exim logs are more than sufficient to identify a scam operation.

Stop it.

reddit.com
u/mxroute — 23 days ago
▲ 15 r/mxroute

New scam album dropped

The beats suck and the lyrics are 100% AI. Endless free service through repeat refund requests - denied. You’ve been warned.

trustpilot.com
u/mxroute — 28 days ago
▲ 14 r/mxroute

Warmup Is How You Spend Everyone Else's Reputation

Let's get this out of the way once and for all. The next time someone says "I disagree that warmup is a big deal" this is my answer.

blog.mxroute.com
u/mxroute — 1 month ago

New exim vuln dropped: CVE-2026-48840

It means nothing to us. It might have if I'd carried on a different project in the past that I scrapped, but it doesn't.

reddit.com
u/mxroute — 1 month ago
▲ 57 r/mxroute

MXroute has been an unwilling participant in a gigantic scam for years

A good while back I added a forbidden use case to our policy that was directly targeting this scam. But back then, it had no name that I was aware of. The forbidden use case policy line simply reads:

"Chinese counterfeit ecommerce websites"

I really didn't know what else to call it. They were always from China, and the patterns were always the same.

Tonight I uncovered another leg of this operation that quietly existed on our service for 5 years. It's difficult to track because they use one central domain to operate as their email sender/support for thousands of websites at once, and the very moment that domain gets reported as a scam somewhere, they throw it out of commission and rotate it. So by the time I recognize what they're doing, they've been operating on our platform for years. It is incredibly frustrating.

However, at this point I've identified key indicators that will allow me to catch them. MXroute will no longer be a participant in these operations.

It feels like every day I'm writing a warning to someone new. The amount of people who desire to make us participants in spam and scam operations only grows larger every year. Yet here we are. Another day, another warning. I hope they see this, and I hope anyone following in their footsteps sees this. My kids are fed, clothed, and sheltered. I don't need dirty money.

Sometimes people wonder why policy lines are vague, and this is exactly why. Because every day is a brand new adventure, and at some point an overly specific policy starts to look like a book that names half of the world's population.

srlabs.de
u/mxroute — 1 month ago
▲ 17 r/mxroute

Growth = finding out why everything sucks (yesterday it was exim)

I always feel the need to say: This isn't "look how unreliable MXroute is" but instead it's "we don't experience problems without talking about them publicly." No one talks to their customers about this stuff. Just so happens my nickname in high school was "no one." Don't fact check me on that.

Yesterday the SMTP server on the glacier[.]mxrouting[.]net server went sideways for about 41 minutes. The cause looks like an Exim bug worth sharing in case any other mail server admins have been seeing weird intermittent panics they can't explain.

The error in the logs was "bad internal_store_malloc request (2147483632 bytes)" repeating about 14 million times. Same exact byte value every time, which is suspicious enough to dig into. That number is what you get when Exim's internal block-size counter ("store_block_order") hits 31. The counter ratchets up over the lifetime of a daemon every time a pool needs a new block. The only thing that brings it down is "store_reset," floored at 13. If you've got any pool where alloc beats free over time, the counter creeps up and eventually starts refusing every allocation in that pool.

This is technically Exim Bug 3047, filed January 2024. The fix that shipped in 4.97.1-5 patched the one allocation path that was known to trigger it (regex matches in "check_dir_size"). Underlying counter is still uncapped though, and any other pattern that pumps it up produces the same panic. I checked the current master branch and it has the same uncapped increment.

We were on 4.99.1. Upgrading to upstream 4.99.3 wouldn't help (store.c is identical). From all of our servers on the same build, only 2 exploded, 1 was leaking slowly, and the rest were clean. So workload clearly matters and I don't yet know what specifically on the affected hosts is pumping the counter up. Still digging on that part.

We're looking at a local patch to cap the counter at order 24 (16MB max block, well below the danger zone). Until then, we'll monitor and force some quick restarts here and there (highly unlikely you'll notice, no one ever notices our intentional restarts).

If anyone else has seen intermittent failure on long-uptime Exim daemons with no obvious cause, while the service technically remains online, check your paniclog for "internal_store_malloc." A daemon restart fixes it instantly, which is probably why it keeps getting dismissed.

TLDR version: Fuck you, Murphy.

reddit.com
u/mxroute — 1 month ago
▲ 16 r/mxroute

Another day, another lesson with Dovecot

Almost all remaining inconsistent behavior with IMAP on our servers seems to come down to one thing:

Certain user actions cause DirectAdmin to reload the Dovecot config, and it appears to have no rate limit (it's closed source, can't look) on how often it will do that.

This caused a mild but noteworthy set of complaints about the fusion[.]mxrouting[.]net server today. This is another case where our recent growth caused us to run into problems that we never had before. Some things you just can't predict, especially when relying on closed source software (can't wait to pull it from the fleet).

So as of today we're starting to roll out, first on the servers that have shown evidence of seeing this (fusion, blizzard, chocobo), a blocker for DA. We're taking over the reload process in systemd for Dovecot and enforcing no config reloads more than once per 60 seconds. This should eliminate all currently remaining issues with IMAP performance and reliability, though it may mean that your vanity hostnames for IMAP/POP take up to 60 seconds to go live.

reddit.com
u/mxroute — 1 month ago
▲ 14 r/mxroute

Credential stuffing on the rise

In the last 3 days we’ve seen an increase in the use of credential stuffing scripts that are successfully hitting email accounts. There appears to be no correlation to recent growth, and the only pattern appears to be the use of these scripts (some are on GitHub, some are private, but we’ve seen all of them before). There is no server, local part, or industry correlation among them.

If I had to guess with what I have in front of me, there must have been a noteworthy database compromise somewhere recently. Something that would attract users from all parts of the world with no industry connection.

This is a friendly reminder not to reuse passwords. Every password should be unique, use a password manager to remember them. Unless of course the account doesn’t matter at all, in that case go ahead and use the same password you used in high school for it. The security should always match the value and risk associated with what’s behind it. Your email account should be on the higher end, because we’re not going to put up with you being compromised every 2 days for 6 months.

reddit.com
u/mxroute — 1 month ago

CVE-2026-45185, if it comes up.

If anyone sees any report of our servers being vulnerable to CVE-2026-45185, it is true by exim version, but we don't use GnuTLS. So it's not a problem we're addressing by itself. We'll let that get patched out by a vendor upgrade later, since it isn't relevant to us.

reddit.com
u/mxroute — 1 month ago
▲ 23 r/mxroute

The difference between a mistake and "literally the reason you signed up for MXroute"

Regardless of which one it is, "cold email" (spam) is not allowed on MXroute. I'll say it every day. But there is sometimes a difference between "signed up for MXroute to send cold email (spam)" and "established customer who has other qualities." It's a judgement call. There's no easily defined criteria for it, it involves the damages caused, the scope and severity of the event, etc.

Just another warning to the "cold email" crowd. Keep 'em coming.

u/mxroute — 2 months ago
▲ 26 r/mxroute

The era of SMTP DDOS has returned

Hey friends,

Several boxes across our fleet are seeing slightly decreased performance as we deal with a DDOS attack targeting our SMTP servers. Two of our older cPanel-based boxes got hit so hard they went down earlier, just because I hadn't adapted our mitigation techniques for them. The rest of the boxes are either seeing no issues or mild performance issues. The mild performance issues come from how careful I was in the mitigation to not block traffic from undeserving customer IPs. I've given them a bit too long of a leash before they get blocked, and in preparation for one of the new features (proper client-facing email log access), our new log handler is taking a bit of a beating and capping CPU cores.

Overall impact is none to mild, but mild is still worth reporting. Got it all under control, just an FYI thing.

reddit.com
u/mxroute — 2 months ago