u/Bigolbagocats

Disclosure: I work at Cloudmersive as a technical writer and the code below uses our SDK

I’m curious how folks in this community are handling form spam in practice these days, specifically whether standard solutions (e.g. reCAPTCHA, honeypots, Akismet) are actually covering you, or whether you’re still seeing a ton of spam getting through?

While documenting this stuff I’ve noticed that most of these approaches check *how* a form was submitted rather than *what* was actually submitted in the form. For example, if a human types “hi I can offer you great SEO services for $99 a month” into a sales contact form, it goes straight through reCAPTCHA because a human submitted it. The API I’ve been documenting reads the field values and classifies them against configurable categories. For example that request would look like:

{
  "InputFormFields": [
    {
      "FieldTitle": "Message",
      "FieldValue": "Hi, I can offer you great SEO services for only $99/month"
    }
  ],
  "AllowUnsolicitedSales": false,
  "AllowPromotionalContent": false,
  "AllowPhishing": false
}

And the response would come back like:

{
  "CleanResult": false,
  "SpamRiskLevel": 0.92,
  "ContainsSpam": true,
  "ContainsUnsolicitedSales": true,
  "ContainsPromotionalContent": true,
  "ContainsPhishingAttempt": false,
  "AnalysisRationale": "Message contains unsolicited sales pitch and promotional pricing"
}

And the PHP integration would look something like this:

composer require cloudmersive/cloudmersive_spam_api_client

&lt;?php
require_once(__DIR__ . '/vendor/autoload.php');

// Configure API key authorization
$config = Swagger\Client\Configuration::getDefaultConfiguration()
    -&gt;setApiKey('Apikey', 'YOUR_API_KEY');

$apiInstance = new Swagger\Client\Api\SpamDetectionApi(
    new GuzzleHttp\Client(),
    $config
);

// Build the request body with your form fields and spam policy settings
$body = new \Swagger\Client\Model\SpamDetectionAdvancedFormSubmissionRequest();
//e.g. $body-&gt;setInputFormFields([['field_title' =&gt; 'Message', 'field_value' =&gt; $_POST['message'] ?? '']]); 
//e.g. $body-&gt;setAllowUnsolicitedSales(false); 
//e.g. $body-&gt;setAllowPhishing(false);

try {
    $result = $apiInstance-&gt;spamDetectFormSubmissionAdvancedPost($body);

    // CleanResult is false if spam was detected
    if (!$result-&gt;getCleanResult()) {
        // Handle flagged submission — log it, reject it, queue for review, etc.
        error_log('Spam detected: ' . $result-&gt;getAnalysisRationale());
    }

    print_r($result);
} catch (Exception $e) {
    echo 'Exception when calling SpamDetectionApi-&gt;spamDetectFormSubmissionAdvancedPost: ' . $e-&gt;getMessage() . PHP_EOL;
}
?&gt;

The body example in this case is wired to $_POST directly since I think that’s probably the most realistic use case? Basically you drop this wherever you’re currently processing submissions.

And on the flip side, if you’re doing content-based filtering of any kind like this API is, how do you handle false positives? For instance, I’ve seen a bunch of legitimate sales inquiries through a contact form that look a lot like spam.

{ "Successful": true, "SubDocuments": [ { "StartPage": 0, "EndPage": 2, "DocumentDescription": "Driver's License - Jane Doe", "FileBytes": "..." }, { "StartPage": 3, "EndPage": 6, "DocumentDescription": "Proof of Insurance - Policy #449201", "FileBytes": "..." } ] }

dotnet add package Cloudmersive.APIClient.NETCore.DocumentAI --version 1.0.0 using System; using System.IO; using Cloudmersive.APIClient.NETCore.DocumentAI.Api; using Cloudmersive.APIClient.NETCore.DocumentAI.Client; using Cloudmersive.APIClient.NETCore.DocumentAI.Model; namespace Example { public class ExtractSplitExample { public void main() { Configuration.Default.AddApiKey("Apikey", "YOUR_API_KEY"); var apiInstance = new ExtractApi(); var inputFile = new FileStream("C:\\temp\\batch_upload.pdf", FileMode.Open); try { SplitDocumentResponse result = apiInstance.ExtractSplit("Advanced", inputFile); foreach (var doc in result.SubDocuments) { Console.WriteLine($"{doc.DocumentDescription}: pp. {doc.StartPage}–{doc.EndPage}"); File.WriteAllBytes($"output_{doc.StartPage}.pdf", Convert.FromBase64String(doc.FileBytes)); } } catch (Exception e) { Console.WriteLine($"Error: {e.Message}"); } } } }

How do you handle form spam in PHP?

Curious what this community thinks about non-generative AI in document pipelines