r/OutSystems

OutSystems teams: when AI enters the project, where does the training data come from?
▲ 11 r/OutSystems+1 crossposts

OutSystems teams: when AI enters the project, where does the training data come from?

A question that I think is going to get more important across OutSystems shops as Mentor and Data Fabric move from announcement into adoption.

If your AI / ML / data science team asked for production-shape data from your OutSystems applications tomorrow (real volumes, real customer patterns, real edge cases) to train or fine-tune a model or process, how long until they had it?

A few things I'm trying to understand from teams actually doing this:

1) Where does the training data come from in practice? A Service Studio extension? Direct SQL Server / Oracle queries against the runtime DB? A timer that exports to a file? Something on the Forge? Mentor / Data Fabric integration?

2) For OutSystems entities specifically (with all the platform metadata, BPT process instance data, system tables), do you train on the user-data-only subset, or is the platform context part of the value?

3) Anonymization before training: who owns it? The OutSystems team? The data team? Both, in a handoff?

4) For ODC vs O11, does the AI access pattern look different given the operational model differences?

5) Is your AI / ML initiative moving forward steadily, or is it stalled because the data plumbing isn't ready?

My newsletter edition this week is the series finale, on whether low-code data infrastructure is ready for AI. Asking r/OutSystems before I extrapolate from generic patterns. The OutSystems-specific shape of this is what I want to learn.

u/thisisBrunoCosta — 4 days ago
▲ 8 r/OutSystems+1 crossposts

Mendix teams: how do you actually anonymize production data for dev/acceptance?

Your Mendix team needs production-shape data in dev or acceptance to test properly. Real volumes, real distributions, real edge cases that synthetic test data won't surface.

But that data has personal information in it. GDPR doesn't care which environment it's sitting in.

How do you actually handle this in practice**?**

A few specific things I'm trying to understand:

1) Do you anonymize before the data leaves production, during transfer, or after it lands in the lower environment**?**

2) If you anonymize after, where does the un-anonymized export sit in the meantime, and is that intermediate storage in scope for your compliance audits**?**

3) For consistency (a name appearing in 47 records becoming the same fake name across all 47), are you using a tool, a Java action, a microflow that runs at first startup, or something else**?**

4) For referential integrity across the domain model, what breaks first when teams improvise**?**

5) For Mendix Cloud restores specifically, how are you handling the gap between "full backup restore" and "restored-then-anonymized"?

I'm not pitching anything. My next newsletter edition is on this topic and I'd rather hear how Mendix teams actually solve it before I extrapolate from generic patterns.

Have you found a clean approach, or is everyone improvising and hoping the auditor doesn't ask too many questions?

Thanks!

u/thisisBrunoCosta — 11 days ago