u/DueKitchen3102

The video is available on another Reddit Channel

https://www.reddit.com/r/LocalLLaMA/comments/1te93s3/rag_on_snapdragon_x2_laptop_200k_documents/

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:

• 𝐌𝐚𝐬𝐬𝐢𝐯𝐞 𝐝𝐨𝐜𝐮𝐦𝐞𝐧𝐭 𝐜𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧: ~200,000 files being indexed (~100,000 completed in this run)

• 𝐋𝐨𝐰-𝐭𝐨𝐤𝐞𝐧 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥: only ~1200 retrieval tokens used in this experiment

• 𝐋𝐨𝐰-𝐦𝐞𝐦𝐨𝐫𝐲 𝐑𝐀𝐆: most data offloaded to disk with only a 128-shard active buffer

• 𝐅𝐚𝐬𝐭 𝐚𝐧𝐝 𝐚𝐜𝐜𝐮𝐫𝐚𝐭𝐞 𝐑𝐀𝐆 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 𝐨𝐧-𝐝𝐞𝐯𝐢𝐜𝐞

𝐁𝐞𝐡𝐢𝐧𝐝 𝐭𝐡𝐞 𝐬𝐜𝐞𝐧𝐞𝐬, 𝐕𝐞𝐜𝐌𝐋’𝐬 𝐚𝐥𝐥-𝐢𝐧-𝐨𝐧𝐞 𝐀𝐈 𝐝𝐚𝐭𝐚𝐛𝐚𝐬𝐞 𝐩𝐥𝐚𝐲𝐬 𝐚 𝐤𝐞𝐲 𝐫𝐨𝐥𝐞.

Enterprise-scale AI systems typically require multiple databases working together:
• Vector database
• Graph database
• Relational database
• Key-value store
• Search database
• Document database

We developed an in-house AI database platform that integrates the core functionality of all six systems into a unified architecture for enterprise AI and agent systems.

This enables joint optimization across indexing, retrieval, graph traversal, storage, and memory management, helping achieve low-token, low-memory, fast, and accurate AI systems on both cloud and AI-PC deployments.

Qualcomm recently released the new 𝐒𝐧𝐚𝐩𝐝𝐫𝐚𝐠𝐨𝐧 𝐗2 𝐥𝐚𝐩𝐭𝐨𝐩 𝐜𝐡𝐢𝐩𝐬𝐞𝐭. I immediately ordered one: ASUS Zenbook A16 16" 3K OLED Touchscreen Laptop — Snapdragon X2 Elite Extreme (2026)

A few things I really like about this machine:

𝐄𝐱𝐭𝐫𝐞𝐦𝐞𝐥𝐲 𝐥𝐢𝐠𝐡𝐭.
Recently, I carried it single-handedly across Hong Kong Airport from customs all the way to Gate G46 while still running programs before boarding. I felt I was holding a big cell phone.
𝐕𝐞𝐫𝐲 𝐩𝐨𝐫𝐭𝐚𝐛𝐥𝐞 𝐩𝐨𝐰𝐞𝐫 𝐚𝐝𝐚𝐩𝐭𝐨𝐫.
Compared to the heavy power brick required by RTX laptops, the adaptor is dramatically lighter. Nevertheless, its power consumption still exceeds the in-flight charging limit on United.
𝐒𝐭𝐫𝐨𝐧𝐠 𝐍𝐏𝐔 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞.
When the NPU is properly utilized, performance is good. For example, embedding/indexing speed reaches roughly 50% of an RTX 5060 laptop, while operating in a much lighter and quieter form factor.

The attached video demonstrates VecML’s AI-PC software running on this laptop.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:

• 𝐌𝐚𝐬𝐬𝐢𝐯𝐞 𝐝𝐨𝐜𝐮𝐦𝐞𝐧𝐭 𝐜𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧: ~200,000 files being indexed (~100,000 completed in this run)

• 𝐋𝐨𝐰-𝐭𝐨𝐤𝐞𝐧 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥: only ~1200 retrieval tokens used in this experiment

• 𝐋𝐨𝐰-𝐦𝐞𝐦𝐨𝐫𝐲 𝐑𝐀𝐆: most data offloaded to disk with only a 128-shard active buffer

• 𝐅𝐚𝐬𝐭 𝐚𝐧𝐝 𝐚𝐜𝐜𝐮𝐫𝐚𝐭𝐞 𝐑𝐀𝐆 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 𝐨𝐧-𝐝𝐞𝐯𝐢𝐜𝐞

We developed an in-house AI database platform that integrates the core functionality of all six systems into a unified architecture for enterprise AI and agent systems.

The demo shown here runs on a Snapdragon X2 Windows laptop. 𝐎𝐮𝐫 𝐦𝐚𝐜𝐎𝐒 𝐀𝐈-𝐏𝐂 𝐬𝐨𝐟𝐭𝐰𝐚𝐫𝐞 𝐢𝐬 𝐧𝐨𝐰 𝐨𝐩𝐞𝐧 𝐟𝐨𝐫 𝐜𝐨𝐧𝐭𝐫𝐨𝐥𝐥𝐞𝐝 𝐭𝐞𝐬𝐭𝐢𝐧𝐠.

RAG on Qualcomm's newest Snapdragon X2 Laptop, 200k documents

RAG on Snapdragon X2 Laptop, 200K documents.