Would a mostly automatic book scanner be useful in a library setting?

I’m trying to understand whether book scanning is a real library workflow problem or just a niche hobbyist problem.

I’ve seen libraries use overhead scanners for local history, special collections, interlibrary loan, patrons scanning personal material, and staff digitization projects. But I don’t know what the day-to-day pain actually looks like.

I’m working on an early non-destructive scanner prototype that turns one page at a time and captures automatically. The goal would be to reduce staff/patron babysitting, not to replace review or preservation judgment.

For librarians or library staff:

  1. Do patrons ask for book/document scanning often?
  2. Is staff time the bottleneck, or is equipment quality/software/review the bigger issue?
  3. Would automatic page turning be helpful, or would it be too risky for public/patron use?
  4. Would privacy/offline processing be a requirement?
  5. What would make a scanner practical for a public library: durability, easy training, low maintenance, accessibility, export formats, price?

I’m not trying to sell anything here. I’m trying to understand whether this belongs in libraries at all, and what would make it useful rather than another device staff have to babysit.

reddit.com
u/adldotori — 20 hours ago

What would a sane archive workflow for digitizing bound books preserve?

I’m not building an AI notes app or another cloud wrapper. I’m trying to understand the capture/archive side of physical books.

For owned, public-domain, or permissioned bound material, what would you want a proper digitization workflow to preserve?

My rough guess:
- raw page images
- processed/cropped page images
- searchable PDF
- OCR text
- metadata
- checksums
- logs for missing/duplicate pages
- manual review notes
- maybe a folder structure that can survive outside any app

I’m working on an early non-destructive scanner prototype that turns one page at a time and captures automatically. The question I’m trying to answer is whether automatic page turning actually matters to people who archive books, or whether the hard part is still QC, OCR, metadata, and long-term file organization.

For people who have digitized books/manuals/old documents: where does the workflow really break?

reddit.com
u/adldotori — 20 hours ago

Do physical books break your PKM workflow too?

I’m working on a non-destructive book scanning workflow and trying to understand a very specific problem:

Web articles, PDFs, and Kindle highlights are relatively easy to get into Obsidian/Notion/Readwise. Physical books are where my PKM workflow breaks.

I’m trying to talk to people who have actually tried turning physical books into searchable notes, Markdown, highlights, or a personal knowledge base.

A few questions:

  1. Have you ever wanted to bring a physical book into your vault or second brain?
  2. Would you care more about full OCR, highlights/quotes, summaries, or clean page images?
  3. What would make the output “good enough” for your real workflow?
reddit.com
u/adldotori — 22 hours ago
▲ 11 r/zotero

How do you handle physical books or scanned chapters in Zotero?

Zotero handles clean PDFs beautifully. Physical books, scanned chapters, course packets, and older printed material are harder.

For researchers using Zotero:

  1. Do you keep scans in Zotero at all?

  2. What matters most: OCR text, page image quality, ISBN/DOI metadata, annotations, or citation fields?

  3. If a bound book becomes a searchable PDF, what metadata would make it useful rather than clutter?

I’m looking at non-destructive scanning for owned/public-domain/permissioned material and trying to understand the real research workflow.

reddit.com
u/adldotori — 1 day ago

Is scanning the reason physical books never make it into your vault?

I use Obsidian heavily, but physical books are still awkward.

I can manually type notes. I can photograph a few pages. I can run OCR. But for an actual book or even a full chapter, the capture process is enough friction that I usually just don’t do it.

Curious if others feel the same.

For physical books:

  1. Do you only capture your own notes/highlights, or do you ever want the source text too?

  2. Have you abandoned a book-to-Obsidian workflow because scanning was too tedious?

  3. Would mostly automated scanning change anything for you, or is manual note-making the point?

  4. How many pages/books would make automation worth caring about?

Only asking about owned/public-domain/permissioned material.

reddit.com
u/adldotori — 1 day ago

Would you use a way to extract highlights from physical books?

I’m researching a problem around physical books and reading workflows.

Kindle, Reader, and web highlights flow nicely into Readwise and Obsidian. But physical books still require manual typing, photos, or messy OCR. I’m exploring a non-destructive scanning workflow that could turn physical books into OCR text, quotes, highlights, or Markdown.

I’m curious:

  1. Do you read physical books but wish the highlights could end up in Readwise/Obsidian?

  2. Is full-book OCR valuable, or are highlights and quotes enough?

  3. What export format would you actually use?

reddit.com
u/adldotori — 1 day ago
▲ 0 r/PKMS

Do physical books break your PKM workflow too?

I’m working on a non-destructive book scanning workflow and trying to understand a very specific problem:

Web articles, PDFs, and Kindle highlights are relatively easy to get into Obsidian/Notion/Readwise. Physical books are where my PKM workflow breaks.

I’m trying to talk to people who have actually tried turning physical books into searchable notes, Markdown, highlights, or a personal knowledge base.

A few questions:

  1. Have you ever wanted to bring a physical book into your vault or second brain?

  2. Would you care more about full OCR, highlights/quotes, summaries, or clean page images?

  3. What would make the output “good enough” for your real workflow?

reddit.com
u/adldotori — 1 day ago