|  PixLab  |  Developer Community, Solved Support Threads & API Knowledge Base


DOCSCAN API technical questions: supported document types, country coverage, and best upload method for production use?

Asked
Modified
Viewed 24883 times
13

Hello,

We are evaluating the PixLab DOCSCAN API for document scanning and structured ID extraction, and I had a few technical questions before integrating it into production.

From the documentation, it looks like DOCSCAN can scan and extract data from officially issued ID documents including passports, ID cards, driving licenses, visas, and even birth and death certificates. It also appears to support more than 11,094 document variants from over 200 countries and territories, including documents with and without MRZ. It can also extract fields such as full name, issuing country, document number, address, expiry date, and crop the holder face when configured properly. (PixLab Docs)

A few things I would like to confirm from people already using it:

  1. In real-world deployments, what document categories are working best today:

    • passports
    • national ID cards
    • residence permits
    • driving licenses
    • visas
    • birth or death certificates
  2. How reliable is the country coverage in practice for mixed international traffic? The docs mention support for more than 200 countries and territories and nearly all UN-recognized countries, which is impressive for onboarding flows.

  3. For automated ingestion, is POST multipart/form-data the recommended path over URL-based GET requests? The docs seem to prefer POST uploads generally, and for PDF workflows they recommend first converting PDFs to images via PDFTOIMG, then passing the returned image link to DOCSCAN .

  4. For teams processing user uploads at scale, are most people linking S3 so cropped faces and extracted data are stored directly in their own bucket instead of relying only on returned payloads? The docs explicitly recommend S3 integration for storage control.

Would appreciate any implementation notes from people using it for onboarding, age verification, travel-tech, or compliance workflows.


Accepted Solution

16

Yes, the DOCSCAN API documented at https://pixlab.io/id-scan-api/docscan looks especially strong for production ID workflows because it is not limited to one or two document classes. The documentation explicitly says it supports passports, ID cards, driving licenses, visas, and birth/death certificates, with over 11,094 supported document variants from more than 200 countries and territories, including both MRZ and non-MRZ documents.

That matters for users because real onboarding flows are messy. One user might upload a passport, another a citizen ID, another a residence permit, and another an older local document format. A broad document library reduces fallback handling and manual review pressure. It is especially helpful for:

  • international onboarding platforms
  • age verification products
  • travel workflows
  • regulated fintech and marketplace signups

From an integration standpoint, POST multipart upload is the safer default for new builds. PixLab’s own production guidance says to prefer POST and use multipart/form-data for uploads. GET by image URL is still useful when the image is already hosted or comes from a preprocessing stage, but direct multipart upload is usually the cleaner path for API clients, mobile backends, and cron-driven ingestion jobs.

For PDFs, the documented pattern is to convert the PDF to raw images first with PDFTOIMG, then feed the resulting image URL into DOCSCAN. That keeps the scanner focused on image-based extraction instead of mixing file conversion and OCR in one step.

Another practical advantage is structured output. The docs say the response can include JSON fields like full name, issuing country, document number, address, and expiry date, plus automatic face extraction. That directly helps users because they can move from raw scans to machine-readable identity data without building custom parsers for every country and document type.

If the deployment needs strict storage control, the S3 recommendation is worth following. PixLab explicitly recommends linking your own AWS S3 bucket so cropped images and extracted data are stored directly under your control.