Creating a SwiftUI Document Scanner with Smart Auto-Capture and PDF Export
This guide walks you through building a SwiftUI document scanner for iPhone and iPad using Dynamsoft Capture Vision. You'll learn how to implement live camera detection with stable quad stabilization, support manual capture fallback, import images from the gallery, and export your scans as PDF or JPEG files. The following questions and answers cover the entire development process from setup to deployment.
What is the core functionality of this scanner?
The app uses the device camera to automatically detect document boundaries and stabilize the detected quad for reliable capture. It includes a cooldown mechanism to prevent duplicate captures. Users can also manually capture, import existing images from the photo library, review and adjust individual pages, and finally export the collection as a multi-page PDF or as separate JPEGs. All processing is powered by Dynamsoft Capture Vision, which runs entirely on the device to ensure privacy and speed.
What are the required prerequisites?
To build this project you need Xcode 16 or newer, iOS 16 or higher as the deployment target, a valid Dynamsoft Capture Vision license key (a 30-day free trial is available), and a Mac with Swift Package Manager resolution enabled in Xcode. The app relies on the capture-vision-spm Swift package hosted on GitHub. Make sure your Xcode project has package dependencies set to resolve automatically.
How do you install and configure the SDK?
Add the Swift Package Dynamsoft/capture-vision-spm to your Xcode project, pinning to version 3.4.1200 or later. In your app's entry point (e.g., the App struct), import DynamsoftCaptureVisionBundle and call LicenseManager.initLicense("YOUR_KEY", verificationDelegate: nil). In this demo, we use a DocumentScannerStore as an environment object to share state across views. The license initialisation runs before any scanner view appears.
How do you stabilize document detection from the camera feed?
The CameraScannerView wraps a UIViewControllerRepresentable that creates a CameraView and attaches a CameraEnhancer instance. It starts the DetectAndNormalizeDocument_Default template. A cooldown period is implemented so the same document rectangle is not re‑captured immediately after detection. The app only records a new page when the quad remains stable for a short duration—typically a few hundred milliseconds—and then captures it. This reduces duplicate pages and improves the user experience.

Can users fall back to manual capture or import from gallery?
Yes. The interface provides a manual capture button that triggers an image capture immediately, regardless of automatic detection. Additionally, users can tap an import button to open the photo picker (PHPickerViewController) and select existing images from their camera roll. Both manual and imported images are added to the same scan set. This flexibility ensures the user can handle documents that the auto‑detect might miss or that are already stored on the device.
How are scanned pages reviewed, edited, and exported?
After capture, pages appear in a scrollable review grid. Tapping a page opens a detailed edit view where the user can adjust the crop region by dragging corner handles. The app applies deskewing and perspective correction automatically. From the review screen, users can delete unwanted pages or reorder them. Finally, they can export the entire set as a multi‑page PDF (using UIGraphicsPDFRenderer) or as individual JPEG images. Export uses the Dynamsoft‑normalized images for the best quality.
Does the app support both iPhone and iPad?
Yes, the SwiftUI interface adapts to different screen sizes. On iPad, the camera view and page review can be displayed side by side using SwiftUI's adaptive layout modifiers or a split view controller when needed. The Dynamsoft Capture Vision SDK works identically on both devices, and the export options (PDF/JPEG) are equally functional. The only difference is that on iPad the gallery picker and file save dialogs appear as popovers, while on iPhone they present modally.
Related Articles
- React Native 0.84: Hermes V1 Becomes the Default Engine
- React Native 0.79: Faster Startup, New Metro Features, and Community-Driven JSC
- Google Chrome Android Update Grants Users Approximate Location Privacy
- The Hidden Impact of Data Normalization: From Dashboard Confusion to AI Governance Risks
- 10 Key Facts About End-to-End Encrypted RCS Between iPhone and Android
- 5 Key Updates in React Native 0.84: Faster Performance and Streamlined Builds
- How to Upload Custom Artwork in the Plex Mobile App: A Step-by-Step Guide
- Swift for Windows Gains Dedicated Workgroup to Drive Platform Support