PDF processing

The PDF samples demonstrate how to extract images from PDF files, apply corrections, and save the modified PDF. This is useful for improving scanned documents, photo albums, or any PDF containing images that would benefit from enhancement. This requires that the PDF content is "layered" - saved with the images as separate objects rather than flattened into a single layer. The sample uses the PFCPDFImageIterator class to access and modify images within the PDF structure.

What these samples demonstrate

The PDFSample projects show how to use the PFCPDFImageIterator class to iterate through images in a PDF, correct each one, and write the results back to a new PDF file.

Key functions

A summary of key functions and purposes:

Function	Purpose
`PFCPDFImageIterator::load`	Open a PDF file for processing
`PFCPDFImageIterator::nextImage`	Get the next image from the PDF
`PFCPDFImageIterator::save`	Write the modified PDF to disk
`PFCPDFImage::imageFile`	Access the image data for correction

C/C++ implementation

The following excerpts are from PDF_Sample.cpp.

Load the PDF file

Open the PDF file using the iterator. The load function returns a status code indicating success or failure.

#include "PFCPDFImageIterator.h"
#include "PFCPDFImage.h"

PFCPDFImageIterator file;

// Open the PDF file
if (file.load(inputPath) != PFCPDFImageIterator::LoadStatus::Ok) {
    printf("Cannot load PDF: %s\n", inputPath);
    return 1;
}

Iterate and correct images

Loop through all images in the PDF using nextImage(). Each call returns a PFCPDFImage object containing the unpacked image data that you can correct like any other image.

// Create correction engine
PFCENGINE *pEngine = PFC_CreateEngine();
PFC_LoadAIEngine(pEngine, 
    AI_SCENE_DETECTION | AI_CORRECTIONS | AI_COLOR | AI_FACEMESH, 
    binPath.c_str());

// Process each image in the PDF
while (PFCPDFImage* pdfImage = file.nextImage()) {
    // Access the image data
    PFCImageFile* imgFile = pdfImage->imageFile;
    
    // Set up PFCIMAGE struct
    PFCIMAGE im;
    im.width = imgFile->width;
    im.height = imgFile->height;
    im.stride = imgFile->stride;
    im.format = (PFCPIXELFORMAT)imgFile->pfcImageFormat();
    im.data = imgFile->raw_image;
    
    // Calculate correction profile
    PFCPARAM param;
    PFC_SetParam(param);
    PFCPROFILE pProfile = PFC_Calc(&im, NULL, pEngine, CALC_ALL, -1, NULL, NULL, NULL);
    
    // Apply corrections
    PFCAPPLYSTATUS status = PFC_Apply(&im, pEngine, pProfile, param, NULL);
    printf("Image corrected with status: %d\n", status);
    
    // Release profile (image data is modified in place)
    PFC_ReleaseProfile(pProfile);
}

PFC_DestroyEngine(pEngine);

Save the modified PDF

After processing all images, save the PDF with the corrected images embedded.

if (file.save(outputPath) != PFCPDFImageIterator::SaveStatus::Ok) {
    printf("Cannot save PDF: %s\n", outputPath);
    return 1;
}

Complete workflow

The full workflow loads, processes, and saves in sequence.

// 1. Load PDF
PFCPDFImageIterator file;
file.load("input.pdf");

// 2. Set up correction engine
PFCENGINE *pEngine = PFC_CreateEngine();
PFC_LoadAIEngine(pEngine, AI_SCENE_DETECTION | AI_CORRECTIONS | AI_COLOR, binPath);

// 3. Process all images
while (PFCPDFImage* pdfImage = file.nextImage()) {
    PFCImageFile* imgFile = pdfImage->imageFile;
    
    PFCIMAGE im;
    im.width = imgFile->width;
    im.height = imgFile->height;
    im.stride = imgFile->stride;
    im.format = (PFCPIXELFORMAT)imgFile->pfcImageFormat();
    im.data = imgFile->raw_image;
    
    PFCPARAM param;
    PFC_SetParam(param);
    PFCPROFILE pProfile = PFC_Calc(&im, NULL, pEngine, CALC_ALL, -1, NULL, NULL, NULL);
    PFC_Apply(&im, pEngine, pProfile, param, NULL);
    PFC_ReleaseProfile(pProfile);
}

// 4. Clean up and save
PFC_DestroyEngine(pEngine);
file.save("output.pdf");

Important notes

The nextImage() function returns images one at a time. When you call nextImage() again, the previous image's memory may be reused. Process each image completely before moving to the next.

The corrections are applied in place to the image data. When you call save(), the PDF is written with all the corrected images.

Source files

You can find the code snippets in the following files:

Platform	Sample	Path
Linux/macOS	`PDFSample`	`Linux/PDFSample/PDF_Sample.cpp`
Windows	`PDFSample`	`Win/PDFSample/PDFSample.cpp`

PDF processing requires the PFCPDF library (libPFCPDF.so on Linux, PFCPDF.dll on Windows). Make sure this library is available in your build environment.

PFC-SDK Version 10.7.2.1269 built from 4fa849d8101945eea725a08dd0dae5101f090fa0 on 11-10-2025.

PDF processing

On this page