Zikko's Resources


Counting the pages of a PDF document with PHP

There are several PDF Page Counter components out there, but none that works under PHP in an easy and free way. So I wrote this function, inspired by a number of articles presenting solutions (non-PHP ones) based on regular expressions.

It's a bit obsolete

Since after writing this article, I've studied the PDF specification in more depth and come up with an article about extracting text from a pdf. The code there also covers counting the pages, possibly more accurately - but also possibly slower, which is why I'm keeping this article here.

How does it work?

The assumption here is that each page is preceded by the strings "/Type" and then "/Page". There may or may not be whitespace between them, and the "/Page" string should not be followed by an "s". Note that this assumption might not be completely bullet proof, but it's good enough for me. I wrote the function in such a way that the entire file is never loaded into memory at once, since PDF documents can be massive.

How do I use it?

$pages = pdf_count_pages('my_pdf_document.pdf');

The code

function pdf_count_pages( $fn )
{
    $buffer='';
    $chunk_size=1024*5;
    $buffer_size = $chunk_size + 10;
    $count = 0;
    
    $f=fopen($fn,'rb');
    
    while(!feof($f))
    {
        $buffer .= fread($f,$chunk_size);
        
        $buffer = substr($buffer,-$buffer_size);
        
        $check_pos = 0;
        
        while(true)
        {
            $type_pos = strpos( $buffer, '/Type', $check_pos );
            
            if( $type_pos === false )
                break;
            
            $page_pos = strpos( $buffer,'/Page', $type_pos );
            
            if( $page_pos === false )
                break;
                
            if( $page_pos+5 >= strlen($buffer) )
                break;
                
            if( $page_pos - $type_pos <= 6 and $buffer[$page_pos+5] != 's')
            {
                $count++;
                $buffer = substr($buffer,$page_pos+5);
                $check_pos = 0;
            }
            else
                $check_pos = $type_pos + 5;
        }
    }
    
    fclose($f);
    
    return $count;
}


All content on these pages may be used, copied and modifed freely. Questions or comments may be sent to . Also visit zikko.se.