Thursday, April 23, 2009

Making PDFs using PHP

Having spent several days using a HTML->PDF conversion program (DOMPDF), I've become disappointed with the original conversion software. Part of the reason was the the code was abstracted beyond belief. Every line of code seemed to make a reference to a class whose name was calculated then dynamically loaded; every function referred to another function in the same class which contained only one line setting a protected object in the class.
 
The code was also unneccessarily padded out with comments like the one below making a thousand lines into five thousand:
 

  /**
   * a record of the current font
   */
  public  $currentFont = '';
 
and here's the bit which really upset me:
 
function DOMPDF_autoload($class) {
  $filename = mb_strtolower($class) . ".cls.php";
  require_once(DOMPDF_INC_DIR . "/$filename");
}
 
if ( !function_exists("__autoload") ) {
  /**
   * Default __autoload() function
   *
   * @param string $class
   */
  function __autoload($class) {
    DOMPDF_autoload($class);
  }
}
 
What the above code does is to hunt through an (abstracted) directory for a file of a particular description whenever a class or function is called which isn't in the already loaded code. Not only that, but the whole thing has been abstracted to a further external function. Try debugging that !
 
So in summary, I couldn't really tell what was going on. More importantly, the code has some serious security vulnerabilities, didn't convert images accurately, and was unnecessarily slow. In fact, the DOMPDF code represents everything I hate about obsessively written abstracted software.
 
Predictably, the set of classes didn't actually create PDFS. It was itself an abstraction of a beautiful set of code written right here in New Zealand, which also has a concisely written manual. http://www.ros.co.nz/pdf/readme.pdf. Like all good code, it isn't loosely coupled, everything works the way you expect first time, and you can quickly find out what is going on.
 
To run it, you load a single file containing a single class, instantiate the class and call the methods laid out in the manual. Like this:
 
require_once("class.ezpdf.php");
$pdf =& new Cezpdf();
 
Add an image like this:
 
$pdf->addJpegFromFile('COLDHOT2.JPG',150,650,100);
Add a table like this:
 
$dblink = getconn();
$results = "";
$stmt = "select itemid,name from gst";
$result = mysqli_query($dblink,$stmt);
if ($result) {
 while($row = mysqli_fetch_array($result,MYSQLI_ASSOC)){
  $results[]=$row;
 }
}
$data = $results;
mysqli_close($dblink);
$pdf->ezTable($data,'',$title,$opts);
Add a block of text on the page like this:
 
$opts = array('shaded'=>0);$opts['showHeadings'] = 0;
$opts['yPos'] = 400; $opts['xPos'] = 200;
$opts['fontSize'] = 11; $opts['showLines'] = 0;
$data = array(array('aaa'=>"Is this a\nmulti line problem"));
$pdf->ezTable($data,'',$title,$opts);
File it to disk like this:
 
$pdfcode = $pdf->ezOutput();
$fp=fopen('findthis.pdf','wb');
fwrite($fp,$pdfcode);
fclose($fp);

Or push it to the browser like this:
 
$pdf->ezStream();
The big problem is that I now have two output types. Firstly, an html representation of a document, and secondly a PDF representation. It is certainly possible that the two will diverge over time so that what is printed on the user's printer is different from what is PDF'ed up and sent to a trading partner. In my opinion, this is a small price to pay for a much faster, more reliable, safer and better looking result. DOMPDF is now officially scrapped!

No comments:

Post a Comment