Ever heard of Positional Ascii before now? I would place a bet that the reason you’ve found this article is because you’re a PHP software developer who’s been asked to parse a Positional Ascii file and do something with the data. Ok, let’s get started…
What is Positional Ascii
Positional Ascii is a little bit like a CSV file. However the data does not have a separator, but a position at where the data starts and finishes.
For example, see this basic ascii specification:
Column Name Offset Length header_1 1 4 header_2 4 5 header_3 8 10
Here’s the positional ascii data the above specification is describing:
AB2AS 1 7891 AB2AS11 7891 AB2BS 7891
And here’s the data once parsed:
$asciiData = array( 0 => array('header_1' => 'AB2', 'header_2' => 'AS 1', 'header_3' => '7891'), 1 => array('header_1' => 'AB2', 'header_2' => 'AS11', 'header_3' => '7891'), 2 => array('header_1' => 'AB2', 'header_2' => 'BS', 'header_3' => '7891'), );
How To Parse Positional Ascii
My first thought was to use to use php’s native fscanf function. It’s the counterpart to the sscanf function. However, after a while testing this, I realised that sscanf tokenises input using whitespace. This is no good to us as we know that there may be whitespace in out file that we may (or may not) need to take into account.
Given that the fscanf function was ruled out, the solution I used was to use php unpack function (see below).
Solution 1: Write Your Own Parsing Code
If you would like to see how the unpack was implemented in the PHP Datafeed Library, see this example.
Solution 2: Use the PHP Datafeed Library
- You can download the PHP Data Feed Library from here.
The PHP Data Feed Library was created to solve a multitude of problems whilst parsing data feed files and works with multiple formats including CSV and, amazingly, Positional Ascii.
This example is taken from this unit test: tests/Dfp/Datafeed/File/Reader/Format/AsciiTest.php
This example assumes a file format as follows:
Column Name Offset Length header_1 1 4 header_2 4 5 header_3 8 10
File Content (“file.asc”):
AB2AS 1 7891 AB2AS11 7891 AB2BS 7891
Code to parse this file:
// Zend Autoloader is needed for the Datafeed Library require_once 'Zend/Loader/Autoloader.php'; $autoloader = Zend_Loader_Autoloader::getInstance(); $autoloader->setFallbackAutoloader(true); $positionalInfo = array( 'header_1' => array('offset' => 1, 'length' => 3), 'header_2' => array('offset' => 4, 'length' => 5), 'header_3' => array('offset' => 8, 'length' => 10), ); $format = new Dfp_Datafeed_File_Reader_Format_Ascii(); $format->getDialect()->setPositionalInfo($positionalInfo); $format->getDialect()->setSkipLines(0); $fileReader = new Dfp_Datafeed_File_Reader(); $fileReader->setFormat($format); $fileReader->setLocation("file.asc"); $i = 0; foreach ($fileReader as $row) { echo print_r($row, 1) . PHP_EOL; } print_r($fileReader->getErrors());
Result:
$asciiData = array( 0 => array('header_1' => 'AB2', 'header_2' => 'AS 1', 'header_3' => '7891'), 1 => array('header_1' => 'AB2', 'header_2' => 'AS11', 'header_3' => '7891'), 2 => array('header_1' => 'AB2', 'header_2' => 'BS', 'header_3' => '7891'), );
Conclusion
Positional Ascii is a very old format that is very rarely used. That said, there are some very old computer systems still in use and positional ascii is still used by some of them.
Positional Ascii Writer? There are currently no plans to implement a positional ascii file writer in the PHP Data Feed Library due to how little format is used these days. However, you are welcome to contribute a positional ascii writer adapter to the PHP Data Feed Library project.
Permalink: http://www.websitefactors.co.uk/php/2012/07/positional-ascii-file-parser-in-php/
You’ve won the bet in the introduction of this article.
Also, thanks.
@ruben
Glad I could help. Fork the project on github, if you find any bugs or have any improvement suggestions, feel free to submit them.