Wednesday, 12 October 2016

Read CSV Data into a PHP Array

The CSV ("Comma Separated Values") file format is one of the oldest in existence that is still used today to exchange data between dissimilar applications. In fact, it's been around for far longer personal computers have! CSV files are often employed to import data from an application such as MS Excel into a database. The glue that holds the import process together can be any programming language, but, as we'll see here today, there are some very good reasons why PHP is one of the best for this purpose. In today's tutorial, we'll write a PHP script that reads in a CSV file and converts its contents into a multi-dimensional array for insertion into a WordPress database.

More on the CSV Format

The name "CSV" would seem to indicate the use of the comma to separate data fields. In reality, the CSV format can vary quite a bit. For that reason, never assume anything; open the file in a text editor and make sure that the format matches your expectations. The field delimiter can be any character from tabs to the "|" symbol. Moreover, fields may be enclosed within single or double quotation marks. Sometimes, the first record is a header containing a list of field names.

Here is a sample file created by Excel's Save As command. It includes header fields and uses the standard comma delimiter:
- Advertisement -

Store,Opening Hour Mon-Wed,OH Thurs,OH Fri,OH Sat,Sun
Aberdeen Beach,11:30-22:00,11:30-22:00,11:30-22:30,11:30-22:30,11:30-22:00
Basildon,12:00-22:30,12:00-22:30,12:00-23:00,12:00-23:00,12:00-22:30
Bath,12:00-22:30,12:00-22:30,12:00-23:00,12:00-23:00,11:30-22:30
Birmingham,11:30-23:00,11:30-23:00,11:30-23:00,11:30-23:00,11:30-23:00
Bluewater,12:00-22:30,12:00-22:30,12:00-23:00,12:00-23:00,12:00-22:30
Braehead,11:30-22:30,11:30-22:30,11:30-23:00,11:30-23:00,11:30-22:00
Braintree,12:00-22:30,12:00-22:30,12:00-23:00,12:00-23:00,12:00-22:00
...

Converting File Entries into Associative Arrays

PHP 5 is very well suited for reading in CSV files. Starting with the file_get_contents() function, we can load the file contents into a string variable. Explode() then splits each line of the file into an array. As an extra step, I like to remove the headers from the file using the array_shift() function, which is then fed into str_getcsv(). That's a specialized function for parsing a CSV string into a numerically indexed array. From there I use a foreach loop to create a key for each field from the headers using array_combine():

$csv = file_get_contents('./input_data.csv', FILE_USE_INCLUDE_PATH);
$lines = explode("\n", $csv);
//remove the first element from the array
$head = str_getcsv(array_shift($lines));

$data = array();
foreach ($lines as $line) {
  $data[] = array_combine($head, str_getcsv($line));
}

The above code produces the following array construct:

array(72) {
  [0]=>
  array(6) {
    ["Store"]=>
    string(14) "Aberdeen Beach"
    ["Opening Hour Mon-Wed"]=>
    string(11) "11:30-22:00"
    ["OH Thurs"]=>
    string(11) "11:30-22:00"
    ["OH Fri"]=>
    string(11) "11:30-22:30"
    ["OH Sat"]=>
    string(11) "11:30-22:30"
    ["Sun"]=>
    string(11) "11:30-22:00"
  }
  [1]=>
  array(6) {
    ["Store"]=>
    string(8) "Basildon"
    ["Opening Hour Mon-Wed"]=>
    string(11) "12:00-22:30"
    ["OH Thurs"]=>
    string(11) "12:00-22:30"
    ["OH Fri"]=>
    string(11) "12:00-23:00"
    ["OH Sat"]=>
    string(11) "12:00-23:00"
    ["Sun"]=>
    string(11) "12:00-22:30"
  }
  [2]=>
  array(6) {
    ["Store"]=>
    string(4) "Bath"
    ["Opening Hour Mon-Wed"]=>
    string(11) "12:00-22:30"
    ["OH Thurs"]=>
    string(11) "12:00-22:30"
    ["OH Fri"]=>
    string(11) "12:00-23:00"
    ["OH Sat"]=>
    string(11) "12:00-23:00"
    ["Sun"]=>
    string(11) "11:30-22:30"
  }
  [3]=>
  array(6) {
    ["Store"]=>
    string(10) "Birmingham"
    ["Opening Hour Mon-Wed"]=>
    string(11) "11:30-23:00"
    ["OH Thurs"]=>
    string(11) "11:30-23:00"
    ["OH Fri"]=>
    string(11) "11:30-23:00"
    ["OH Sat"]=>
    string(11) "11:30-23:00"
    ["Sun"]=>
    string(11) "11:30-23:00"
  }
  [4]=>
  array(6) {
    ["Store"]=>
    string(9) "Bluewater"
    ["Opening Hour Mon-Wed"]=>
    string(11) "12:00-22:30"
    ["OH Thurs"]=>
    string(11) "12:00-22:30"
    ["OH Fri"]=>
    string(11) "12:00-23:00"
    ["OH Sat"]=>
    string(11) "12:00-23:00"
    ["Sun"]=>
    string(11) "12:00-22:30"
  }
  [5]=>
  array(6) {
    ["Store"]=>
    string(8) "Braehead"
    ["Opening Hour Mon-Wed"]=>
    string(11) "11:30-22:30"
    ["OH Thurs"]=>
    string(11) "11:30-22:30"
    ["OH Fri"]=>
    string(11) "11:30-23:00"
    ["OH Sat"]=>
    string(11) "11:30-23:00"
    ["Sun"]=>
    string(11) "11:30-22:00"
  }
  //...
}

Working with the Imported Data

With the CSV data loaded into an array, we are ready to load the data into our system. In this particular case, that involves fetching the post ID so that we can save each location's opening times as WordPress meta data. Luckily, the store names are unique. The location name should be sanitized before inclusion in a query because you can never fully trust external data. WordPress provides the $wpdb->prepare() function for that purpose.

Each location array is passed to the array_slice() function in order to separate the opening times from the store name. Inside the foreach loop, the data is formatted into HTML so as to display the label (array key), colon (:), and value on each line. The formatted string is then saved to the database via the update_post_meta() function, using the post ID as the unique identifier:

foreach ($data as $location) {
  //Get the ID for the store name.  Should only be one.
  $ids = $wpdb->get_col( $wpdb->prepare( "select ID from $wpdb->posts where post_title = %s", $location["Store"] ) );
  
  if (count($ids) >= 1) {
    $opening_times = '';
    foreach (array_slice($location, 1) as $days => $hours) {
       $opening_times .= $days . ': ' . $hours . '<br>';  
    }
    update_post_meta($ids[0], 'opening_times', $opening_times);
  }
}

Here is an example of the formatted markup and its presentation in the browser:

Mon - Wed: 11:30-22:00<br>
Thurs: 11:30-22:00<br>
Fri: 11:30-22:30<br>
Sat: 11:30-22:30<br>
Sun: 11:30-22:00
Mon - Wed: 11:30-22:00
Thurs: 11:30-22:00
Fri: 11:30-22:30
Sat: 11:30-22:30
Sun: 11:30-22:00

Conclusion

Besides having many specialized functions such as str_getcsv(), another distinct advantage to using a PHP script to import CSV data is that it's easy to set it up as a cron job that runs at a given interval, such as every day at 6 Am. The file_get_contents() function can even accept a file URL so that it can retrieve a file directly from an external partner's server.

0 comments:

Post a Comment