by AJ Miller

If you are tasked with modifying some data in the Magento database, there are a few strategies you will find to tackle this. They can be summarized in roughly three categories:
A) Create a script that will loop through the Magento objects in question and make modifications

B) Perform a SQL query on the database directly

C) Don’t modify the data at all, and just make modifications on the fly in the view templates or controllers
**there are other ways to tackle this option, but the principle remains the same

Generally I try to avoid Option B at all costs with Magento since its database design is based on the EAV pattern. There’s been a lot of discussion about the pros and cons of this database design (mainly the cons), so suffice it to say if you screw up your queries, which you have a high likelihood of doing, you will be in for a world of hurt on your production database that suddenly doesn’t work anymore. Option C is enticing if you have a developer on staff to make modifications and tweaks as time goes on, because every change will require modifying source code, but it is often a very quick way to make sweeping changes in a large SKU database (at the expense of a little additional rendering time per page). Since our client did not have a developer on staff and generally wants changes we make to be modifiable in the Magento admin panel (as much as possible) we settled on option A. The advantages to creating a script are that you end up modifying the database data itself, so the changes are often present in the Magento admin panel (ex. a change to product titles). Also, as a general principle, we try to get any scripts we write to be as idempotent as possible.

Let’s walkthrough setting up a bulk upload script to make a change for us. As an example, this script will add a new text field to any product that matches a ‘Name Required’ option that’s present or not. This could be useful for, say, a t-shirt company that needs the name to be printed on the product.

First, we have to fire up the Magento app. Here we just initialize it with default values:

<?php

require_once 'app/Mage.php';
Mage::app('default');

Next we grab a collection of all products and apply a filter so we ensure we’re only dealing with those fields that meet our criteria (in this case, that name_req attribute is true or ‘1’). You can add as many or as few filters as you need here to get the dataset you are looking for. You simply tell Magento to getModel(‘whatever’)->getCollection() for whatever model you are attempting to load, and then apply a filter to that collection if needed:

$collection = Mage::getModel('catalog/product')->getCollection();
$collection->addFieldToFilter(array(array('attribute'=>'name_req', 'eq'=>'1')));

Next, since we’re adding a Product Option to these products, we create a generic option in a PHP array that we’ll use later:

$name_option = array(
    'title' => 'Name',
    'type' => 'field',
    'is_require' => 1,
    'max_characters' => '20');

You can read more about Magenta Options and what keys to use in this excellent blog post by Oleg Ishenko.

Finally, we loop through each product in the collection above and add our option in if it isn’t already present (remember – idempotent! so don’t make the change if it already exists). I’ve added a number of comments to help the reader work through this code:

foreach ($collection as $product) {
    // Grab the current options present on this product
    $options = $product->getOptionInstance()->getProductOptionCollection($product);

    // Assume there will be a change that needs to be saved.
    $needs_save = 1;

    // Keep track of how many of these fields we find.
    // We only want to end up with one,
    // so if more are found they need to be deleted.
    // There is no 'is_unique' validation for product options in Magento,
    // so we have to do our own here.
    $num_of_name_fields = 0;

    if ($options->count() > 0) {
        foreach ($options as $oneOption) {
            // I found that whether Magento returns
            // an object or an array is unpredictable.
            // I was forced to add this check, although
            // it might have been the dataset I was 
            // working with so YMMV.
            // We're basically just checking if there is
            // an option that already has a title called
            // 'Name' and if so, increasing our 
            // $num_of_name_fields counter so we can keep
            // track of how many Name fields are present.
            // The $needs_save logic will be explained a
            // little later.
            if ((is_object($oneOption) && 
                 $oneOption->getTitle() == "Name") ||
                (is_array($oneOption) && 
                $oneOption['title'] == "Name")) {
                $num_of_name_fields++;
                $needs_save--;

                // We're using a trick here where the first
                // time a Name field is found, the 
                // $num_of_name_fields count will be 1, so
                // it won't be deleted. Any future Name fields
                // found in the options array will mean the
                // $num_of_name_fields will be greater than 1,
                // so they'll be marked for deletion.
                if ($num_of_name_fields > 1) {
                    $needs_save++;
                    // Again, whether the $option is returned as
                    // an array or an object was unpredictable
                    // so I needed this check to call 'delete'
                    // the right way. You may or may not need this.
                    if (is_object( $oneOption )) {
                        $oneOption->delete();
                    } else if (is_array( $oneOption )) {
                        $oneOption['is_delete'] = 1;
                    }
                }
            }
        }
    }
    if ($needs_save > 0) {
        // Here is the part that stumped me for a while.
        // OptionInstance() is a weird animal in Magento.
        // It's explained in detail below for those who are interested.
        $optionInstance = 
          $product->getOptionInstance()->unsetOptions();
        $optionInstance->setProduct($product);
        $product->setHasOptions(1);
        $product->setRequiredOptions(1);
        if ($num_of_name_fields < 1) {
            $optionInstance->addOption($name_option);
        }
        $product->save();
    }
}

By putting these pieces together, you should have a good foundation bulk update script for making any sweeping changes you want to an object in the Magento database. This works with the Magento object models instead of querying the database directly, so all of the database queries are handled safely by Magento itself.

For those who are interested in the $needs_save variable above, and how it works, it’s actually fairly simple. It first assumes a save is needed for every product object ($needs_save = 1). Then if it finds an existing match to what it was going to do (i.e. if a product option titled ‘Name’ is already present) it assumes a save isn’t needed for that operation and subtracts 1 ($needs_save--). This results in a variable that is keeping track of whether or not there are any actual changes that need to be saved back to the database. If $needs_save is greater than 0 then there are some changes to save, so proceed to execute a save() operation. If not, then don’t bother saving. This trick takes a little extra time to implement but can tremendously speed up scripts like this when there is a large subset of data that won’t need changes done at all (i.e. you have 10,000 entries and only 50 will actually need something done).

Lastly, to understand the OptionInstance() code above you have to understand something about singletons (which Magento uses a lot). OptionInstance is actually a singleton that represents a set of arbitrary options. We retrieve that singleton via the product model $optionInstance = $product->getOptionInstance()->unsetOptions();. Then, since it’s a singleton, we have to tell it to reset itself otherwise it will contain any options that have been added before (in other words, if you remove this line, the script above will add one Name field to the first product, the Name field twice to the second product, three times to the third product, etc etc). Please note this will not delete all the options on your product despite what the documentation sounds like. It will only reset the ‘group’ of options you are trying to add to your product. This means, each time we run through our loop, we have to fetch the OptionInstance() singleton, reset it to a blank state, set it to the current product, tell it which option(s) to add to this product, and then save the product. As an additional aside, we don’t have to save the OptionInstance() directly. Magento, when saving the product, will check the OptionInstance() to see if any changes have occurred there and will save them for us. Just calling ->save() on the product is enough to trigger that check and save the options to the database.

That wraps it up. I hope this helps you to create your own scripts to quickly modify your Magento dataset. Automating these types of changes can save a tremendous amount of time and will result in a much cleaner end result. Besides, who honestly wants to click through that admin panel to make the same edit to 500 products?

If you need further assistance with your Magento project, contact KnockMedia today.