In the context of the daily technologic watch of our R&D departement whose purpose is to find new ways to optimize e-commerce websites, we are releasing an optimization extension for one of Magento’s most greedy index, the URL index !
But first of all, what’s an index ?
In order to reduce the time spent in collecting specific data (such as stocks, URL, prices…) in the database, Magento gathers them in « index » tables.
You can imagine this tables as baskets and the data in them as fruits. It’s much more faster to give users the fruits from baskets than making round trips to your fruit stall !
However, before being able to give efficiently this data, Magento need to fill up or update its indexes. This is called the re-index process.
These processes can take more or less time to execute themselves depending on the volume of data of your store and your server’s capacities.
Observation and problems
Some of our clients possess a great number of distinct URLs, so the reindex of «Catalog Url Rewrites» would take up to several hours to execute, whether it would be started manually in the BackEnd or by CRON.
For big catalogs, the following problems would appear:
- Changes made in the BackEnd that would need a URL re-index could not be updated frequently because of the duration of the process ;
- The process may sometimes not end properly ;
- If the process doen’t end properly, some URL may be « badly » rewritten and put at a disavantage your SEO ;
- Executed by CRON, it is sometimes compulsory to upgrade your RAM memory (PHP-CLI) ;
- As long as the URL re-index isn’t completed, other re-index processes can’t start.
So our R&D team asked itself:
Why is Magento’s URL re-index taking so long ?
We then noticed that Magento didn’t use some of the store’s and products’ settings.
- In fact, if you use short URLs for your products, Magento still creates both possibles URLs (long and short) in its index table.
- If your product is disabled, hence is not visible on your store, Magento stills indexes its URL.
- If your product has several categories and sub-categories, there will be as many URLs as categories even if they aren’t used on your FrontEnd.
- Same for products that are « not visible individually » which are generally used as associated products to a configurable product.
Their URL are created and stored in the database.
Solving and benefits:
The index optimization extension we developed allows a better control over the URLs that need to be generated (or not) and therefore improve the duration of the process.
Benefits for your store:
- The URL re-index only takes a few minutes instead of several hours (in certain cases) ;
- Being faster to execute, the process doesn’t end with an error or pending message anymore ;
- You don’t have 404 error anymore, no more impact on your SEO ;
- Your server’s RAM memory consumption is less important than before during the process.
To give a taste of what you can expect, here are the first results obtained of differents types of catalogs and configurations.
With 9500 references (SKU), shared (or not) between 900 categories / Sub-categories and with CMS pages, the number of generated URLs in the «Core_URL_Rewrite» table is about 80 000.
The re-index (launched manually or by CRON) lasts about 3h30 (12 600 seconds).
After installing our patch, the re-index’s duration is 1 minute !
Other examples in the following chart :
We can see that the re-index is clearly faster.
We also noticed a slight improvement on the Front’s display performances.
Installation & settings
Our extension «Patch_index_URL» is compatible with all versions of Magento CE &EE :
First make a backup of your database then install the extension on your developpement environment but not production !
- In your BackEnd, go to : > System > Configuration > Developer
- If you can’t see the extension, purge the Magento Cache and sessions, log out and log in to your Back End.
- Select YES on the field «Enable Optimisation»/li>
- If you don’t need to generate URLs for disabled products or not visible individually products, select YES for the 2 other fields.
- Go to the Index Management and re-index only your URLs.
Depending on the settings you chose, the re-index should be much quicker.
Any feedback and benchmarks is welcome