Today I noticed a small indexing problem with a customer's website. This works with a shop system that supports "speaking URLs" and is therefore a bit SEO-friendly, but unfortunately it offers the sorting of product lists and thus attaches a lot of parameters to the beautiful URLs, which of course are indexed far too often as a result . The result is [duplicate content->duplicate-content] galore.
In order to make the webmaster's work as easy as possible, I looked for a solution with the [robots.txt file->robots-txt] that helps to prevent the indexing of such sorting pages. The whole thing can be solved easily in my case, since the files usually end with ".html" and only in the sortings is a ".html?something=somehow" to be found. This addition in the robots.txt is now doing its job on the customer side:
Allow: /*.html$
Disallow: /*.html?*
If your files end with .php, you have to adjust the entry accordingly. And give Google a few weeks until it has thrown the pages from the index that are now prohibited. Aunt G is not the fastest ...
Chapter in this post:
My tips & tricks about technology & Apple
Related Articles
Jens has been running the blog since 2012. He acts as Sir Apfelot for his readers and helps them with technical problems. In his spare time he rides electric unicycles, takes photos (preferably with the iPhone, of course), climbs around in the Hessian mountains or hikes with the family. His articles deal with Apple products, news from the world of drones or solutions to current bugs.