-
I’ve specified certain pages in robots.txt to be blocked as those were generated by Wordfence, yet Google Search Console indicates them as indexing issues: blocked by robots. Will this have any impact on my SEO? And can I just ignore this warning? The pages are not indexed and served they are just being shown as blocked by robots.
-
Hello, I’m sorry to hear about the issue you’re facing with the “Blocked by robots.txt” warning in Google Search Console. This warning can indeed have an impact on your SEO as it indicates that Google has found your page and indexed it in search results, but it has also found a rule in robots.txt that instructs it to ignore the page from crawling.
This can lead to confusion for Google and may affect your SEO efforts. To address this issue, you should consider reviewing the pages that are being blocked by robots.txt and ensure that they are correctly set up to be blocked. Additionally, you may want to reach out to external sites that might be linking to these blocked pages and request them to change the link to a more relevant URL on your website.
Once you’ve made the necessary changes, you should validate the fix in Google Search Console. This will prompt Google to recrawl the URLs and close the issue if the error is resolved.
If you have any further questions or need assistance with this, please feel free to reach out to our support team directly from here. We’re here to help!
Sources:
How to Fix the “Indexed, though blocked by robots.txt” Error » Rank Math
How to Fix ‘Blocked by robots.txt’ Error in Google Search Console » Rank Math
Common robots.txt Issues & How to Fix Them » Rank Math
No Information is Available for This Page » Rank MathHello,
Thank you for contacting Rank Math support. We apologize for any inconvenience this might have caused.
If you have specified certain pages to be blocked in your robots.txt file and Google Search Console is reporting them as “Blocked by robots,” it typically means that Googlebot is not able to crawl those pages. This, in itself, is not a direct SEO issue, especially if those pages are not meant to be indexed.
The fact that Googlebot can’t crawl them won’t directly harm your SEO since the pages are not set to be indexed.
This warning in Google Search Console is more of an informational message. It’s letting you know that Googlebot can’t access those pages, and it’s not necessarily a problem if those pages aren’t supposed to be indexed.
Hope this helps and please do not hesitate to let us know if you need assistance with anything else.
Thank you.
Thanks for the clarification. Another question Three of those blocked pages are being indexed; what can I do about them? These pages are generated by WordFence.
Hello,
Are you able to access the Rank Math metabox on those pages to control the robots meta settings and set them to
noindex
following this guide? https://rankmath.com/kb/how-to-noindex-urls/If you are not able to do that, you might need to apply the
noindex
dynamically by editing the following filter: https://rankmath.com/kb/filters-hooks-api-developer/#change-robots-metaDon’t hesitate to get in touch if you have any other questions.
Thank you very much for the response. The new pages generated by WordFence are being blocked by robots in robots.txt, and as for the 3 pages that got indexed, I am leaving them the way they are as I don’t want any more unnecessary complications as it could potentially block my whole site from getting indexed. And I do not have access to those pages in rankmath as those are dynamic and not actually a page that is being generated by WordFence.
And once again, thanks for the help. As long as my overall SEO is not affected, I can safely ignore the 3 pages.
Hello,
Sure that’s totally fine.
But we should note that the
robots.txt
DISALLOW rule would only prevent Google from crawling the page. If Google has already indexed the page – which it has – this rule will not remove the page from Google, you will have to use the Google Removal tool to remove the URL from Google’s index.Then you can apply the noindex tag dynamically to make sure the page is no longer indexed in the future. Here’s an example code to use the filter for multiple URLs at a time:
add_filter( 'rank_math/frontend/robots', function( $robots ) { // Get the current page URL $current_url = home_url( $_SERVER['REQUEST_URI'] ); // Define an array of pages to change the robots tag to noindex $pages = array( 'https://example.com/page1', 'https://example.com/page2' ); // Check if the current page is in the array if ( in_array( $current_url, $pages ) ) { // Change the robots tag to noindex $robots['index'] = 'noindex'; } // Return the modified robots array return $robots; });
Remember to change the example.com URLs to the URLs you want to dynamically set to noindex. You can also add more URLs. You can add the code to your site using any of the methods here.
We hope this helps you resolve the issue. If you have any other questions or concerns regarding Rank Math, don’t hesitate to get in touch with us again. We are always happy to help.
Thank you for choosing Rank Math!
I have requested the urls in the removal tool; they are showing as temporarily removed, but when I checked the url in Google Index, it was still showing as indexed. I don’t know what’s happening, but I guess I will wait and see if anything changes or not.
Hello,
Sure, let us know how this goes.
Meanwhile, if you have any other concerns, please don’t hesitate to contact us anytime to assist you further.
Looking forward to helping you.
Thank you.
Sure, I will update here if I see any changes in the robots in the blocked section of my GSC.
Thank you very much for all the help.Hello,
This ticket will be open, so you can update us here.
However, if it has been closed by the bot, please don’t hesitate to create a new ticket and reference this ticket here.
Looking forward to helping you.
Thank you
The indexed pages were eventually deindexed by requesting the manual removal of the pages. Thank you very much for all the help. By the way, I have just blocked the pages in robots.txt and requested the manual removal of pages; this solved the major issue, but the crawled pages that are currently not indexed are still there; hopefully they will also be removed from the search console. And I would suggest a small tip: if this problem happens to anyone else, they can do what I did, but even better, before turning on the WordFence live traffic records, just add this line in the robots.txt Disallow: *?wordfence* This should be sufficient to not create any unnecessary problems with the search console.
Hello,
We’re delighted to hear that this issue has been resolved. We appreciate your kind words and feedback.
I’m closing this ticket now but if you ever have another question or need any help in the future, please don’t hesitate to create a new forum topic. We’ll be more than happy to assist you again.
Thank you for choosing Rank Math and have a wonderful day!
Cheers
The ticket ‘Blocked By Robots’ is closed to new replies.