How can I find all broken links pointing to my site?
-
I help manage a large website with over 20M backlinks and I want to find all of the broken ones. What would be the most efficient way to go about this besides exporting and checking each backlink's reponse code?
Thank you in advance!
-
You can find broken links pointing to your website by using website crawl tools like Screaming Frog or Ahrefs, checking crawl errors in Google Search Console, and monitoring your backlinks with tools like Ahrefs or SEMrush. Regularly checking your referral traffic and using online broken link checkers can also help you identify broken links.
-
You can find broken links pointing to your website by using website crawl tools like Screaming Frog or Ahrefs, checking crawl errors in Google Search Console, and monitoring your backlinks with tools like Ahrefs or SEMrush. Regularly checking your referral traffic and using online broken link checkers can also help you identify broken links.
-
We often use Moz Pro, its a fantastic SEO tool, we also use Screaming Frog as well, we use this to find any broken internal links.
this has helped improve our on-page seo, for our garden office company.
-
Ha, I feel silly. I do use ahrefs, but somehow the broken backlinks tool escaped me. This is perfect, thank you!
-
Hi Steven,
I assume many of these backlinks will be broken because pages were removed from your site without being properly redirected. If that is the case, Open Site Explorer's Link Opportunities (Link Reclamation) tool should be a big help. This will show all 404 URLs with inbound links that you can recapture be 301 redirecting. Additionally, you can look up the backlinks to each of these 404 pages and reach out to each webmaster requesting they update the URL of their link.
I've also had success exporting Top Pages reports (Moz or Majestic are my preferred tools for this), running any URL with a backlink to it through Screaming Frog and pulling 404 pages/broken links (or even 302 redirects) that way. I usually find additional opportunities that do not show up in the Link Reclamation report.
Hope this helps!
-
Use ahrefs and split the crawls for the main folders of the website. Actually, consider the priorities because then you don't have to do all of the 20m. Start with the main ones and go step by step for being able to crawl the majority.
-
I agree with Kevin. Ahref has that capability assuming you don't run into size constraints. Here's a quick post that explains where to find it. (See https://ahrefs.com/blog/turning-broken-links-site-powerful-links-ahrefs-broken-link-checker/.)
-
Have you looked into ahrefs? I know a ton of horsepower behind it, but don't know if it can handle checking 20m. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way of crawling my entire site to get a list of NoFollow links?
Hi all, hope somebody can help. I want to crawl my site to export an audit showing: All nofollow links (what links, from which pages) All external links broken down by follow/nofollow. I had thought Moz would do it, but that's not in Crawl info. So I thought Screaming Frog would do it, but unless I'm not looking in the right place, that only seems to provide this information if you manually click down each link and view "Inlinks" details. Surely this must be easy?! Hope someone can nudge me in the right direction... Thanks....
Intermediate & Advanced SEO | | rl_uk0 -
Site deindexed after HTTPS migration + possible penalty due to spammy links
Hi all, we've recently migrated a site from http to https and saw the majority of pages drop out of the index. https://www.relate.org.uk/ One of the most extreme deindexation problems I've ever seen, but there doesn't appear to be anything obvious on-page which is causing the issue. (Unless I've missed something - please tell me if I have!) I had initially discounted any off-page issues due to the lack of a manual action in SC, however after looking into their link profile I spotted 100 spammy porn .xyz sites all linking (see example image). Didn't appear to be any historic disavow files uploaded in the non https SC accounts. Any on-page suggestions, or just play the waiting game with the new disavow file? Hku8I
Intermediate & Advanced SEO | | CTI_Digital0 -
OSE link report showing links to 404 pages on my site
I did a link analysis on this site mormonwiki.com. And many of the pages shown to be linked to were pages like these http://www.mormonwiki.com/wiki/index.php?title=Planning_a_trip_to_Rome_By_using_Movie_theatre_-_Your_five_Fun_Shows2052752 There happens to be thousands of them and these pages actually no longer exist but the links to them obviously still do. I am planning to proceed by disavowing these links to the pages that don't exist. Does anyone see any reason to not do this, or that doing this would be unnecessary? Another issue is that Google is not really crawling this site, in WMT they are reporting to have not crawled a single URL on the site. Does anyone think the above issue would have something to do with this? And/or would you have any insight on how to remedy it?
Intermediate & Advanced SEO | | ThridHour0 -
First Link on Page Still Only Link on Page?
Bruce Clay and others did some research and found that the first link on the page is the most important and what is accredited as the link. Any other links on the page mean nothing. Is this still true? And in that case, on an ecommerce site with category links in the top navigation (which is high on the code), is it not useful to link to categories in the content of the page? Because the category is already linked to on that page. Thank you, Tyler
Intermediate & Advanced SEO | | tylerfraser0 -
Is the Tool Forcing Sites to Link Out?
Hi I have a tool that I wish to give to sites, it allows the user to get an accurate idea of their credit score with out giving away any personal data and with out having a credit search done on their file. Due to the way the tool works and to make the implementation on other peoples sites as simple as possible the tool remains hosted by me and a one line piece of Javascript code just needs to be added to the code of the site wishing to use the tool. This code includes a link to my site to call the information from my server to allow the tool to show and work on the other site. My questions are: Could this cause a problem with Google as far as their link quality goes? - Are we forcing people to give us a backlink to use the tool? (in the eyes of Google) or will Google not be able to read the Javascript / will ignore the link for SEO purposes? Should I make the link in the code Nofollow? If I should make the link a Nofollow any tips on how to make the most of the opportunity from a link building or SEO point of view? Thanks for your help
Intermediate & Advanced SEO | | MotoringSEO0 -
Google penalized site--307/302 redirect to new site-- Via intermediate link—New Site Ranking Gone..?
Hi, I have a site that google had placed a manual link penalty on, let’s call this our
Intermediate & Advanced SEO | | Robdob2013
company site. We tried and tried to get the penalty removed, and finally gave up and purchased another name. It was our understanding that we could safely use either a 302 or 307 temporary redirect in order to redirect people from our old domain to our new one.. We put this into place several months and everything seemed to be going along well. Several days ago I noticed that our root domain name had dropped for our selected keyword from position 9 to position 65. Upon looking into our GWT under “Links to Your site” , I have found many, many, many links which were pointed to our old google penalized domain name to our new root domain name each of this links had a sub heading “Via this intermediate link -> Our Old Domain Google Penalized Domain Name” In light of all of this going on, I have removed the 307/302 redirect, have brought the
old penalized site back which now consists of a basic “we’ve moved page” which is linked to our new site using a rel=’nofollow’ I am hoping that -1- Our new domain has probably not received a manual penalty and is most likely now
received some sort of algorithmic penalty, and that as these “intermediate links” will soon disappear because I’m no longer doing the 302/307 from the old sight to the new. Do you think this is the case now or that I now have a new manual penalty place on the new
domain name.. I would very much appreciate any comments and/or suggestions as to what I should or can do to get this fixed. I need to still keep the old domain name as this address has already been printed on business cards many, many years ago.. Also on a side note some of the sub pages of the new root domain are still ranking very
well, it’s only the root domain that is now racking awfully.. Thanks,0 -
SEO Link on Clients Site
Hey SEOMozzers, Quick question. In light of the possible 'over-optimisation' penalties pending from Google should we be looking to remove the SEO links to our site from our Clients websites? I appreciate that including a link to our site from an anchor text that includes 'SEO' in it may be like waving a flag to Search Engines saying we are carrying out SEO on our Clients sites. Obviously we would sooner risk a drop in our SEO keyword rankings than risk a penalty of any kind for our Clients. What is the recommended practice here?
Intermediate & Advanced SEO | | MiroAsh0 -
How 'Off Topic' can I go - site wide?
Hello, I am currently number 1 for a competitive keyword - so don't want to push the wrong button and self destruct! My site is highly focused on one relatively narrow niche with about 50-60 pages of content bang on topic. I was wondering if Google will discredit my site in any way if I start adding pages that are** 'loosely related' **to the overall theme of my niche. Some of them are what you might call sister concepts with maybe one mention of my target keyword in the body..... Does the algo value what percentage of the whole site's content is on/ off topic? If so how important is this as a factor? Thanks a lot
Intermediate & Advanced SEO | | philipjterry0