Search

Website SEO/Error Scanning Tools

  • Thread starter Alfuzzy
  • Start date
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #1
I've run into a number of products that folks can use to scan websites for various parameters related to SEO (400 errors, dead links, website speed, backlinks, meta information, etc.).

* Some of these products are downloadable apps (free limited trial/paid).
* Some scanner websites will scan some of your site for free (but need a subscription for full scans).
* Some scanner websites won't scan anything until you sign up & provide credit card info.

In many cases these products are darn expensive (prices only bigger businesses can afford). Was wondering what other folks were using to scan their website for 400 errors/SEO parameters/etc...and if anyone knew of some good free or lower cost/affordable products?

Thanks
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #2
Very good question. I'm trying to think back to the last time I used one of these tools. I remember back in something like 2006, I used a program called WebPosition Gold. I'm not sure if the one I just linked to is the same one I used so long ago, so proceed with caution. That wasn't really a website scanner though - it would scan search results to see where your valued keywords resided in the search engines. Google actually later on referenced this program specifically and said not to use it. I guess they were getting inundated with queries.

The second tool I used, and this is probably what you're referring to, was Moz. Back then in 2008, it was called SEOMoz. If memory serves, I believe I paid for the service for a few months at $29.99 per month. I see they've raised their prices.

What I use today is Xenu Link Sleuth. It's free and is pretty good. It'll tell you which pages are good (200), redirect (301)(I believe), are 403 and are 404. It doesn't handle backlinks and other SEO related items though. But it's very good at telling you about the internals of your site. I'm not sure if you have come across this one yet. If not, give it a shot. Just remember to block the directories you don't want crawled as to avoid wasting time.

I'll also mention that I do get bored with tools like this. After the first crawl or scan and after I look at the results, I'm like, "Okay, done with that." There's nothing else to do. For instance, after I block a bunch of pages, the crawler will tell me that they're blocked. And as for the services that require payment, after the first crawl, you're basically paying to keep up with your link profile and rankings. I'm not sure there's value in that and it's the reason I stopped using Moz. Once I knew the link profile, I was finished with that part and once I knew the rankings, I was done there too. I definitely know when my rankings are good or bad. There are free tools to tell me that, such as Google Analytics and the Search Console. So, you may want to give Moz, Botify (may be crazy expensive - they want you to call them), or ahrefs. Both Moz and ahrefs seem pretty reasonable for the beginner with their free or almost free trials.
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #3
JGaulard said:
I'll also mention that I do get bored with tools like this. After the first crawl or scan and after I look at the results, I'm like, "Okay, done with that." There's nothing else to do.
I hear ya...I would be the same way.:) Once you've taken care of most of the issues...rescans would only serve as a spot-check to make sure nothing unexpected has cropped up.

Another SEO scanner for websites is neilpatel.com:

https://neilpatel.com/seo-analyzer/

Not sure if it is super popular or not...but seems to rank high in Google search results. It's $29.99/month...and looks like monthly subscriptions are possible. I guess if you know what you're doing....with a 1 month subscription might be able to get it all done!:)

Here's another one I've used from time to time...seobility.com. Here's a link to their SEO Checker:

https://www.seobility.net/en/seocheck/

These guys are $50/month.

Here's another internet based tool I've used to find dead/broken links:

https://www.deadlinkchecker.com/

I believe this one has a limit of 2000 pages scanned (for the free check). I think to do more need to sign up for some sort of subscription. $9.95/month is the least expensive:

https://www.deadlinkchecker.com/automatic-broken-link-check.asp

Pretty useful info...but deadlinks pretty much the only SEO scan it does (compared to the others).

I should mention I have not subscribed to any of these (just used anything free they offered). Thus cannot speak for the quality of the subscription products.
 
  • Like
Reactions: JGaulard
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #4
Very nice resources. Thank you for sharing them. I haven't looked at my backlinks for years now. Perhaps I'll take advantage of one of these to do so. I'll also look into the others to see what they have to offer. I get so buried in code and SEO "theory" (and building content) that it would be nice to let someone else do some work for a change. Thanks!
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #5
Wanted to get your opinion on something.

Using a scanner tool...I scanned gaulard.com...and it comes back with only three 404 errors (awesome)!

The interesting part is what shows up in the "200 no error" list (for both your site, my site, and probably other Xenforo sites). Since we've been discussing robots.txt & Google crawl budget in another thread...I was wondering if some of the stuff that's showing up as "200 no error"...could be using up some crawl budget...and if there's some way to block these items with robots.txt?

I should also mention the scanner tool can be set to follow robots.txt rules...thus I'm assuming this "stuff" can be crawled by Google.

Here's a screenshot...also circled some of the pages I'm referring to:

Screen Shot 2022-01-07 at 9.31.39 AM.png

The common URLs for each of these circled items are:

https://gaulard.com/forum/
https://gaulard.com/forum/members/

As mentioned...this scanner tool can be set to follow robots.txt rules...and technically these circled items come back as "200 no error" (not really a problem).

I know you have https://gaulard.com/forum/members/ "disallowed" in your robots.txt. My concern is...if Google is crawling these...is some crawl budget being used on this stuff...that could be used on other more valuable pages?

Thanks
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #6
Hi there - I don't think this is a problem. The items you circled in your screenshot are member avatar images. Those, as well as uploaded post images, are held in the /data/ directory. We actually want all of those crawled. If you do a "site:yoursite.com" in Google and then click the "Images" option, you'll see all your indexed images. The more, the better because image search is huge.

I am guessing that images that aren't really linked to will show as "200 no error." In my log files, I see Googlebot crawling these things, but not overwhelmingly so. And as long as all these images are valuable to the search engine (and not errors), it wouldn't reduce its crawl rate because of them. It's those 403 pages that are the real killers. Search engines don't like them at all. I'm still on the fence about the 301 redirects, but I'm hesitant to test that because the trend has been so positive. I'm trying not to rock the boat.
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #7
By the way, which were the URLs that returned 404s? I wouldn't mind fixing those. Thanks.
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #8
JGaulard said:
I'm still on the fence about the 301 redirects, but I'm hesitant to test that because the trend has been so positive. I'm trying not to rock the boat.
I hear ya...risk vs. return may not be worth it. Definitely continue with the experiment as is...at least until things stabilize.:) Then start experiment #2!;)

Just to be sure (on my end). All of those circled links above are pointing to this page (link below). Are you saying this shouldn't be an issue as far as crawl budget?

https://gaulard.com/forum/members/

Seems to be at least one of these for each member on a site. For a site with a lots of members (let's say 100,000 members)...that would be 100,000 pages possibly crawled unnecessarily (or links to the same page). Unless as I think you may be saying...crawling these pages is a good thing:

Thanks
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #9
Alfuzzy said:
I hear ya...risk vs. return may not be worth it. Definitely continue with the experiment as is...at least until things stabilize.:) Then start experiment #2!;)

Just to be sure (on my end). All of those circled links above are pointing to this page (link below). Are you saying this shouldn't be an issue as far as crawl budget?

https://gaulard.com/forum/members/

Seems to be at least one of these for each member on a site. For a site with a lots of members (let's say 100,000 members)...that would be 100,000 pages possibly crawled unnecessarily (or links to the same page). Unless as I think you may be saying...crawling these pages is a good thing:

Thanks
Just so I understand, the avatars you circled above are linking to their member accounts? So, my avatar is linking to: https://gaulard.com/forum/members/2/?

If that's the case, it's not a problem because I have the /members/ directory blocked in the robots.txt. While those member pages are still linked to (and gaining pagerank), they're not actually being crawled because they're blocked. So yes, Googlebot knows about the member pages, but they don't consume crawl budget. It's fine for pages to be blocked, but still linked to.

I hope I'm understanding this correctly. If you were to crawl again with the robots.txt being followed, those member pages shouldn't show up in the results. Maybe the avatar images would, but the actual pages shouldn't. But I don't really know how the utility you're using works. I hope this helps.
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #10
JGaulard said:
By the way, which were the URLs that returned 404s? I wouldn't mind fixing those. Thanks.
I thought you might be curious...especially since there are only three.:)

Here they are:

404 error #1: https://gaulard.com/forum/members/10/
404 error #2: https://gaulard.com/forum/members/10/
404 error #3: https://gaulard.com/forum/threads/40/

Looks like two of them are for the same URL.

I've been fixing many of these myself on my site. If you discover something with each them (especially the first 2)...would love to hear what it was in case I run into something similar.

Thanks:)
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #11
JGaulard said:
If you were to crawl again with the robots.txt being followed, those member pages shouldn't show up in the results. Maybe the avatar images would, but the actual pages shouldn't.
This is where there may be a question regarding the scanner tool. All the scans I've done have a setting for "Limit crawl based on robots.txt"...and I have that setting checked.

The circled items above are showing up as "200 no error" with the scanner. But...I'm assuming if the scanner tool is "seeing" these (even though they are blocked with your robots.txt)...then Google crawler may be seeing them too. Assuming the scanner tool is accurate.

Thanks
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #12
Alfuzzy said:
I thought you might be curious...especially since there are only three.:)

Here they are:

404 error #1: https://gaulard.com/forum/members/10/
404 error #2: https://gaulard.com/forum/members/10/
404 error #3: https://gaulard.com/forum/threads/40/

Looks like two of them are for the same URL.

I've been fixing many of these myself on my site. If you discover something with each them (especially the first 2)...would love to hear what it was in case I run into something similar.

Thanks:)
That's interesting. Those URLs are showing a 200 OK on the header checker site I use here: https://www.webconfs.com/http-header-check.php

I don't know why they would be appearing as 404 errors. I checked them while not logged into this site too and they seem fine. Very strange. I'll chalk it up to scanner malfunction. Especially since there's nothing I can do about it. Thanks for the URLs.

Jay
 
JGaulard

JGaulard

Administrator
Staff member
Site Supporter
Sr. Site Supporter
Power User
Joined
May 5, 2021
Messages
319
Reaction Score
2
Points
18
  • #13
Alfuzzy said:
This is where there may be a question regarding the scanner tool. All the scans I've done have a setting for "Limit crawl based on robots.txt"...and I have that setting checked.

The circled items above are showing up as "200 no error" with the scanner. But...I'm assuming if the scanner tool is "seeing" these (even though they are blocked with your robots.txt)...then Google crawler may be seeing them too. Assuming the scanner tool is accurate.

Thanks
Well those avatar images link to the member pages, which are blocked (the member pages), but the avatar images themselves aren't blocked and should be crawled. They're like any other images. I'm guessing other images on the site showed like the avatar images did. Ultimately, I don't think they're a problem, because, like I said above, the more images crawled, the better.
 
A

Alfuzzy

Member
Site Supporter
Joined
Dec 30, 2021
Messages
68
Reaction Score
2
Points
8
  • #14
JGaulard said:
That's interesting. Those URLs are showing a 200 OK on the header checker site I use here: https://www.webconfs.com/http-header-check.php

I don't know why they would be appearing as 404 errors. I checked them while not logged into this site too and they seem fine. Very strange. I'll chalk it up to scanner malfunction. Especially since there's nothing I can do about it. Thanks for the URLs.

Jay
Good deal...thanks. I was leaving open the possibility this could be a scanner malfunction. I may contact the developer...and see what they have to say.

I wonder if internet forums are "special animals" (compared to WordPress sites for example)...and some scanner tools get "confused" when they scan forums.

Thanks
 
Top