
EmeraldHike
Member
- Joined
- May 10, 2021
- Messages
- 133
- Reaction Score
- 0
- Points
- 21
- #1
I've been reading through a few different threads in the XenForo community and I'm slightly confused about something. In one thread in particular, the originating author asks why someone might block or disallow the /posts/ directory in their robots.txt file. The thread began back in 2013, so I'm wondering if something has changed in the code for this forum software since then. I'm a fairly new user and from what I can tell, there aren't any URLs with /posts/ in them, except for links to reactions someone may have applied to a certain post. In the thread, the user questions why the developers have disallowed /posts/ in their robots.txt file. A developer named Mike replied that there wasn't any SEO reason, per se, but that they simply didn't want Google or any other search engine for that matter to follow useless links. Since the URLs in question are merely 301 redirects that send the user who clicks on them to the most recent post in a thread, it's sort of a waste of time for the search engines to crawl them. The person who created the thread kept talking about how these URLs exist on the homepage, which confused me. I don't see any of these in any of my installs.
So my question is, what exactly is being discussed here. Have the URLs that had /posts/ in them changed to something else through the years? Were they once like this?
From what I can see, the only 301 redirects that exist on the homepage look like this:
Has XenForo updated the old URLs to the new ones?
I know people who use this forum software are sometimes slow to upgrade to the latest version due to all their template and code modifications and that they can be even slower to update their robots.txt files. As I cruise around the web creeping on other site's robots.txt files, mostly everyone has the /posts/ directory blocked. About 90% of the largest forums that use XenForo on the web have that directory, among others, blocked. I'm curious if they had that disallowed because of this 301 redirect and duplicate content issue or are they merely blocking the reactions link. It couldn't be just the reactions link.
In my own robots.txt file, I have these two lines included:
Disallow: /threads/*/post
Disallow: /threads/*/latest
Once I blocked these two directories, I witnessed marked changes in crawling and indexing. As the original thread creator on the XenForo site stated, Google definitely doesn't handle 301 redirects like they're supposed to. When you link to a redirected URL on your own website, that URL might stay indexed and the target URL might never get indexed. It's a weird situation. If you have any input, please let me know. Thanks!
So my question is, what exactly is being discussed here. Have the URLs that had /posts/ in them changed to something else through the years? Were they once like this?
https://mysite.com/posts/1234/
From what I can see, the only 301 redirects that exist on the homepage look like this:
https://mysite.com/threads/thread-name.1234/post-1234
Has XenForo updated the old URLs to the new ones?
I know people who use this forum software are sometimes slow to upgrade to the latest version due to all their template and code modifications and that they can be even slower to update their robots.txt files. As I cruise around the web creeping on other site's robots.txt files, mostly everyone has the /posts/ directory blocked. About 90% of the largest forums that use XenForo on the web have that directory, among others, blocked. I'm curious if they had that disallowed because of this 301 redirect and duplicate content issue or are they merely blocking the reactions link. It couldn't be just the reactions link.
In my own robots.txt file, I have these two lines included:
Disallow: /threads/*/post
Disallow: /threads/*/latest
Once I blocked these two directories, I witnessed marked changes in crawling and indexing. As the original thread creator on the XenForo site stated, Google definitely doesn't handle 301 redirects like they're supposed to. When you link to a redirected URL on your own website, that URL might stay indexed and the target URL might never get indexed. It's a weird situation. If you have any input, please let me know. Thanks!