I think I got it. I finally think I got it. I've settled on what I think is the best approach to handle all of the extra URLs that Xenforo offers its visitors. If you are new to this or if you haven't noticed in your own install of this software, Xenforo creates tons of 301 redirects and other links that need to be blocked one way or another. If left unchecked, the search engine crawlers will spend all day and night crawling unnecessary links, which, as you may already know, wastes time and crawl budget. Not to mention other issues, such as watering down your good URLs as well as not canonicalizing URLs properly. I have no idea if Google hands out penalties for having too many links to pages blocked in the robots.txt file, but I really don't want to find out. I'd rather be gone with the useless links.
I'll tell you where the trouble spots are and then I'll tell you how to fix each one. I'll make a list down below and offer my solution.
/posts/ - Links to threads from notification emails that are received. These aren't linked to from anywhere but the emails, but they do 301 redirect back to the full thread URL. Also, this directory is used to link to reactions within each post. Those reaction pages are marked with the noindex meta tag element, but they are very thin. There's no reason for Google to crawl these pages. I block this directory in the robots.txt file. I'm not concerned with this because there isn't an overwhelming number of these URLs. I also remove the link altogether from the site when a user is a guest who's not logged in. When a member is logged in, they'll see the links perfectly fine. Here's how to remove the links in the template system:
Code:
TEMPLATE: "REACTION_LIST_ROW"
(remove reactions link row for guests - /posts/ links on thread pages)
line 5
<!-- REMOVE REACTIONS FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a class="reactionsBar-link" href="{{ link($link, $content, $linkParams) }}" data-xf-click="overlay" data-cache="false">{$reactions}</a>
<xf:else />
{$reactions}
</xf:if>
<!-- REMOVE REACTIONS FOR GUESTS -->
/goto/ - When someone responds to a thread in a post, a link is created that, when clicked, directs the page to the post that was responded to. This link is actually a redirect that leads to the original thread URL that's got a hashtag applied to its end. These are needless links that should be removed. Don't block this directory in the robots.txt file because you want any links that have already been crawled to consolidate with the thread URL. Just remove the links so they're not crawled anymore. Again, allowing search engines such as Google to crawl these links is a waste of crawl budget and these links don't always canonicalize, creating a duplicate content issue. Here's how to remove these links in the template system:
Code:
TEMPLATE: "BB_CODE_TAG_QUOTE"
(remove quote username link (goto) for guests - /goto/ links on thread pages)
line 9
<!-- REMOVE QUOTE USERNAME LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('goto/' . {$source.type}, null, {'id': $source.id}) }}"
class="bbCodeBlock-sourceJump"
data-xf-click="attribution"
data-content-selector="#{$source.type}-{$source.id}">{{ phrase('x_said:', {'name': $name}) }}</a>
<xf:else />
{{ phrase('x_said:', {'name': $name}) }}
</xf:if>
<!-- REMOVE QUOTE USERNAME LINK FOR GUESTS -->
/attachments/ - This is the directory that's used for images that are uploaded to posts. I block these attachments for guests in the permission system contained within the software. By doing this, I'm forcing the URLs to return a 403 status code when clicked on or crawled. I'm not in love with returning error codes on my site, so once these URLs are completely out of the Google index, I'll remove the links to these as well. There's no reason for guests to see these links and as it stands right now, I've got over 40,000 attachment URLs on one of my websites. I'd rather keep that link juice directed somewhere else. There's a nice
add-on that handles the removal of this link written by
customizeFX if you're interested.
/members/ - This is a very similar issue to the previous one. We've got three choices when it comes to members and attachments. We can allow them to be crawled freely, which would create a Google Panda penalty on your website, block them in the robots.txt file, which would create tons of blocked page errors in the Google Webmaster Console, or we can block access to these pages via the permissions system contained within the software itself. This would cause a different type of error in the console though - a 403 header status code. I don't want any of these errors, so I keep these pages blocked via the permission system and when all of the URLs are cleaned out of Google's index, I'll remove the links to them, just like I did above. For an
add-on that does this quickly and easily, you can visit
customizeFX again.
/threads/xx-thread-name-xx.123/latest - These URLs and the next are insidious little beasts. On every single forum page, there are links to the latest posts contained within a thread. Although these links are marked with the nofollow link element, Google still follows them. These links all use a 301 redirect to eventually land on a specific post on the thread page itself. The new URLs end with a hashtag. Allowing Google to follow all these links is senseless and can consume most of a website's crawl budget. Also, 301 redirects aren't the best think to have all over a website, as they take a long time to execute. I can't imagine that search engines like them. In addition to that, I've seen many cases where the redirect isn't even applied by Google or Google chooses the redirected link as the canonical one. This can create duplicate content and it's just bad form. I remove these links completely from the site for those who aren't logged in as members. Here's how I do it:
Code:
TEMPLATE: "THREAD_LIST_MACROS"
(remove date link for guests)
line 180
<!-- REMOVE DATE LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('threads/latest', $thread) }}" rel="nofollow"><xf:date time="{$thread.last_post_date}" class="structItem-latestDate" /></a>
<xf:else />
<xf:date time="{$thread.last_post_date}" class="structItem-latestDate" />
</xf:if>
<!-- REMOVE DATE LINK FOR GUESTS -->
/threads/xx-thread-name-xx.123/post-123 - These URLs cause the same exact issue as above, but these are even worse. These "post" links are scattered throughout the posts themselves and for each and every post that's created, another 301 redirect URL is created as well. Not only that, these URLs, when contained in a node list, don't have a nofollow link element applied to them. In the post they do, but not in the node list. Currently, I've got over 600 of these URLs that haven't canonicalized on one of my websites and I don't think they ever will. I don't block these URLs, I keep them crawlable, but I remove the links. Here's how I do it:
Code:
TEMPLATE: "NODE_LIST_FORUM"
(remove thread link for guests)
line 121
<!-- REMOVE THREAD LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<xf:if is="$extras.LastThread.isUnread()">
<a href="{{ link('threads/unread', $extras.LastThread) }}" class="node-extra-title" title="{$extras.LastThread.title}">{{ prefix('thread', $extras.LastThread) }}{$extras.LastThread.title}</a>
<xf:else />
<a href="{{ link('threads/post', $extras.LastThread, {'post_id': $extras.last_post_id}) }}" class="node-extra-title" title="{$extras.LastThread.title}">{{ prefix('thread', $extras.LastThread) }}{$extras.LastThread.title}</a>
</xf:if>
<xf:else />
<xf:if is="$extras.LastThread.isUnread()">
{{ prefix('thread', $extras.LastThread) }}{$extras.LastThread.title}
<xf:else />
{{ prefix('thread', $extras.LastThread) }}{$extras.LastThread.title}
</xf:if>
</xf:if>
<!-- REMOVE THREAD LINK FOR GUESTS -->
Code:
TEMPLATE: "POST_MACROS"
(remove post date number link for guests)
line 51
<!-- REMOVE POST NUMBER LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('threads/post', $thread, {'post_id': $post.post_id}) }}" rel="nofollow">
#{{ number($post.position + 1) }}
</a>
<xf:else />
#{{ number($post.position + 1) }}
</xf:if>
<!-- REMOVE POST NUMBER LINK FOR GUESTS -->
Code:
TEMPLATE: "POST_MACROS"
(remove share toolstip and link for guests)
line 32
<!-- REMOVE SHARE TOOLTIP LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('threads/post', $thread, {'post_id': $post.post_id}) }}"
data-xf-init="share-tooltip" data-href="{{ link('posts/share', $post) }}"
rel="nofollow">
<xf:fa icon="fa-share-alt"/>
</a>
<xf:else />
<xf:fa icon="fa-share-alt"/>
</xf:if>
<!-- REMOVE SHARE TOOLTIP LINK FOR GUESTS -->
Code:
TEMPLATE: "POST_MACROS"
(remove post date link for guests)
line 21
<!-- REMOVE POST DATE LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('threads/post', $thread, {'post_id': $post.post_id}) }}" class="u-concealed"
rel="nofollow">
<xf:date time="{$post.post_date}"/>
</a>
<xf:else />
<xf:date time="{$post.post_date}"/>
</xf:if>
<!-- REMOVE POST DATE LINK FOR GUESTS -->
Miscellaneous Unnecessary Links - There are a few other links that I like to remove because they are completely unnecessary to those who aren't logged in as members. These are mostly date links, but again, they consume resources that can be better used elsewhere. Here's the template code to remove them:
Code:
TEMPLATE: "THREAD_LIST_MACROS"
(remove date link for guests)
line 150
<!-- REMOVE DATE LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<li class="structItem-startDate"><a href="{{ link('threads', $thread) }}" rel="nofollow"><xf:date time="{$thread.post_date}" /></a></li>
<xf:else />
<li class="structItem-startDate"><xf:date time="{$thread.post_date}" /></li>
</xf:if>
<!-- REMOVE DATE LINK FOR GUESTS -->
Code:
TEMPLATE: "THREAD_LIST_MACROS"
(remove prefix filter link for guests)
line 91
<!-- REMOVE PREFIX FILTER LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('forums', $forum, {'prefix_id': $thread.prefix_id}) }}" class="labelLink" rel="nofollow">{{ prefix('thread', $thread, 'html', '') }}</a>
<xf:else />
{{ prefix('thread', $thread, 'html', '') }}
</xf:if>
<!-- REMOVE PREFIX FILTER LINK FOR GUESTS -->
Code:
TEMPLATE: "THREAD_VIEW"
(remove thread date link for guests)
line 16
<!-- REMOVE THREAD DATE LINK FOR GUESTS -->
<xf:if is="$xf.visitor.user_id">
<a href="{{ link('threads', $thread) }}" class="u-concealed"><xf:date time="{$thread.post_date}" /></a>
<xf:else />
<xf:date time="{$thread.post_date}" />
</xf:if>
<!-- REMOVE THREAD DATE LINK FOR GUESTS -->
I hope I didn't miss anything. I don't think I did, but if I did, I'll update this thread. If you have anything to add or if you have any other ideas, I'd love to see what you've got to say. Please feel free to chime in below. Thanks!