Technical SEO

Technical SEO is the practice of optimizing technical aspects of a website to make it easier for search engines to find, crawl, and index web pages of that website.
It helps increase visibility and rankings in search engines.
It refers to behind-the-scenes technical elements that power your organic growth in search engines, such as site architecture, mobile optimization, and page speed.
It focuses on user experience factors. Such as making your website faster and easier on mobile devices.
The first step in improving technical SEO is knowing where you stand by performing a site audit.
The second step is to create a plan to address the areas where you fall short.
Technical SEO, content strategy, and link-building strategies all work in tandem to help your pages rank highly in search.
Technical SEO basically is divided into five categories.

1. Why Is Technical SEO Important?

Technical SEO can make or break your SEO performance.
If pages on your site aren’t accessible to search engines, they won’t appear in search results—no matter how valuable your content is.
This results in a loss of traffic to your website and potential revenue to your business.
Website’s speed and mobile-friendliness are also the confirmed ranking factors.
If your pages load slowly, users may get annoyed and leave your site This signals that your site doesn’t create a positive user experience. As a result, search engines may not rank your site.

2. How websites work

User requests domains in the browser by clicking on a link to the website.
The browser makes requests to the server for the code upon which your web page is constructed, such as HTML, CSS, and JavaScript.
The server sends (website files) resources to be assembled in the searcher’s browser.
The browser assembles the web page (resources) but still needs to put it all together and render the web page so that the user can see it in their browser. Assembly defines the page speed.
As the browser passes and organizes all the web page’s resources, it creates a Document Object Model (DOM). The website will then appear in the browser.

Google renders certain resources, like JavaScript, on a “second pass.” Google will look at the page without JavaScript first, then a few days to a few weeks later, it will render JavaScript, meaning SEO-critical elements that are added to the page using JavaScript might not get indexed.

A website is made of the 3 codes (programming languages) used to construct those web pages.

HTML – Structure – What a website says (titles, body content, etc.)
CSS – Appearance – How a website looks (color, fonts, etc.)
JavaScript – Action – How it behaves (interactive, dynamic, etc.)

2.1. HTML: What a website says (Structure)

HTML stands for hypertext markup language.
It serves as the backbone of a website.
Elements like headings, paragraphs, lists, and content are all defined in the HTML.
Google crawls these HTML elements to determine how relevant is your content to a query.
What’s in your HTML plays a huge role in how your web page ranks in Google.

https://moz.com/images/learn/guides/BGSEOAsset-41.png?w=840&h=290&auto=compress%2Cformat&fit=crop&dm=1554097417&s=81430a29b0dae4bf5fee563a09a3e122

HTML is used to create the content (Title, H1, H2-H6, Meta Description, etc.), but not to style it.
However, CSS is used to define the fonts, colors, and layouts of created content.

2.2. CSS: How a website looks (Appearance)

CSS stands for “cascading style sheets”.
CSS will allow you to make fonts, colors, and layouts of the content.
With CSS, web pages could be “beautified” without requiring manual coding of styles into the HTML of every page.
CSS files over HTML, make your page less code-heavy, reducing file transfer size and making load times faster.
Browsers still have to download resources like your CSS file, so compressing them can make your web pages load faster.
Having your pages more content-heavy than code-heavy, can lead to better indexing of your site’s content.

In the earlier days of the Internet, webpages were built with HTML. When CSS came along, webpage content gained the ability to take some styles. When the programming language JavaScript entered the scene, websites could now not only have structure and style, but they could be dynamic.

2.3. JavaScript: How a website behaves (Action)

JavaScript has opened up a lot of opportunities for non-static web page creation.
If a page is enhanced with JavaScript, that user’s browser will execute the JavaScript against the static HTML upon a query, resulting in some sort of interactivity.
JavaScript can do almost anything to a page. It could create a pop-up, for example, or it could request third-party resources like ads to display on your page.

(a) Client-side rendering versus server-side rendering

Search engines don’t view JavaScript the same way human visitors do. That’s because of client-side versus server-side rendering.
Most JavaScript is executed in a client’s browser.
On the other hand, with server-side rendering, the files are executed on the server and the server sends them to the browser in their fully rendered state.
SEO-critical page elements such as text, links, and tags that are loaded on the client’s side with JavaScript, rather than represented in your HTML, are invisible from your page’s code until they are rendered. This means that search engine crawlers won’t see what’s in your JavaScript — at least not initially.
Google says that, as long as you’re not blocking Googlebot from crawling your JavaScript files, they’re able to render and understand your web pages just like a browser can.
This means that Googlebot should see the same things as a user viewing on a site.
However, due to this “second wave of indexing” for client-side JavaScript, Google can miss certain elements that are only available once JavaScript is executed.

3. Technical SEO Factors

Technical SEO depends on two important processes: Crawling & Indexing

3.1. Crawling

Crawlability is the foundation of your technical SEO strategy.
Crawling allows search engines to grab content from your website or pages along with the links of other pages on them to find even more pages.
Search bots will crawl your pages to gather information about your site.
For example, a search engine like Google crawls a blog page, and it sees the recently added links to new blog posts. That’s one of the ways Google discovers our new blog posts.
If these bots are somehow blocked from crawling, they can’t index or rank your pages.

3.2. Indexing

Once search engines crawl your pages, they then try to analyze and understand the content on those pages.
Then search engine stores those pieces of content in its search index—a huge database containing billions of web pages.
Your webpages must be indexed by search engines to appear in search results.
As search bots crawl your website, they begin indexing pages based on their topic and relevance to that topic. Once indexed, your page is eligible to rank on the SERPs. Here are a few factors that can help your pages get indexed.

How to check whether your page is indexed or not

Write “site:” before the URL of a page or website (that you want to check) in the Google search engine.
For example, “site:www.semrush.com” into Google’s search box.
You can also check whether individual pages are indexed by searching the page URL with the “site:” operator. This tells you (roughly) how many pages from the site Google has indexed.

4. How to Optimize Crawling

The first step to implementing technical SEO is to ensure that all of your important pages are accessible and easy to navigate. There are a few ways you can control what gets crawled on your website.

Optimize your site architecture SEO-friendly.
Create and submit an XML sitemap in the search engines.
Maximize your crawl budget.
Set a URL structure.
Utilize robots.txt.
Add breadcrumb menus.
Use pagination.

4.1. Create an SEO-Friendly Site Architecture

Your website has multiple pages. Those pages need to be organized in a way that allows search engines to easily find and crawl them.
Site architecture is how you organize the pages (linked together) on your site.
An effective site structure organizes pages in a way that helps crawlers find your website content quickly and easily.
Ensure all the pages are just a few clicks away from your homepage when structuring your site.
The closer the Page is to your homepage, get more priority for crawling by the search engine.
The homepage links to category pages, and the category pages link to individual subpages.
The structure shall reduce the number of orphan pages (pages that don’t have internal linking).

Orphan Pages

Orphan pages are pages with no internal links pointing to them, making it difficult (or sometimes impossible) for crawlers and users to find them.

4.2. Create an XML sitemap.

XML Sitemap helps search bots to find and crawl your web pages.
It is a map of your website (Structure).
An XML sitemap is a file containing a list of important pages on your site.
It lets search engines know which pages you have and where to find them.
It is important if your site contains a lot of pages. Or if they’re not linked together well.
Remember to keep your sitemap up-to-date as you add and remove web pages.

(a) Submit Your Sitemap to Google

You’ll submit your sitemap to Google Search Console and Bing Webmaster Tools once.
Submit your sitemap in the Google via Google Search Console (GSC).

Your sitemap is usually located at one of these two URLs:

yoursite.com/sitemap.xml
- yoursite.com/sitemap_index.xml

4.3. Maximize your crawl budget (Crawl Adjustments)

Crawl budget is the number of pages, that search engines will crawl within a certain timeframe.
Each website has a different crawl budget, which is a combination of how often Google wants to crawl a site and how much crawling your site allows.
Search engines calculate crawl budget based on:

crawl limit (how often they can crawl without causing issues) &
- crawl demand (how often they’d like to crawl a site).

Here are a few tips to ensure that you’re maximizing your crawl budget:

Make sure you’re prioritizing your most important pages for crawling.
- Remove or canonicalize duplicate pages.
- Fix or redirect any broken links.
- Make sure your CSS and JavaScript files are crawlable.
- Check your crawl stats regularly and watch for sudden dips or increases.
- Make sure any bot or page you’ve disallowed from crawling is meant to be blocked.
- Keep your sitemap updated and submit it to the appropriate webmaster tools.
- Prune your site of unnecessary or outdated content.

4.4. Set a URL structure

URL structure refers to how you structure your URLs (subdomains, subdirectories, subfolders).
URL structure could be determined by your site architecture. (where the URL leads.)
Submit a list of URLs of your important pages to search engines in the form of an XML sitemap.

4.5. Utilize robots.txt

When a web robot crawls your site, it will first check the /robot.txt.
A robots.txt file tells search engines where they can and can’t go on your site.
It is also known as the Robot Exclusion Protocol.
This protocol can allow or disallow specific web pages or sections of your site for crawling.
If you’d like to prevent bots from indexing your site, you’ll use a noindex robots meta tag.

Your robots.txt file is available at your homepage URL with “/robots.txt” at the end.

Check it to ensure you’re not accidentally blocking access to important pages that Google should crawl via the disallow directive.
For example, you wouldn’t want to block your blog posts and regular website pages. Because then they’ll be hidden from Google.

(a) Reason for must have Robot.txt file

You may want to block certain bots from crawling your site altogether.
Unfortunately, there are some bots with malicious intent that spam your content.
If you notice this bad behavior, use robot.txt to prevent them from entering your website.
In this scenario, robot.txt will work as a force field for bad bots on the internet.
You have a crawl budget that you don’t want to spend on unnecessary data. So, you may want to exclude pages that don’t help search bots understand what your website is about, for example, a Thank You page from an offer or a login page.

(b) Access restrictions

If you want the page to be accessible to some users but not search engines.
Then what you probably want is one of these three options:
- Some kind of login system
- HTTP authentication (where a password is required for access)
- IP whitelisting (which only allows specific IP addresses to access the pages)

This type of setup is best for things like internal networks, member-only content, or staging, test, or development sites. It allows for a group of users to access the page, but search engines will not be able to access the page and will not index it.

How to See Crawl Activity

From Google Search Console (GSC) – the easiest way

GSC → Setting → Crawling → Crawl Stats

From the access of website server logs – the Advance Way
If your hosting has a control panel like cPanel,

you should have access to raw logs and aggregators like AWStats and Webalizer.

4.6. Add Breadcrumb menus

Breadcrumb is a trail of text links that guides users back to the start of their journey on your website.
It shows users where they are on the website and how they reached that point.
It’s a menu of pages that tells users how their current page relates to the rest of the site.
They aren’t just for website visitors; search bots use them, too.
Breadcrumbs should be two things:

1) Visible to users so they can easily navigate your web pages without using the Back button.

2) Have structured markup language to give accurate context to search bots that are crawling your site.

breadcrumb-navigation-menu-technical-seo

4.7. Use pagination

It is like numbering the pages on your website.
Pagination is a navigation technique that’s used to divide a long list of content (pages) into multiple pages. i.e. for 100 blog articles on a website, if set 10 to show per page, it will create just 10 pages.

This approach is favored over infinite scrolling.
In infinite scrolling, content loads dynamically as users scroll down the page.
This creates an issue for Google. Because it may not be able to access all the content that loads dynamically.
And if Google can’t access your content, it won’t appear in search results.