How It Started

I recently met the co-founder of a two-men crew start-up to discuss the possibility of implementing SEO for their social platform. It's build on a MySql + PHP ZendFramework + Bootstrap CSS + Angular.js stack. They are missing on a lot of potential organic traffic which would help them grow the community they need to attract investors.

I knew angular.js had a bad reputation regarding SEO. I have spent time digging that issue and concluded it is much worse than I thought. I have now advised them to ditch angular.js if they want to capture that traffic.

Here Is Why

If you are unfamiliar with angular.js, it is a framework facilitating the development of single page applications (SPA) relying on Ajax and Javascript to load and display content. In other words, although it may feel like you are navigating from page to page when using a SPA, the browser does not load and reload pages. It fetches content from the server and updates the current page.

The URLs & Javascript

Therefore, the application is loaded from a single URL. When you click on links, the URL is updated in such a way that it does not trigger a page reload. Typically, such URLs contain a hashbang (#!): http://myapp.com/#!path/to/content

Now, why is that an issue for SEO? First, web crawlers need URLs to index content and second, they need to have access to that content to understand what it is all about.

If Javascript is not enabled on a browser, the angular.js content is not loaded. Therefore, any link added on the page is not displayed too. It prevents the discovery process of web crawlers and the access to content.

In order to solve that issue, a standard practice has been implemented. All URLs containing a hashbang are converted into escaped_fragment URLs when web crawlers fetch pages from servers. These return HTML equivalent pages, that is, the same content as the SPA would display, without Javascript. All the URLs of the SPA can be included in a sitemap.xml too to let the crawlers know about them.

Yes, this means a lot of extra work to let search engines index your SPA content when it is easily achieved with a multipage application, where each page has its own URL and content does not have to be loaded with Ajax.

Hashbang URLs are ugly and people don't like them. There are solutions to make them look nice: http://myapp.com/path/to/content. However, if the hashbang is not there anymore, how can web crawlers know they should use an escaped_fragment version to access content without relying on Javascript? They can't. There is a workaround, the server can detect in the HTTP headers that requests are coming from web crawlers and return HTTP equivalent content.

Prerendering

Today, Google is the only search engine capable of executing some Javascript when crawling and indexing web pages. If you don't care about other search engines, you might think that you don't need to worry about serving HTML equivalent content for SEO.

Wrong. The way Google indexes content depends on whether is it visible to users when they display the page. If they need to click here and there to access it, it is not considered a good user experience. Users want to be satisfied fast. Such content is demoted when computing relevancy and ranking. It can hurt your rankings.

Google is also wary about dynamically loaded content. It wants to be sure it can trust it. It does not want to take the risk of referencing a cooking website which might display porn pictures instead of recipes to its user. Therefore, using Ajax raises the trust hurdle for SEO.

If you care about all search engines and if you already know about angular.js, then you already know that prerendering web pages is the mostly recommended SEO solution. However, it is not always easy to achieve. Some companies offer paid services to do this for you. You submit an URL and they return the HTML equivalent content of your page.

The Truly Ugly Part

Proper SEO requires much more than prerendering solutions. You need to be able to control at least the title, description and the robots META tag of each page. Setting canonical links or previous and next META tags can be necessary too. Angular.js does not offer any out-of-the-box solution. Some solutions for titles have been made available on the web.

Angular.js lets you create directives, which are more or less customized HTML tags. If you create a type E directive, it can be replaced with whatever content you are interested in, dynamically. I attempted to create one in the <head> section of my HTML document. Unfortunately, it is automatically moved to the <body> section of the document at runtime. Directives are not a solution.

Then, I wondered whether I could implement an angular.js service, retrieve the <head> DOM element, and inject SEO child tags according to the displayed route/page. This requires loading some kind of JSON containing the SEO configuration for each page. I managed to achieve something, but I gave up. This is far too ugly, risky, hard to test and crazy to maintain.

An alternative solution is to deal with SEO-related <head> tags on the server side when prerendering pages. It is a lot of work to workaround the limitations of angular.js.

The Bottom Line

All in all, if you want to implement proper SEO for an angular.js application, you can't escape implementing prerendering, which is equivalent to implementing multiple pages for a single page application. You also need to twist the way you set up your SEO-related tags in your HTML document.

Why not go for a multipage application in the first place? It's nuts !!! If SEO is critical to your business, ditch angular.js! It will save you a lot of time, money and complexity.