Some search related considerations for coding websites

Website development

Last Post by nico wouterse 2 years ago

1 Posts

1 Users

1 Reactions

929 Views

RSS

nico wouterse

(@lynkfs)

Posts: 9

Trusted Member

Topic starter

It is perfectly feasible to code a website using QTX. Some considerations do apply though.

For webapps with a known audience who know where to find it, the process is simple. Create the web-app, upload it to some server and Bob's your uncle.
If the app's content or functions need to be found though, then search considerations come into play.

Search these days is everywhere, most platforms in the bloggo-sphere have their own search capabilities (GitHub, Stackoverflow, Reddit, Instagram etc). However most of these are indexed by Google as well. Also many users use Google as their primary entry point anyway, so for the time being indexing by Google is still of paramount importance.

Google indexing used to be on the basis of HTML only. It's bots crawled pages and extracted indexing info by looking at page structure. This was a problem for sites where content and structure was hidden in javascript files, as is the case with QTX projects.
Fortunately since about 3 years Google has changed it's process, and now renders all pages in the background (headless browser) and uses that to get to content and structure.

One of the considerations is that generally these crawlers do not click on buttons or menu items on rendered pages to see what is behind those links. This means that for projects with multiple forms generally only the first form will be seen. And in the case of for instance log-in enabled apps nothing will ever be seen.
(The exception is where links consist of the <a href> type. The url in the href may get extracted and added to the indexing queue).
There is a workaround by mapping form-changes to the history api, but although that works it complicates matters more than necessary.

I find that for websites it is generally best to have a separate page for every form (a separate project for every form), and then link pages by using <a> elements, or in code by setting the correct url in window.location.href

The other consideration is that indexing for Google is a resource intensive exercise. Rendering pages in a headless browser is much more intensive than looking at HTML only. And given the sheer number of new websites coming online, even Google's resources are not limitless. Which means that apparently there is a time limit on javascript execution during crawling, and that rendering will stop if it takes too long. Which may lead to partial rendering and thus incomplete indexing.
How long is too long is unknown, but javascript file-size and things like loading external libraries and resources can become limiting factors.

Also one of the other strategies employed by Google is that their crawlers try to get to the most important part on pages while ignoring the rest. So left and right hand sides of pages, ads, footers, caroussels, images and the like may be simply omitted from indexing.

Note also that crawlers use http exclusively, so pages which include different protocols (websocket, webrtc) will not be rendered correctly.

I suppose the recommendation is to keep web pages as simple as possible and avoid elements not contributing to the primary message.

This post is not about SEO as a whole (a vast subject), but just on some of the effects of encoding single page apps in javascript files and how that may effect indexing.

Posted : 13/03/2023 7:36 am

Jon Lennart Aasenden reacted

Topic Tags

Forum Jump:

23 Forums
68 Topics
145 Posts
0 Online
167 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed