Specify a Character Set Early
Character set defined
A character set (charset) is a list of characters that browsers use in order to display a web page. Each character corresponds to a specific number, and depending on which charset is defined by the website will determine how the browser will read it. There exist many different characters sets for various purposes, as well as sets that have simply expanded upon pre-existing character sets.
Currently, the most popular charset and the default used in HTML5 is UTF-8.
Why is it important to specify a character set early?
Specifying a charset early is important as it lets the browser know how it should display the web page. This provides the browser with the ability to begin parsing and executing assets immediately, therefore reducing latency time. Many browsers look for the charset parameter in the first 1024 bytes of a page before executing any JavaScript or rendering the page.
Therefore if specifying a character set within the HTML document, it is important to do so at the very beginning of the <head>
tag. Without specifying a charset, the browser is left to figure out which charset to use on its own, which inevitably takes more time and is less efficient. Additionally, if the browser does not find a specified character set early on, it will use the default encoding. If later down the page, the character set is specified, the browser may need to reparse information or even rerequest resources.
Thankfully, if your website does not currently specify a character set, there are two easy ways to fix that.
How to specify a character set early?
There are two separate ways to specify a character set early. One method is on the server side, while the other is on the client side. It is recommended to always specify the character set on the server side whenever possible. In any event, the following section shows how to specify a character set on both the server and client side.
Server side
The server side method for specifying a charset early is the recommended way to go. This method helps decrease load time which is explained in further detail in our Avoid a Character Set in the Meta Tag article. It also avoids the requirement of defining a charset for each HTML file. Depending on your server, the configuration will vary. The following are examples for both Nginx and Apache.
Nginx
http {
include /etc/nginx/mime.types;
charset UTF-8;
...
}
Apache
AddType 'text/html; charset=UTF-8' html
Client side
Alternatively, if you do not have access to your server, you can specify a charset early in your HTML file. Include this as close to the opening <head>
tag as possible to ensure that the browser realizes that there is a specified charset.
<meta http-equiv="content-type" content="text/html;charset=UTF-8">
As mentioned in the previous section, using this method is not optimal. If testing your site with a speed test tool, you may receive a recommendation to avoid a character set in the meta tag. However, using this method is still better than not specifying the charset at all if you do not have access to the origin server.