UTF-8 is defined as the default character encoding for HTML5 used to display an HTML page perfectly. It encourages web developers to use UTF-8 as it covers all the characters and symbols in the entity that uses one byte and works well in all the browsers. Unicode Transformation Format – 8 bits are a method converts typed character into machine-readable code. The charset attribute is used to perform a character encoding for the HTML.
Syntax of UTF-8 in HTML
Specification of UTF-8 Character encoding in the tag is given as:
<meta charset="UTF-8">
Here meta gives data about the HTML document but is machine-readable. And their elements specify a keyword, last modified etc. This meta tag contains the charset, which tells the web browser while accessing the page.
Encoding is how the given numbers are converted to binary numbers, which a machine understood. Here each character is made up of one or more bytes respectively.
As an example, let’s take the text Hi, EDUCBA!
The UTF-8-character Encoding is given as below:
01001000 01101001 00101100 01100101 01000100 01010101 01000011 01000010 01000001 00100001
Which converts into a machine-readable binary structure.
Next, we shall see how the Unicode representation is important while taking up foreign languages in the content.
Given below are the examples of UTF-8 in HTML:
Simple example with the paragraph content.
Code:
new.html
<meta charset="UTF-8">Page Title !مرحبا بالعالم
你叫什么名字?
This is Chinese Language.
This is the code demonstrating encoding Process
Explanation:
Output:
Using Buttons for the input text.
Code:
lang.html
<!DOCTYPE HTML > <html> <head> <title>HTML sample -buttons</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <form action="addressing" method="post"> <fieldset> <legend>Selection list</legend> Checkbox: <input type="checkbox" name="King" value="one"><br> RadioButton1: <input type="radio" name="Queen" value="two"><br> RadioButton2: <input type="radio" name="Jack" value="three" checked="checked"><br> </fieldset> <fieldset> <legend>Give Input</legend> Login Id: <input type="text" name="Login name"><br> Password: <input type="password" name="Strong Password"><br> </fieldset> <fieldset> <legend>Designation</legend> <p><input type="checkbox" name=" Software Engineer"> Software Engineer</p> <p><input type="checkbox" name="Data Analyst"> Data Analyst</p> <p><input type="checkbox" name="Web Developer"> Web Developer</p> <p><input type="checkbox" name=" Senior Analyst"> Senior Analyst</p> </fieldset> <p><input type="submit" value="press"> <input type="reset"></p> </form> </body> </html>
Explanation:
Output:
Code using foreign-language content.
Code:
mett.html
<!DOCTYPE html> <html> <head> <title> HTML UTF-8 Charset </title> <meta name="keywords" charset="UTF-8" content="Meta Tags, Metadata" /> </head> <body style="text-align:left"> <H1>Hi Instructor!</H1> <h2> This is my formal e-mail for the joining. </h2> <h3>Hola, me llamo Juan </h3> <b>Mucho gusto </b> </body> </html>
Explanation:
Output:
Using JavaScript.
Code:
name.js
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>UTF-8 Charset</title> <style> span { color: blue; } span.name { color: red; font-weight: bolder; } </style> <script src="https://code.jquery.com/jquery-3.5.0.js"></script> </head> <body> <div> <span>Thomas,</span> <span>John Betson,</span> <span>Valli Tromson</span> </div> <div> <span>आभरणा,</span> <span>आचुथान,</span> <span>अभिनंध</span> </div> <script> $( "div span:first-child" ) .css( "text-decoration", "Underline" ) .hover(function() { $( this ).addClass( "name" ); }); </script> </body> </html>
Explanation:
Output:
So that’s all about the encoding of UTF-8 in HTML. We have gone through Unicode and encodes in the HTML briefly and the implementation of HTML and JavaScript. In this emerging software world, the character sets are not made so feasible; therefore, there comes character encoding schemes to be done with the HTML and other programming languages. Therefore, it is said that it is best to use UTF-8 everywhere where it doesn’t need any conversions encoding.
The above is the detailed content of UTF-8 in HTML. For more information, please follow other related articles on the PHP Chinese website!