What is the utf8mb4_0900_ai_ci collation?
What is the meaning of the MySQL collation utf8mb4_0900_ai_ci?
utf8mb4means that each character is stored as a maximum of 4 bytes in the UTF-8 encoding scheme.
0900refers to the Unicode Collation Algorithm version. (The Unicode Collation Algorithm is the method used to compare two Unicode strings that conforms to the requirements of the Unicode Standard).
airefers accent insensitivity. That is, there is no difference between e, è, é, ê and ë when sorting.
cirefers to case insensitivity. This is, there is no difference between p and P when sorting.
utf8mb4 has become the default character set, with
utf8mb4_0900_ai_ci as the default collation in MySQL 8.0.1 and later. Previously,
utf8mb4_general_ci was the default collation. Because the
utf8mb4_0900_ai_ci collation is now the default, new tables have the ability to store characters outside the Basic Multilingual Plane by default. Emojis can now be stored by default. If accent sensitivity and case sensitivity are required, you may use
If you are interested in the details, the MySQL developers have explained the motivation behind the switch to
utf8mb4_0900_ai_ci as the default collation in this article: New collations in MySQL 8.0.0.