Lin Hong's TECH Blog! 刀不磨要生锈,人不学习要落后 - Thinking ahead

MySQL 8.0 Study 005 Tips

2023-01-28

MySQL 8.0 Study 005 Tips

学习系列

  • MySQL 存储引擎
  • MySQL 启动选项
  • MySQL 字符集查看
  • MySQL 锁相关(脏读/不可重复度/幻读)
  • MySQL 删除重复行

MySQL 存储引擎

MySQL支持的存储引擎和对应是否支持事务等信息

MySQL5.5.5开始InnoDB为默认存储引擎

创建表时候可以指定存储引擎类型

也可以通过alter table修改表的存储引擎类型

通过show create table xxx来查看表的存储引擎

mysql [localhost:8032] {msandbox} (test) > show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                        | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| ndbcluster         | NO      | Clustered, fault-tolerant tables                               | NULL         | NULL | NULL       |
| FEDERATED          | NO      | Federated MySQL storage engine                                 | NULL         | NULL | NULL       |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables      | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, and foreign keys     | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                             | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                          | NO           | NO   | NO         |
| ndbinfo            | NO      | MySQL Cluster system information storage engine                | NULL         | NULL | NULL       |
| MRG_MYISAM         | YES     | Collection of identical MyISAM tables                          | NO           | NO   | NO         |
| BLACKHOLE          | YES     | /dev/null storage engine (anything you write to it disappears) | NO           | NO   | NO         |
| CSV                | YES     | CSV storage engine                                             | NO           | NO   | NO         |
| ARCHIVE            | YES     | Archive storage engine                                         | NO           | NO   | NO         |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
11 rows in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > create table test01(id int) engine=memory;
Query OK, 0 rows affected (0.08 sec)

mysql [localhost:8032] {msandbox} (test) > alter table test01 engine=innodb;
Query OK, 0 rows affected (0.12 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql [localhost:8032] {msandbox} (test) > show create table test01\G
*************************** 1. row ***************************
       Table: test01
Create Table: CREATE TABLE `test01` (
  `id` int DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > 

MySQL 启动选项

使用下面命令查看启动项内容

mysqld --verbose --help

如:

  --bind-address=name IP address(es) to bind to. Syntax: address[,address]...,
                      where address can be an IPv4 address, IPv6 address, host
                      name or one of the wildcard values *, ::, 0.0.0.0. In
                      case more than one address is specified in a
                      comma-separated list, wildcard values are not allowed.
                      Every address can have optional network namespace
                      separated by the delimiter / from the address value.
                      E.g., the following value
                      192.168.1.1/red,172.16.1.1/green,193.168.1.1 specifies
                      three IP addresses to listen for incoming TCP connections
                      two of that have to be placed in corresponding
                      namespaces: the address 192.168.1.1 must be placed into
                      the namespace red and the address 172.16.1.1 must be
                      placed into the namespace green. Using of network
                      namespace requires its support from underlying Operating
                      System. Attempt to specify a network namespace for a
                      platform that doesn't support it results in error during
                      socket creation.

  --default-storage-engine=name 
                      The default storage engine for new tables

  --skip-networking   Don't allow connection with TCP/IP

MySQL 字符集查看

查看MySQL字符集:

  • ci 后缀说明 字符集的名称使用小写形式
  • maxlen 表示最多多少个字节表示一个字符(utf8mb4 表示 4个字节)

不同的字符集,CHAR(N)、VARCHAR(N) 对应最长的字节也不相同。 比如 GBK 字符集,1 个字符最大存储 2 个字节,UTF8MB4 字符集 1 个字符最大存储 4 个字节。

mysql [localhost:8032] {msandbox} ((none)) > show charset;
+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
| binary   | Binary pseudo charset           | binary              |      1 |
| cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
| cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
| cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
| cp850    | DOS West European               | cp850_general_ci    |      1 |
| cp852    | DOS Central European            | cp852_general_ci    |      1 |
| cp866    | DOS Russian                     | cp866_general_ci    |      1 |
| cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
| dec8     | DEC West European               | dec8_swedish_ci     |      1 |
| eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
| euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
| greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
| hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
| hp8      | HP West European                | hp8_english_ci      |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
| koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
| koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| macce    | Mac Central European            | macce_general_ci    |      1 |
| macroman | Mac West European               | macroman_general_ci |      1 |
| sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
| swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
| tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
| ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| utf8mb3  | UTF-8 Unicode                   | utf8mb3_general_ci  |      3 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_0900_ai_ci  |      4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.02 sec)

mysql [localhost:8032] {msandbox} ((none)) > 

字符比较规则查看 排序规则(Collation)是比较和排序字符串的一种规则,每个字符集都会有默认的排序规则

  • ai 不区分重音
  • as 区分重音
  • ci 不区分大小写
  • cs 区分大小写
  • bin 二进制比较
mysql [localhost:8032] {msandbox} ((none)) > show collation;
mysql [localhost:8032] {msandbox} ((none)) > show collation like 'utf8mb4\_%';
+----------------------------+---------+-----+---------+----------+---------+---------------+
| Collation                  | Charset | Id  | Default | Compiled | Sortlen | Pad_attribute |
+----------------------------+---------+-----+---------+----------+---------+---------------+
| utf8mb4_0900_ai_ci         | utf8mb4 | 255 | Yes     | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_ci         | utf8mb4 | 305 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_cs         | utf8mb4 | 278 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_bin           | utf8mb4 | 309 |         | Yes      |       1 | NO PAD        |

服务器级别字符集和比较规则

mysql [localhost:8032] {msandbox} ((none)) > show variables like 'character%';
+--------------------------+-----------------------------------+
| Variable_name            | Value                             |
+--------------------------+-----------------------------------+
| character_set_client     | utf8mb4                           |
| character_set_connection | utf8mb4                           |
| character_set_database   | utf8mb4                           |
| character_set_filesystem | binary                            |
| character_set_results    | utf8mb4                           |
| character_set_server     | utf8mb4                           |
| character_set_system     | utf8mb3                           |
| character_sets_dir       | /opt/mysql/8.0.32/share/charsets/ |
+--------------------------+-----------------------------------+
8 rows in set (0.00 sec)

mysql [localhost:8032] {msandbox} ((none)) > show variables like 'collation%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_0900_ai_ci |
| collation_database   | utf8mb4_0900_ai_ci |
| collation_server     | utf8mb4_0900_ai_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)

mysql [localhost:8032] {msandbox} ((none)) > 

而且随着移动互联网的飞速发展,推荐把 MySQL 的默认字符集设置为 UTF8MB4,否则,某些 emoji 表情字符无法在 UTF8 字符集下存储,比如 emoji 笑脸表情,对应的字符编码为 0xF09F988E

mysql [localhost:8032] {msandbox} ((none)) > select cast(0xF09F988E as char charset utf8) as emoji;
+-------+
| emoji |
+-------+
| NULL  |
+-------+
1 row in set, 2 warnings (0.01 sec)

mysql [localhost:8032] {msandbox} ((none)) > select cast(0xF09F988E as char charset utf8mb4) as emoji;
+-------+
| emoji |
+-------+
| 😎      |
+-------+
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} ((none)) > 

后期需要进行字符集转换

ALTER TABLE test CHARSET utf8mb4;

对应字段依旧没有转化

mysql [localhost:8032] {msandbox} (test) > create table t1(col1 varchar(100) character set utf8,primary key(col1)) engine=innodb default charset=utf8;
Query OK, 0 rows affected, 2 warnings (0.07 sec)

mysql [localhost:8032] {msandbox} (test) > show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `col1` varchar(100) CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci NOT NULL,
  PRIMARY KEY (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > 
mysql [localhost:8032] {msandbox} (test) > show full columns from t1;
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type         | Collation          | Null | Key | Default | Extra | Privileges                      | Comment |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| col1  | varchar(100) | utf8mb3_general_ci | NO   | PRI | NULL    |       | select,insert,update,references |         |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > 

-- alter table 转化

mysql [localhost:8032] {msandbox} (test) > alter table t1 charset utf8mb4;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql [localhost:8032] {msandbox} (test) > show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `col1` varchar(100) CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci NOT NULL,
  PRIMARY KEY (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > show full columns from t1;
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type         | Collation          | Null | Key | Default | Extra | Privileges                      | Comment |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| col1  | varchar(100) | utf8mb3_general_ci | NO   | PRI | NULL    |       | select,insert,update,references |         |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > 

col1字符集没有改变,需要下面方法改变

mysql [localhost:8032] {msandbox} (test) > alter table t1 convert to charset utf8mb4;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql [localhost:8032] {msandbox} (test) > show full columns from t1;
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type         | Collation          | Null | Key | Default | Extra | Privileges                      | Comment |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| col1  | varchar(100) | utf8mb4_0900_ai_ci | NO   | PRI | NULL    |       | select,insert,update,references |         |
+-------+--------------+--------------------+------+-----+---------+-------+---------------------------------+---------+
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `col1` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
  PRIMARY KEY (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql [localhost:8032] {msandbox} (test) > 

MySQL 乱码

查看数据库的编码
show variables like ‘character_set%’;

character_set_client:客户端请求数据的字符集
character_set_connection:客户机与服务器连接的字符集
character_set_database:默认数据库的字符集;如果没有默认数据库,就会使用 character_set_server指定的字符集(建议不要随意更改)
character_set_filesystem:把 character_set_client转换character_set_filesystem (默认为binary, 不做任何转换)
character_set_results:返回给客户端的字符集
character_set_server:数据库服务器的默认字符集
character_set_system:系统字符集,默认utf8。(用于数据库的表、列和存储在目录表中函数的名字)

要处理中文,则可以将character-set-server和character-set-client均设置为GB2312,如果要同时处理多国语言,则设置为UTF8。

参考:

彻底解决mysql中文乱码

MySQL-不能忽略的 COLLATION

MySQL 锁相关(脏读/不可重复度/幻读)

脏读

当前事务读取到另外一个事务未提交的一条记录
如果另外一个事务在写的时候给记录加上锁,当前事务就无法读取该记录时再获取到锁,就不会出现脏读

不可重复读

当前事务先读取一条记录,另外一个事务对该记录就行修改。当前事务再次读取到不同的记录。
如果当前事务读取该记录时候加锁,另外一个事务就无法修改改记录。

幻读

当前事务读取了某些条件的记录,之后别的事务又插入符合相同条件的新记录,导致当前事务再次读取相同条件记录时候,可以读到别的事务插入的新记录

读读情况不会引起什么问题,对于写写,读写,写读情况,可能会引起一些问题,这时候需要MVCC(多版本控制)或者加锁方式来解决。

MySQL 删除重复行

How To Delete Duplicate Rows in MySQL

如何在MySQL表中删除重复行

DROP TABLE IF EXISTS contacts;

CREATE TABLE contacts (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL, 
    email VARCHAR(255) NOT NULL
);

INSERT INTO contacts (first_name,last_name,email) 
VALUES ('Carine ','Schmitt','carine.schmitt@verizon.net'),
       ('Jean','King','jean.king@me.com'),
       ('Peter','Ferguson','peter.ferguson@google.com'),
       ('Janine ','Labrune','janine.labrune@aol.com'),
       ('Jonas ','Bergulfsen','jonas.bergulfsen@mac.com'),
       ('Janine ','Labrune','janine.labrune@aol.com'),
       ('Susan','Nelson','susan.nelson@comcast.net'),
       ('Zbyszek ','Piestrzeniewicz','zbyszek.piestrzeniewicz@att.net'),
       ('Roland','Keitel','roland.keitel@yahoo.com'),
       ('Julie','Murphy','julie.murphy@yahoo.com'),
       ('Kwai','Lee','kwai.lee@google.com'),
       ('Jean','King','jean.king@me.com'),
       ('Susan','Nelson','susan.nelson@comcast.net'),
       ('Roland','Keitel','roland.keitel@yahoo.com');

select * from contacts order by email;

SELECT 
    email, COUNT(email)
FROM
    contacts
GROUP BY 
    email
;

SELECT 
    email, COUNT(email)
FROM
    contacts
GROUP BY 
    email
HAVING 
    COUNT(email) > 1;

-- A) Delete duplicate rows using DELETE JOIN statement

DELETE t1 FROM contacts t1
INNER JOIN contacts t2 
WHERE 
    t1.id < t2.id AND 
    t1.email = t2.email;

-- In case you want to delete duplicate rows and keep the lowest id, you can use the following statement:

DELETE c1 FROM contacts c1
INNER JOIN contacts c2 
WHERE
    c1.id > c2.id AND 
    c1.email = c2.email;

-- B) Delete duplicate rows using an intermediate table
-- The following shows the steps for removing duplicate rows using an intermediate table:

-- 1. Create a new table with the structure the same as the original table that you want to delete duplicate rows.
-- 2. Insert distinct rows from the original table to the immediate table.
-- 3. Drop the original table and rename the immediate table to the original table.

-- step 1
CREATE TABLE contacts_temp 
LIKE contacts;

-- step 2
INSERT INTO contacts_temp
SELECT * 
FROM contacts 
GROUP BY email;


-- step 3
DROP TABLE contacts;

ALTER TABLE contacts_temp 
RENAME TO contacts;

-- C) Delete duplicate rows using the ROW_NUMBER() function

-- The following statement uses the ROW_NUMBER() function to assign a sequential integer number to each row. If the email is duplicate, the row number will be greater than one.

SELECT 
	id, 
    email, 
    ROW_NUMBER() OVER ( 
		PARTITION BY email 
        ORDER BY email
	) AS row_num 
FROM contacts;

SELECT 
	id 
FROM (
	SELECT 
		id,
		ROW_NUMBER() OVER (
			PARTITION BY email
			ORDER BY email) AS row_num
	FROM 
		contacts
) t
WHERE 
	row_num > 1;

DELETE FROM contacts 
WHERE 
	id IN (
	SELECT 
		id 
	FROM (
		SELECT 
			id,
			ROW_NUMBER() OVER (
				PARTITION BY email
				ORDER BY email) AS row_num
		FROM 
			contacts
		
	) t
    WHERE row_num > 1
);

Referece

参考:

MySQL-不能忽略的 COLLATION

Have a good work&life! 2023/01 via LinHong


Similar Posts

上一篇 OCI SSH Key Tips

下一篇 Exadata CoD Tips

Comments