RegExp正则表达式

2022.03.02 Wed

1-规则

特殊字符		表达式
\w	匹配字母数字下划线，相当于[A-Za-z0-9_]	[xyz]	字符集，匹配 x 或 y 或 z
\r	匹配回车符	[^xyz]	否定字符集，匹配非 x 或 y 或 z
\n	匹配换行符	x{n,m}	匹配 x 出现 n-m 次的条件
.	匹配除换行符外的所有字符	(x)	匹配 x 并且记住匹配项(捕获括号)
		(?:x)	匹配 x 但是不记住匹配项(非捕获括号)
		x(?=y)	匹配后面是 y 的 x
		x(?!y)	匹配后面不是 y 的 x
		(?<=y)x	匹配前面是 y 的 x
		(?<!y)x	匹配前面不是 y 的 x

2-JavaScript 方法

1
2
3

const regExp=/\d+(?=hello)/g
const str1='123hello456'
const str2='123world456'

RegExp 对象 exec test

//test 判断字符串是否匹配正则 返回boolean
//exec 查找字符串符合正则的部分，存在返回包含条件的数组，否则返回null
regExp.test(str1); // true
regExp.test(str2); // false
regExp.exec(str1); // [ '123', index: 0, input: '123hello456', groups: undefined ]
regExp.exec(str2); // null

当正则表达式是全局时，exec 与 test 都会记录当前检索位置，下次从当前位置继续执行

String 对象 match matchAll replace search split

//match 方法与exec作用相同
//search 查找字符串符合正则的部分，存在返回匹配的索引位置，否则返回-1
str1.match(regExp); // [ '123' ]
str1.replace(regExp,'haha'); // 'hahahello456'
str1.search(/\d+$/); // 8
str2.search(regExp); // -1
str1.split(/\d+/); // [ '', 'hello', '' ]

exec 和 match 方法区别

const reg1=/\d+/
const reg2=/\d+/g
const str='123hello456world999'
reg1.exec(str); // [ '123', index: 0, input: '123hello456world999', groups: undefined ]
reg2.exec(str); // [ '123', index: 0, input: '123hello456world999', groups: undefined ]
reg2.exec(str); // [ '456', index: 8, input: '123hello456world999', groups: undefined ]
str.match(reg1); // [ '123', index: 0, input: '123hello456world999', groups: undefined ]
str.match(reg2); // [ '123', '456', '999' ]

当正则表达式是全局时，exec 方法返回匹配的第一个匹配结果数组，match 方法返回所有匹配的结果组成的数组，但是 exec 会记录当前检索位置，下次从当前位置继续执行。
当正则表达式非全局时，exec 和 match 都返回匹配的第一个结果数组

3-捕获与非捕获

const str='123hello456world999'

str.match(/(\d+)/); //  ['123', '123', index: 0, input: '123hello456world999', groups: undefined]
str.match(/(?:\d+)/); // ['123', index: 0, input: '123hello456world999', groups: undefined]

如上，捕获组匹配成功后返回的数组包括匹配到的字符串和所有被记住的子字符串，非捕获组匹配成功返回的数组只包括匹配到的字符串

捕获组命名
捕获组其实是分为编号捕获组 Numbered Capturing Groups 和命名捕获组 Named Capturing Groups 的，我们上面说的捕获组，默认指的是编号捕获组。命名捕获组，也是捕获组，只是语法不一样。命名捕获组的语法如下：(?group) 或 (?'name'group)，其中 name 表示捕获组的名称，group 表示捕获组里面的正则。

//编号捕获组
 'abcabcabcabcabc'.match(/(abc)\1/g); // ['abcabc', 'abcabc']

//命名捕获组
'2022-03-07'.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
// ['2022-03-07', '2022', '03', '07', index: 0, input: '2022-03-07', groups: {year: '2022', month: '03', day: '07'}]

通过捕获组拆分匹配到的数组

/* 捕获组会返回匹配到的字符串和所有记住的子字符串 */
const date='2020-05-23'

date.match(/\d{4}-\d{2}-\d{2}/); // ['2020-05-23', index: 0, input: '2020-05-23', groups: undefined]
date.match(/(\d{4})-(\d{2})-(\d{2})/); //  ['2020-05-23', '2020', '05', '23', index: 0, input: '2020-05-23', groups: undefined]

对捕获组反向引用

const str='hellohellohello'

str.match(/(hello)/); //  ['hello', 'hello', index: 0, input: 'hellohellohello', groups: undefined]
str.match(/(hello)\1/); // ['hellohello', 'hello', index: 0, input: 'hellohellohello', groups: undefined]

非捕获组(?😃

const str='ha-ha,haa-haa'

// \1 引用的第一个组实际上是指向第二个组，因为第一个是未被捕获的分组。
str.match(/(?:ha)-ha,(haa)-\1/g); // ['ha-ha,haa-haa']

4-正则表达式使用变量

正则表达式使用变量时，任何写在字面量模版里的变量都会解析成字符串，可以通过拼接或者构造函数。

1
2
3

//两种写法是等价的
var re = new RegExp('\\w+','g');
var re = /\w+/g;

1
2
3

const string='hello';
const regExp=new RegExp(`\\d+(?=${string})`,'g');
str1.match(regExp); // [ '123' ]

* 和 + 限定符都是贪婪的，因为它们会尽可能多的匹配文字，只有在它们的后面加上一个 ? 就可以实现非贪婪或最小匹配。

5-示例


//将字符根据空格或者-分成数组
const test1='hello world'
const test2='hello-world'
test1.split(/[\s\-]+/) //['hello','world']
test2.split(/[\s\-]+/) //['hello','world']

//替换掉尖括号的内容
const test3='<abcdefg>>'
test3.replace(/(?<=<).*(?=>)/g, 'hello') //'<hello>'

//删除数字前面的0
const test4=00003232323
test.replace(/^0+/g, '')

//1-9开头的数字
const numReg = /^[1-9]+\d/g;

//以第一个非连字符隔开
'hello-world-this-is-test'.split(/(?<=^[^-]+)-/)
// [ 'hello', 'world-this-is-test' ]

//限制250单词数量
const reg=/^(?:\b\w+\b[\s\r\n]*){1,10}$/g

参考链接：
正则表达式练习

JavaScript