💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
# 第八课 正则表达式 > 常用正则 ``` <pre class="calibre14">``` <span class="token2">(</span><span class="token2">[</span>\s\S<span class="token2">]</span><span class="token">*</span><span class="token">?</span><span class="token2">)</span> 表示任意多个字符<span class="token2">,</span>换行也可以匹配 <span class="token2">(</span><span class="token2">[</span>\s<span class="token">*</span><span class="token2">]</span><span class="token">+</span><span class="token2">)</span> 匹配一个或多个空格 <span class="token2">(</span><span class="token2">[</span>\s<span class="token2">,</span><span class="token2">]</span><span class="token">+</span><span class="token2">)</span> 匹配多个空格或逗号 <span class="token2">(</span><span class="token2">[</span><span class="token2">,</span><span class="token2">]</span><span class="token">+</span><span class="token2">)</span> 匹配多个逗号 <span class="token">/</span>php<span class="token">/</span>i 不区分大小写 <span class="token">^</span> $ 匹配开始结束字符 <span class="token2">.</span> 匹配除换行以外字符串 <span class="token">?</span> <span class="token3">0</span>次 或 <span class="token3">1</span>次 等价<span class="token2">{</span><span class="token3">0</span><span class="token2">,</span><span class="token3">1</span><span class="token2">}</span> <span class="token">*</span> <span class="token3">0</span>次 或 多次 等价<span class="token2">{</span><span class="token3">0</span><span class="token2">,</span><span class="token2">}</span> <span class="token">+</span> <span class="token3">1</span>次 或 多次 等价<span class="token2">{</span><span class="token3">1</span><span class="token2">,</span><span class="token2">}</span> <span class="token">-</span> 表示范围 <span class="token2">[</span><span class="token2">]</span> 开始结束字符类定义 \d 任意<span class="token3">10</span>进制数字 <span class="token2">[</span><span class="token3">0</span><span class="token">-</span><span class="token3">9</span><span class="token2">]</span> \s 任意空白字符 单个 \S 任意非空白字符 \w 任意单词字符 等价<span class="token2">[</span>a<span class="token">-</span>zA<span class="token">-</span>Z0<span class="token">-</span><span class="token3">9</span><span class="token2">]</span> <span class="token2">(</span><span class="token">?</span><span class="token2">:</span>中国<span class="token">|</span>美国<span class="token2">)</span><span class="token2">(</span><span class="token2">.</span><span class="token">*</span><span class="token2">)</span> 匹配中国<span class="token2">,</span>美国开头的字符串 <span class="token2">(</span>\d<span class="token">+</span>\<span class="token2">.</span>\d<span class="token">+</span>\<span class="token2">.</span>\d<span class="token">+</span>\<span class="token2">.</span>\d<span class="token">+</span><span class="token2">)</span> IP <span class="token2">(</span><span class="token2">[</span>a<span class="token">-</span>zA<span class="token">-</span>Z<span class="token2">]</span><span class="token2">[</span>a<span class="token">-</span>zA<span class="token">-</span>Z0<span class="token">-</span><span class="token3">9</span>_<span class="token2">]</span><span class="token2">)</span> 匹配是否合法字母开头 <span class="token2">(</span>\d<span class="token">-</span>\d<span class="token">|</span>\d<span class="token">-</span>\d<span class="token2">)</span> 电话号码 <span class="token2">[</span><span class="token3">1</span><span class="token">-</span><span class="token3">9</span><span class="token2">]</span><span class="token2">[</span><span class="token3">0</span><span class="token">-</span><span class="token3">9</span><span class="token2">]</span> qq <span class="token">^</span><span class="token2">[</span>\w\<span class="token2">.</span>\<span class="token">-</span><span class="token2">]</span><span class="token">+</span>@\w<span class="token">+</span><span class="token2">(</span><span class="token2">[</span>\<span class="token2">.</span>\<span class="token">-</span><span class="token2">]</span>\w<span class="token">+</span><span class="token2">)</span><span class="token">*</span>\<span class="token2">.</span>\w<span class="token">+</span>$ email href<span class="token">=</span><span class="token4">"(.*?)"</span> 超链接 <span class="token">/</span><span class="token">^</span>\d<span class="token2">{</span><span class="token3">1</span><span class="token2">,</span><span class="token3">6</span><span class="token2">}</span>$<span class="token">/</span> 匹配<span class="token3">0</span><span class="token">-</span><span class="token3">999999</span> <span class="token">/</span>\d<span class="token2">{</span><span class="token3">4</span><span class="token2">}</span>年\d<span class="token2">{</span><span class="token3">1</span><span class="token2">,</span><span class="token3">2</span><span class="token2">}</span>月\d<span class="token2">{</span><span class="token3">1</span><span class="token2">,</span><span class="token3">2</span><span class="token2">}</span><span class="token">/</span> 匹配年月日 ``` ``` > preg\_math 匹配一次,成功返回 true ``` <pre class="calibre14">``` <span class="token1">preg_match</span><span class="token2">(</span><span class="token4">"/\<center>([\s\S]*?)<\/center\>/"</span><span class="token2">,</span>$str<span class="token2">,</span>$rs<span class="token2">)</span><span class="token2">;</span> ``` ``` > preg\_match\_all匹配多次,成功返回true ``` <pre class="calibre14">``` <span class="token1">preg_match_all</span><span class="token2">(</span><span class="token4">"/\<center>([\s\S]*?)<\/center\>/"</span><span class="token2">,</span>$str<span class="token2">,</span>$rs<span class="token2">)</span><span class="token2">;</span> ``` ``` > preg\_replace 匹配替换,替换成$re ``` <pre class="calibre14">``` $rs <span class="token">=</span><span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/\<center>([\s\S]*?)<\/center\>/"</span><span class="token2">,</span>$re<span class="token2">,</span>$str<span class="token2">)</span><span class="token2">;</span> ``` ``` > preg\_split分割成数组 ``` <pre class="calibre14">``` $arr <span class="token">=</span> <span class="token1">preg_split</span><span class="token2">(</span><span class="token4">'/([\s*]+)/'</span><span class="token2">,</span><span class="token4">"a b c d ef"</span><span class="token2">)</span><span class="token2">;</span> ``` ``` 替换 ``` <pre class="calibre14">``` $str <span class="token">=</span> <span class="token4">"选项[http://127.0.0.1/weixin/addons/yoby_diyform/weui/fm.jpg]你好"</span><span class="token2">;</span> $str1 <span class="token">=</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/(?:\[)(.*?)(?:\])/i"</span><span class="token2">,</span> <span class="token4">"<img src=\"\${1}\" />"</span><span class="token2">,</span> $str<span class="token2">)</span><span class="token2">;</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/.*\|(.*?)\|.*/i"</span><span class="token2">,</span> <span class="token4">"\${1}"</span><span class="token2">,</span> $v<span class="token2">)</span><span class="token2">;</span> 字符<span class="token">|</span><span class="token3">120000</span><span class="token">|</span>来了 输出<span class="token3">120000</span> ``` ``` \\s+ 多个空白 \[^>\] >左边任意字符 .\*? 任意多个字符 \\d+ 匹配数字 ``` <pre class="calibre14">``` <span class="token6">/*获取html并用正则处理*/</span> <span class="token5">function</span> <span class="token1">get_content</span><span class="token2">(</span>$url<span class="token2">)</span><span class="token2">{</span> $html <span class="token">=</span> <span class="token1">file_get_contents</span><span class="token2">(</span>$url<span class="token2">)</span><span class="token2">;</span> $code<span class="token">=</span> <span class="token1">mb_detect_encoding</span><span class="token2">(</span>$html<span class="token2">,</span> <span class="token1">array</span><span class="token2">(</span><span class="token4">"GB2312"</span><span class="token2">,</span><span class="token4">"GBK"</span><span class="token2">,</span><span class="token4">'UTF-8'</span><span class="token2">,</span><span class="token4">'BIG5'</span><span class="token2">)</span><span class="token2">)</span><span class="token2">;</span><span class="token6">//获取编码</span> <span class="token5">if</span><span class="token2">(</span>$code<span class="token">!=</span><span class="token4">"UTF-8"</span><span class="token2">)</span><span class="token2">{</span> $htmls <span class="token">=</span> <span class="token1">mb_convert_encoding</span><span class="token2">(</span>$html<span class="token2">,</span> <span class="token4">"UTF-8"</span><span class="token2">,</span> $code<span class="token2">)</span><span class="token2">;</span><span class="token6">//转换内容为UTF-8编码</span> <span class="token2">}</span><span class="token5">else</span><span class="token2">{</span> $htmls <span class="token">=</span> $html<span class="token2">;</span> <span class="token2">}</span> $htmls <span class="token">=</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/<script[\s\S]*?<\/script>/i"</span><span class="token2">,</span><span class="token4">""</span><span class="token2">,</span>$htmls<span class="token2">,</span><span class="token">-</span><span class="token3">1</span><span class="token2">)</span><span class="token2">;</span><span class="token6">//去除script</span> $htmls <span class="token">=</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/<noscript[\s\S]*?<\/noscript>/i"</span><span class="token2">,</span><span class="token4">""</span><span class="token2">,</span>$htmls<span class="token2">,</span><span class="token">-</span><span class="token3">1</span><span class="token2">)</span><span class="token2">;</span><span class="token6">//去除noscript</span> $htmls<span class="token">=</span><span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/<(\/?link.*?)>/si"</span><span class="token2">,</span><span class="token4">""</span><span class="token2">,</span>$htmls<span class="token2">)</span><span class="token2">;</span><span class="token6">//去掉link</span> $htmls<span class="token">=</span><span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/<(style.*?)>(.*?)<(\/style.*?)>/si"</span><span class="token2">,</span><span class="token4">""</span><span class="token2">,</span>$htmls<span class="token2">)</span><span class="token2">;</span><span class="token6">//去掉style</span> $htmls <span class="token">=</span><span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/style=.+?['|\"]/i"</span><span class="token2">,</span><span class="token4">''</span><span class="token2">,</span>$htmls<span class="token2">,</span><span class="token">-</span><span class="token3">1</span><span class="token2">)</span><span class="token2">;</span><span class="token6">//去除style行内样式</span> $htmls <span class="token">=</span><span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">'#<!--[^\!\[]*?(?<!\/\/)-->#'</span> <span class="token2">,</span> <span class="token4">''</span> <span class="token2">,</span> $htmls<span class="token2">)</span><span class="token2">;</span><span class="token6">//去掉html注释</span> $htmls <span class="token">=</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/<a[^>]*>(.*?)<\/a>/is"</span><span class="token2">,</span> <span class="token4">"$1"</span><span class="token2">,</span> $htmls<span class="token2">)</span><span class="token2">;</span><span class="token6">//去除外站超链接</span> $htmls <span class="token">=</span> <span class="token1">preg_replace</span><span class="token2">(</span><span class="token4">"/(\n\r)/i"</span><span class="token2">,</span> <span class="token4">''</span><span class="token2">,</span> $htmls<span class="token2">)</span><span class="token2">;</span> <span class="token6">//去掉空行</span> <span class="token5">return</span> $htmls<span class="token2">;</span> <span class="token2">}</span> <span class="token1">preg_match</span><span class="token2">(</span><span class="token4">'/<div class="infoBox-list".*?>.*?<div class="news-page clearfix">/ism'</span><span class="token2">,</span> $htmls<span class="token2">,</span> $rs<span class="token2">)</span><span class="token2">;</span> $htmls <span class="token">=</span> $rs<span class="token2">[</span><span class="token3">0</span><span class="token2">]</span><span class="token2">;</span><span class="token6">//获取两个class之间内容</span> $url <span class="token">=</span> <span class="token2">(</span><span class="token1">preg_match</span><span class="token2">(</span><span class="token4">'/^http(s)?:\\/\\/.+/'</span><span class="token2">,</span>$url<span class="token2">)</span><span class="token2">)</span><span class="token">?</span>$url<span class="token2">:</span>"http<span class="token2">:</span><span class="token">/</span><span class="token">/</span> "<span class="token2">.</span>$url<span class="token2">;</span><span class="token6">//判断是否包含https/http</span> <span class="token1">preg_match</span><span class="token2">(</span><span class="token4">"/src=\"\/?(.*?)\"/"</span><span class="token2">,</span>$content<span class="token2">,</span>$match<span class="token2">)</span><span class="token2">;</span> 第一张图片 ``` ``` ``` <pre class="calibre16">``` <span class="token2">[</span>\u4e00<span class="token">-</span>\u9fa5<span class="token2">]</span><span class="token2">{</span><span class="token3">0</span><span class="token2">,</span><span class="token2">}</span> 匹配中文 \d<span class="token">+</span> 匹配<span class="token">>=</span><span class="token3">0</span>数字 <span class="token2">[</span>a<span class="token">-</span>zA<span class="token">-</span>Z<span class="token2">]</span><span class="token">+</span> 不区分大小写<span class="token3">26</span>个字母 <span class="token2">[</span>A<span class="token">-</span>Za<span class="token">-</span>z0<span class="token">-</span><span class="token3">9</span><span class="token2">]</span><span class="token">+</span> 英文与数字 \s<span class="token">+</span> 多个空格 <span class="token2">[</span><span class="token3">0</span><span class="token">-</span><span class="token3">9</span><span class="token2">]</span><span class="token">*</span> 匹配一串数字 \d<span class="token2">{</span><span class="token3">4</span><span class="token2">}</span> 匹配四位数字 \d<span class="token2">{</span><span class="token3">5</span><span class="token2">,</span><span class="token2">}</span> 匹配至少<span class="token3">5</span>位数 \d<span class="token2">{</span><span class="token3">4</span><span class="token2">,</span><span class="token3">10</span><span class="token2">}</span> 匹配<span class="token3">4</span><span class="token">-</span><span class="token3">10</span>位数 ``` ```