Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
lewis e28644e8d9 | 1 year ago | |
---|---|---|
.. | ||
syntax | 1 year ago | |
.gitignore | 1 year ago | |
.travis.yml | 1 year ago | |
ATTRIB | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago | |
match.go | 1 year ago | |
regexp.go | 1 year ago | |
replace.go | 1 year ago | |
runner.go | 1 year ago | |
testoutput1 | 1 year ago |
Regexp2 is a feature-rich RegExp engine for Go. It doesn't have constant time guarantees like the built-in regexp
package, but it allows backtracking and is compatible with Perl5 and .NET. You'll likely be better off with the RE2 engine from the regexp
package and should only use this if you need to write very complex patterns or require compatibility with .NET.
The engine is ported from the .NET framework's System.Text.RegularExpressions.Regex engine. That engine was open sourced in 2015 under the MIT license. There are some fundamental differences between .NET strings and Go strings that required a bit of borrowing from the Go framework regex engine as well. I cleaned up a couple of the dirtier bits during the port (regexcharclass.cs was terrible), but the parse tree, code emmitted, and therefore patterns matched should be identical.
This is a go-gettable library, so install is easy:
go get github.com/dlclark/regexp2/...
Usage is similar to the Go regexp
package. Just like in regexp
, you start by converting a regex into a state machine via the Compile
or MustCompile
methods. They ultimately do the same thing, but MustCompile
will panic if the regex is invalid. You can then use the provided Regexp
struct to find matches repeatedly. A Regexp
struct is safe to use across goroutines.
re := regexp2.MustCompile(`Your pattern`, 0)
if isMatch, _ := re.MatchString(`Something to match`); isMatch {
//do something
}
The only error that the *Match*
methods should return is a Timeout if you set the re.MatchTimeout
field. Any other error is a bug in the regexp2
package. If you need more details about capture groups in a match then use the FindStringMatch
method, like so:
if m, _ := re.FindStringMatch(`Something to match`); m != nil {
// the whole match is always group 0
fmt.Printf("Group 0: %v\n", m.String())
// you can get all the groups too
gps := m.Groups()
// a group can be captured multiple times, so each cap is separately addressable
fmt.Printf("Group 1, first capture", gps[1].Captures[0].String())
fmt.Printf("Group 1, second capture", gps[1].Captures[1].String())
}
Group 0 is embedded in the Match. Group 0 is an automatically-assigned group that encompasses the whole pattern. This means that m.String()
is the same as m.Group.String()
and m.Groups()[0].String()
The last capture is embedded in each group, so g.String()
will return the same thing as g.Capture.String()
and g.Captures[len(g.Captures)-1].String()
.
regexp
and regexp2
Category | regexp | regexp2 |
---|---|---|
Catastrophic backtracking possible | no, constant execution time guarantees | yes, if your pattern is at risk you can use the re.MatchTimeout field |
Python-style capture groups (?P<name>re) |
yes | no (yes in RE2 compat mode) |
.NET-style capture groups (?<name>re) or (?'name're) |
no | yes |
comments (?#comment) |
no | yes |
branch numbering reset (?|a|b) |
no | no |
possessive match (?>re) |
no | yes |
positive lookahead (?=re) |
no | yes |
negative lookahead (?!re) |
no | yes |
positive lookbehind (?<=re) |
no | yes |
negative lookbehind (?<!re) |
no | yes |
back reference \1 |
no | yes |
named back reference \k'name' |
no | yes |
named ascii character class [[:foo:]] |
yes | no (yes in RE2 compat mode) |
conditionals (?(expr)yes|no) |
no | yes |
The default behavior of regexp2
is to match the .NET regexp engine, however the RE2
option is provided to change the parsing to increase compatibility with RE2. Using the RE2
option when compiling a regexp will not take away any features, but will change the following behaviors:
[[:foo:]]
)(P<name>re)
)$
to only match end of string (like RE2) (see #24)re := regexp2.MustCompile(`Your RE2-compatible pattern`, regexp2.RE2)
if isMatch, _ := re.MatchString(`Something to match`); isMatch {
//do something
}
This feature is a work in progress and I'm open to ideas for more things to put here (maybe more relaxed character escaping rules?).
I've run a battery of tests against regexp2 from various sources and found the debug output matches the .NET engine, but .NET and Go handle strings very differently. I've attempted to handle these differences, but most of my testing deals with basic ASCII with a little bit of multi-byte Unicode. There's a chance that there are bugs in the string handling related to character sets with supplementary Unicode chars. Right-to-Left support is coded, but not well tested either.
I'm open to new issues and pull requests with tests if you find something odd!
本项目是群体化方法与技术的开源实现案例,在基于Gitea的基础上,进一步支持社交化的协同开发、协同学习、协同研究等群体创新实践服务,特别是针对新一代人工智能技术特点,重点支持项目管理、git代码管理、大数据集存储管理与智能计算平台接入。
Go SVG JavaScript Vue HTML other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》