The Pattern class supports the use of the following (Perl-like)
special escape sequences:
\b - indicating a word-boundary
\d - indicating a digit ([[:digit:]]) character
\s - indicating a white-space ([:space:]) character
\w - indicating a word ([:alnum:]) character
The corresponding capitals (e.g., \W) define the complementary character sets. The capitalized character set shorthands are not expanded inside explicit character-classes (i.e., [ ... ] constructions). So [\W] represents a set of two characters: \ and W.
As the backslash (\) is treated as a special character it should be handled carefully. Pattern converts the escape sequences \d \s \w (and outside of explicit character classes the sequences \D \S \W) to their respective character classes. All other escape sequences are kept as-is, and the resulting regular expression is offered to the pattern matching compilation function regcomp(3). This function interprets escape sequences. Consequently some care should be exercised when defining patterns containing escape sequences. Here are the rules:
---------------------------------------------------------
Specify: Converts to: regcomp uses: Matches:
---------------------------------------------------------
\d [[:digit:]] [[:digit:]] 3
---------------------------------------------------------
---------------------------------------------------------
Specify: Converts to: regcomp uses: Matches:
---------------------------------------------------------
\x \x x x
---------------------------------------------------------
R"((\w+)\s*:\s*\d+)"
REG_EXTENDED:
Use POSIX Extended Regular Expression syntax when
interpreting regex. If not set, POSIX Basic Regular
Expression syntax is used.
REG_NOSUB:
Support for substring addressing of matches is not
required. The nmatch and pmatch parameters to
regexec are ignored if the pattern buffer supplied
was compiled with this flag set.
REG_NEWLINE:
Match-any-character operators don't match a newline.
A non-matching list ([^...]) not containing a newline does not match a newline.
Match-beginning-of-line operator (^) matches the empty string immediately after a newline, regardless of whether eflags, the execution flags of regexec, contains REG_NOTBOL.
Match-end-of-line operator ($) matches the empty string immediately before a newline, regardless of whether eflags contains REG_NOTEOL.
Copy and move constructors (and assignment operators) are available.
Options may be:
REG_NOTBOL:
The match-beginning-of-line operator always fails to match
(but see the compilation flag REG_NEWLINE above) This flag
may be used when different portions of a string are passed
to regexec and the beginning of the string should not be
interpreted as the beginning of the line.
REG_NOTEOL:
The match-end-of-line operator always fails to
match (but see the compilation flag REG_NEWLINE)
#include "driver.h"
#include <bobcat/pattern>
using namespace std;
using namespace FBB;
#include <algorithm>
#include <cstring>
void showSubstr(string const &str)
{
static int count = 0;
cout << "String " << ++count << " is '" << str << "'\n";
}
void match(Pattern const &patt, string const &text)
try
{
Pattern pattern{ patt };
pattern.match(text);
Pattern p3(pattern);
cout << "before: " << p3.before() << "\n"
"matched: " << p3.matched() << "\n"
"beyond: " << pattern.beyond() << "\n"
"end() = " << pattern.end() << '\n';
for (size_t idx = 0; idx != pattern.end(); ++idx)
{
string str = pattern[idx];
if (str.empty())
cout << "part " << idx << " not present\n";
else
{
Pattern::Position pos = pattern.position(idx);
cout << "part " << idx << ": '" << str << "' (" <<
pos.first << "-" << pos.second << ")\n";
}
}
}
catch (exception const &exc)
{
cout << exc.what() << '\n';
}
int main(int argc, char **argv)
{
string patStr = R"(\d+)";
do
{
cout << "Pattern: '" << patStr << "'\n";
try
{
// by default: case sensitive
// use any args. for case insensitive
Pattern patt(patStr, argc == 1);
cout << "Compiled pattern: " << patt.pattern() << '\n';
while (true)
{
cout << "string to match : ";
string text;
getline(cin, text);
if (text.empty())
break;
cout << "String: '" << text << "'\n";
match(patt, text);
}
}
catch (exception const &exc)
{
cout << exc.what() << ": compilation failed\n";
}
cout << "New pattern: ";
}
while (getline(cin, patStr) and not patStr.empty());
}