regexp-compiler.cc - OpenGrok cross reference for /third_party/node/deps/v8/src/regexp/regexp-compiler.cc

Lines Matching defs:character
59 //        Examples of elements include character classes, plain strings
124 //   * Choice nodes have 1-character lookahead.
125 //     A choice node looks at the following character and eliminates some of
126 //     the choices immediately based on that character.  This is not yet
131 //     implementation of this would push each character position onto the
597   // of the negative submatch and restore the character position.
753 int GetCaseIndependentLetters(Isolate* isolate, base::uc16 character,
757   if (RegExpCaseFolding::IgnoreSet().contains(character)) {
758     letters[0] = character;
763       RegExpCaseFolding::SpecialAddSet().contains(character);
766   set.add(character);
771     canon = RegExpCaseFolding::Canonicalize(character);
792       isolate->jsregexp_uncanonicalize()->get(character, '\0', letters);
796     letters[0] = character;
839     // character.  We do not need to do anything since the one-byte pass
877     // subtract the difference from the found character, then do the or
899   // if this character lies before a character that matched.
1060   // heuristics are complicated a little by the fact that any 128-character
1063   // 128-character space can take up a lot of space in the ranges array if,
1064   // for example, we only want to match every second character (eg. the lower
1069   // character range (even non-Latin1 charset-based text has spaces and
1100 // Gets a series of segment boundaries representing a character class.  If the
1101 // character is in the range between an even and an odd boundary (counting from
1103 // know that the character is in the range of min_char to max_char inclusive.
1119   // Just need to test if the character is before or on-or-after
1120   // a particular character.
1135   // character class.
1157   // determine whether the character is inside or outside the character class.
1547     // For 2-character preloads in one-byte mode or 1-character preloads in
1579 // We iterate along the text object, building up for each character a
1582 // machine word for the current character width in order to be used in
1617             // a match at this character position.
1631             // whether we have a match at this character position.  Otherwise
1642           // determine definitely whether we have a match at this character
1665         // A quick check uses multi-character mask and compare.  There is no
1705           // so the chances of a false positive rise.  A character class
2114 // Emit the code to check for a ^ in multiline mode (1-character lookbehind
2119   // We will load the previous character into the current character register.
2126   // bounds-checks when loading the one-character lookbehind.
2138   // to load the previous character without bounds checks.
2188     // Next character is not a word character.
2221   // bounds-checks when loading the one-character lookbehind.
2226     // The start of input counts as a non-word character, so the question is
2232   // to load the previous character without bounds checks.
2309 // second pass and the character class in the last pass.
2317 // A slight complication involves the fact that the first character may already
2319 // do the test for that character first.  We do this in separate passes.  The
2322 // first_element_checked to indicate that that character does not need to be
2332 // or obviate the need for further checks at some character positions.
2354           // Everywhere else we assume that a non-Latin-1 character cannot match
2355           // a Latin-1 character. Avoid the cases where this is assumption is
2459 // straight character sequences (possibly to be matched in a case-independent
2460 // way) and character classes.  For efficiency we do not do this in a single
2462 // emitting code for some character positions every time.  See the comment on
2483   // If a character is preloaded into the current character register then
2511   // We don't have an instruction for shifting the current character register
2539       // None of the standard character classes is different in the case
2698 void BoyerMoorePositionInfo::Set(int character) {
2699   SetInterval(Interval(character, character));
2808 // Find the highest-points range between 0 and length_ where the character
2835       // Add 1 to the frequency to give a small per-character boost for
2868 // character at max_lookahead offset is not one of these characters, then we
2904   // contains precisely one character.
2978  * character register.  R nodes do this preloading.  Vertices are marked
3204   // any character one at a time.  Any non-anchored regexp has such a
3207   // and step forwards 3 if the character is not one of abc.  Abc need
3208   // not be atoms, they can be any reasonably limited character class or
3332     // Reload the current character, since the next quick check expects that.
3803   // hard, so we just say that any character can match.
3845         base::uc16 character = atom->data()[j];
3849               isolate, character, bm->max_char() == String::kMaxOneByteCharCode,
3855           if (character <= max_char) bm->Set(offset, character);