Lines Matching defs:character

59 //        Examples of elements include character classes, plain strings
124 // * Choice nodes have 1-character lookahead.
125 // A choice node looks at the following character and eliminates some of
126 // the choices immediately based on that character. This is not yet
131 // implementation of this would push each character position onto the
597 // of the negative submatch and restore the character position.
753 int GetCaseIndependentLetters(Isolate* isolate, base::uc16 character,
757 if (RegExpCaseFolding::IgnoreSet().contains(character)) {
758 letters[0] = character;
763 RegExpCaseFolding::SpecialAddSet().contains(character);
766 set.add(character);
771 canon = RegExpCaseFolding::Canonicalize(character);
792 isolate->jsregexp_uncanonicalize()->get(character, '\0', letters);
796 letters[0] = character;
839 // character. We do not need to do anything since the one-byte pass
877 // subtract the difference from the found character, then do the or
899 // if this character lies before a character that matched.
1060 // heuristics are complicated a little by the fact that any 128-character
1063 // 128-character space can take up a lot of space in the ranges array if,
1064 // for example, we only want to match every second character (eg. the lower
1069 // character range (even non-Latin1 charset-based text has spaces and
1100 // Gets a series of segment boundaries representing a character class. If the
1101 // character is in the range between an even and an odd boundary (counting from
1103 // know that the character is in the range of min_char to max_char inclusive.
1119 // Just need to test if the character is before or on-or-after
1120 // a particular character.
1135 // character class.
1157 // determine whether the character is inside or outside the character class.
1547 // For 2-character preloads in one-byte mode or 1-character preloads in
1579 // We iterate along the text object, building up for each character a
1582 // machine word for the current character width in order to be used in
1617 // a match at this character position.
1631 // whether we have a match at this character position. Otherwise
1642 // determine definitely whether we have a match at this character
1665 // A quick check uses multi-character mask and compare. There is no
1705 // so the chances of a false positive rise. A character class
2114 // Emit the code to check for a ^ in multiline mode (1-character lookbehind
2119 // We will load the previous character into the current character register.
2126 // bounds-checks when loading the one-character lookbehind.
2138 // to load the previous character without bounds checks.
2188 // Next character is not a word character.
2221 // bounds-checks when loading the one-character lookbehind.
2226 // The start of input counts as a non-word character, so the question is
2232 // to load the previous character without bounds checks.
2309 // second pass and the character class in the last pass.
2317 // A slight complication involves the fact that the first character may already
2319 // do the test for that character first. We do this in separate passes. The
2322 // first_element_checked to indicate that that character does not need to be
2332 // or obviate the need for further checks at some character positions.
2354 // Everywhere else we assume that a non-Latin-1 character cannot match
2355 // a Latin-1 character. Avoid the cases where this is assumption is
2459 // straight character sequences (possibly to be matched in a case-independent
2460 // way) and character classes. For efficiency we do not do this in a single
2462 // emitting code for some character positions every time. See the comment on
2483 // If a character is preloaded into the current character register then
2511 // We don't have an instruction for shifting the current character register
2539 // None of the standard character classes is different in the case
2698 void BoyerMoorePositionInfo::Set(int character) {
2699 SetInterval(Interval(character, character));
2808 // Find the highest-points range between 0 and length_ where the character
2835 // Add 1 to the frequency to give a small per-character boost for
2868 // character at max_lookahead offset is not one of these characters, then we
2904 // contains precisely one character.
2978 * character register. R nodes do this preloading. Vertices are marked
3204 // any character one at a time. Any non-anchored regexp has such a
3207 // and step forwards 3 if the character is not one of abc. Abc need
3208 // not be atoms, they can be any reasonably limited character class or
3332 // Reload the current character, since the next quick check expects that.
3803 // hard, so we just say that any character can match.
3845 base::uc16 character = atom->data()[j];
3849 isolate, character, bm->max_char() == String::kMaxOneByteCharCode,
3855 if (character <= max_char) bm->Set(offset, character);