<?php
include_once $_SERVER['DOCUMENT_ROOT'] . '/include/shared-manual.inc';
$TOC = array();
$TOC_DEPRECATED = array();
$PARENTS = array();
include_once dirname(__FILE__) ."/toc/reference.pcre.pattern.syntax.inc";
$setup = array (
  'home' => 
  array (
    0 => 'index.php',
    1 => 'PHP Manual',
  ),
  'head' => 
  array (
    0 => 'UTF-8',
    1 => 'de',
  ),
  'this' => 
  array (
    0 => 'regexp.reference.repetition.php',
    1 => 'Repetition',
    2 => 'Repetition',
  ),
  'up' => 
  array (
    0 => 'reference.pcre.pattern.syntax.php',
    1 => 'PCRE regex syntax',
  ),
  'prev' => 
  array (
    0 => 'regexp.reference.subpatterns.php',
    1 => 'Subpatterns',
  ),
  'next' => 
  array (
    0 => 'regexp.reference.back-references.php',
    1 => 'Back references',
  ),
  'alternatives' => 
  array (
  ),
  'source' => 
  array (
    'lang' => 'en',
    'path' => 'reference/pcre/pattern.syntax.xml',
  ),
  'history' => 
  array (
  ),
);
$setup["toc"] = $TOC;
$setup["toc_deprecated"] = $TOC_DEPRECATED;
$setup["parents"] = $PARENTS;
manual_setup($setup);

contributors($setup);

?>
<div id="regexp.reference.repetition" class="section">
  <h2 class="title">Repetition</h2>
  <p class="para">
   Repetition is specified by quantifiers, which can follow any
   of the following items:

   <ul class="itemizedlist">
    <li class="listitem"><span class="simpara">a single character, possibly escaped</span></li>
    <li class="listitem"><span class="simpara">the . metacharacter</span></li>
    <li class="listitem"><span class="simpara">a character class</span></li>
    <li class="listitem"><span class="simpara">a back reference (see next section)</span></li>
    <li class="listitem"><span class="simpara">a parenthesized subpattern (unless it is an assertion -
     see below)</span></li>
   </ul>
  </p>
  <p class="para">
   The general repetition quantifier specifies a minimum and
   maximum number of permitted matches, by giving the two
   numbers in curly brackets (braces), separated by a comma.
   The numbers must be less than 65536, and the first must be
   less than or equal to the second. For example:

   <code class="literal">z{2,4}</code>

   matches &quot;zz&quot;, &quot;zzz&quot;, or &quot;zzzz&quot;. A closing brace on its own
   is not a special character. If the second number is omitted,
   but the comma is present, there is no upper limit; if the
   second number and the comma are both omitted, the quantifier
   specifies an exact number of required matches. Thus

   <code class="literal">[aeiou]{3,}</code>

   matches at least 3 successive vowels, but may match many
   more, while

   <code class="literal">\d{8}</code>

   matches exactly 8 digits.

  </p>
  <p class="simpara">
   Prior to PHP 8.4.0, an opening curly bracket that
   appears in a position where a quantifier is not allowed, or
   one that does not match the syntax of a quantifier, is taken
   as a literal character. For example, <code class="literal">{,6}</code>
   is not a quantifier, but a literal string of four characters.

   As of PHP 8.4.0, the PCRE extension is bundled with PCRE2 version 10.44,
   which allows patterns such as <code class="literal">\d{,8}</code> and they are
   interpreted as <code class="literal">\d{0,8}</code>.

   Further, as of PHP 8.4.0, space characters around quantifiers such as
   <code class="literal">\d{0 , 8}</code> and <code class="literal">\d{ 0 , 8 }</code> are allowed.
  </p>
  <p class="para">
   The quantifier {0} is permitted, causing the expression to
   behave as if the previous item and the quantifier were not
   present.
  </p>
  <p class="para">
   For convenience (and historical compatibility) the three
   most common quantifiers have single-character abbreviations:

   <table class="doctable table">
    <caption><strong>Single-character quantifiers</strong></caption>
    
     <tbody class="tbody">
      <tr>
       <td><code class="literal">*</code></td>
       <td>equivalent to <code class="literal">{0,}</code></td>
      </tr>

      <tr>
       <td><code class="literal">+</code></td>
       <td>equivalent to <code class="literal">{1,}</code></td>
      </tr>

      <tr>
       <td><code class="literal">?</code></td>
       <td>equivalent to <code class="literal">{0,1}</code></td>
      </tr>

     </tbody>
    
   </table>

  </p>
  <p class="para">
   It is possible to construct infinite loops by following a
   subpattern that can match no characters with a quantifier
   that has no upper limit, for example:

   <code class="literal">(a?)*</code>
  </p>
  <p class="para">
   Earlier versions of Perl and PCRE used to give an error at
   compile time for such patterns. However, because there are
   cases where this can be useful, such patterns are now
   accepted, but if any repetition of the subpattern does in
   fact match no characters, the loop is forcibly broken.
  </p>
  <p class="para">
   By default, the quantifiers are &quot;greedy&quot;, that is, they
   match as much as possible (up to the maximum number of permitted
   times), without causing the rest of the pattern to
   fail. The classic example of where this gives problems is in
   trying to match comments in C programs. These appear between
   the sequences /* and */ and within the sequence, individual
   * and / characters may appear. An attempt to match C comments
   by applying the pattern

   <code class="literal">/\*.*\*/</code>

   to the string

   <code class="literal">/* first comment */ not comment /* second comment */</code>

   fails, because it matches the entire string due to the
   greediness of the .* item.
  </p>
  <p class="para">
   However, if a quantifier is followed by a question mark,
   then it becomes lazy, and instead matches the minimum
   number of times possible, so the pattern

   <code class="literal">/\*.*?\*/</code>

   does the right thing with the C comments. The meaning of the
   various quantifiers is not otherwise changed, just the preferred
   number of matches. Do not confuse this use of
   question mark with its use as a quantifier in its own right.
   Because it has two uses, it can sometimes appear doubled, as
   in

   <code class="literal">\d??\d</code>

   which matches one digit by preference, but can match two if
   that is the only way the rest of the pattern matches.
  </p>
  <p class="para">
   If the <a href="reference.pcre.pattern.modifiers.php" class="link">PCRE_UNGREEDY</a>
   option is set (an option which is not
   available in Perl) then the quantifiers are not greedy by
   default, but individual ones can be made greedy by following
   them with a question mark. In other words, it inverts the
   default behaviour.
  </p>
  <p class="para">
   Quantifiers followed by <code class="literal">+</code> are &quot;possessive&quot;. They eat
   as many characters as possible and don&#039;t return to match the rest of the
   pattern. Thus <code class="literal">.*abc</code> matches &quot;aabc&quot; but
   <code class="literal">.*+abc</code> doesn&#039;t because <code class="literal">.*+</code> eats the
   whole string. Possessive quantifiers can be used to speed up processing.
  </p>
  <p class="para">
   When a parenthesized subpattern is quantified with a minimum
   repeat count that is greater than 1 or with a limited maximum,
   more store is required for the compiled pattern, in
   proportion to the size of the minimum or maximum.
  </p>
  <p class="para">
   If a pattern starts with .* or .{0,} and the <a href="reference.pcre.pattern.modifiers.php" class="link">PCRE_DOTALL</a>
   option (equivalent to Perl&#039;s /s) is set, thus allowing the .
   to match newlines, then the pattern is implicitly anchored,
   because whatever follows will be tried against every character
   position in the subject string, so there is no point in
   retrying the overall match at any position after the first.
   PCRE treats such a pattern as though it were preceded by \A.
   In cases where it is known that the subject string contains
   no newlines, it is worth setting <a href="reference.pcre.pattern.modifiers.php" class="link">PCRE_DOTALL</a> when the
   pattern begins with .* in order to
   obtain this optimization, or
   alternatively using ^ to indicate anchoring explicitly.
  </p>
  <p class="para">
   When a capturing subpattern is repeated, the value captured
   is the substring that matched the final iteration. For example, after

   <code class="literal">(tweedle[dume]{3}\s*)+</code>

   has matched &quot;tweedledum tweedledee&quot; the value of the captured
   substring is &quot;tweedledee&quot;. However, if there are
   nested capturing subpatterns, the corresponding captured
   values may have been set in previous iterations. For example,
   after

   <code class="literal">/(a|(b))+/</code>

   matches &quot;aba&quot; the value of the second captured substring is
   &quot;b&quot;.
  </p>
 </div><?php manual_footer($setup); ?>