Skip to content

Commit c41c5b2

Browse files
committed
String and Regex update
1 parent 0fab178 commit c41c5b2

File tree

2 files changed

+39
-41
lines changed

2 files changed

+39
-41
lines changed

README.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -300,9 +300,11 @@ True
300300

301301
String
302302
------
303+
**Immutable sequence of characters.**
304+
303305
```python
304306
<str> = <str>.strip() # Strips all whitespace characters from both ends.
305-
<str> = <str>.strip('<chars>') # Strips all passed characters from both ends.
307+
<str> = <str>.strip('<chars>') # Strips passed characters. Also lstrip/rstrip().
306308
```
307309

308310
```python
@@ -321,6 +323,7 @@ String
321323
```
322324

323325
```python
326+
<str> = <str>.lower() # Changes the case. Also upper/capitalize/title().
324327
<str> = <str>.replace(old, new [, count]) # Replaces 'old' with 'new' at most 'count' times.
325328
<str> = <str>.translate(<table>) # Use `str.maketrans(<dict>)` to generate table.
326329
```
@@ -329,38 +332,37 @@ String
329332
<str> = chr(<int>) # Converts int to Unicode character.
330333
<int> = ord(<str>) # Converts Unicode character to int.
331334
```
332-
* **Also: `'lstrip()'`, `'rstrip()'` and `'rsplit()'`.**
333-
* **Also: `'lower()'`, `'upper()'`, `'capitalize()'` and `'title()'`.**
335+
* **Use `'unicodedata.normalize("NFC", <str>)'` on strings that may contain characters like `'Ö'` before comparing them, because they can be stored as one or two characters.**
334336

335337
### Property Methods
336-
```text
337-
+---------------+----------+----------+----------+----------+----------+
338-
| | [ !#$%…] | [a-zA-Z] | [¼½¾] | [²³¹] | [0-9] |
339-
+---------------+----------+----------+----------+----------+----------+
340-
| isprintable() | yes | yes | yes | yes | yes |
341-
| isalnum() | | yes | yes | yes | yes |
342-
| isnumeric() | | | yes | yes | yes |
343-
| isdigit() | | | | yes | yes |
344-
| isdecimal() | | | | | yes |
345-
+---------------+----------+----------+----------+----------+----------+
338+
```python
339+
<bool> = <str>.isdecimal() # Checks for [0-9].
340+
<bool> = <str>.isdigit() # Checks for [²³¹] and isdecimal().
341+
<bool> = <str>.isnumeric() # Checks for [¼½¾] and isdigit().
342+
<bool> = <str>.isalnum() # Checks for [a-zA-Z] and isnumeric().
343+
<bool> = <str>.isprintable() # Checks for [ !#$%…] and isalnum().
344+
<bool> = <str>.isspace() # Checks for [ \t\n\r\f\v\x1c-\x1f\x85\xa0…].
346345
```
347-
* **`'isspace()'` checks for whitespaces: `'[ \t\n\r\f\v\x1c-\x1f\x85\xa0\u1680…]'`.**
348346

349347

350348
Regex
351349
-----
350+
**Functions for regular expression matching.**
351+
352352
```python
353353
import re
354+
```
355+
356+
```python
354357
<str> = re.sub(<regex>, new, text, count=0) # Substitutes all occurrences with 'new'.
355358
<list> = re.findall(<regex>, text) # Returns all occurrences as strings.
356359
<list> = re.split(<regex>, text, maxsplit=0) # Add brackets around regex to include matches.
357-
<Match> = re.search(<regex>, text) # Searches for first occurrence of the pattern.
360+
<Match> = re.search(<regex>, text) # First occurrence of the pattern or None.
358361
<Match> = re.match(<regex>, text) # Searches only at the beginning of the text.
359362
<iter> = re.finditer(<regex>, text) # Returns all occurrences as Match objects.
360363
```
361364

362365
* **Argument 'new' can be a function that accepts a Match object and returns a string.**
363-
* **Search() and match() return None if they can't find a match.**
364366
* **Argument `'flags=re.IGNORECASE'` can be used with all functions.**
365367
* **Argument `'flags=re.MULTILINE'` makes `'^'` and `'$'` match the start/end of each line.**
366368
* **Argument `'flags=re.DOTALL'` makes `'.'` also accept the `'\n'`.**

index.html

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454

5555
<body>
5656
<header>
57-
<aside>October 4, 2023</aside>
57+
<aside>October 11, 2023</aside>
5858
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
5959
</header>
6060

@@ -290,10 +290,11 @@
290290
┃ decimal.Decimal │ ✓ │ │ │ │ ┃
291291
┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┛
292292
</code></pre>
293-
<div><h2 id="string"><a href="#string" name="string">#</a>String</h2><pre><code class="python language-python hljs">&lt;str&gt; = &lt;str&gt;.strip() <span class="hljs-comment"># Strips all whitespace characters from both ends.</span>
294-
&lt;str&gt; = &lt;str&gt;.strip(<span class="hljs-string">'&lt;chars&gt;'</span>) <span class="hljs-comment"># Strips all passed characters from both ends.</span>
293+
<div><h2 id="string"><a href="#string" name="string">#</a>String</h2><p><strong>Immutable sequence of characters.</strong></p><pre><code class="python language-python hljs">&lt;str&gt; = &lt;str&gt;.strip() <span class="hljs-comment"># Strips all whitespace characters from both ends.</span>
294+
&lt;str&gt; = &lt;str&gt;.strip(<span class="hljs-string">'&lt;chars&gt;'</span>) <span class="hljs-comment"># Strips passed characters. Also lstrip/rstrip().</span>
295295
</code></pre></div>
296296

297+
297298
<pre><code class="python language-python hljs">&lt;list&gt; = &lt;str&gt;.split() <span class="hljs-comment"># Splits on one or more whitespace characters.</span>
298299
&lt;list&gt; = &lt;str&gt;.split(sep=<span class="hljs-keyword">None</span>, maxsplit=<span class="hljs-number">-1</span>) <span class="hljs-comment"># Splits on 'sep' str at most 'maxsplit' times.</span>
299300
&lt;list&gt; = &lt;str&gt;.splitlines(keepends=<span class="hljs-keyword">False</span>) <span class="hljs-comment"># On [\n\r\f\v\x1c-\x1e\x85\u2028\u2029] and \r\n.</span>
@@ -305,42 +306,37 @@
305306
&lt;int&gt; = &lt;str&gt;.find(&lt;sub_str&gt;) <span class="hljs-comment"># Returns start index of the first match or -1.</span>
306307
&lt;int&gt; = &lt;str&gt;.index(&lt;sub_str&gt;) <span class="hljs-comment"># Same, but raises ValueError if missing.</span>
307308
</code></pre>
308-
<pre><code class="python language-python hljs">&lt;str&gt; = &lt;str&gt;.replace(old, new [, count]) <span class="hljs-comment"># Replaces 'old' with 'new' at most 'count' times.</span>
309+
<pre><code class="python language-python hljs">&lt;str&gt; = &lt;str&gt;.lower() <span class="hljs-comment"># Changes the case. Also upper/capitalize/title().</span>
310+
&lt;str&gt; = &lt;str&gt;.replace(old, new [, count]) <span class="hljs-comment"># Replaces 'old' with 'new' at most 'count' times.</span>
309311
&lt;str&gt; = &lt;str&gt;.translate(&lt;table&gt;) <span class="hljs-comment"># Use `str.maketrans(&lt;dict&gt;)` to generate table.</span>
310312
</code></pre>
311313
<pre><code class="python language-python hljs">&lt;str&gt; = chr(&lt;int&gt;) <span class="hljs-comment"># Converts int to Unicode character.</span>
312314
&lt;int&gt; = ord(&lt;str&gt;) <span class="hljs-comment"># Converts Unicode character to int.</span>
313315
</code></pre>
314316
<ul>
315-
<li><strong>Also: <code class="python hljs"><span class="hljs-string">'lstrip()'</span></code>, <code class="python hljs"><span class="hljs-string">'rstrip()'</span></code> and <code class="python hljs"><span class="hljs-string">'rsplit()'</span></code>.</strong></li>
316-
<li><strong>Also: <code class="python hljs"><span class="hljs-string">'lower()'</span></code>, <code class="python hljs"><span class="hljs-string">'upper()'</span></code>, <code class="python hljs"><span class="hljs-string">'capitalize()'</span></code> and <code class="python hljs"><span class="hljs-string">'title()'</span></code>.</strong></li>
317+
<li><strong>Use <code class="python hljs"><span class="hljs-string">'unicodedata.normalize("NFC", &lt;str&gt;)'</span></code> on strings that may contain characters like <code class="python hljs"><span class="hljs-string">'Ö'</span></code> before comparing them, because they can be stored as one or two characters.</strong></li>
317318
</ul>
318-
<div><h3 id="propertymethods">Property Methods</h3><pre><code class="text language-text">┏━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━┓
319-
┃ │ [ !#$%…] │ [a-zA-Z] │ [¼½¾] │ [²³¹] │ [0-9] ┃
320-
┠───────────────┼──────────┼──────────┼──────────┼──────────┼──────────┨
321-
┃ isprintable() │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ ┃
322-
┃ isalnum() │ │ ✓ │ ✓ │ ✓ │ ✓ ┃
323-
┃ isnumeric() │ │ │ ✓ │ ✓ │ ✓ ┃
324-
┃ isdigit() │ │ │ │ ✓ │ ✓ ┃
325-
┃ isdecimal() │ │ │ │ │ ✓ ┃
326-
┗━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┛
319+
<div><h3 id="propertymethods">Property Methods</h3><pre><code class="python language-python hljs">&lt;bool&gt; = &lt;str&gt;.isdecimal() <span class="hljs-comment"># Checks for [0-9].</span>
320+
&lt;bool&gt; = &lt;str&gt;.isdigit() <span class="hljs-comment"># Checks for [²³¹] and isdecimal().</span>
321+
&lt;bool&gt; = &lt;str&gt;.isnumeric() <span class="hljs-comment"># Checks for [¼½¾] and isdigit().</span>
322+
&lt;bool&gt; = &lt;str&gt;.isalnum() <span class="hljs-comment"># Checks for [a-zA-Z] and isnumeric().</span>
323+
&lt;bool&gt; = &lt;str&gt;.isprintable() <span class="hljs-comment"># Checks for [ !#$%…] and isalnum().</span>
324+
&lt;bool&gt; = &lt;str&gt;.isspace() <span class="hljs-comment"># Checks for [ \t\n\r\f\v\x1c-\x1f\x85\xa0…].</span>
327325
</code></pre></div>
328326

329-
<ul>
330-
<li><strong><code class="python hljs"><span class="hljs-string">'isspace()'</span></code> checks for whitespaces: <code class="python hljs"><span class="hljs-string">'[ \t\n\r\f\v\x1c-\x1f\x85\xa0\u1680…]'</span></code>.</strong></li>
331-
</ul>
332-
<div><h2 id="regex"><a href="#regex" name="regex">#</a>Regex</h2><pre><code class="python language-python hljs"><span class="hljs-keyword">import</span> re
333-
&lt;str&gt; = re.sub(&lt;regex&gt;, new, text, count=<span class="hljs-number">0</span>) <span class="hljs-comment"># Substitutes all occurrences with 'new'.</span>
327+
<div><h2 id="regex"><a href="#regex" name="regex">#</a>Regex</h2><p><strong>Functions for regular expression matching.</strong></p><pre><code class="python language-python hljs"><span class="hljs-keyword">import</span> re
328+
</code></pre></div>
329+
330+
331+
<pre><code class="python language-python hljs">&lt;str&gt; = re.sub(&lt;regex&gt;, new, text, count=<span class="hljs-number">0</span>) <span class="hljs-comment"># Substitutes all occurrences with 'new'.</span>
334332
&lt;list&gt; = re.findall(&lt;regex&gt;, text) <span class="hljs-comment"># Returns all occurrences as strings.</span>
335333
&lt;list&gt; = re.split(&lt;regex&gt;, text, maxsplit=<span class="hljs-number">0</span>) <span class="hljs-comment"># Add brackets around regex to include matches.</span>
336-
&lt;Match&gt; = re.search(&lt;regex&gt;, text) <span class="hljs-comment"># Searches for first occurrence of the pattern.</span>
334+
&lt;Match&gt; = re.search(&lt;regex&gt;, text) <span class="hljs-comment"># First occurrence of the pattern or None.</span>
337335
&lt;Match&gt; = re.match(&lt;regex&gt;, text) <span class="hljs-comment"># Searches only at the beginning of the text.</span>
338336
&lt;iter&gt; = re.finditer(&lt;regex&gt;, text) <span class="hljs-comment"># Returns all occurrences as Match objects.</span>
339-
</code></pre></div>
340-
337+
</code></pre>
341338
<ul>
342339
<li><strong>Argument 'new' can be a function that accepts a Match object and returns a string.</strong></li>
343-
<li><strong>Search() and match() return None if they can't find a match.</strong></li>
344340
<li><strong>Argument <code class="python hljs"><span class="hljs-string">'flags=re.IGNORECASE'</span></code> can be used with all functions.</strong></li>
345341
<li><strong>Argument <code class="python hljs"><span class="hljs-string">'flags=re.MULTILINE'</span></code> makes <code class="python hljs"><span class="hljs-string">'^'</span></code> and <code class="python hljs"><span class="hljs-string">'$'</span></code> match the start/end of each line.</strong></li>
346342
<li><strong>Argument <code class="python hljs"><span class="hljs-string">'flags=re.DOTALL'</span></code> makes <code class="python hljs"><span class="hljs-string">'.'</span></code> also accept the <code class="python hljs"><span class="hljs-string">'\n'</span></code>.</strong></li>
@@ -2929,7 +2925,7 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
29292925

29302926

29312927
<footer>
2932-
<aside>October 4, 2023</aside>
2928+
<aside>October 11, 2023</aside>
29332929
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
29342930
</footer>
29352931

0 commit comments

Comments
 (0)