Skip to content

Commit 581284d

Browse files
author
Kenneth Reitz
committed
Merge pull request realpython#257 from rgbkrk/editing_on_the_plane
Editing on the plane
2 parents 5cf74be + 191ee66 commit 581284d

File tree

11 files changed

+145
-132
lines changed

11 files changed

+145
-132
lines changed

docs/scenarios/admin.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ The following command lists all available minions running CentOS using the grain
104104
105105
Salt also provides a state system. States can be used to configure the minion hosts.
106106

107-
For example, when a minion host is ordered to read the following state file, will install
107+
For example, when a minion host is ordered to read the following state file, it will install
108108
and start the Apache server:
109109

110110
.. code-block:: yaml

docs/scenarios/cli.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ Command Line Applications
66
Clint
77
-----
88

9-
.. todo:: Write about Clint
9+
.. todo:: Write about Clint

docs/scenarios/client.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,9 @@ messaging library aimed at use in scalable distributed or concurrent
4141
applications. It provides a message queue, but unlike message-oriented
4242
middleware, a ØMQ system can run without a dedicated message broker. The
4343
library is designed to have a familiar socket-style API.
44+
45+
RabbitMQ
46+
--------
47+
48+
.. todo:: Write about RabbitMQ
49+

docs/scenarios/db.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ Django ORM
3030
The Django ORM is the interface used by `Django <http://www.djangoproject.com>`_
3131
to provide database access.
3232

33-
It's based on the idea of models, an abstraction that makes it easier to
33+
It's based on the idea of `models <https://docs.djangoproject.com/en/1.3/#the-model-layer>`_, an abstraction that makes it easier to
3434
manipulate data in Python.
3535

36-
Documentation can be found `here <https://docs.djangoproject.com/en/1.3/#the-model-layer>`_

docs/scenarios/gui.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Gtk
4141
PyGTK provides Python bindings for the GTK+ toolkit. Like the GTK+ library
4242
itself, it is currently licensed under the GNU LGPL. It is worth noting that
4343
PyGTK only currently supports the Gtk-2.X API (NOT Gtk-3.0). It is currently
44-
recommended that PyGTK is not used for new projects and existing applications
44+
recommended that PyGTK not be used for new projects and existing applications
4545
be ported from PyGTK to PyGObject.
4646

4747
Tk
@@ -60,10 +60,10 @@ available on the `Python Wiki <http://wiki.python.org/moin/TkInter>`_.
6060

6161
Kivy
6262
----
63-
Kivy is a Python library for development of multi-touch enabled media rich applications. The aim is to allow for quick and easy interaction design and rapid prototyping, while making your code reusable and deployable.
63+
`Kivy <http://kivy.org>`_ is a Python library for development of multi-touch enabled media rich applications. The aim is to allow for quick and easy interaction design and rapid prototyping, while making your code reusable and deployable.
6464

6565
Kivy is written in Python, based on OpenGL and supports different input devices such as: Mouse, Dual Mouse, TUIO, WiiMote, WM_TOUCH, HIDtouch, Apple's products and so on.
6666

6767
Kivy is actively being developed by a community and free to use. It operates on all major platforms (Linux, OSX, Windows, Android).
6868

69-
The main resource for information is the website: http://kivy.org
69+
The main resource for information is the website: http://kivy.org

docs/scenarios/imaging.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,15 @@ The `Python Imaging Library <http://www.pythonware.com/products/pil/>`_, or PIL
1212
for short, is *the* library for image manipulation in Python.
1313

1414
It works with Python 1.5.2 and above, including 2.5, 2.6 and 2.7. Unfortunately,
15-
it doesn't work with 3.0+ yet.
15+
it doesn't work with 3.0+ yet.
1616

1717
Installation
1818
~~~~~~~~~~~~
1919

2020
PIL has a reputation of not being very straightforward to install. Listed below
2121
are installation notes on various systems.
2222

23-
Also, there's a fork named `Pillow <http://pypi.python.org/pypi/Pillow>`_ which is easier
23+
Also, there's a fork named `Pillow <http://pypi.python.org/pypi/Pillow>`_ which is easier
2424
to install. It has good setup instructions for all platforms.
2525

2626
Installing on Linux

docs/scenarios/network.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@ Twisted
66

77
`Twisted <http://twistedmatrix.com/trac/>`_ is an event-driven networking engine. It can be
88
used to build applications around many different networking protocols, including http servers
9-
and clients, applications using SMTP, POP3, IMAP or SSH protocols, instant messaging and
9+
and clients, applications using SMTP, POP3, IMAP or SSH protocols, instant messaging and
1010
`many more <http://twistedmatrix.com/trac/wiki/Documentation>`_.
1111

1212
PyZMQ
1313
-----
1414

1515
`PyZMQ <http://zeromq.github.com/pyzmq/>`_ is the Python binding for `ZeroMQ <http://www.zeromq.org/>`_,
1616
which is a high-performance asynchronous messaging library. One great advantage is that ZeroMQ
17-
can be used for message queuing without message broker. The basic patterns for this are:
17+
can be used for message queuing without a message broker. The basic patterns for this are:
1818

1919
- request-reply: connects a set of clients to a set of services. This is a remote procedure call
2020
and task distribution pattern.
21-
- publish-subscribe: connects a set of publishers to a set of subscribers. This is a data
21+
- publish-subscribe: connects a set of publishers to a set of subscribers. This is a data
2222
distribution pattern.
2323
- push-pull (or pipeline): connects nodes in a fan-out / fan-in pattern that can have multiple
2424
steps, and loops. This is a parallel task distribution and collection pattern.

docs/scenarios/scientific.rst

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ people who only need the basic requirements can just use NumPy.
3535

3636
NumPy is compatible with Python versions 2.4 through to 2.7.2 and 3.1+.
3737

38+
Numba
39+
-----
40+
.. todo:: Write about Numba
41+
3842
SciPy
3943
-----
4044

@@ -60,8 +64,9 @@ Resources
6064

6165
Installation of scientific Python packages can be troublesome. Many of these
6266
packages are implemented as Python C extensions which need to be compiled.
63-
This section lists various so-called Python distributions which provide precompiled and
64-
easy-to-install collections of scientific Python packages.
67+
This section lists various so-called scientific Python distributions which
68+
provide precompiled and easy-to-install collections of scientific Python
69+
packages.
6570

6671
Unofficial Windows Binaries for Python Extension Packages
6772
---------------------------------------------------------
@@ -91,6 +96,6 @@ Anaconda
9196
Python Distribution <https://store.continuum.io/cshop/anaconda>`_ which
9297
includes all the common scientific python packages and additionally many
9398
packages related to data analytics and big data. Anaconda comes in two
94-
flavours, a paid for version and a completely free and open source community
99+
flavors, a paid for version and a completely free and open source community
95100
edition, Anaconda CE, which contains a slightly reduced feature set. Free
96-
licences for the paid-for version are available for academics and researchers.
101+
licenses for the paid-for version are available for academics and researchers.

docs/scenarios/scrape.rst

Lines changed: 101 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,99 +1,101 @@
1-
HTML Scraping
2-
=============
3-
4-
Web Scraping
5-
------------
6-
7-
Web sites are written using HTML, which means that each web page is a
8-
structured document. Sometimes it would be great to obtain some data from
9-
them and preserve the structure while we're at it. Web sites provide
10-
don't always provide their data in comfortable formats such as ``.csv``.
11-
12-
This is where web scraping comes in. Web scraping is the practice of using a
13-
computer program to sift through a web page and gather the data that you need
14-
in a format most useful to you while at the same time preserving the structure
15-
of the data.
16-
17-
lxml and Requests
18-
-----------------
19-
20-
`lxml <http://lxml.de/>`_ is a pretty extensive library written for parsing
21-
XML and HTML documents really fast. It even handles messed up tags. We will
22-
also be using the `Requests <http://docs.python-requests.org/en/latest/>`_ module instead of the already built-in urlib2
23-
due to improvements in speed and readability. You can easily install both
24-
using ``pip install lxml`` and ``pip install requests``.
25-
26-
Lets start with the imports:
27-
28-
.. code-block:: python
29-
30-
from lxml import html
31-
import requests
32-
33-
Next we will use ``requests.get`` to retrieve the web page with our data
34-
and parse it using the ``html`` module and save the results in ``tree``:
35-
36-
.. code-block:: python
37-
38-
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
39-
tree = html.fromstring(page.text)
40-
41-
``tree`` now contains the whole HTML file in a nice tree structure which
42-
we can go over two different ways: XPath and CSSSelect. In this example, I
43-
will focus on the former.
44-
45-
XPath is a way of locating information in structured documents such as
46-
HTML or XML documents. A good introduction to XPath is on `W3Schools <http://www.w3schools.com/xpath/default.asp>`_ .
47-
48-
There are also various tools for obtaining the XPath of elements such as
49-
FireBug for Firefox or if you're using Chrome you can right click an
50-
element, choose 'Inspect element', highlight the code and then right
51-
click again and choose 'Copy XPath'.
52-
53-
After a quick analysis, we see that in our page the data is contained in
54-
two elements - one is a div with title 'buyer-name' and the other is a
55-
span with class 'item-price':
56-
57-
::
58-
59-
<div title="buyer-name">Carson Busses</div>
60-
<span class="item-price">$29.95</span>
61-
62-
Knowing this we can create the correct XPath query and use the lxml
63-
``xpath`` function like this:
64-
65-
.. code-block:: python
66-
67-
#This will create a list of buyers:
68-
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
69-
#This will create a list of prices
70-
prices = tree.xpath('//span[@class="item-price"]/text()')
71-
72-
Lets see what we got exactly:
73-
74-
.. code-block:: python
75-
76-
print 'Buyers: ', buyers
77-
print 'Prices: ', prices
78-
79-
::
80-
81-
Buyers: ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes',
82-
'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff',
83-
'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup',
84-
'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire',
85-
'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']
86-
87-
Prices: ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25',
88-
'$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11',
89-
'$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68',
90-
'$15.00', '$114.07', '$10.09']
91-
92-
Congratulations! We have successfully scraped all the data we wanted from
93-
a web page using lxml and Requests. We have it stored in memory as two
94-
lists. Now we can do all sorts of cool stuff with it: we can analyze it
95-
using Python or we can save it a file and share it with the world.
96-
97-
A cool idea to think about is modifying this script to iterate through
98-
the rest of the pages of this example dataset or rewriting this
99-
application to use threads for improved speed.
1+
HTML Scraping
2+
=============
3+
4+
Web Scraping
5+
------------
6+
7+
Web sites are written using HTML, which means that each web page is a
8+
structured document. Sometimes it would be great to obtain some data from
9+
them and preserve the structure while we're at it. Web sites don't always
10+
provide their data in comfortable formats such as ``csv`` or ``json``.
11+
12+
This is where web scraping comes in. Web scraping is the practice of using a
13+
computer program to sift through a web page and gather the data that you need
14+
in a format most useful to you while at the same time preserving the structure
15+
of the data.
16+
17+
lxml and Requests
18+
-----------------
19+
20+
`lxml <http://lxml.de/>`_ is a pretty extensive library written for parsing
21+
XML and HTML documents really fast. It even handles messed up tags. We will
22+
also be using the `Requests <http://docs.python-requests.org/en/latest/>`_
23+
module instead of the already built-in urlib2 due to improvements in speed and
24+
readability. You can easily install both using ``pip install lxml`` and
25+
``pip install requests``.
26+
27+
Lets start with the imports:
28+
29+
.. code-block:: python
30+
31+
from lxml import html
32+
import requests
33+
34+
Next we will use ``requests.get`` to retrieve the web page with our data
35+
and parse it using the ``html`` module and save the results in ``tree``:
36+
37+
.. code-block:: python
38+
39+
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
40+
tree = html.fromstring(page.text)
41+
42+
``tree`` now contains the whole HTML file in a nice tree structure which
43+
we can go over two different ways: XPath and CSSSelect. In this example, I
44+
will focus on the former.
45+
46+
XPath is a way of locating information in structured documents such as
47+
HTML or XML documents. A good introduction to XPath is on
48+
`W3Schools <http://www.w3schools.com/xpath/default.asp>`_ .
49+
50+
There are also various tools for obtaining the XPath of elements such as
51+
FireBug for Firefox or the Chrome Inspector. If you're using Chrome, you
52+
can right click an element, choose 'Inspect element', highlight the code,
53+
right click again and choose 'Copy XPath'.
54+
55+
After a quick analysis, we see that in our page the data is contained in
56+
two elements - one is a div with title 'buyer-name' and the other is a
57+
span with class 'item-price':
58+
59+
::
60+
61+
<div title="buyer-name">Carson Busses</div>
62+
<span class="item-price">$29.95</span>
63+
64+
Knowing this we can create the correct XPath query and use the lxml
65+
``xpath`` function like this:
66+
67+
.. code-block:: python
68+
69+
#This will create a list of buyers:
70+
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
71+
#This will create a list of prices
72+
prices = tree.xpath('//span[@class="item-price"]/text()')
73+
74+
Lets see what we got exactly:
75+
76+
.. code-block:: python
77+
78+
print 'Buyers: ', buyers
79+
print 'Prices: ', prices
80+
81+
::
82+
83+
Buyers: ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes',
84+
'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff',
85+
'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup',
86+
'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire',
87+
'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']
88+
89+
Prices: ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25',
90+
'$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11',
91+
'$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68',
92+
'$15.00', '$114.07', '$10.09']
93+
94+
Congratulations! We have successfully scraped all the data we wanted from
95+
a web page using lxml and Requests. We have it stored in memory as two
96+
lists. Now we can do all sorts of cool stuff with it: we can analyze it
97+
using Python or we can save it to a file and share it with the world.
98+
99+
A cool idea to think about is modifying this script to iterate through
100+
the rest of the pages of this example dataset or rewriting this
101+
application to use threads for improved speed.

docs/scenarios/speed.rst

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ The GIL
4242

4343
`The GIL`_ (Global Interpreter Lock) is how Python allows multiple threads to
4444
operate at the same time. Python's memory management isn't entirely thread-safe,
45-
so the GIL is required to prevents multiple threads from running the same
45+
so the GIL is required to prevent multiple threads from running the same
4646
Python code at once.
4747

4848
David Beazley has a great `guide`_ on how the GIL operates. He also covers the
@@ -58,8 +58,8 @@ C Extensions
5858
The GIL
5959
-------
6060

61-
`Special care`_ must be taken when writing C extensions to make sure you r
62-
egister your threads with the interpreter.
61+
`Special care`_ must be taken when writing C extensions to make sure you
62+
register your threads with the interpreter.
6363

6464
C Extensions
6565
::::::::::::
@@ -76,7 +76,9 @@ Pyrex
7676
Shedskin?
7777
---------
7878

79-
79+
Numba
80+
-----
81+
.. todo:: Write about Numba and the autojit compiler for NumPy
8082

8183
Threading
8284
:::::::::
@@ -86,7 +88,7 @@ Threading
8688
---------
8789

8890

89-
Spanwing Processes
91+
Spawning Processes
9092
------------------
9193

9294

0 commit comments

Comments
 (0)