|
5 | 5 | ----------
|
6 | 6 | 问题
|
7 | 7 | ----------
|
8 |
| -You need to write a script that involves finding files, like a file renaming script or a log |
9 |
| -archiver utility, but you’d rather not have to call shell utilities from within your Python |
10 |
| -script, or you want to provide specialized behavior not easily available by “shelling out.” |
| 8 | +你需要写一个涉及到文件查找操作的脚本,比如对日志归档文件的重命名工具, |
| 9 | +你不想在Python脚本中调用shell,或者你要实现一些shell不能做的功能。 |
11 | 10 |
|
12 | 11 | |
|
13 | 12 |
|
14 | 13 | ----------
|
15 | 14 | 解决方案
|
16 | 15 | ----------
|
17 |
| -To search for files, use the os.walk() function, supplying it with the top-level directory. |
18 |
| -Here is an example of a function that finds a specific filename and prints out the full |
19 |
| -path of all matches: |
| 16 | +查找文件,可使用 ``os.walk()`` 函数,传一个顶级目录名给它。 |
| 17 | +下面是一个例子,查找特定的文件名并答应所有符合条件的文件全路径: |
20 | 18 |
|
21 |
| -#!/usr/bin/env python3.3 |
22 |
| -import os |
| 19 | +.. code-block:: python |
23 | 20 |
|
24 |
| -def findfile(start, name): |
25 |
| - for relpath, dirs, files in os.walk(start): |
26 |
| - if name in files: |
27 |
| - full_path = os.path.join(start, relpath, name) |
28 |
| - print(os.path.normpath(os.path.abspath(full_path))) |
| 21 | + #!/usr/bin/env python3.3 |
| 22 | + import os |
29 | 23 |
|
30 |
| -if __name__ == '__main__': |
31 |
| - findfile(sys.argv[1], sys.argv[2]) |
| 24 | + def findfile(start, name): |
| 25 | + for relpath, dirs, files in os.walk(start): |
| 26 | + if name in files: |
| 27 | + full_path = os.path.join(start, relpath, name) |
| 28 | + print(os.path.normpath(os.path.abspath(full_path))) |
32 | 29 |
|
33 |
| -Save this script as findfile.py and run it from the command line, feeding in the starting |
34 |
| -point and the name as positional arguments, like this: |
| 30 | + if __name__ == '__main__': |
| 31 | + findfile(sys.argv[1], sys.argv[2]) |
35 | 32 |
|
36 |
| -bash % ./findfile.py . myfile.txt |
| 33 | +保存脚本为文件findfile.py,然后在命令行中执行它。 |
| 34 | +指定初始查找目录以及名字作为位置参数,如下: |
| 35 | + |
| 36 | +.. code-block:: python |
| 37 | + bash % ./findfile.py . myfile.txt |
37 | 38 |
|
38 | 39 | |
|
39 | 40 |
|
40 | 41 | ----------
|
41 | 42 | 讨论
|
42 | 43 | ----------
|
43 |
| -The os.walk() method traverses the directory hierarchy for us, and for each directory |
44 |
| -it enters, it returns a 3-tuple, containing the relative path to the directory it’s inspecting, |
45 |
| -a list containing all of the directory names in that directory, and a list of filenames in |
46 |
| -that directory. |
47 |
| -For each tuple, you simply check if the target filename is in the files list. If it is, |
48 |
| -os.path.join() is used to put together a path. To avoid the possibility of weird looking |
49 |
| -paths like ././foo//bar, two additional functions are used to fix the result. The first is |
50 |
| -os.path.abspath(), which takes a path that might be relative and forms the absolute |
51 |
| -path, and the second is os.path.normpath(), which will normalize the path, thereby |
52 |
| -resolving issues with double slashes, multiple references to the current directory, and |
53 |
| -so on. |
54 |
| -Although this script is pretty simple compared to the features of the find utility found |
55 |
| -on UNIX platforms, it has the benefit of being cross-platform. Furthermore, a lot of |
56 |
| -additional functionality can be added in a portable manner without much more work. |
57 |
| -To illustrate, here is a function that prints out all of the files that have a recent modifi‐ |
58 |
| -cation time: |
59 |
| - |
60 |
| -#!/usr/bin/env python3.3 |
61 |
| - |
62 |
| -import os |
63 |
| -import time |
64 |
| - |
65 |
| -def modified_within(top, seconds): |
66 |
| - now = time.time() |
67 |
| - for path, dirs, files in os.walk(top): |
68 |
| - for name in files: |
69 |
| - fullpath = os.path.join(path, name) |
70 |
| - if os.path.exists(fullpath): |
71 |
| - mtime = os.path.getmtime(fullpath) |
72 |
| - if mtime > (now - seconds): |
73 |
| - print(fullpath) |
74 |
| - |
75 |
| -if __name__ == '__main__': |
76 |
| - import sys |
77 |
| - if len(sys.argv) != 3: |
78 |
| - print('Usage: {} dir seconds'.format(sys.argv[0])) |
79 |
| - raise SystemExit(1) |
80 |
| - |
81 |
| - modified_within(sys.argv[1], float(sys.argv[2])) |
82 |
| - |
83 |
| -It wouldn’t take long for you to build far more complex operations on top of this little |
84 |
| -function using various features of the os, os.path, glob, and similar modules. See Rec‐ |
85 |
| -ipes 5.11 and 5.13 for related recipes. |
| 44 | +``os.walk()`` 方法为我们遍历目录树, |
| 45 | +每次进入一个目录,它会返回一个三元组,包含相对于查找目录的相对路径,一个该目录下的目录名列表, |
| 46 | +以及那个目录下面的文件名列表。 |
| 47 | + |
| 48 | +对于每个元组,只需检测一下目标文件名是否在文件列表中。如果是就使用 ``os.path.join()`` 合并路径。 |
| 49 | +为了避免奇怪的路径名比如 ``././foo//bar`` ,使用了另外两个函数来修正结果。 |
| 50 | +第一个是 ``os.path.abspath()`` ,它接受一个路径,可能是相对路径,最后返回绝对路径。 |
| 51 | +第二个是 ``os.path.normpath()`` ,用来返回正常路径,可以解决双斜杆、对目录的多重引用的问题等。 |
| 52 | + |
| 53 | +尽管这个脚本相对于UNIX平台上面的很多查找公交来讲要简单很多,它还有跨平台的优势。 |
| 54 | +并且,还能很轻松的加入其他的功能。 |
| 55 | +我们再演示一个例子,下面的函数打印所有最近被修改过的文件: |
| 56 | + |
| 57 | +.. code-block:: python |
| 58 | +
|
| 59 | + #!/usr/bin/env python3.3 |
| 60 | +
|
| 61 | + import os |
| 62 | + import time |
| 63 | +
|
| 64 | + def modified_within(top, seconds): |
| 65 | + now = time.time() |
| 66 | + for path, dirs, files in os.walk(top): |
| 67 | + for name in files: |
| 68 | + fullpath = os.path.join(path, name) |
| 69 | + if os.path.exists(fullpath): |
| 70 | + mtime = os.path.getmtime(fullpath) |
| 71 | + if mtime > (now - seconds): |
| 72 | + print(fullpath) |
| 73 | +
|
| 74 | + if __name__ == '__main__': |
| 75 | + import sys |
| 76 | + if len(sys.argv) != 3: |
| 77 | + print('Usage: {} dir seconds'.format(sys.argv[0])) |
| 78 | + raise SystemExit(1) |
| 79 | +
|
| 80 | + modified_within(sys.argv[1], float(sys.argv[2])) |
| 81 | +
|
| 82 | +在此函数的基础之上,使用os,os.path,glob等类似模块,你就能实现更加复杂的操作了。 |
| 83 | +可参考5.11小节和5.13小节等相关章节。 |
0 commit comments