Skip to content

Commit abab573

Browse files
committed
add pdf splitter tutorial
1 parent a4667ad commit abab573

File tree

5 files changed

+45
-0
lines changed

5 files changed

+45
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,7 @@ This is a repository of all the tutorials of [The Python Code](https://www.thepy
167167
- [How to Merge PDF Files in Python](https://www.thepythoncode.com/article/merge-pdf-files-in-python). ([code](handling-pdf-files/pdf-merger))
168168
- [How to Sign PDF Files in Python](https://www.thepythoncode.com/article/sign-pdf-files-in-python). ([code](handling-pdf-files/pdf-signer))
169169
- [How to Extract PDF Metadata in Python](https://www.thepythoncode.com/article/extract-pdf-metadata-in-python). ([code](handling-pdf-files/extract-pdf-metadata))
170+
- [How to Split PDF Files in Python](https://www.thepythoncode.com/article/split-pdf-files-in-python). ([code](handling-pdf-files/split-pdf))
170171

171172
- ### [Python for Multimedia](https://www.thepythoncode.com/topic/python-for-multimedia)
172173
- [How to Make a Screen Recorder in Python](https://www.thepythoncode.com/article/make-screen-recorder-python). ([code](general/screen-recorder))
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# [How to Split PDF Files in Python](https://www.thepythoncode.com/article/split-pdf-files-in-python)
757 KB
Binary file not shown.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
pikepdf
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
import os
2+
from pikepdf import Pdf
3+
4+
# a dictionary mapping PDF file to original PDF's page range
5+
file2pages = {
6+
0: [0, 9], # 1st splitted PDF file will contain the pages from 0 to 9 (9 is not included)
7+
1: [9, 11], # 2nd splitted PDF file will contain the pages from 9 (9 is included) to 11
8+
2: [11, 100], # 3rd splitted PDF file will contain the pages from 11 until the end or until the 100th page (if exists)
9+
}
10+
11+
# the target PDF document to split
12+
filename = "bert-paper.pdf"
13+
# load the PDF file
14+
pdf = Pdf.open(filename)
15+
# make the new splitted PDF files
16+
new_pdf_files = [ Pdf.new() for i in file2pages ]
17+
# the current pdf file index
18+
new_pdf_index = 0
19+
# iterate over all PDF pages
20+
for n, page in enumerate(pdf.pages):
21+
if n in list(range(*file2pages[new_pdf_index])):
22+
# add the `n` page to the `new_pdf_index` file
23+
new_pdf_files[new_pdf_index].pages.append(page)
24+
print(f"[*] Assigning Page {n} to the file {new_pdf_index}")
25+
else:
26+
# make a unique filename based on original file name plus the index
27+
name, ext = os.path.splitext(filename)
28+
output_filename = f"{name}-{new_pdf_index}.pdf"
29+
# save the PDF file
30+
new_pdf_files[new_pdf_index].save(output_filename)
31+
print(f"[+] File: {output_filename} saved.")
32+
# go to the next file
33+
new_pdf_index += 1
34+
# add the `n` page to the `new_pdf_index` file
35+
new_pdf_files[new_pdf_index].pages.append(page)
36+
print(f"[*] Assigning Page {n} to the file {new_pdf_index}")
37+
38+
# save the last PDF file
39+
name, ext = os.path.splitext(filename)
40+
output_filename = f"{name}-{new_pdf_index}.pdf"
41+
new_pdf_files[new_pdf_index].save(output_filename)
42+
print(f"[+] File: {output_filename} saved.")

0 commit comments

Comments
 (0)