Check if two PDF documents are identical with Python Last Updated : 23 Jul, 2025 Comments Improve Suggest changes Like Article Like Report Python is an interpreted and general purpose programming language. It is a Object-Oriented and Procedural paradigms programming language. There are various types of modules imported in python such as difflib, hashlib. Modules used:difflib : It is a module that contains function that allows to compare set of data.SequenceMatcher : It is used to compare pair of input sequences.Function Used:hash_file ( string $algo , string $filename , bool $binary = false ): It is a function which has the hash of a file.object.hexdigest(): It is a function which returns string.fileObject.read(size): It is a function that returns the specified number of bytes of a file.ApproachImport moduleDeclare a function with 2 arguments which is for file.Declare two objects for hashlib.sha1()Open filesRead the file by breaking the line into smaller chunksNow return both file such as h1.hexdigest() which is of 160 bits.Use hash_file() function to store the hash of a file.Compare and generate appropriate messageFiles in UseFile 1File 2 Program: Python3 import hashlib from difflib import SequenceMatcher def hash_file(fileName1, fileName2): # Use hashlib to store the hash of a file h1 = hashlib.sha1() h2 = hashlib.sha1() with open(fileName1, "rb") as file: # Use file.read() to read the size of file # and read the file in small chunks # because we cannot read the large files. chunk = 0 while chunk != b'': chunk = file.read(1024) h1.update(chunk) with open(fileName2, "rb") as file: # Use file.read() to read the size of file a # and read the file in small chunks # because we cannot read the large files. chunk = 0 while chunk != b'': chunk = file.read(1024) h2.update(chunk) # hexdigest() is of 160 bits return h1.hexdigest(), h2.hexdigest() msg1, msg2 = hash_file("pd1.pdf ", "pd1.pdf") if(msg1 != msg2): print("These files are not identical") else: print("These files are identical") Output These files are not identical Comment More infoAdvertise with us C chetanjha888 Follow Improve Article Tags : Technical Scripter Python Technical Scripter 2020 python-utility Python-projects +1 More Practice Tags : python Explore Python FundamentalsPython Introduction 3 min read Input and Output in Python 4 min read Python Variables 6 min read Python Operators 5 min read Python Keywords 2 min read Python Data Types 8 min read Conditional Statements in Python 3 min read Loops in Python - For, While and Nested Loops 7 min read Python Functions 8 min read Recursion in Python 6 min read Python Lambda Functions 6 min read Python Data StructuresPython String 6 min read Python Lists 6 min read Python Tuples 6 min read Dictionaries in Python 7 min read Python Sets 10 min read Python Arrays 9 min read List Comprehension in Python 4 min read Advanced PythonPython OOP Concepts 11 min read Python Exception Handling 6 min read File Handling in Python 4 min read Python Database Tutorial 4 min read Python MongoDB Tutorial 2 min read Python MySQL 9 min read Python Packages 12 min read Python Modules 7 min read Python DSA Libraries 15 min read List of Python GUI Library and Packages 11 min read Data Science with PythonNumPy Tutorial - Python Library 3 min read Pandas Tutorial 6 min read Matplotlib Tutorial 5 min read Python Seaborn Tutorial 15+ min read StatsModel Library- Tutorial 4 min read Learning Model Building in Scikit-learn 8 min read TensorFlow Tutorial 2 min read PyTorch Tutorial 7 min read Web Development with PythonFlask Tutorial 8 min read Django Tutorial | Learn Django Framework 10 min read Django ORM - Inserting, Updating & Deleting Data 4 min read Templating With Jinja2 in Flask 6 min read Django Templates 7 min read Python | Build a REST API using Flask 3 min read How to Create a basic API using Django Rest Framework ? 4 min read Python PracticePython Quiz 3 min read Python Coding Practice 1 min read Python Interview Questions and Answers 15+ min read Like