0% found this document useful (0 votes)
8 views

Proj4

Uploaded by

yuzeidutta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Proj4

Uploaded by

yuzeidutta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

AISSCE 2024-25

INFORMATION PRACTICES (065) –XII

PROJECT REPORT

"Birdwatching Analytics: Understanding Avian


Patterns"

Submitted by : Submitted to :

Name :
Class :
Roll Number :
(AISSCE)
DECLARATION

This is certify that the Information Practices Project on ""Birdwatching


Analytics: Understanding Avian Patterns"" has been successfully
completed by of class XII- , Army Public School
Jorhat for consideration in parallel fulfilment of curriculum of central board of
secondary education (CBSE) of Information Practices (065) for the award of
AISSCE Practical Examination 2024-25.

I certify that this project is up to my expectation and as per the guidelines issued by
the CBSE.

(External Examiner)

(Internal Examiner) (Principal)


ACKNOWLEDGEMENT

I take this opportunity to express my deep sense of gratitude to all those who have
been instrumental in preparation of this project.

I feel great pleasure to express my obligation to Mr. Diganta Handique, Principal of


the Army Public School Jorhat.

I am also sincerely grateful to Mr Tapan Kumar Saikia PGT(IP) ,the Army Public
School Jorhat for his encouragement and valuable guidance during the entire period
of work.

I would also thank all my parents and friends for their whole hearted support and
encouragement without with this project would not have been successful.

I could not forget Internet ,Textbook which provided me with sufficient matter of
reference.
TABLE OF CONTENT

SL NO. TOPIC PAGE NO.

1 Introduction 5

2 Problem Statement 6

3 Objective 7

4 Project Scope 8

5 System Requirement & Specification 9

6 Overview of Python 10-11

7 Data Collection 12

8 Source Code 13-16

9 Output 17

10 Data Visualisation Graphs 18-23

11 Conclusion 24

12 Bibilography 25
INTRODUCTION

Birdwatching is a popular activity that not only connects enthusiasts


with nature but also contributes to scientific research through the
collection of observational data. This project aims to analyse bird
observation data to understand trends in species prevalence,
seasonal variations, and correlations among various factors
influencing bird sightings. By leveraging data visualization
techniques, we aim to present insights that can aid in conservation
efforts and enhance the experience of birdwatchers.
PROBLEM STATEMENT
Despite the growing interest in birdwatching and the importance of
avian biodiversity, there is often a lack of accessible data analysis
that can inform enthusiasts and conservationists about trends and
patterns in bird observations. This project addresses the need for a
comprehensive analysis of bird observation data to identify key
species, understand seasonal trends, and explore relationships
between different factors affecting bird populations.
OBJECTIVE

The primary objectives of this project are:


To identify the most frequently observed bird species from the
dataset.
To analyse trends in bird observations over time.
To examine seasonal variations in bird sightings.
To explore correlations between numerical features related to bird
observations.
To present findings through informative visualizations that can assist
in conservation efforts.
PROJECT SCOPE
This project focuses on analysing a dataset containing bird
observation records. The scope includes:
Data preprocessing to clean and prepare the dataset for analysis.
Visualization of key insights through various graphs such as bar
plots, line plots, histograms, box plots, and heatmaps.
Interpretation of results to provide actionable insights for
conservationists and birdwatchers.
The project does not cover field studies or real-time data collection
but relies solely on existing datasets.
SYSTEM REQUIREMENT & SPECIFICATION
To effectively analyze and visualize youth depression data using
Python, the following hardware and software specifications are
utilized:
Hardware Requirements
1. Processor: - Minimum: Intel Core i5 (4 cores, 2.0 GHz or higher)
or equivalent.
2. Memory (RAM):
- Minimum: 8 GB RAM for basic tasks; suitable for handling
moderate datasets.
3. Storage: - Minimum: 512 GB SSD for efficient data access and
processing.
4.Graphics Processing Unit (GPU):
- Integrated graphics are sufficient for basic tasks; a dedicated GPU
is recommended for advanced visualization.
Software Requirements
1. Operating System: - Windows 10 or Newer.
2. Python Environment:
- Python version 3.13.x with essential libraries including:
- Pandas, NumPy, Matplotlib,Seaborn,Scikit-learn (for machine
learning applications)
3. Development Tools:
- Integrated Development Environment (IDE) such as PyCharm or
Jupyter Notebook for coding and visualization.
OVERVIEW OF PYTHON
Python is a high-level, interpreted, interactive and object-oriented
scripting language. Python is designed to be highly readable. It uses
English keywords frequently where as other languages use
punctuation, and it has fewer syntactical constructions than other
languages.

Python is an open-source and cross-platform programming language.


It is available for use under Python Software Foundation
License (compatible to GNU General Public License) on all the major
operating system platforms Linux, Windows and Mac OS.

To facilitate new features and to maintain that readability, the


Python Enhancement Proposal (PEP) process was developed. This
process allows anyone to submit a PEP for a new feature, library, or
other addition.

The design philosophy of Python emphasizes on simplicity,


readability and unambiguity. Python is known for its batteries
included approach as Python software is distributed with a
comprehensive standard library of functions and modules.

Python supports imperative, structured as well as object-oriented


programming methodology. It provides features of functional
programming as well.
Libraries Used in the Project
Pandas:
Purpose: A powerful library for data manipulation and analysis.
Key Features:
Provides two primary data structures: DataFrame and Series for handling
tabular data.
Supports reading and writing data from various formats (CSV, Excel, SQL).
Offers robust functions for data cleaning, filtering, grouping, and merging
datasets.
Integrates seamlessly with NumPy for efficient numerical operations.
NumPy:
Purpose: A fundamental package for numerical computing in Python.
Key Features:
Provides support for large, multi-dimensional arrays and matrices.
Offers a collection of mathematical functions to perform operations on
arrays.
Enables efficient array manipulation and broadcasting capabilities,
enhancing performance for numerical computations.
Matplotlib:
Purpose: A versatile plotting library for creating static, animated, and
interactive visualizations in Python.
Key Features:
Supports a wide range of plots (line plots, bar charts, histograms, scatter
plots).
Allows customization of plots with labels, titles, legends, and colors.
Integrates well with Pandas and NumPy for easy visualization of data stored
in DataFrames and arrays.

Seaborn:
Purpose: A statistical data visualization library based on Matplotlib that
provides a high-level interface for drawing attractive graphics.
Key Features:
Simplifies the creation of complex visualizations such as heatmaps, violin
plots, and pair plots.
Enhances Matplotlib’s functionality with improved aesthetics and color
palettes.
Facilitates easy exploration of relationships between multiple variables
through built-in statistical functions.

Scikit-learn:
Purpose: A comprehensive machine learning library for Python that provides
simple and efficient tools for data mining and analysis.
Key Features:
Offers a wide range of algorithms for classification, regression, clustering,
and dimensionality reduction.Provides utilities for model evaluation and
selection through cross-validation and metrics.Integrates seamlessly with
NumPy and Pandas to facilitate preprocessing and feature extraction.
DATA COLLECTION

Technique: Secondary Data Collection


The data used in this project was collected through secondary data
collection techniques. This involves gathering existing datasets from reliable
sources such as governmental wildlife agencies, environmental
organizations, or online databases dedicated to avian research. The dataset
used contains records of bird sightings along with relevant attributes like
species names, observation dates, and total observation counts.
PROGRAMING CODE (IDLE Python3.13.64-bit)
# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set the style for seaborn


sns.set(style="whitegrid")

# Load the dataset


data = pd.read_csv(r'C:\Users\Dell\Desktop\24ipp\P4\brd.csv') #
Update with your actual file path

# Display the first few rows of the dataset


print(data.head())

# Check for missing values


print(data.isnull().sum())

# Strip any leading or trailing spaces from column names


data.columns = data.columns.str.strip()

# Verify column names


print("Columns in dataset:", data.columns.tolist())

# Data Preprocessing - Drop missing values if necessary


data.dropna(inplace=True)
# Convert 'totalobservations' to numeric, coercing errors to NaN
data['totalobservations'] = pd.to_numeric(data['totalobservations'],
errors='coerce')
data.dropna(subset=['totalobservations'], inplace=True)

# Convert 'lastobservation' to datetime format (check for correct name)


if 'lastobservation' in data.columns:
data['lastobservation'] = pd.to_datetime(data['lastobservation'],
errors='coerce')
else:
print("Column 'lastobservation' does not exist.")

# Drop rows where conversion failed (if any)


data.dropna(subset=['lastobservation'], inplace=True)

# 1. Most Frequently Observed Bird Species


plt.figure(figsize=(14, 7))
top_species = data.nlargest(10, 'totalobservations')
sns.barplot(x='totalobservations', y='name', data=top_species,
palette='viridis') # Removed hue parameter
plt.title('Top 10 Most Frequently Observed Bird Species in India')
plt.xlabel('Total Observations')
plt.ylabel('Common Name')
plt.tight_layout()
plt.show()

# 2. Observation Trends Over Time


observations_per_year =
data.groupby(data['lastobservation'].dt.year)['totalobservations'].sum()
.reset_index()

plt.figure(figsize=(14, 7))
sns.lineplot(x='lastobservation', y='totalobservations',
data=observations_per_year)
plt.title('Trend of Bird Observations Over Time')
plt.xlabel('Year')
plt.ylabel('Total Observations')
plt.xticks(observations_per_year['lastobservation'], rotation=45)
plt.tight_layout()
plt.show()

# 3. Distribution of Total Observations


plt.figure(figsize=(14, 7))
sns.histplot(data['totalobservations'], bins=30, kde=True)
plt.title('Distribution of Total Observations')
plt.xlabel('Total Observations')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()

# 4. Box Plot for Total Observations by Month


data['Month'] = data['lastobservation'].dt.month
plt.figure(figsize=(14, 7))
sns.boxplot(x='Month', y='totalobservations', data=data)
plt.title('Box Plot of Total Observations by Month')
plt.xlabel('Month')
plt.ylabel('Total Observations')

month_labels = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
'Oct', 'Nov', 'Dec']
plt.xticks(range(12), month_labels, rotation=45) # Adjusting index for
labels

plt.tight_layout()
plt.show()

# 5. Correlation Heatmap of Numerical Features (specifying


numeric_only=True)
correlation_matrix = data.corr(numeric_only=True) # Specify
numeric_only to avoid ValueError
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, fmt=".2f",
cmap='coolwarm')
plt.title('Correlation Heatmap of Numerical Features')
plt.tight_layout()
plt.show()

# Conclusion Summary
print("Analysis complete. Visualizations have been generated.")
OUTPUT

Python 3.13.0 (tags/v3.13.0:60403a5, Oct 7 2024, 09:38:07) [MSC v.1941


64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.

================= RESTART: C:\Users\Dell\Desktop\24ipp\P4\p4.py


================
name ... totalobservations
0 Brown-cheeked Fulvetta ... 443
1 Nepal Fulvetta ... 141
2 Brown Fulvetta ... 112
3 Giant Laughingthrush ... 61
4 Spotted Laughingthrush ... 112

[5 rows x 4 columns]
name 0
scientificname 0
lastobservation 0
totalobservations 0
dtype: int64
Columns in dataset: ['name', 'scientificname', 'lastobservation',
'totalobservations']

Analysis complete. Visualizations have been generated.


DATA VISUALIZATION GRAPHS
Conclusion: The bar plot reveals the ten bird species with the
highest observation counts in the dataset. This information is
crucial for understanding which species are most prevalent in
the observed area. Conservation efforts can be prioritized for
less frequently observed species, while the most common
species may indicate a stable or thriving population.
Additionally, this data can inform birdwatching activities and
ecological studies by highlighting popular species among
birdwatchers.
Conclusion: The line plot illustrates the trend of bird
observations over the years. An increasing trend may indicate
a growing interest in birdwatching or successful conservation
efforts, while a decreasing trend could signal environmental
issues or declining populations. Analyzing these trends helps
stakeholders understand changes in biodiversity and can guide
future conservation strategies.
Conclusion: The histogram displays the distribution of total
observations across all entries, showing how frequently
different counts of observations occur. The presence of a peak
at certain observation levels suggests common observation
patterns, while a wide spread indicates variability in
observation frequency. This information can help researchers
understand observer behavior and identify potential biases in
data collection.
Conclusion: The box plot provides insights into seasonal
variations in bird observations by month. It highlights median
observation counts, interquartile ranges, and potential outliers
for each month. Observations tend to vary significantly across
months, which may correlate with migratory patterns or
breeding seasons. Understanding these seasonal trends is
essential for planning conservation initiatives and optimizing
birdwatching opportunities.
Conclusion: The correlation heatmap visualizes relationships
between numerical features within the dataset. Strong positive
or negative correlations can indicate how different factors
influence bird observations (e.g., habitat type, weather
conditions). Identifying these correlations helps researchers
understand ecological dynamics and can inform management
practices aimed at enhancing avian diversity and health.
CONCLUSION

The analysis of bird observation data provides valuable insights into avian
biodiversity and trends in birdwatching activities. By identifying the most
frequently observed species and understanding seasonal variations, this
project contributes to the broader knowledge base needed for effective
conservation strategies. The visualizations generated offer an engaging way
to present these findings to both enthusiasts and researchers alike.
BIBILOGRAPHY
1. Kedarsai. (2023). Bird Species Classification 200 Categories. Retrieved
from Kaggle.
2. Basandrai, A. (2023). eBird Indian Birds Observations. Retrieved from
Kaggle.
3. Veeralakrishna. (2023). 200 Bird Species with 11,788 Images. Retrieved
from Kaggle.
4. Ichhadhari. (2023). Indian Birds Dataset. Retrieved from Kaggle.
Akash2907. (2023). Bird Species Classification. Retrieved from Kaggle.

You might also like