0% found this document useful (0 votes)
7 views5 pages

Managing Data and Assignments

The document outlines the end-to-end data lifecycle, emphasizing the distinction between data and information, and covers data types, structures, and management practices. It highlights the importance of data quality, security, and ethical considerations, while providing practical examples and exercises for database design, spreadsheet usage, and SQL commands. Effective data management is presented as crucial for enhancing visibility, reliability, security, and compliance, with a focus on preventing data loss and its associated consequences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

Managing Data and Assignments

The document outlines the end-to-end data lifecycle, emphasizing the distinction between data and information, and covers data types, structures, and management practices. It highlights the importance of data quality, security, and ethical considerations, while providing practical examples and exercises for database design, spreadsheet usage, and SQL commands. Effective data management is presented as crucial for enhancing visibility, reliability, security, and compliance, with a focus on preventing data loss and its associated consequences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Managing Data:

Compiled for Celestino


Date: 01 August 2025

Learning Objectives
- Understand the end-to-end data lifecycle, from collection to archiving.
- Differentiate between raw data and meaningful information.
- Identify and apply appropriate data types and structures.
- Explore database models and basic SQL operations.
- Utilize spreadsheet tools for validation, analysis, and reporting.
- Recognize the importance of data quality, security, and ethics.

Data vs Information
Data are unprocessed facts or measurements, such as a list of temperatures,
student names, or survey responses.
Information is the result of processing, organizing, or interpreting data to provide
context and meaning—for example, a weekly temperature trend chart or a summary
report of survey findings.
Data Lifecycle
1. Collection
- Manual entry via forms, sensors, or automated imports.
2. Storage
- Files (CSV, JSON) or databases (relational, NoSQL).
3. Processing
-Cleaning, transforming, aggregating.
4. Analysis
- Statistical functions, charts, SQL queries.
5. Archiving & Deletion
- Backup, retention policies, secure disposal.
Data Types
| Categories Type | Description | Example |

|-------------|-----------|------------
| Primitive | Text | Alphanumeric characters | “Kenya”, “A123” |
| | Number | Integers or decimals | 42, 3.14 |
| | Boolean | True/False | TRUE, FALSE |

| Composite | Date/Time | Dates, timestamps | 2024-02-15, 14:30 |


| | Binary | Raw bytes (files, images, audio) | JPEG, MP3 |
| Structured | List | Ordered collection of items | [“apple”, “banana”] |
| | Record | Grouping of related fields | {Name: “Asha”, Age: 12} |
Data Structures
- Field: a single attribute (column) in a dataset, e.g., “Price”.
- Record: a complete row combining fields, e.g., one product entry.
- Table: a collection of records sharing the same fields.
- Index: a lookup table to speed up searches by one or more fields.
Relational Databases
- Tables store structured data; relationships link tables via keys.
- Primary Key uniquely identifies records (e.g., StudentID).
- Foreign Key references primary keys in other tables.
- Normalization reduces redundancy (1NF, 2NF, 3NF).

SQL Commands
| Command | Purpose | Example |

|---------|-----------------------------
| SELECT | Retrieve specific fields from a table eg. SELECT Name, Age FROM
Students;

| WHERE | Filter records by condition eg. SELECT * FROM Sales WHERE Amount >
100; |
 JOIN | Combine rows from two tables based on a key eg. SELECT * FROM
Orders JOIN Customers;
 INSERT | Add a new record | INSERT INTO Products VALUES (...); |
 UPDATE Modify existing records eg UPDATE Inventory SET Qty = Qty - 1; |

Spreadsheets
- Cells arranged in rows (1, 2, 3…) and columns (A, B, C…).
- Absolute vs relative references ($A$1 vs A1).
- Named ranges simplify formulas.
Key Functions
- SUM(range) – totals values.
- AVERAGE(range) – computes mean.
- VLOOKUP(key, table, col, FALSE) – finds matching data.
- COUNTIF(range, criteria) – counts entries meeting criteria.
- IF(condition, true-value, false-value) – conditional logic.
Advanced Features
- PivotTables for dynamic summarization.
- Conditional formatting to highlight patterns.
- Data import from external sources (CSV, web queries).
Data Validation Techniques
- Restrict cell values by type (whole number, date, text length).
- Custom formulas to enforce complex rules, e.g.:

- =AND(A2>=1, A2<=100) for score ranges.

- =NOT(COUNTIF($B$2:$B$100, B2)>1) to prevent duplicates.


- Dropdown lists for controlled vocabulary.
- Error alerts and input messages guide users.
Data Quality and Ethics
- Accuracy: ensure data is correct and up-to-date.
- Completeness: no missing critical fields.
- Consistency: uniform formats across records.
- Timeliness: data available when needed.
Ethical Considerations

- Obtain consent before collecting personal data.


- Anonymize sensitive information (e.g., names IDs). - Secure storage and controlled
access (passwords, encryption).
- Comply with local regulations (e.g., GDPR principles).
Importance of Data Management
Effective data management delivers:
- Enhanced Visibility: users can quickly find and trust data.
- Increased Reliability: standard processes minimize errors.
- Stronger Security: encryption and access controls protect assets.
- Improved Efficiency: streamlined workflows reduce duplication.
- Better Compliance: policies ensure regulatory adherence.
- Scalability: repeatable procedures support data growth.
Effects of Data Loss
Data loss can lead to:
- Financial Loss: recovery costs and lost revenue.
- Reputational Damage: erosion of customer trust.
- Operational Disruption: downtime and productivity setbacks. - Legal Consequences:
fines and litigation for non-compliance.
- Business Closure: small firms may not survive severe breaches.
Data Security and Backup
- User authentication and role-based permissions.
- Encryption at rest (database files) and in transit (HTTPS).
- Regular backups: full, incremental, and archival copies.
- Audit logs to track who accessed or modified data.
Data Analysis and Reporting
- Use charts (bar, line, scatter) to visualize trends.
- Calculate descriptive statistics: mean, median, mode, standard deviation.
- Perform “what-if” analysis with data tables and scenario manager.
- Generate automated reports using scripts or macros.
Real-World Examples
- School attendance tracker: collect daily marks, validate dates, summarize by
month.
- Inventory management: relational tables for products, suppliers, sales; reorder
alerts via SQL queries.
- Health survey: CSV data cleaned in spreadsheet, pivoted by age group, visualized
with pie charts.
Activities and Exercises

1. Design a small relational database on paper for a library:


- Tables: Books, Members, Loans.
- Identify fields, primary keys, and relationships.
2. Build a spreadsheet to track fitness data:

- Collect daily steps, validate with whole-number rule.

- Create a line chart showing weekly progress.


3. Write SQL queries against sample data:

- List all orders above a threshold.


- Join customers with their purchase history.
4. Implement advanced validation in a spreadsheet:

You might also like