0% found this document useful (0 votes)
125 views3 pages

Structured, Semi-Structured and Unstructured Data (M-2)

Structured data follows a defined structure and schema, making it easily accessible. It is typically stored in rows and columns of databases and accessed using SQL. Semi-structured data has some structure but not a rigid schema, lacking fixed fields. It includes XML and is harder to automatically access. Unstructured data completely lacks structure, such as images and text documents, and is difficult for computers to interpret without preprocessing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views3 pages

Structured, Semi-Structured and Unstructured Data (M-2)

Structured data follows a defined structure and schema, making it easily accessible. It is typically stored in rows and columns of databases and accessed using SQL. Semi-structured data has some structure but not a rigid schema, lacking fixed fields. It includes XML and is harder to automatically access. Unstructured data completely lacks structure, such as images and text documents, and is difficult for computers to interpret without preprocessing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Structured data

Structured data is the data which conforms to a data model, has a well define
structure, follows a consistent order and can be easily accessed and used by a person
or a computer program.
Structured data is usually stored in well-defined schemas such as Databases. It is
generally tabular with column and rows that clearly define its attributes.
SQL (Structured Query language) is often used to manage structured data stored in
databases.
Characteristics of Structured Data:
 Data conforms to a data model and has easily identifiable structure
 Data is stored in the form of rows and columns Example : Database
 Data is well organised so, Definition, Format and Meaning of data is
explicitly known
 Data resides in fixed fields within a record or file
 Similar entities are grouped together to form relations or classes
 Entities in the same group have same attributes
 Easy to access and query, So data can be easily used by other programs
 Data elements are addressable, so efficient to analyse and process

Sources of Structured Data:


 SQL Databases
 Spreadsheets such as Excel
 OLTP Systems
 Online forms
 Sensors such as GPS or RFID tags
 Network and Web server logs
 Medical devices

Storage:
Structured data is stored in tabular formats (e.g., excel sheets or SQL databases) that
require less storage space. It can be stored in data warehouses, which makes it highly
scalable.

Semi-structured data
Semi-structured data is the data which does not conforms to a data model but has
some structure. It lacks a fixed or rigid schema. It is the data that does not reside in
a rational database but that have some organisational properties that make it easier
to analyse. With some process, we can store them in the relational database.
Characteristics of semi-structured Data:
 Data does not conforms to a data model but has some structure.
 Data can not be stored in the form of rows and columns as in Databases
 Semi-structured data contains tags and elements (Metadata) which is used to
group data and describe how the data is stored
 Similar entities are grouped together and organised in a hierarchy
 Entities in the same group may or may not have the same attributes or
properties
 Does not contains sufficient metadata which makes automation and
management of data difficult
 Size and type of the same attributes in a group may differ
 Due to lack of a well defined structure, it can not used by computer programs
easily

Sources of semi-structured Data:


 E-mails
 XML and other markup languages
 Binary executables
 TCP/IP packets
 Zipped files
 Integration of data from different sources
 Web pages

Possible solution for storing semi-structured data


 Data can be stored in DBMS specially designed to store semi-structured data
 XML is widely used to store and exchange semi-structured data. It allows its
user to define tags and attributes to store the data in hierarchical form.
Schema and Data are not tightly coupled in XML.
 Object Exchange Model (OEM) can be used to store and exchange semi-
structured data. OEM structures data in form of graph.
 RDBMS can be used to store the data by mapping the data to relational
schema and then mapping it to a table

Unstructured data
Unstructured data is the data which does not conforms to a data model and has no
easily identifiable structure such that it can not be used by a computer program
easily. Unstructured data is not organised in a pre-defined manner or does not have
a pre-defined data model, thus it is not a good fit for a mainstream relational
database.
Characteristics of Unstructured Data:
 Data neither conforms to a data model nor has any structure.
 Data can not be stored in the form of rows and columns as in Databases
 Data does not follows any semantic or rules
 Data lacks any particular format or sequence
 Data has no easily identifiable structure
 Due to lack of identifiable structure, it can not used by computer programs
easily

Sources of Unstructured Data:


 Web pages
 Images (JPEG, GIF, PNG, etc.)
 Videos
 Memos
 Reports
 Word documents and PowerPoint persentations
 Surveys

Possible solution for storing Unstructured data:


 Unstructured data can be converted to easily manageable formats
 using Content addressable storage system (CAS) to store unstructured data.
It stores data based on their metadata and a unique name is assigned to every
object stored in it.The object is retrieved based on content not its location.
 Unstructured data can be stored in XML format.
 Unstructured data can be stored in RDBMS which supports BLOBs

You might also like