0% found this document useful (0 votes)
55 views22 pages

Iot Data Collection, Storage and Streaming Analytics

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 22

IoT data collection, storage

and streaming analytics


One of the primary goals of IoT is the generation of business intelligence that can make business
processes more efficient. IoT has a dramatic effect on the volume, variety and rate of data
processing as billions of devices sense and transmit information at an ever-increasing rate. As video
sensing takes off, this only gets bigger. Real-time streaming analytics coupled with a highly
distributed architecture for data retention and access, will be essential to mining the data in a realtime manner.

Krish Prabhu, GIAN Lecture 3

The IoT Reference Model

End to End Security

Management Capabilities

Application Layer

IoT Applications
(Asset Management, Smart cities, healthcare, )

Why build end to end


solutions?
Which solutions areas?

Service, Application
Support Layer

Generic Support

Specific Support

Networking

How do you define a


platform?
How do you attract
developers?

Managing connections
across multiple networks
is not an easy job

Devices are custom

Network Layer
Transport

Device Layer

Device

Gateway

Ref: ITU-T Y.2060

Krish Prabhu, GIAN Lecture 4

IoT generates a lot of data


-

Data is generated continuously


The amount of data increases with device population
Data sources are highly distributed
Data sources are on the move
Data can become stale fairly soon
Data location will depend on application

Data needs to be accurate, secure and useful


3

Krish Prabhu, GIAN Lecture 4

IoT introduces a new concept Fog Computing


-

To handle the large amounts of data in real-time, Fog


Computing takes Cloud Computing to the edge of the
WAN.
This eliminates latency and accommodates mobility
It is also an answer to the large volume of data that is
generated and moves it closer to the end-points.
Facilitates real-time processing and adapts to changing
data currency (date getting stale problem)
Co-exists with Cloud Computing
Krish Prabhu, GIAN Lecture 4

IoT Fog/Cloud Solution


Cloud Computing
Cloud Data Centers

Enterprises

Ref: Cisco

Krish Prabhu, GIAN Lecture 4

Ref: Ahmed Banafa

Krish Prabhu, GIAN Lecture 4

Why Fog Computing?

Ref: ITU-T Y.2060

Krish Prabhu, GIAN Lecture 4

What is Cloud Computing?


Cloud computing is not a technology but a model for processing and presenting Information

Cloud is all about information services, not products:


The infrastructure is shared
Multiple clients share a common technology platform
The services are accessed on demand in units that vary by service
Units can be - user, capacity, transaction or any combination thereof.
Services are scalable
Services are flexible; there are no limits to growth.
The pricing model is by consumption
Instead of paying the fixed costs of a service sized to handle peak usage, you pay a variable cost
per unit consumption (users, transactions, capacity, etc.) that is measured in time periods that
can vary, such as hour or month.
Services can be accessed from anywhere in the world by multiple devices.
The public clouds are those that offer services to any customer over the Internet.
Private clouds offer services to a predefined group of customers, with access through Internet
or private networks.
8

Krish Prabhu, GIAN Lecture 4

Cloud Computing Models

Krish Prabhu, GIAN Lecture 4

Shared Cloud Resources


Orchestrator, Network Controllers, Management Functions

App1

App2

AppN
Hypervisor

Storage

DBs & Mware

Compute
Krish Prabhu, GIAN Lecture 4

Network

Fog Computing extends Cloud Computing

Ref: Cisco

11

Krish Prabhu, GIAN Lecture 4

Role of Fog

12

Krish Prabhu, GIAN Lecture 4

The IoT Data Collection System


Fog or Cloud Node

13

Krish Prabhu, GIAN Lecture 4

Ref: Aeris

REST Representational State Transfer

14

REST is the software architectural style of the World Wide Web


Defined by Roy Fielding in 2000, it is based on HTTP 1.1 (1996 - 1999)
REST has a coordinated set of constraints (more on next slide)
These are applied to the design of components in a distributed hypermedia
system
REST leads to a higher-performing and more maintainable software
architecture
When a system conforms to the constraints of REST it is called RESTful.
RESTful systems typically communicate over Hypertext Transfer Protocol
(HTTP)
REST systems interface with external systems as web resources
Krish Prabhu, GIAN Lecture 4

Rest - Constraints
Clientserver - A uniform interface separates clients from servers
Clients are not concerned with data storage, so client code is portable
Servers are not concerned with the user interface or user state
Servers and clients may also be replaced and developed independently
Stateless - Client context is not stored on the server
Each request from any client contains all the information necessary to service the request
The session state can be transferred by the server to another service (i.e., a database)
There are means to facilitate a state-transition for the client
Cacheable - Clients and intermediaries can cache responses
Responses define themselves as cacheable, or not, to prevent clients from reusing stale data
Well-managed caching eliminates some clientserver interactions, further improving scalability and
performance.
Layered system - A client cannot ordinarily tell whether it is connected directly to the end server, or to an
intermediary along the way.
Intermediary servers improve system scalability
They may also enforce security policies.
15

Krish Prabhu, GIAN Lecture 4

Rest Constraints, continued


Code on demand (optional) - Servers can temporarily extend or customize the functionality of a client by
the transfer of executable code.
Examples of this may include compiled components such as Java applets and client-side scripts
such as JavaScript.
Uniform interface - The uniform interface constraint is fundamental to the design of any REST service
The uniform interface simplifies and decouples the architecture, which enables each part to evolve
independently.
The four constraints for this uniform interface are:
Identification of resources
Manipulation of resources
Self-descriptive messages
Hypermedia as the engine of application state (HATEOAS)
16

Krish Prabhu, GIAN Lecture 4

REST - benefits
Complying with the architectural constraints of REST provides desirable non-functional properties, such as:
Performance Minimal component interactions (a dominant factor in user-perceived performance)
Scalability - Supports large numbers of components and interactions among components
Implementation
REST's clientserver separation simplifies component implementation and simplicity of
interfaces
Layered system constraints allow intermediariesproxies, gateways, and firewallsto be
introduced at various points.
REST enables intermediate processing by constraining messages to be self-descriptive
It facilitates modifiability of components to meet changing needs
Visibility of communication between components by service agents
Portability of components by moving program code with the data
17

Krish Prabhu, GIAN Lecture 4

CoAP Constrained Application Protocol


A software protocol intended to be used in very simple electronics devices that allows them to
communicate interactively over the Internet

Targeted for small low power sensors, switches, valves controlled or supervised remotely over the Internet
Application layer protocol that is intended for use in resource-constrained internet devices
Designed to easily translate to HTTP for simplified integration with the web
Offers multicast support, very low overhead, and simplicity (important for IoT where devices have much less
memory and power)

The Internet Engineering Task Force (IETF) Constrained RESTful environments (CoRE) Working
Group has done the major standardization (specified in RFC 7252)
The CoRE group has designed CoAP with the following features in mind:
Overhead and parsing complexity.
URI and content-type support.
Support for the discovery of resources provided by known CoAP services.
Simple subscription for a resource, and resulting push notifications.
Simple caching based on max-age.

With the introduction of CoAP, a complete networking stack of open standard protocols that are
suitable for constrained devices and environments, becomes available.
18

Krish Prabhu, GIAN Lecture 4

The distributed Fog/Cloud architecture for


storing and analyzing IoT data must
Be secure and support reliable collection, storage, analysis and publishing of

IoT data that scales to millions of users and billions of devices as needed
Have a rules engine that processes data in near real time which powers
sophisticated alerting functionality in applications
Support industry standard protocols for collecting data from devices including
CoAP and MQTT as well as provide support for custom protocols as needed
Offer REST API that supports both pushing data to applications and
applications pulling data on-demand
Offer a secure data sharing interface that enables sensor data to be consumed
by applications across the enterprise or by 3rd parties
19

Krish Prabhu, GIAN Lecture 4

Challenges
The Data Complexity Problem
With billions of sensors collecting trillions of data points, collecting, managing, and analyzing data on this scale
presents a big challenge. Gaining visibility and insight into devices, connectivity, and billing information is
important
Proactively Managing IoT Connectivity
Not only is there a need to identify devices with connectivity issues, but there is also a need to focus on a
specific device, time, or usage activity including data, SMS, voice, sessions, registrations, etc. This lets you find
specific devices that are outliers or are behaving abnormally. Automating this helps in rapid diagnosis and in
containing down time. For mobile end-points, costs also need to be managed by revealing coverage, data, SMS,
and roaming charges.
Setting Smart Alerts
Triggered by specific events, enabling the operator to build pre-defined actions to help to fix or mitigate a
problem.
Detecting Anomalies
Machine learning engine can identify anomalies such as certain devices using higher-than-normal amounts of
data (or security breaches) which can then feed into the smart alerts.
20

Krish Prabhu, GIAN Lecture 4

The IoT Solution Stack


Problem

Solution

Cant monitor devices in real-time for more


efficient deployment.
Planning ineffectively without understanding of
deployment progression over time.
Surprised by unexpected bills and significant
usage changes that leads to high bills.
Have little visibility or ability to manage billing
costs.

21

View devices deployed and track their usage


and billing well before you receive your
monthly bill.
Able to see number of provisioned, billed,
suspended, and canceled devices over time.
Gain visibility into the billing history, current
usage, and projected based on latest usage
data.
Set billing or usage alerts that will enable the
user to take appropriate action to manage
costs.

Troubleshooting erratic devices - e.g., frequent


registration, multiple sessions, too high data
usage by some devices, etc.

Automated anomalies and specific alerts that


enable user to be notified of device behavior
changes and also ability to automate actions.

Dont know where the devices are deployed and


if they are connected and behaving normally.

Map view of devices and summary status


showing all devices working properly.

Difficult to use raw data reports to extract the


key insights.

Simple interface shows the relevant data in a


graphical format.

Krish Prabhu, GIAN Lecture 4

Ref: Aeris

End of Lecture

22

Krish Prabhu, GIAN Lecture 4

You might also like