Client–Server Technologies — Isaac Computer Science
Client–Server Technologies — Isaac Computer Science
Client–server technologies
Most of the services that you use on the internet are based on client–server architecture. In
this model, there are client applications that make requests for services and servers that send
a response to each request.
A web server process usually runs 24 hours a day, 7 days a week. If the server is
unavailable, the site or sites it is hosting cannot be accessed. The server process is
associated with an open port on the computer; for a web server, port 80 needs to be open
for standard HTTP requests. Servers are passive and listen (wait) for incoming requests.
When a request is received, the server responds by sending the resource that was
requested.
A client is the service that makes the request. If you want to look at a web page, you use
a browser; a browser is a client software service. If you want to collect emails, you use an
email client. The client process makes a request in the appropriate format and waits for
the response from the server.
Client-side processing refers to any operations that are carried out on the 'client' side of a
client–server relationship. There is almost always some processing that is carried out by
the client.
Many operations have been traditionally carried out client-side, because it offloads
processing to the client, which reduces the load on the server and reduces
communications bandwidth. Sometimes, the script needs to access data stored on the
local computer, e.g. in cookies which can be used to personalise a web page.
Client-side processing can pose a security risk. Malicious content can be distributed as
easily as code that creates a stunning slide show. Browsers are designed to mitigate
these risks. Scripts run in a sandbox (an environment that isolates the code so that it
cannot have access to, or affect, other parts of the system). Scripts have no access to the
operating system and little access to the file system (other than for limited purposes such
as cookies) and they can only access information that relates to their own site.
Server-side processing refers to operations that are carried out on the server.
When a request for a web page is made for a resource that must be processed on the
server, the server-side script needs to be executed, before it is sent to the client. Server-
side scripts are stored on the server and are processed on the server so are completely
hidden from the client.
1. The user navigates to the appropriate page and complete an order form. The order
form may make use of a client-side script to carry out basic validation of the data
that has been entered. For example, it could use a regular expression to make sure
that the email address is in the correct format.
2. The user submits the order form. The web browser creates an HTTP POST request
to the web server, submitting the data from the form. POST is explained in more
detail in the next section.
3. The web server detects that the request is 'dynamic' and forwards it to the
appropriate server-side extension (e.g. a php interpreter) for processing.
4. The server side script executes a SQL query to insert the data for the order into the
database. It then dynamically creates an HTML page that confirms the order has
been placed, including the customer order number that was generated as part of the
transaction.
5. The web server then returns the generated HTML to the web browser along with an
HTTP status code of 200 ('success'). If anything goes wrong, an error code is
returned.
6. The web browser receives the HTML and renders it to display the confirmation to
the user.
The process described above has been simplified. There would probably be a lot of
additional processing to be performed server-side. For example, stock levels may be
updated, an email may be sent to the user to confirm the order, the transaction may
generate additional activity on the warehousing or dispatch systems.
There is more control over the environment in which the scripts are be executed.
The developer knows exactly which version of the interpreter will be used and can
be confident that any other software extensions will be present.
Scripts are hidden on the server — this helps protect intellectual copyright and helps
prevent anyone from tampering with the code
Servers can be optimised to cope with heavy processing demands
Thin-client computing
A thin-client is a device that has limited main memory, limited secondary storage, and
only basic processing capability.
The client server paradigm has started to change the design of computers. Although high
performance (thick-client) computers are favoured by gamers and users with specialist
requirements (i.e. video editors), most users now realise that they do not need powerful
computers. As more and more processing is done "server-side", the need for high-end
processors in user devices is diminishing. Cloud storage is having a dramatic impact on
where data is stored, the demand for huge amounts of local storage has taken a
downturn. Today, the display quality and battery life of a computer is considered more
important than its storage capacity.
The advantage for the user is that a thin-client computer is cheaper to buy. Software is
always kept up-to-date by the provider, and is often available at no cost. Thin clients are
arguably more secure, as data is not held locally and the responsibility for backup is with
the company storing the data.
The main disadvantage of a thin-client is the need to be connected to the internet. There
is also a requirement for a decent amount of bandwidth as more data is transferred
between client and server.
Standards
The use of standards (as well as protocols) is essential for the internet. Standards exist
for almost every aspect of the internet and, without these standards, the internet would
grind to a halt.
There are two broad categories of standards. De jure standards are standards that are
regulated by official bodies. You have already seen how ICANN provides effective
regulation of the names and numbers that are used across the internet. The term JPEG is
an acronym for the Joint Photographic Experts Group, which created the image
compression standard in 1992 and continues to update and maintain it.
There are also de facto standards. These are standards that arise through popular use
but are not managed or regulated. For example, it may be standard to use sans serif fonts
for web pages but no one will stop you if you go wild with Garamond.
JSON and XML
JSON and XML are both standards for the interchange of data.
JSON
JSON is an acronym for JavaScript Object Notation. Its development is regulated by
ECMA, a standards organisation for information and communication systems. The fact
that it includes the term JavaScript in its name can be misleading; although its providence
was JavaScript, JSON is open source and language independent.
JSON is text based (i.e. readable by humans) and is built on two structures:
{
"palette": [
{
"name": "red",
"code": "#ff0000",
"rgb": {"red": 255, "green": 0, "blue": 0}
},
{
"name": "orange",
"code": "#ffa500",
"rgb": {"red": 255, "green": 165, "blue": 0}
},
{
"name": "yellow",
"code": "#ffff00",
"rgb": {"red": 255, "green": 255, "blue": 0}
},
{
"name": "green",
"code": "#008000",
"rgb": {"red": 0, "green": 128, "blue": 0}
},
{
"name": "blue",
"code": "#0000ff",
"rgb": {"red": 0, "green": 0, "blue": 255}
},
{
"name": "indigo",
"code": "#4b0082",
"rgb": {"red": 75, "green": 0, "blue": 130}
},
{
"name": "violet",
"code": "#ee82ee",
"rgb": {"red": 238, "green": 130, "blue": 238}
}
]
}
JSON is a compact representation that is very easy for humans to understand. It is easy
for computers to parse and therefore quick to process. In JSON, it is easy to distinguish
between the number 1 and the string value "1", as numbers and strings are represented
differently. One disadvantage of JSON, is that it supports limited types of data.
The json.org website has an excellent example of the use of syntax diagrams to show the
structure of a JSON file.
XML
Extensible Markup Language (XML) is an alternative format to JSON for the interchange
of data.
In XML format, the data for the example palette would be defined like this:
<palette>
<code>#ff0000</code>
<name>red</name>
<rgb>
<blue>0</blue>
<green>0</green>
<red>255</red>
</rgb>
<code>#ffa500</code>
<name>orange</name>
<rgb>
<blue>0</blue>
<green>165</green>
<red>255</red>
</rgb>
<code>#ffff00</code>
<name>yellow</name>
<rgb>
<blue>0</blue>
<green>255</green>
<red>255</red>
</rgb>
<code>#008000</code>
<name>green</name>
<rgb>
<blue>0</blue>
<green>128</green>
<red>0</red>
</rgb>
<code>#0000ff</code>
<name>blue</name>
<rgb>
<blue>255</blue>
<green>0</green>
<red>0</red>
</rgb>
<code>#4b0082</code>
<name>indigo</name>
<rgb>
<blue>130</blue>
<green>0</green>
<red>75</red>
</rgb>
<code>#ee82ee</code>
<name>violet</name>
<rgb>
<blue>238</blue>
<green>130</green>
<red>238</red>
</rgb>
</palette>
You can see that the XML file is longer; every field name tag needs a corresponding end
tag.
XML allows a schema to be written. The schema is a type of metadata that specifies, and
therefore constrains, the structure of the XML file. A typical schema defines the elements
and attributes that must be included in the file: the data types for these elements and
attributes — including default and fixed values, child elements — as well as the order in
which they appear.
For the example, a simple schema might look like this:
<xs:schema>
<xs:element name="palette">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="code" type="xs:string"/>
<xs:name="rgb">
<xs:complexType>
<xs:sequence>
<xs:element name="red" type="xs:integer"/>
<xs:element name="green" type="xs:integer"/>
<xs:element name="blue" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:sequence>
</xs:complexType>
</<xs:schema>
XML schema can contain many different rules and will ensure that the data contained in
the file is in the correct format and of the right data types.
Web CRUD applications and REST
CRUD is an acronym that represents the four basic operations relating to persistent data
(i.e. data stored in files or databases). When you develop an application, you must always
consider whether you have implemented the functionality to allow all four operations. For
example, if you are making an online revision tool: will the teacher be able to CREATE
new questions? Will the teacher be able to UPDATE questions if they need to be
changed? Will the teacher be able to DELETE questions that are no longer needed? Will
the teacher be able to RETRIEVE a specific set of questions?
Operation SQL
Create INSERT
Retrieve SELECT
Update UPDATE
Delete DELETE
The table can be extended to include the Hypertext Transfer Protocol that map to the four
basic operations:
If you have previously done some web development, you may have used the GET and
POST methods. Many students use these two methods interchangeably with little regard
for their documented purpose (and this is possible as you will have coded both the client
and server end of the process). It is likely that many of you have not used the PUT or
DELETE methods at all.
The important thing about RESTful applications is that they allow client and server to be
programmed independently, yet interact successfully without either knowing the
implementation details on the other side. In a RESTful system, the client sends a request
that the server performs a particular action (such as updating a resource). The request
must contain all of the information necessary for the server to be able to process the
request. Once the server has performed the action, it sends a response (often in XML or
JSON format).
This RESTful approach has been widely adopted for many APIs (Application Programme
Interfaces) where the use of standards has resulted in APIs that are easier to provide,
use, document, and support.
Websockets
The benefit of the Websocket protocol is that data can be pushed from the server without
having to be initiated by an HTTP request from the client. Every time an HTTP request is
made, significant header data has to be transferred to the server. This increases latency
(the amount of time taken before anything is seen to happen at the client end). Once a
Websocket connection has been made, the messages are automatically received without
the need for security checks, which also reduces latency.
If the web page contains content where response times are important, like a browser-
based game, reducing latency is essential for smooth performance.
The Websocket protocol makes use of the transport layer of the TCP/IP stack. The
communications use port number 80 (or 443 if encrypted), which means that firewalls will
treat the data as standard web traffic.
Data is transferred as messages, each of which consists of one or more frames that will
be reconstructed when received. These frames can be:
text frames
binary frames, e.g. for image data
"ping/pong" frames, used to check the connection
control frames, e.g. to close the connection
Search engine indexing (OCR only)
The World Wide Web is a vast repository of data. Most people make extensive use of it
every day to find information about all sorts of things, ranging from train timetables, to
where to find the best food, to getting updates from our friends.
Search engines can be a useful tool if you don’t know where to find the information you
want. When using a search engine you can expect to see the results immediately and for
them to be organised in the order of the most useful sites.
No one really knows how much data there is on the web but it is estimated that at the time
of writing (2019) it is around 40 Zettabytes (that is 40 x 1021) with around 2 billion active
websites. Every day, there are around 5.5 billion searches for information through Google
alone.
Search engine providers, such as Google, build indexes and deploy very smart algorithms
that allow them to provide more or less instant responses to requests for information.
Twenty four hours a day, seven days a week, search engine providers use software
known as web crawlers to discover and scan pages by following links from one webpage
to another. They also deploy algorithms to evaluate the importance of those pages so that
they can organise search results in a way that will be helpful to the user.
A search for the Raspberry Pi Foundation shows that there are 17,700,000 results and
the search took 0.47 seconds. The first page of results shows around twelve likely pages.
Fortunately, the official Raspberry Pi website appears right at the top! Most users do not
look beyond the first page of results. If you run a website, it is important that you make
sure that the site can be found. A whole industry has been built around the concept of
'search engine optimisation', to help organisations achieve a strong online presence.
The indexing algorithms make use of all of the information on a page to index it. This
includes visible information — age content and links — and hidden data such as the
metadata that is included in the header of an HTML page. Metadata is provided by the
page author and can help a page to be indexed correctly and there are many tags that
can be used. The most useful tags/metadata provide instructions to robots to control the
behavior of search engine crawling and indexing. The "keywords" metatag no longer has
much weighting in indexing algorithms.
<head>
<meta charset="UTF-8" />
<meta name="description" content="A level Computer Science Revision
Resources" />
<meta name="keywords" content="AQA, OCR, EDUQAS" />
<meta name="author" content="C.S.Teacher" />
<meta name="robots" content="noindex" /> >
</head>
PageRank Algorithm (OCR only)
The algorithms used by the organisations that provide search engines are kept secret, as
they are fundamental to commercial success. However, it is possible to examine an
algorithm that was devised and named after Larry Page, one of the founders of Google.
This algorithm is called PageRank.
PageRank calculates the importance of a page by counting the number and quality of
links it contains. According to Google, "PageRank works by counting the number and
quality of links to a page to determine a rough estimate of how important the website is.
The underlying assumption is that more important websites are likely to receive more links
from other websites".
In the algorithm, a page with a high number of incoming links is judged more important
than a page with a high number of outgoing links. This makes sense because if other
sites link to a particular website, they have already judged that site to be useful.
The directed graph below shows that, if the quality of links is equal, C would have the
greatest page rank; it has three incoming links. The site represented by A has the most
links, but only one is incoming.
All teaching materials on this site are available under the Open Government Licence v3.0, except where otherwise stated.