0% found this document useful (0 votes)
8 views2 pages

String Data Type

A string is a data type that represents a sequence of characters, often implemented as an array of bytes. Strings can be fixed-length or variable-length, with modern programming languages typically using variable-length strings. Character encoding and representation methods, such as null-terminated strings, are crucial for string implementation across different programming languages.

Uploaded by

christopherhodoh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

String Data Type

A string is a data type that represents a sequence of characters, often implemented as an array of bytes. Strings can be fixed-length or variable-length, with modern programming languages typically using variable-length strings. Character encoding and representation methods, such as null-terminated strings, are crucial for string implementation across different programming languages.

Uploaded by

christopherhodoh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

STRING DATA TYPE

A string is generally considered as a data type and is often implemented as an


array data structure of bytes (or words) that stores a sequence of elements, typically characters,
using some character encoding. String may also denote more general arrays or other sequence
(or list) data types and structures.
A string datatype is a datatype modeled on the idea of a formal string. Strings are such an
important and useful datatype that they are implemented in nearly every programming
language. In some languages they are available as primitive types and in others as composite
types. The syntax of most high-level programming languages allows for a string, usually quoted
in some way, to represent an instance of a string datatype; such a meta-string is called
a literal or string literal

String length
Although formal strings can have an arbitrary finite length, the length of strings in real languages is
often constrained to an artificial maximum. In general, there are two types of string datatypes: fixed-
length strings, which have a fixed maximum length to be determined at compile time and which use
the same amount of memory whether this maximum is needed or not, and variable-length strings,
whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the
actual requirements at run time. Most strings in modern programming languages are variable-length
strings. Of course, even variable-length strings are limited in length – by the size of
available computer memory. The string length can be stored as a separate integer (which may put
another artificial limit on the length) or implicitly through a termination character, usually a character
value with all bits zero such as in C programming language.

Character encoding
String datatypes have historically allocated one byte per character, and, although the exact
character set varied by region, character encodings were similar enough that programmers could
often get away with ignoring this, since characters a program treated specially (such as period and
space and comma) were in the same place in all the encodings a program would encounter. These
character sets were typically based on ASCII or EBCDIC. If text in one encoding was displayed on a
system using a different encoding, text was often mangled, though often somewhat readable and
some computer users learned to read the mangled text.

Implementations
Some languages, such as C++ and Ruby, normally allow the contents of a string to be changed after
it has been created; these are termed mutable strings. In other languages, such as Java and Python,
the value is fixed and a new string must be created if any alteration is to be made; these are
termed immutable strings (some of these languages also provide another type that is mutable, such
as Java and .NET
Strings are typically implemented as arrays of bytes, characters, or code units, in order to allow fast
access to individual units or substrings—including characters when they have a fixed length. A few
languages such as Haskell implement them as linked lists instead.
Some languages, such as Prolog and Erlang, avoid implementing a dedicated string datatype at all,
instead adopting the convention of representing strings as lists of character codes.
Representations
Representations of strings depend heavily on the choice of character repertoire and the method of
character encoding. Older string implementations were designed to work with repertoire and
encoding defined by ASCII. Modern implementations often use the extensive repertoire defined by
Unicode along with a variety of complex encodings such as UTF-8 and UTF-16.
The term byte string usually indicates a general-purpose string of bytes, rather than strings of only
(readable) characters, strings of bits, or such. Byte strings often imply that bytes can take any value
and any data can be stored as-is, meaning that there should be no value interpreted as a termination
value.
Null-terminated
The length of a string can be stored implicitly by using a special terminating character; often this is
the null character (NUL), which has all bits zero, a convention used and perpetuated by the
popular C programming language. Hence, this representation is commonly referred to as a C string.
This representation of an n-character string takes n + 1 space (1 for the terminator), and is thus
an implicit data structure.
In terminated strings, the terminating code is not an allowable character in any string. Strings
with length field do not have this limitation and can also store arbitrary binary data.
An example of a null-terminated string stored in a 10-byte buffer, along with its ASCII (or more
modern UTF-8) representation as 8-bit hexadecimal numbers is:

F R A N K NUL k e f w

4616 5216 4116 4E16 4B16 0016 6B16 6516 6616 7716

THANK YOU

URL

https://en.wikipedia.org/wiki/String_(computer_science)

You might also like