0% found this document useful (0 votes)
23 views18 pages

Cpp_Document

The document provides a comprehensive overview of memory unit conversions, particularly focusing on binary and decimal systems, and the internal workings of the C++ std::string class. It explains the differences between std::string and C-style strings (char[]), highlighting memory management, resizing capabilities, and performance metrics. Additionally, it includes practical use cases, code examples, and performance comparisons to guide developers in choosing the appropriate string type for their applications.

Uploaded by

ss240102222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views18 pages

Cpp_Document

The document provides a comprehensive overview of memory unit conversions, particularly focusing on binary and decimal systems, and the internal workings of the C++ std::string class. It explains the differences between std::string and C-style strings (char[]), highlighting memory management, resizing capabilities, and performance metrics. Additionally, it includes practical use cases, code examples, and performance comparisons to guide developers in choosing the appropriate string type for their applications.

Uploaded by

ss240102222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Memory Unit Conversions

1 Standard Conversions (Base 2 - Used in Com-


puting)
The following table shows conversions based on the binary system:

Unit Equivalent in Bytes


1 KB (Kilobyte) 1024 bytes
1 MB (Megabyte) 1024 × 1024 = 1, 048, 576 bytes
1 GB (Gigabyte) 1024 × 1024 × 1024 = 1, 073, 741, 824 bytes
1 TB (Terabyte) 10243 = 1, 099, 511, 627, 776 bytes

Example 1: Convert 5 MB to bytes:

5 × 1024 × 1024 = 5, 242, 880 bytes

Example 2: Convert 256 MB to bytes:

256 × 10242 = 268, 435, 456 bytes

2 Approximate (Base 10 - Used in Storage)


- 1 KB ≈ 103 bytes (1, 000 bytes) - 1 MB ≈ 106 bytes (1, 000, 000 bytes) - 1 GB
≈ 109 bytes (1, 000, 000, 000 bytes) - 1 TB ≈ 1012 bytes (1, 000, 000, 000, 000
bytes)
Example: Convert 2 GB to bytes (Base 10):

2 × 109 = 2, 000, 000, 000 bytes

Shortcut: - Multiply by 1024 for binary memory (RAM, memory limits).


- Multiply by 1000 for decimal storage (HDD, SSD sizes).

3 Converting 2x to 10y (Power of Two to Power


of Ten)
A useful approximation:
2x ≈ 10x×0.301

1
Example 1: Convert 210 to 10y

10 × 0.301 = 3.01

210 ≈ 103
(Actual: 1024 ≈ 103 )
Example 2: Convert 232 to 10y

32 × 0.301 = 9.63

232 ≈ 109.63 ≈ 4.3 × 109


(Actual: 4, 294, 967, 296)
Shortcut: - Multiply exponent by ≈ 0.3 for a rough approximation. -
210 ≈ 103 , so every 10-bit increase ≈ 3 decimal places.

4 Quick Reference Table: 2x to 10y Approxima-


tion
2x Approx 10y
210 103 (1K)
220 106 (1M)
230 109 (1B)
240 1012 (1T)

5 Practical Use Cases


- Memory calculations in CP: Checking constraints like n ≤ 106 , converting
memory usage properly. - Binary vs. Decimal conversions: Knowing 2x ≈
10y helps in bitwise calculations.

6 Key Properties of std::string Storage


• Uses a dynamic character array internally.
• Allows resizing (expands when needed).
• Stores characters contiguously in memory.

• Supports random access (O(1) for indexing).


• Uses Small String Optimization (SSO) to avoid heap allocation for
small strings.

2
7 How std::string Works Internally
7.1 Basic Structure
Internally, std::string maintains:

1. A pointer to a character array (heap-allocated if the string is large).


2. Size (number of characters).
3. Capacity (allocated space for future growth).

Example:
# include < iostream >
# include < string >

using namespace std ;

int main () {
string s = " Hello " ;
cout << " Address of s : " << ( void *) s . data () << endl ;
}

Output Example:
Address of s: 0x5642d9f2a6f0
This address points to a contiguous array of characters.

7.2 Why Not a Linked List?


If std::string used a linked list:

• Random access (s[i]) would be O(n) (Bad for indexing).


• More memory overhead due to storing next pointers.
• Poor cache locality (slower performance).
Using a dynamic array allows:

• O(1) indexing (s[i]).


• ✓ Efficient cache usage (sequential memory access).
• ✓ Lower memory overhead.

3
8 Small String Optimization (SSO)
For short strings ( 15 characters or less), std::string stores characters
inside itself (without heap allocation).
Example:
# include < iostream >
# include < string >

int main () {
string small = " Hello " ; // Stored in stack ( SSO )
string large = " This is a very long string " ; // Stored in heap

cout << " Address of small : " << ( void *) small . data () << endl ;
cout << " Address of large : " << ( void *) large . data () << endl ;
}

Output Example:
Address of small: 0x7ffe5f2c34a8 (Stack memory - SSO)
Address of large: 0x5642d9f2a6f0 (Heap memory)

• **Small strings ( 15 chars)** stay inside std::string object (Stack).


• **Larger strings** use heap memory.

Why SSO? SSO avoids heap allocation for short strings, making oper-
ations faster.

9 Memory Layout of std::string


9.1 Case 1: Small String ("Hello")
Stack (Small String Optimization)
------------------------------------
| ’H’ | ’e’ | ’l’ | ’l’ | ’o’ | ’\0’ | Empty | ...
------------------------------------
(No heap allocation needed)

9.2 Case 2: Large String ("This is a long string")


Stack:
--------------------------------
| Pointer -> Heap Address |
| Size = 23 |
| Capacity = 32 |
--------------------------------

Heap (Dynamic Array):


--------------------------------

4
| ’T’ | ’h’ | ’i’ | ’s’ | ’ ’ | ... | ’g’ | ’\0’ |
--------------------------------

10 Difference Between std::string and char[]

Feature std::string char[] (C-Style String)


Storage Dynamic array (heap) Fixed array (stack)
Resizing ✓ Yes (+= works) ✗ No (fixed size)
Null-Termination Optional ✓ Required (\0)
Memory Safety ✓ Safe ✗ Prone to buffer overflow
Performance Optimized (SSO, caching) Less optimized

Example of char[] vs std::string


char cstr [] = " Hello " ; // Stored in stack
std :: string s = " Hello " ; // Uses dynamic a l l o c a t i o n ( or SSO )

• char[] is faster for fixed-size strings.

• std::string is more flexible & safer.

11 Summary
• ✓ std::string uses a dynamic array, not a linked list.

• ✓ Small strings ( 15 chars) stay in stack (SSO).


• ✓ Large strings use heap memory.
• ✓ Provides O(1) access, dynamic resizing, and memory safety.

12 Difference Between char[] and std::string


Both store characters in an array, but they differ in memory management,
flexibility, and safety.

12.1 C-Style String (char[])


• A fixed-size character array.
• Manually manages memory.
• Must be null-terminated (\0) to indicate the end of the string.

• Cannot be resized dynamically.

5
Example:
char cstr [] = " Hello " ; // Stored in stack
cout << cstr ; // Output : Hello

Memory Layout (Stack):


Stack:
--------------------------------
| ’H’ | ’e’ | ’l’ | ’l’ | ’o’ | ’\0’ |
--------------------------------
• Total size = 6 bytes (5 characters + 1 null terminator).
• If the null terminator is missing, it may cause undefined behavior.

12.2 C++ std::string


• A dynamic string (can grow/shrink).
• No need for null termination (\0).
• Automatic memory management (no manual malloc/free).
• Provides string manipulation functions (e.g., append(), substr(),
find()).
• Uses SSO (Small String Optimization) to store short strings inside
the object itself.
Example:
# include < iostream >
# include < string >

int main () {
std :: string s = " Hello " ; // Uses dynamic memory ( or SSO )
s += " World " ; // A u t o m a t i c a l l y resizes
std :: cout << s ; // Output : Hello World
}

Memory Layout (Heap for Large Strings, SSO for Small)


Stack (std::string Object):
--------------------------------
| Pointer to Heap (if large) |
| Size = 11 |
| Capacity = 15 |
--------------------------------

Heap (if large):


--------------------------------
| ’H’ | ’e’ | ’l’ | ’l’ | ’o’ | ’ ’ | ’W’ | ’o’ | ’r’ | ’l’ | ’d’ | ’\0’ |
--------------------------------

6
• Dynamic resizing happens automatically.
• No null-termination is required because std::string keeps track of
size.

13 Memory Usage of Strings


13.1 How Many Bytes Does Each Character Take?
• A normal character (char) takes 1 byte.
• A Unicode character (wchar t or UTF-8/UTF-16 encoding) may
take 2–4 bytes.

Example:
char ch = ’A ’; // 1 byte
std :: string s = " Hello " ; // Each c h a r a c t e r = 1 byte
std :: wstring ws = L " Hello " ; // Each c h a r a c t e r = 2+ bytes ( UTF -16)

13.2 Memory Usage of Strings

String Type Size Per Character Null-Termination? Resizable?


char[] (C-style) 1 byte ✓ Yes ✗ No
std::string 1 byte ✗ No ✓ Yes
std::wstring (Wide) 2–4 bytes ✗ No ✓ Yes

14 Key Differences Between char[] and std::string

Feature char[] (C-style) std::string (C++)


Storage Fixed-size array Dynamic array (heap)
Resizing ✗ No ✓ Yes (+= works)
Null-Termination ✓ Required (\0) ✗ Not required
Memory Safety ✗ Prone to buffer overflow ✓ Safe
Performance Faster for fixed-size Optimized (SSO, caching)

15 Which One Should You Use?


• Use char[] ONLY IF you:
– ✓ Know the exact size of the string.
– ✓ Need maximum speed with manual memory control.
– ✗ Don’t need dynamic resizing.
• Use std::string in most cases because:

7
– ✓ It’s safer (avoids buffer overflow).
– ✓ It’s easier (automatic resizing, no need to manage memory).
– ✓ It’s optimized (SSO makes small strings very fast).

16 Final Summary
• ✓ Both char[] and std::string store characters in an array, NOT
linked lists.
• ✓ Each character usually takes 1 byte (char) unless using Unicode.

• ✓ C-style strings require \0, but std::string does not.


• ✓ C-style strings are faster for fixed sizes, but std::string is
better for dynamic strings.

17 Performance Metrics to Compare


We will compare:
• String creation
• Concatenation speed

• Iteration over characters


• Memory usage

18 Code for Performance Comparison


C++ Benchmark Code:
# include < iostream >
# include < chrono >
# include < string >
# include < cstring > // For C - style strings

using namespace std ;


using namespace chrono ;

const int N = 1000000; // 1 million c h a r a c t e r s

void testCString () {
auto start = h i g h _ r e s o l u t i o n _ c l o c k :: now () ;

char * str = new char [ N + 1]; // A l l o c a t e memory for N


c h a r a c t e r s + ’\0 ’
for ( int i = 0; i < N ; i ++) {
str [ i ] = ’a ’; // Fill with ’a ’
}

8
str [ N ] = ’ \0 ’; // Null t e r m i n a t i o n

// I t e r a t i n g over the string


long long sum = 0;
for ( int i = 0; i < N ; i ++) {
sum += str [ i ]; // Dummy o p e r a t i o n to prevent o p t i m i z a t i o n
}

delete [] str ; // Free memory

auto end = h i g h _ r e s o l u t i o n _ c l o c k :: now () ;


cout << "C - style string execution time : "
<< duration_cast < milliseconds >( end - start ) . count ()
<< " ms " << endl ;
}

void testCppString () {
auto start = h i g h _ r e s o l u t i o n _ c l o c k :: now () ;

string str (N , ’a ’) ; // Create a string with N ’a ’ c h a r a c t e r s

// I t e r a t i n g over the string


long long sum = 0;
for ( int i = 0; i < N ; i ++) {
sum += str [ i ]; // Dummy o p e r a t i o n to prevent o p t i m i z a t i o n
}

auto end = h i g h _ r e s o l u t i o n _ c l o c k :: now () ;


cout << " std :: string execution time : "
<< duration_cast < milliseconds >( end - start ) . count ()
<< " ms " << endl ;
}

int main () {
testCString () ;
testCppString () ;
return 0;
}

19 Expected Results
Operation char[] (C-Style) std::string
String Creation ✓ Faster (Direct memory allocation) ✗ Slightly slower (Dynamic management)
Concatenation ✗ Slower (strcpy & reallocation) ✓ Faster (+= optimization)
Iteration ✓ Fast (Direct memory access) ✓ Almost the same speed
Memory Efficiency ✓ Less overhead (no extra capacity) ✗ Extra capacity allocated (for resizing)
Ease of Use ✗ Harder (manual memory management) ✓ Easier (automatic memory management)

20 Why std::string is More Efficient for Large


Operations
• std::string preallocates extra memory to prevent frequent realloca-
tion during appends.

9
• std::string avoids strcpy overhead in concatenation.
• std::string uses Small String Optimization (SSO), making small
strings faster than char[].

21 When to Use What?


Use char[] if... Use std::string if...
✓ Performance is critical (e.g., embedded systems). ✓ You need dynamic resizing.
✓ You know the exact string size. ✓ You want memory safety.
✓ You want zero overhead (no extra memory). ✓ You want built-in functions (find(), substr(), etc.).

22 Conclusion
• For fixed-size, performance-critical applications → Use char[].
• For general-purpose, safer, and flexible string handling → Use
std::string.

23 1. Comparison Table

Feature Array (int arr[N]) Vector (vector<int> v)


Size Fixed at compile-time Dynamic (can grow/shrink)
Memory Allocation Stack (default) Heap (dynamic)
Resizing No Yes
Bound Checking No Yes (v.at(i))
Insertion/Deletion Manual shifting required Easy (push back(), pop back())
Ease of Use Manual size management STL functions available

24 2. Why Competitive Programmers Prefer


Vectors
24.1 1. Dynamic Resizing (No Need to Predefine Size)
vector allows automatic resizing, while arrays require a fixed size at compile-
time.
vector < int > v ;
for ( int i = 0; i < 5; i ++) {
v . push_back ( i ) ; // No need to p r e d e f i n e size
}

10
24.2 2. Heap Allocation (Avoids Stack Overflow)
If constraints are large (N ≤ 106 or 107 ), arrays risk stack overflow, while vectors
use heap memory.
// Large array ( Risky !)
int arr [1000000];

// Large vector ( Safe !)


vector < int > v (1000000) ;

24.3 3. Built-in STL Functions for Faster Code


Vectors provide built-in functions that simplify operations.
vector < int > v = {5 , 2 , 8 , 1};
sort ( v . begin () , v . end () ) ; // Sorting in O ( N log N )
cout << " Size : " << v . size () ;

24.4 4. Safe Memory Management (Avoids Segmentation


Faults)
Accessing out-of-bounds indices in arrays causes undefined behavior, whereas
vectors handle it safely.
vector < int > v (5) ;
cout << v . at (10) ; // Throws an error ( safe )

25 3. When to Use Arrays vs. Vectors

Use Arrays If... Use Vectors If...


Fixed-size storage is required Size needs to be dynamic
Memory efficiency is critical Memory management should be automatic
You need low-level memory access STL functions like sort(), find() are useful

26 4. Code Comparison
26.1 Array Version (More Manual Work)
# include < iostream >
# include < algorithm >
using namespace std ;

int main () {
int arr [] = {3 , 1 , 4 , 1 , 5};
int N = sizeof ( arr ) / sizeof ( arr [0]) ;

11
sort ( arr , arr + N ) ;

cout << arr [0]; // Output : 1


}

26.2 Vector Version (Cleaner & Safer)


# include < iostream >
# include < vector >
# include < algorithm >
using namespace std ;

int main () {
vector < int > v = {3 , 1 , 4 , 1 , 5};
sort ( v . begin () , v . end () ) ;

cout << v [0]; // Output : 1


}

27 5. Conclusion
• Arrays (int arr[N]) → Faster but fixed-size, stack-allocated, error-
prone.
• Vectors (vector<int> v) → Dynamic, safer, STL-supported, preferred
in CP.

For large constraints in competitive programming, vectors are the best


choice due to flexibility, safety, and built-in STL support.

28 Initialization Functions
28.1 Creating a Vector
vector < int > v1 ; // Empty vector
vector < int > v2 (5) ; // Vector of size 5 , i n i t i a l i z e d with
default values (0)
vector < int > v3 (5 , 100) ; // Vector of size 5 , i n i t i a l i z e d with 100
vector < int > v4 = {1 , 2 , 3}; // Vector i n i t i a l i z e d with given
values

28.2 Assigning Values (assign)


vector < int > v ;
v . assign (5 , 10) ; // {10 , 10 , 10 , 10 , 10}

12
29 Size and Capacity Functions
29.1 Checking Size (size)
vector < int > v = {1 , 2 , 3 , 4 , 5};
cout << v . size () ; // Output : 5

29.2 Checking Capacity (capacity)


vector < int > v ;
cout << v . capacity () ; // Output : varies

29.3 Checking If Empty (empty)


vector < int > v ;
cout << v . empty () ; // Output : 1 ( true )

29.4 Resizing (resize)


vector < int > v = {1 , 2 , 3};
v . resize (5) ; // {1 , 2 , 3 , 0 , 0}
v . resize (7 , 100) ; // {1 , 2 , 3 , 0 , 0 , 100 , 100}

30 Element Access Functions


30.1 Accessing Elements ([] and .at())
vector < int > v = {10 , 20 , 30};
cout << v [1]; // Output : 20
cout << v . at (2) ; // Output : 30 ( Safe , checks bounds )

30.2 First and Last Elements (front, back)


vector < int > v = {5 , 10 , 15};
cout << v . front () ; // Output : 5
cout << v . back () ; // Output : 15

13
31 Modifying Functions
31.1 Adding Elements
vector < int > v ;
v . push_back (10) ; // {10}
v . push_back (20) ; // {10 , 20}

31.2 Removing Elements


vector < int > v = {10 , 20 , 30};
v . pop_back () ; // {10 , 20}

31.3 Inserting at Specific Position


vector < int > v = {1 , 2 , 4};
v . insert ( v . begin () + 2 , 3) ; // {1 , 2 , 3 , 4}

31.4 Removing a Specific Element


vector < int > v = {1 , 2 , 3 , 4 , 5};
v . erase ( v . begin () + 2) ; // Removes 3 , result : {1 , 2 , 4 , 5}

31.5 Clearing a Vector


vector < int > v = {1 , 2 , 3};
v . clear () ; // Becomes empty

32 Searching and Sorting


32.1 Sorting a Vector (sort)
vector < int > v = {5 , 1 , 4 , 3 , 2};
sort ( v . begin () , v . end () ) ; // {1 , 2 , 3 , 4 , 5}

32.2 Reversing a Vector (reverse)


reverse ( v . begin () , v . end () ) ;

14
32.3 Finding an Element (find)
vector < int > v = {10 , 20 , 30};
if ( find ( v . begin () , v . end () , 20) != v . end () )
cout << " Found ! " ;

33 Advanced Functions
33.1 Swapping Two Vectors (swap)
vector < int > a = {1 , 2 , 3};
vector < int > b = {10 , 20 , 30};
a . swap ( b ) ; // Now a = {10 , 20 , 30} , b = {1 , 2 , 3}

33.2 Shrinking Capacity (shrink to fit)


vector < int > v (100) ;
v . resize (10) ;
v . shrink_to_fit () ; // Reduces c a p a c i t y to fit the new size

34 Iterators in Vectors
34.1 Iterating Over a Vector
vector < int > v = {1 , 2 , 3};
for ( auto it = v . begin () ; it != v . end () ; it ++)
cout << * it << " " ;

34.2 Using Range-Based Loops


for ( int x : v )
cout << x << " " ;

35 Summary of Vector Functions


Category Function Description
Initialization vector<int> v(n, val); Creates vector of size n with val
Size & Capacity size(), capacity(), resize(), empty() Check or modify vector size
Element Access [], at(), front(), back() Access elements
Modifications push back(), pop back(), insert(), erase(), clear() Modify elements
Sorting & Searching sort(), reverse(), find() Sort and search elements
Iterators begin(), end() Iterate over vector

When passing arrays or vectors to functions, we use & (reference) or *


(pointer) to avoid unnecessary memory copies and modify the original data
efficiently.

15
36 Passing an Array to a Function
In C++, arrays cannot be passed by value because they decay into pointers.
Instead, they are passed as pointers (*).

36.1 Pass by Pointer (*)


# include < iostream >
using namespace std ;

void modifyArray ( int * arr , int size ) { // Pointer to first element


for ( int i = 0; i < size ; i ++) {
arr [ i ] *= 2; // Modify o r i g i n a l array
}
}

int main () {
int arr [] = {1 , 2 , 3 , 4 , 5};
int size = sizeof ( arr ) / sizeof ( arr [0]) ;

modifyArray ( arr , size ) ; // Passing pointer


for ( int i : arr ) cout << i << " " ; // Output : 2 4 6 8 10
}

Why use * (pointer)?


• Arrays decay into pointers when passed to functions.
• Allows modifying the original array.
• Saves memory (only pointer is passed, not a copy of the array).

36.2 Pass by Reference (&)


C++ does not allow direct array references, but we can use a reference to an
array:
void modifyArray ( int (& arr ) [5]) { // R e f e r e n c e to a fixed - size
array
arr [0] = 100; // M o d i f i e s o r i g i n a l array
}

Why use &?


• Guarantees that the function gets the exact array size ([5] must match).
• No need to pass size separately.

37 Passing a vector to a Function


Vectors are objects (not pointers), so they can be passed by value, reference
(&), or pointer (*).

16
37.1 Pass by Value (Makes a Copy)
# include < iostream >
# include < vector >
using namespace std ;

void modifyVector ( vector < int > v ) { // Copy of vector


v [0] = 100; // M o d i f i e s copy , NOT the o r i g i n a l
}

int main () {
vector < int > v = {1 , 2 , 3 , 4 , 5};
modifyVector ( v ) ;
cout << v [0]; // Output : 1 ( u n c h a n g e d )
}

Bad for large vectors → Creates a copy (slow for large data).

37.2 Pass by Reference (&)


void modifyVector ( vector < int > & v ) { // R e f e r e n c e to o r i g i n a l vector
v [0] = 100; // M o d i fi e s o r i g i n a l vector
}

int main () {
vector < int > v = {1 , 2 , 3 , 4 , 5};
modifyVector ( v ) ;
cout << v [0]; // Output : 100 ( m o d i f i e d )
}

Why use & (reference)?


• No extra memory copy.

• Modifies the original vector.

37.3 Pass by Pointer (*)


void modifyVector ( vector < int > * v ) { // Pointer to vector
(* v ) [0] = 100; // Must use * v to access e l e m e n t s
}

int main () {
vector < int > v = {1 , 2 , 3 , 4 , 5};
modifyVector (& v ) ; // Pass address
cout << v [0]; // Output : 100
}

Why use * (pointer)?


• Explicitly shows modification of original data.
• Can handle nullptr (null safety checks).

17
38 Summary Table
Passing Type Syntax (Array) Syntax (Vector) Modifies Original? Memory Efficient?
By Value ✗ Not possible void func(vector<int> v) ✗ No ✗ No (Copy created)
By Pointer (*) void func(int *arr, int size) void func(vector<int> *v) ✓ Yes ✓ Yes
By Reference (&) void func(int (&arr)[N]) void func(vector<int> &v) ✓ Yes ✓ Yes

39 Which One Should You Use?


39.1 For Arrays
• Use int *arr for flexibility (pass arrays of any size).
• Use int (&arr)[N] when size is fixed and known.

39.2 For Vectors


• Use vector<int> &v (Most Common, Best for CP).

• Use vector<int> *v only if NULL checking is needed.

Would you like performance benchmarks comparing these methods?

18

You might also like