MD Lab 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

TECHNICAL UNIVERSITY OF MOLDOVA

FACULTY OF COMPUTERS, INFORMATICS AND MICROELECTRONICS


SPECIALITY SOFTWARE ENGINEERING

Report
on
Discrete Mathematics

Laboratory Work nr.2

Performed: st. gr.


FAF-212
Maia
Zaica

Verified: Cristofor Fiștic

Chişinău 2021
LABORATORY WORK Nr.2
Problem condition:
1. Random vulnerability

You've stumbled onto a significant vulnerability in a commonly used cryptographic library. It


turns out that the random number generator it uses frequently produces the same primes
when it is generating keys.
Exploit this knowledge to factor the (hexadecimal) keys below, and enter your answer as the
last six digits of the largest factor you find (in decimal).

Key 1:
1c7bb1ae67670f7e6769b515c174414278e16c27e95b43a789099a1c7d55c717b2f0a0442a7d4
9503ee09552588ed9bb6eda4af738a02fb31576d78ff72b2499b347e49fef1028182f158182a0b
a504902996ea161311fe62b86e6ccb02a9307d932f7fa94cde410619927677f94c571ea39c7f41
05fae00415dd7d
Key 2:
2710e45014ed7d2550aac9887cc18b6858b978c2409e86f80bad4b59ebcbd90ed18790fc56f53f
fabc0e4a021da2e906072404a8b3c5555f64f279a21ebb60655e4d61f4a18be9ad389d8ff05b99
4bb4c194d8803537ac6cd9f708e0dd12d1857554e41c9cbef98f61c5751b796e5b37d338f5d9b
3ec3202b37a32f

Using Euclidean Algorithm we created function gcd(x, y) it will return the greatest common factor of two numbers.
With it we will get the gcd of key1 and key2, we convert the key into decimal numbers with the help of int() function.
The int() function can help in converting different values to decimal integers. It typecasts the hexadecimal string to its
corresponding integer value. To achieve this, we have to pass the number and its base to convert it into an integer.
print(str(hcf)[-6:], "\n")
To print the last 6 digit we transform it into a string and with square brackets and given index -6: we get the last 6
characters.
def gcd(x, y) -> int:
while(y):
x, y = y, x % y
return x

key1 =
"1c7bb1ae67670f7e6769b515c174414278e16c27e95b43a789099a1c7d55c717b2f0a0442a7d49503ee095
52588ed9bb6eda4af738a02fb31576d78ff72b2499b347e49fef1028182f158182a0ba504902996ea161311
fe62b86e6ccb02a9307d932f7fa94cde410619927677f94c571ea39c7f4105fae00415dd7d"
key2 =
"2710e45014ed7d2550aac9887cc18b6858b978c2409e86f80bad4b59ebcbd90ed18790fc56f53ffabc0e4a
021da2e906072404a8b3c5555f64f279a21ebb60655e4d61f4a18be9ad389d8ff05b994bb4c194d8803537a
c6cd9f708e0dd12d1857554e41c9cbef98f61c5751b796e5b37d338f5d9b3ec3202b37a32f"

dec1 = int(key1, 16)


dec2 = int(key2, 16)
hcf = gcd(dec1, dec2)
print(str(hcf)[-6:], "\n")

print(dec1//hcf)
print(dec2//hcf)
print("Highest common factor: ", hcf)

Problem condition:
2. RSA
Make a program that will implement the RSA algorithm made by you and with help of this
program you can encrypt any string and then decrypt it.
Rule:
You can’t use libraries, just remember the algorithm that you study and implement it.

Solution:
RSA algorithm is asymmetric cryptography algorithm. Asymmetric actually means that it works on two different keys
i.e. Public Key and Private Key. As the name describes that the Public Key is given to everyone and Private key is
kept private.
In order to encrypt a message with RSA, one must choose two large prime numbers p and q,
Let n = p · q.
The encryption key is a pair of integers (e, n) and the decryption key is a pair (d, n). Given a message m, in order to
encrypt it, we would represent it as a number between 0 and
n − 1.

e ¿ ¿ e
E( m)=m ≡ m (mod n) ⇔ m =rem(m , n)
¿ ¿ d d
D(m )=(m ) ≡m(mod n) ⇔ m=rem ( ( m¿ ) ,n)

Integers e and d are closely related to numbers p and q. Choose e to be any large random integer that is relatively
prime to ( p−1)(q−1). Then d will be the multiplicative inverse of e modulo( p−1) ¿):
e · d ≡1(mod ( p−1)(q−1)) .

The keys for the RSA algorithm are generated in the following way:
1. Select two large prime numbers, x and y. The prime numbers need to be large so that they will be difficult for
someone to figure out.
 For security purposes, the integers p and q should be chosen at random and should be similar in
magnitude but differ in length by a few digits to make factoring harder.[2] Prime integers can be
efficiently found using a primality test.
 p and q are kept secret.
2. Calculate n=q ∙ p .
 n is used as the modulus for both the public and private keys. Its length, usually expressed in bits, is
the key length.
 n is released as part of the public key.
3. Calculate the Carmichael's totient function: φ ( n )=( p−1) ∙(q−1).
4. Select an integer e , such that e  is co-prime to φ ( n )and 1<e <ϕ (n). The pair of numbers (n , e)  makes up the
public key.
 e is released as part of the public key.
5. Determine d asd ≡e−1( mod φ(n)); that is, d is the modular multiplicative inverse of e modulo φ (n).
 This means: solve for d the equation e ∙ d ≡1 mod φ( n); d can be computed efficiently by using the
extended Euclidean algorithm, since, thanks to e and φ (n)being coprime, said equation is a form of
Bézout's identity, where d is one of the coefficients.
 d is kept secret as the private key exponent.

d can be found using the Extended Euclidean algorithm. The pair (n , d ) makes up the private key.
We input p and q, the prime check functions checks if the input are prime numbers. Then we get the public and private
key. User enters the message, then is asked if he want to encrypt or decrypt.
import math

# Input Prime Numbers


print("PLEASE ENTER THE 'p' AND 'q' VALUES BELOW:")
p = int(input("Enter a prime number for p: "))
q = int(input("Enter a prime number for q: "))
print("*****************************************************")

# Check if Input's are Prime


'''THIS FUNCTION AND THE CODE IMMEDIATELY BELOW THE FUNCTION CHECKS WHETHER THE INPUTS
ARE PRIME OR NOT.'''

def prime_check(a):
if (a == 2):
return True
elif ((a < 2) or ((a % 2) == 0)):
return False
elif (a > 2):
for i in range(2, a):
if not (a % i):
return False
return True

check_p = prime_check(p)
check_q = prime_check(q)
while (((check_p == False) or (check_q == False))):
p = int(input("Enter a prime number for p: "))
q = int(input("Enter a prime number for q: "))
check_p = prime_check(p)
check_q = prime_check(q)

# RSA Modulus
'''CALCULATION OF RSA MODULUS 'n'.'''
n = p * q
print("RSA Modulus(n) is:", n)

# Eulers Toitent
'''CALCULATION OF EULERS TOTIENT 'r'.'''
r = (p - 1) * (q - 1)
print("Eulers Totient(r) is:", r)
print("*****************************************************")

# GCD
'''CALCULATION OF GCD FOR 'e' CALCULATION.'''

def egcd(e, r):


while (r != 0):
e, r = r, e % r
return e

# Euclid's Algorithm
def eugcd(e, r):
for i in range(1, r):
while (e != 0):
a, b = r // e, r % e
if (b != 0):
print("%d = %d*(%d) + %d" % (r, a, e, b))
r = e
e = b

# Extended Euclidean Algorithm


def eea(a, b):
if (a % b == 0):
return (b, 0, 1)
else:
gcd, s, t = eea(b, a % b)
s = s - ((a // b) * t)
print("%d = %d*(%d) + (%d)*(%d)" % (gcd, a, t, s, b))
return (gcd, t, s)

# Multiplicative Inverse
def mult_inv(e, r):
gcd, s, _ = eea(e, r)
if (gcd != 1):
return None
else:
if (s < 0):
print("s=%d. Since %d is less than 0, s = s(modr), i.e., s=%d." % (s, s, s
% r))
elif (s > 0):
print("s=%d." % (s))
return s % r
# e Value Calculation
'''FINDS THE HIGHEST POSSIBLE VALUE OF 'e' BETWEEN 1 and 1000 THAT MAKES (e,r)
COPRIME.'''
for i in range(1, 1000):
if (egcd(i, r) == 1):
e = i
print("The value of e is:", e)
print("*****************************************************")

# d, Private and Public Keys


'''CALCULATION OF 'd', PRIVATE KEY, AND PUBLIC KEY.'''
print("EUCLID'S ALGORITHM:")
eugcd(e, r)
print("END OF THE STEPS USED TO ACHIEVE EUCLID'S ALGORITHM.")
print("*****************************************************")
print("EUCLID'S EXTENDED ALGORITHM:")
d = mult_inv(e, r)
print("END OF THE STEPS USED TO ACHIEVE THE VALUE OF 'd'.")
print("The value of d is:", d)
print("*****************************************************")
public = (e, n)
private = (d, n)
print("Private Key is:", private)
print("Public Key is:", public)
print("*****************************************************")

# Encryption
'''ENCRYPTION ALGORITHM.'''

def encrypt(pub_key, n_text):


e, n = pub_key
x = []
m = 0
for i in n_text:
if (i.isupper()):
m = ord(i) - 65
c = (m ** e) % n
x.append(c)
elif (i.islower()):
m = ord(i) - 97
c = (m ** e) % n
x.append(c)
elif (i.isspace()):
spc = 400
x.append(400)
return x

# Decryption
'''DECRYPTION ALGORITHM'''

def decrypt(priv_key, c_text):


d, n = priv_key
txt = c_text.split(',')
x = ''
m = 0
for i in txt:
if (i == '400'):
x += ' '
else:
m = (int(i) ** d) % n
m += 65
c = chr(m)
x += c
return x
# Message
message = input("What would you like encrypted or decrypted?(Separate numbers with ','
for decryption):")
print("Your message is:", message)

# Choose Encrypt or Decrypt and Print


choose = input("Type '1' for encryption and '2' for decrytion.")
if (choose == '1'):
enc_msg = encrypt(public, message)
print("Your encrypted message is:", enc_msg)
print("Thank you for using the RSA Encryptor. Goodbye!")
elif (choose == '2'):
print("Your decrypted message is:", decrypt(private, message))
print("Thank you for using the RSA Encryptor. Goodbye!")
else:
print("You entered the wrong option.")
print("Thank you for using the RSA Encryptor. Goodbye!")

the code bellow, generates random large prime numbers.

import random
from random import randrange, getrandbits

def is_prime(n, k=64):


if n == 2 or n == 3:
return True
if n <= 1 or n % 2 == 0:
return False
s = 0
r = n - 1
while r & 1 == 0:
s += 1
r //= 2
for _ in range(k):
a = randrange(2, n - 1)
x = pow(a, r, n)
if x != 1 and x != n - 1:
j = 1
while j < s and x != n - 1:
x = pow(x, 2, n)
if x == 1:
return False
j += 1
if x != n - 1:
return False
return True

def generate_prime_candidate(length):
p = getrandbits(length)
p |= (1 << length - 1) | 1
return p

def generate_prime_number(length=512):
p = 4
while not is_prime(p, 64):
p = generate_prime_candidate(length)
return p

p, q = generate_prime_number(), generate_prime_number()

Problem condition:

3. Matrix
You have a set of 20 people connected via a friendship matrix. The whole list is given in matrix.txt..

3.1 Friends
Find the person with the most friends.

Created an array from the file matrix.txt given in condition. Calculating the sum of each row or each column will give
us the number of friends for each person.
list = []
for i in range(len(matrix)):
x = sum(matrix[i])
list.append(x)
So calculated the sum of each line and put it into a list.
def max_friends(list):
max_value = max(list)
if list.count(max_value) > 1:
return [i for i, x in enumerate(list) if x == max(list)]
else:
return list.index(max(list))
Created a function which takes the previous list as parameter to return the max number of friends.
output = ""
for i in max_friends(list):
output += names.get(i) + "\n"
print(output)
Created a dictionary to store the names, with for loop and calling max_friends() function the people with most friends
are displayed.
matrix = [[0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

# sum of each row


list = []
for i in range(len(matrix)):
x = sum(matrix[i])
list.append(x)

# maximum sum
def max_friends(list):
max_value = max(list)
if list.count(max_value) > 1:
return [i for i, x in enumerate(list) if x == max(list)]
else:
return list.index(max(list))

names = {
0 : "Caleb Hobby",
1 : "Alta Kennan",
2 : "Corrin Tally",
3 : "Leandro Eagan",
4 : "Otilia Laxson",
5 : "Ellie Francese",
6 : "Augustine Golub",
7 : "Elinore Orsborn",
8 : "Clarence Stalker",
9 : "Lili Houghton",
10 : "Monet Mccoy",
11 : "Angila Ellinger",
12 : "Sammie Womac",
13 : "Tiny Parkhurst",
14 : "Pearlie Moffet",
15 : "Cruz Perna",
16 : "Rebbecca Charlton",
17 : "Marita Tegeler",
18 : "Jarred Marrow",
19 : "Lorean Simcox"
}

output = ""
for i in max_friends(list):
output += names.get(i) + "\n"
print("1. Most friends have:")
print(output)

3.2 Sort
Sort all the people by the number of friends.
We sort the people that have the least to the most friends, and get the indices of the sorted array.
ascending = []

for i in range(len(list)):
ascending.append([list[i], i])
ascending.sort()
sort_index = []

for x in ascending:
sort_index.append(x[1])

new_list = []
while list:
min = list[0]
for j in list:
if j < min:
min = j
new_list.append(min)
list.remove(min)

print("2. Sorted by the number of friends:")


sort = [names.get(i) + " " + str(j) for i, j in zip(sort_index, new_list)]

for person in sort:


print(person)

3.3 Let's do ratings


How to do that? Well, each person in the graph is connected to everyone else at some level.
Therefore, each person will have a list of connections which is as long as the total list of
people in the graph (in our case, 20). You then have to compute the shortest path from each
of the nodes to each of the other nodes.
For example, let’s say that you found that from node 0 you can reach node 3 in 5 steps (that
is, the shortest path connecting nodes 0 and 3 has 5 steps). That means that node 3 will be
a connection of level 5 to node 0 and will therefore contribute to 0 with 4 points.
As a procedure, you can take each item n and then compute the distances between n and all
the other vertices of the graph. You can use these distances to compute the value that is
added by each of the other n − 1 vertices to n. Sum it and you’ll have the value of vertex n.
In order to find the shortest path between two vertices, you’ll have to use Dijkstra's algorithm.
You can find plenty of implementations of that algorithm online.
Compute the points for each person in our network. Let’s call it ”Rating”

class Graph():
Given a graph and a source vertex in the graph, find the shortest paths from the source to all vertices in the given
graph.
Algorithm
1) Create a set sptSet (shortest path tree set) that keeps track of vertices included in the shortest-path tree, i.e., whose
minimum distance from the source is calculated and finalized. Initially, this set is empty.
2) Assign a distance value to all vertices in the input graph. Initialize all distance values as INFINITE. Assign distance
value as 0 for the source vertex so that it is picked first.
3) While sptSet doesn’t include all vertices
….a) Pick a vertex u which is not there in sptSet and has a minimum distance value.
….b) Include u to sptSet.
….c) Update distance value of all adjacent vertices of u. To update the distance values, iterate through all adjacent
vertices. For every adjacent vertex v, if the sum of distance value of u (from source) and weight of edge u-v, is less
than the distance value of v, then update the distance value of v.
With it we get the distances from each vertex to the rest of it. We make an array called distance. To calculate the
rating we need to sum of distances.
import sys

class Graph():

def __init__(self, vertices):


self.V = vertices
self.graph = [[0 for column in range(vertices)]
for row in range(vertices)]

def printSolution(self, dist):


dis = []
for node in range(self.V):
dis.append(dist[node])
print(dis)

# A utility function to find the vertex with


# minimum distance value, from the set of vertices
# not yet included in shortest path tree
def minDistance(self, dist, sptSet):

# Initialize minimum distance for next node


min = sys.maxsize

# Search not nearest vertex not in the


# shortest path tree
for v in range(self.V):
if dist[v] < min and sptSet[v] == False:
min = dist[v]
min_index = v

return min_index

# Function that implements Dijkstra's single source


# shortest path algorithm for a graph represented
# using adjacency matrix representation
def dijkstra(self, src):

dist = [sys.maxsize] * self.V


dist[src] = 0
sptSet = [False] * self.V

for cout in range(self.V):

# Pick the minimum distance vertex from


# the set of vertices not yet processed.
# u is always equal to src in first iteration
u = self.minDistance(dist, sptSet)

# Put the minimum distance vertex in the


# shortest path tree
sptSet[u] = True

# Update dist value of the adjacent vertices


# of the picked vertex only if the current
# distance is greater than new distance and
# the vertex in not in the shortest path tree
for v in range(self.V):
if self.graph[u][v] > 0 and sptSet[v] == False and dist[v] > dist[u] +
self.graph[u][v]:
dist[v] = dist[u] + self.graph[u][v]

self.printSolution(dist)

g = Graph(20)
g.graph =[[0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

for i in range(20):
g.dijkstra(i)

distance=[[0, 3, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 1, 1],
[3, 0, 1, 2, 4, 2, 2, 4, 2, 2, 3, 2, 1, 3, 2, 2, 3, 2, 4, 2],
[2, 1, 0, 1, 3, 1, 1, 3, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 3, 1],
[2, 2, 1, 0, 3, 1, 1, 3, 2, 2, 1, 2, 2, 2, 2, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 0, 2, 3, 1, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[1, 2, 1, 1, 2, 0, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1],
[2, 2, 1, 1, 3, 1, 0, 3, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 1, 2, 3, 0, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[2, 2, 1, 2, 3, 2, 2, 3, 0, 2, 3, 1, 1, 3, 2, 1, 2, 1, 3, 1],
[2, 2, 1, 2, 3, 1, 2, 3, 2, 0, 1, 2, 1, 1, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 1, 3, 1, 2, 3, 3, 1, 0, 2, 2, 1, 1, 2, 3, 2, 3, 2],
[2, 2, 1, 2, 3, 1, 1, 3, 1, 2, 2, 0, 2, 2, 2, 1, 2, 2, 3, 2],
[2, 1, 2, 2, 3, 1, 1, 3, 1, 1, 2, 2, 0, 2, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 2, 3, 1, 1, 3, 3, 1, 1, 2, 2, 0, 2, 2, 3, 2, 3, 2],
[1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 2, 1, 2, 0, 2, 2, 2, 2, 2],
[2, 2, 1, 1, 3, 2, 1, 3, 1, 2, 2, 1, 2, 2, 2, 0, 1, 2, 3, 2],
[1, 3, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 3, 3, 2, 1, 0, 3, 2, 2],
[2, 2, 1, 1, 3, 1, 1, 3, 1, 2, 2, 2, 2, 2, 2, 2, 3, 0, 3, 2],
[1, 4, 3, 3, 2, 2, 3, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 0, 2],
[1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0]]

list = []
for i in range(len(distance)):
x = sum(distance[i])
list.append(x)

names = {
0 : "Caleb Hobby",
1 : "Alta Kennan",
2 : "Corrin Tally",
3 : "Leandro Eagan",
4 : "Otilia Laxson",
5 : "Ellie Francese",
6 : "Augustine Golub",
7 : "Elinore Orsborn",
8 : "Clarence Stalker",
9 : "Lili Houghton",
10 : "Monet Mccoy",
11 : "Angila Ellinger",
12 : "Sammie Womac",
13 : "Tiny Parkhurst",
14 : "Pearlie Moffet",
15 : "Cruz Perna",
16 : "Rebbecca Charlton",
17 : "Marita Tegeler",
18 : "Jarred Marrow",
19 : "Lorean Simcox"
}

for (i, item) in enumerate(list, start = 0):


print(names.get(i) + ": ", item)

3.4 Influential people


Let’s say that each of these people has a certain rate of posting content. Obviously, people who
communicate more are much more influential. Suppose that you need to promote a new brand using social
media. We found out how often each of these 20 people writes something on their walls. You can find it in
influence.txt Whom of these people will you contact? Why? Be advised that not only the frequency of
posting matters, but also the number of friends! Use the data from the previous exercise and find the new
”Rating” for each person by multiplying it with 0.5 of the posting rate. Please sort the people by the newly
computed rating.

Solution:
Using the data from previous exercise.
rate = []
with open("influence.txt",'r') as data_file:
for line in data_file:
data = line.split()
rate.append(data[3])
we put in a list the “influence” of each person
products = []
for x, y in zip(list, rate):
products.append(0.5*float(x)*float(y))
calculate their new rating and put it into a list
ascending = []
for i in range(len(products)):
ascending.append([products[i], i])
ascending.sort()
sort_index = []
for x in ascending:
sort_index.append(x[1])

new_list = []
while products:
min = products[0]
for j in products:
if j < min:
min = j
new_list.append(min)
products.remove(min)

print("Rating in ascending order:\n")


sort = [names.get(i) + " " + str(j) for i, j in zip(sort_index, new_list)]
for person in sort:
print(person)
We are putting in an ascending order.
distance=[[0, 3, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 1, 1],
[3, 0, 1, 2, 4, 2, 2, 4, 2, 2, 3, 2, 1, 3, 2, 2, 3, 2, 4, 2],
[2, 1, 0, 1, 3, 1, 1, 3, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 3, 1],
[2, 2, 1, 0, 3, 1, 1, 3, 2, 2, 1, 2, 2, 2, 2, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 0, 2, 3, 1, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[1, 2, 1, 1, 2, 0, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1],
[2, 2, 1, 1, 3, 1, 0, 3, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 1, 2, 3, 0, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[2, 2, 1, 2, 3, 2, 2, 3, 0, 2, 3, 1, 1, 3, 2, 1, 2, 1, 3, 1],
[2, 2, 1, 2, 3, 1, 2, 3, 2, 0, 1, 2, 1, 1, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 1, 3, 1, 2, 3, 3, 1, 0, 2, 2, 1, 1, 2, 3, 2, 3, 2],
[2, 2, 1, 2, 3, 1, 1, 3, 1, 2, 2, 0, 2, 2, 2, 1, 2, 2, 3, 2],
[2, 1, 2, 2, 3, 1, 1, 3, 1, 1, 2, 2, 0, 2, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 2, 3, 1, 1, 3, 3, 1, 1, 2, 2, 0, 2, 2, 3, 2, 3, 2],
[1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 2, 1, 2, 0, 2, 2, 2, 2, 2],
[2, 2, 1, 1, 3, 2, 1, 3, 1, 2, 2, 1, 2, 2, 2, 0, 1, 2, 3, 2],
[1, 3, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 3, 3, 2, 1, 0, 3, 2, 2],
[2, 2, 1, 1, 3, 1, 1, 3, 1, 2, 2, 2, 2, 2, 2, 2, 3, 0, 3, 2],
[1, 4, 3, 3, 2, 2, 3, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 0, 2],
[1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0]]

list = []
for i in range(len(distance)):
x = sum(distance[i])
list.append(x)

names = {
0 : "Caleb Hobby",
1 : "Alta Kennan",
2 : "Corrin Tally",
3 : "Leandro Eagan",
4 : "Otilia Laxson",
5 : "Ellie Francese",
6 : "Augustine Golub",
7 : "Elinore Orsborn",
8 : "Clarence Stalker",
9 : "Lili Houghton",
10 : "Monet Mccoy",
11 : "Angila Ellinger",
12 : "Sammie Womac",
13 : "Tiny Parkhurst",
14 : "Pearlie Moffet",
15 : "Cruz Perna",
16 : "Rebbecca Charlton",
17 : "Marita Tegeler",
18 : "Jarred Marrow",
19 : "Lorean Simcox"
}

rate = []
with open("influence.txt",'r') as data_file:
for line in data_file:
data = line.split()
rate.append(data[3])

products = []
for x, y in zip(list, rate):
products.append(0.5*float(x)*float(y))

# for (i, item) in enumerate(products, start = 0):


# print(names.get(i)+": ", item)
ascending = []
for i in range(len(products)):
ascending.append([products[i], i])
ascending.sort()
sort_index = []
for x in ascending:
sort_index.append(x[1])

new_list = []
while products:
min = products[0]
for j in products:
if j < min:
min = j
new_list.append(min)
products.remove(min)

print("Rating in ascending order:\n")


sort = [names.get(i) + " " + str(j) for i, j in zip(sort_index, new_list)]

for person in sort:


print(person)

3.5 Analyze your content


You are publishing a book and would like to promote it through the use of social media. The book’s title is
”From T-Rex to Multi Universes: How the Internet has Changed Politics, Art and Cute Cats.” You have
done some research in the world’s most popular social network and have found that the range of interests is
stored in interests.txt Analyze your title and see what specter of interests is your book marketable to.

Solution:
We should find similar words from the book’s title and from the given file, also included books since it’s a
title of a book.
import string

f = open("interests.txt", "r")
title = "From T-Rex to Multi Universes: How the Internet has Changed Politics, Art and
Cute Cats."
book_title = title.translate(str.maketrans('', '', string.punctuation))

a = set(f.read().splitlines())
b = set(book_title.split())
words = book_title.split()
x = b.intersection(a)

print("Specter of interests the book is marketable to:")


for interests in x:
print(interests)
print("Books")

f.close()
the output would be
Specter of interests the book is marketable to:
Art
Cats
Politics
Internet
Books

3.6 Promote it
We have provided you with a list of interests of each of these people. You can find it in interests.txt.
Considering the set of interests you have chosen, who of them would you market the book to? Let’s say that
a person has 5 of her interests coinciding with your books and she has a Rating of 346. Multiply her rating
with the 0.2 * coinciding interests to see a final score. Sort the people by this final score. Provide us with a
list of 5 people we should contact to make your book a bestseller! Please use the names found in
people_interests.txt.

Solution:
Using the data from previous exercises we get to know who would be the best choice to contact to help
promote the book.
Here are the Top 5
Corrin Tally 25.72500000000001
Monet Mccoy 43.290000000000006
Angila Ellinger 59.400000000000006
Marita Tegeler 83.52749999999999
Alta Kennan 91.08000000000001
book_title = "From T-Rex to Multi Universes: How the Internet has Changed Politics, Art
and Cute Cats ."
f = open("interests.txt", "r")

a = set(f.read().splitlines())
b = set(book_title.split())
words = book_title.split()
x = b.intersection(a)

print("Specter of interests the book is marketable to:")


marketable = ["Books"]
for interests in x:
marketable.append(interests)

print(marketable)
f.close()

listOfLines = list()
with open ("people_interests.txt", "r") as myfile:
for line in myfile:
listOfLines.append(line.strip())
print(listOfLines)

myList = [i.split(':')[-1] for i in listOfLines]


print(myList)

print("-----------------------------------------------------------")

distance=[[0, 3, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 1, 1],
[3, 0, 1, 2, 4, 2, 2, 4, 2, 2, 3, 2, 1, 3, 2, 2, 3, 2, 4, 2],
[2, 1, 0, 1, 3, 1, 1, 3, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 3, 1],
[2, 2, 1, 0, 3, 1, 1, 3, 2, 2, 1, 2, 2, 2, 2, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 0, 2, 3, 1, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[1, 2, 1, 1, 2, 0, 1, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1],
[2, 2, 1, 1, 3, 1, 0, 3, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 3, 1],
[1, 4, 3, 3, 1, 2, 3, 0, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 2],
[2, 2, 1, 2, 3, 2, 2, 3, 0, 2, 3, 1, 1, 3, 2, 1, 2, 1, 3, 1],
[2, 2, 1, 2, 3, 1, 2, 3, 2, 0, 1, 2, 1, 1, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 1, 3, 1, 2, 3, 3, 1, 0, 2, 2, 1, 1, 2, 3, 2, 3, 2],
[2, 2, 1, 2, 3, 1, 1, 3, 1, 2, 2, 0, 2, 2, 2, 1, 2, 2, 3, 2],
[2, 1, 2, 2, 3, 1, 1, 3, 1, 1, 2, 2, 0, 2, 1, 2, 3, 2, 3, 2],
[2, 3, 2, 2, 3, 1, 1, 3, 3, 1, 1, 2, 2, 0, 2, 2, 3, 2, 3, 2],
[1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 2, 1, 2, 0, 2, 2, 2, 2, 2],
[2, 2, 1, 1, 3, 2, 1, 3, 1, 2, 2, 1, 2, 2, 2, 0, 1, 2, 3, 2],
[1, 3, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 3, 3, 2, 1, 0, 3, 2, 2],
[2, 2, 1, 1, 3, 1, 1, 3, 1, 2, 2, 2, 2, 2, 2, 2, 3, 0, 3, 2],
[1, 4, 3, 3, 2, 2, 3, 2, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 0, 2],
[1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0]]

list = []
for i in range(len(distance)):
x = sum(distance[i])
list.append(x)

names = {
0 : "Caleb Hobby",
1 : "Alta Kennan",
2 : "Corrin Tally",
3 : "Leandro Eagan",
4 : "Otilia Laxson",
5 : "Ellie Francese",
6 : "Augustine Golub",
7 : "Elinore Orsborn",
8 : "Clarence Stalker",
9 : "Lili Houghton",
10 : "Monet Mccoy",
11 : "Angila Ellinger",
12 : "Sammie Womac",
13 : "Tiny Parkhurst",
14 : "Pearlie Moffet",
15 : "Cruz Perna",
16 : "Rebbecca Charlton",
17 : "Marita Tegeler",
18 : "Jarred Marrow",
19 : "Lorean Simcox"
}

rate = []
with open("influence.txt",'r') as data_file:
for line in data_file:
data = line.split()
rate.append(data[3])

c = [1,2,1,1,1,1,1,1,0,0,2,2,0,0,1,1,0,3,0,0]
products = []
for x, y in zip(list, rate):
products.append(0.5*float(x)*float(y))

promote = []
for x, y in zip(products, c):
promote.append(0.2*float(x)*float(y))

# for (i, item) in enumerate(products, start = 0):


# print(names.get(i)+": ", item)

ascending = []
for i in range(len(promote)):
ascending.append([promote[i], i])
ascending.sort()
sort_index = []
for x in ascending:
sort_index.append(x[1])

new_list = []
while promote:
min = promote[0]
for j in promote:
if j < min:
min = j
new_list.append(min)
promote.remove(min)

print("Rating in ascending order:\n")


sort = [names.get(i) + " " + str(j) for i, j in zip(sort_index, new_list)]
for person in sort:
print(person)

Problem condition:
4. Network
The dataset
The dataset is a text file where every line represents a JSON object that describes a tweet (tweet.txt). It was
fetched using twitter stream API, hence we're dealing with real life data (yay).

4.1 Popular Hashtag


Write a program that prints on the screen 10 most popular #hashtags followed by the number of occurrences
of the #hashtag
from collections import Counter

freq = Counter()
with open("tweets.txt") as data:
for line in data:
for part in line.split():
if "#" in part:
freq[part] += 1

print(freq.most_common(10))
this should help count actual hashtags
Counter(re.findall(r'#([a-z0-9]+)', ' '.join(lst), re.I))

4.3 Top
Write a program that prints on the screen 10 most positive tweets and 10 most negative tweets.

import json

tweets = []
with open('tweets.txt') as f:
for l in f:
tweets.append(json.loads(l))

import re
import tweepy
from tweepy import OAuthHandler
from textblob import TextBlob

class TwitterClient(object):
'''
Generic Twitter Class for sentiment analysis.
'''

def __init__(self):
'''
Class constructor or initialization method.
'''
# keys and tokens from the Twitter Dev Console
consumer_key = 'XXXXXXXXXXXXXXXXXXXXXXXX'
consumer_secret = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXX'
access_token = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXX'
access_token_secret = 'XXXXXXXXXXXXXXXXXXXXXXXXX'

# attempt authentication
try:
# create OAuthHandler object
self.auth = OAuthHandler(consumer_key, consumer_secret)
# set access token and secret
self.auth.set_access_token(access_token, access_token_secret)
# create tweepy API object to fetch tweets
self.api = tweepy.API(self.auth)
except:
print("Error: Authentication Failed")

def clean_tweet(self, tweet):


'''
Utility function to clean tweet text by removing links, special characters
using simple regex statements.
'''
return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])
| (\w+:\ / \ / \S+)
", " ", tweet).split())

def get_tweet_sentiment(self, tweet):


'''
Utility function to classify sentiment of passed tweet
using textblob's sentiment method
'''
# create TextBlob object of passed tweet text
analysis = TextBlob(self.clean_tweet(tweet))
# set sentiment
if analysis.sentiment.polarity > 0:
return 'positive'
elif analysis.sentiment.polarity == 0:
return 'neutral'
else:
return 'negative'

def get_tweets(self, query, count=10):


'''
Main function to fetch tweets and parse them.
'''
# empty list to store parsed tweets
tweets = []

try:
# call twitter api to fetch tweets
fetched_tweets = self.api.search(q=query, count=count)

# parsing tweets one by one


for tweet in fetched_tweets:
# empty dictionary to store required params of a tweet
parsed_tweet = {}

# saving text of tweet


parsed_tweet['text'] = tweet.text
# saving sentiment of tweet
parsed_tweet['sentiment'] = self.get_tweet_sentiment(tweet.text)

# appending parsed tweet to tweets list


if tweet.retweet_count > 0:
# if tweet has retweets, ensure that it is appended only once
if parsed_tweet not in tweets:
tweets.append(parsed_tweet)
else:
tweets.append(parsed_tweet)

# return parsed tweets


return tweets

except tweepy.TweepError as e:
# print error (if any)
print("Error : " + str(e))

def main():
# creating object of TwitterClient Class
api = TwitterClient()
# calling function to get tweets
tweets = api.get_tweets(query='Donald Trump', count=200)
# picking positive tweets from tweets
ptweets = [tweet for tweet in tweets if tweet['sentiment'] == 'positive']
# percentage of positive tweets
print("Positive tweets percentage: {} %".format(100 * len(ptweets) / len(tweets)))
# picking negative tweets from tweets
ntweets = [tweet for tweet in tweets if tweet['sentiment'] == 'negative']
# percentage of negative tweets
print("Negative tweets percentage: {} %".format(100 * len(ntweets) / len(tweets)))
# percentage of neutral tweets
print("Neutral tweets percentage: {} % \
".format(100 * (len(tweets) - (len(ntweets) + len(ptweets))) / len(tweets)))

# printing first 5 positive tweets


print("\n\nPositive tweets:")
for tweet in ptweets[:10]:
print(tweet['text'])

# printing first 5 negative tweets


print("\n\nNegative tweets:")
for tweet in ntweets[:10]:
print(tweet['text'])

if __name__ == "__main__":
# calling main function
main()

You might also like