Academia.eduAcademia.edu

"USING SOCIAL MEDIA POST IN PREDICTING DEPRESSION LEVEL"

2019

Depression being a popular psychological state is one psychological state that bedevils public health globally. Generally mental illness is the primary cause of depression. Many people suffer from depression and a section of them undergoes sufficient treatment. It’s a serious medical illness which corresponds to user’s ability to work, study, participation in social activities and having fun. Depression poses a challenge to personal and public health. One of the major solutions to this problem is detailed study of individual’s behavior attributes. These attributes are available on various social media such as Facebook, Twitter, Instagram etc. Social networking platform is best way to know a person behavior, thinking style, mood, egoistic networks, opinions etc. The use of social media has increase especially by young people. Users on social media express how they feeling, daily activities and opinions about various topics etc. So, to predict depression levels social media can used. This study aim is to extract information from social media posts and this will enable one to have clear understanding of person’s behavioral attributes and depression levels of users can be predicted with machine learning algorithm which include Support Vector machine, Naïve Bayes.

ABSTRACT

Depression being a popular psychological state is one psychological state that bedevils public health globally. Generally mental illness is the primary cause of depression. Many people suffer from depression and a section of them undergoes sufficient treatment. It's a serious medical illness which corresponds to user's ability to work, study, participation in social activities and having fun. Depression poses a challenge to personal and public health. One of the major solutions to this problem is detailed study of individual's behavior attributes.

These attributes are available on various social media such as Facebook, Twitter, Instagram etc. Social networking platform is best way to know a person behavior, thinking style, mood, egoistic networks, opinions etc. The use of social media has increase especially by young people. Users on social media express how they feeling, daily activities and opinions about various topics etc. So, to predict depression levels social media can used.

This study aim is to extract information from social media posts and this will enable one to have clear understanding of person's behavioral attributes and depression levels of users can be predicted with machine learning algorithm which include Support Vector machine, Naïve Bayes.

INTRODUCTION

Depression being a popular psychological state is one psychological state that bedevils public health globally. In 2015, more than 300 million people suffered from depression in accordance with Subha et al. (2018). Subha et al. (2018) describe depression as the psychological state of constant unhappiness, desperation, low energy, low self-esteem, vacant mood, anxiety, sleeping problems, feeling guilty, self-harm, and unfair ideas. It impacts the everyday tasks, as it disrupts the memory and the ability to concentrate. It doesn't just influence your capacity to ordinarily assess, it is a social retardant. In keeping with Subha et al., (2018) depression additionally place the patients in danger of cardiovascular disease by sixty seven percent (67%) and a rise in a very risk of cancer by five hundredth. In addition, this psychological situation is also a delay in the style of stress and married break-down or situation for family, friends, carers and alternative relations. For this reason, it is sensible to make a shot and invest in decreasing depression and drugs.

Depression may be a treatable unwellness. Subha et al., (2018) is of the opinion that associate early detection and intervention would shorten the treatment course. Unluckily, the speed of handiness to handling depression is shockingly low. It was rumored completely that 500ths of those in the mental state had access to mental health (Subha et al. 2018). The obstacles embody a shortage of data and consciousness in depression, with adverse perception of mental state services and a restricted number of qualified mental health personnel's (Subha et. al., 2018). The reporting on the incidence of time periods shows elevated discrepancies, with 3% expected to 0.17% in Japan and USA respectively. In North depressive event in a 1-year period (Andrade and Caraveo, 2003).

Still, world requirements and services for detection, supporting, and treating psychopathy of this nature are thought-about as scant (Detels, 2009). Though Detels (2009) in his report opined thateighty seven of the world's governments supply some medical aid health services to tackle psychopathy, 30% don't have programs, and twenty eighth percent (28%) haven't any budget specially known for mental state. In reality, there is no credible laboratory check to designate most types of mental health; the designation is usually based on patient experience, relatives or colleagues' reportable behavior and standing mind examination (Choudhury et al. 2013). In view of all these problems, the budding use of social network as an instrument for detecting and predicting people's depression is being explored. The main focus is on normal mental disease: major depressive disorder or MDD. Choudhury et al., (2013) argues that the major depressive disorder is described by occurrences of panoptic mood swing in the midst of low shallowness and loss of interest or enjoyment in unremarkably enjoyable operations. Kessler et al., (2003) and Rude et al., (2004) established in their paper that individuals plagued by Major depressive disorder concentrate on sad and uncomfortable information, that's to interpret ambiguous data negatively, and to harbor persistently hopeless views.

The rate at which individuals use social media, like Facebook and Twitter, to share their ideas and views with their contacts is increasing in a day. Postings on these sites are done or created in an exceedingly true-to-life scenery and in the course of everyday operations and events. In line with Choudhury et al., (2013) social media offers a technique to capture behavioral characteristics appropriate to the mood, thinking, interaction, socialization and activities of the individual. In social media postings, the sensation and language can indicate feelings of unlinking, guilt, vulnerability and self-disappointment. Typically, victims of depression also pull out of social stuff and activities. Such changes in action may well be salient with changes in activity on social media. Also, social media would possibly replicate ever-changing social ties. We tend to follow the assumption that language, activity and social relationships change can also be used to build applied mathematical models for detecting and even predicting significant depressive disorder in a fine grain mode, along with ways of matching and extending ancient designation approaches. But by eliminating social networking feeds of a particular user, there may a way to get an entire picture of the user regular behavior. Aldarwash and Ahmed (2017) felt that from the profile of a user on social media, we would gather all the information related to the disposition of the person, events, sleeping hours, thinking vogue, interactions, feelings of guilt, lack of importance, loneliness, and helplessness. Return of characteristics of such exercise, which demonstrate depression signs to social media customers that could predict whether or not the person is depressed. The predicted utility, which can save time before a depressed consumer gets into the major depression portion, can be used by doctors, relatives and colleagues.

BACKGROUND

Depression could be a disposition syndrome which creates a continual feeling of disappointment and loss of interest. There are many alternative kinds of depressive disorders, and every sort has its own distinctive signs. The foremost variety of syndrome is termed Major depressive disorder that hinders the power to figure, read, eat, sleep and play.

In other to detect major depressive episode, the patient can experience 5 or a lot of the subsequent 9 signs throughout the amount of period of time and nearly each day (Frances et al., 1994). The primary symptom has low disposition most often in the day. Secondly is disenchanted in the majority of daily events. The third sign is weight gain or weight loss and excessive amount of sleep time. Fourth sign -tension, impedance. Fifthtiredness, energy loss. Sixth -sense of fault, insignificance. Seventh sign -unable to concentrate, thoughtfulness. The eighth sign is the same as the third sign which bothers on sleep. The ninth and last sign is that the solely signs that doesn't get to exist nearly each day, the signs is pondering bereavement, suicide try, also progressing to kill. Hussain et al., (2015) is of the view that asking patients questions about these indications does not detect depression during an accurate means approving that life science is not 100% certain about the methods used to diagnose depression.

The rate at which Social media is used these days particularly for the younger generation is increasing. Customers will access their social media platform from their Phones, Personal Computers at any given time and place. The accessibility of Social Networking Sites permits user's to be particular about user interest, feeling and share their routine daily.

Gathering user content (UGC) from Social Networking Sites can be utilized in health-related human behaviors. In line with Aldarwash and Ahmed (2017) through the mining of posts on social media, we will see the picture of the user's behavior that may aid the predict depression. Hussain et al., (2015) developed a technique that classify social media user's depression part in line with their in person written communication, wherever we'd get a certain results of the user mental state. This methodology begins in assembling the user generated content from social networking sites. According to Hussain et al. (2015) classified phrases that determine whether or not the user is depressed within the training portion . The Support Vector Machine was useful in these coaching part to test one of these signs of depression.

CHAPTER TWO

LITERATURE REVIEW

Several literatures have stated the discovery of the likely usage of web media in predicting many kinds of depression among social media users. The studies have being conducted on the information collected from Twitter (Nadeem et al., 2016), Facebook (Schwartz et al., 2014), Reddit (Shen & Rudzicz, 2017), Instagram (Reece & Danforth, 2017) and internet forums. Shen & Rudzicz, (2017) classified Reddit posts associated with worry by applying N-gram language modeling, vector interleaving, topic analysis, and emotional means that come up with options. In step with Choudhury et al., (2013) depression constitutes a real check in a private and general eudaemonia. Tidy range of individuals experience the heart disease's ill effects and just one segment receives comfortable therapy once a annually (Islam et al., 2018). Choudhury et al., (2013) conjointly examined the prospect of victimization of web media to detect and evaluate some signs of substantial user dejection. They evaluated behavioral credits tracking social commitment, sensation, idiom and linguistic designs, self-system, then signs of antidepressant drug medicines through posts on their web-based social networking. Nadeem et al., (2016) used crowd sourcing to gather knowledge of Twitter users with detected depression. A Bag of words method was implemented to enumerate every twitter comment so as to impact on many applied math classifiers to supply assessments to the chance of depression.

Reece & Danforth, (2017) applied machine learning tools to notice markers of depression on photos from Instagram. Color analysis, data parts, and algorithmic face detection were utilized by Reece and Danforth (2017) to extract applied math options from Instagram photos. In recent times to detect depression on social media deep learning techniques have conjointly been used.

For example, Aldarwish and Ahmad (2017) have been studying the recent increase in the use of social media by the youth. This is often seen as convenient as a consequence of social network sites, allowing users to specify their interests, feelings, and timetable on a daily basis. Troussas et al., (2013) Extracted information from twitter is analyzed by varied algorithms like Naive Bayes, Support Vector Machine, Maximum entropy etc. Information that's extracted is incorporates matter information, emotions etc. On these information, reprocessing and so analysis is meted out. They classify it using the Naive Bayes algorithmic program that classifies it into varied depression levels (Sonawane et al., 2018).

PREDICTION THROUGH TRADITIONAL METHODS

The mental states of users were predicted by most researchers like Choi et al., (2014) through Facebook. User's response from a fortnight ago, supported by experiences are captured within the questionnaires. Guntuku et al., (2017) in line with his paper tells that Centers for Disease Control and Prevention (CDC) reports that Depression may be a common and heavy ill health, touching one out of ten ladies 18-44 years. Therefore to what extent are these varied machine learning algorithms helpful against the prediction and prevention of suicide and depression.

Alternative to accomplish this Braithwaite et al., (2016)

PREDICTING THROUGH SOCIAL MEDIA

Aldarwish and Ahmed (2017) I am fed up with all this, why people can not leave me alone, why they always interfere.

CHAPTER THREE

METHODOLOGY

In this study, focus is on 3 sorts of factors like emotional method, temporal method, linguistic style for the identification and detection of depressive awareness as social media data from Choudhury et al., (2013) used crowdsourcing to collect labels on the existence of MDD as groundtruth information. Crowdsourcing is an effective mechanism for realizing a different population's access to behavioral information, crowdsourcing is less time-intensive and cheaper. They conjointly designed a Human Intelligence Tasks (HITs) using Amazon's Mechanism Turk interface; the respondents were asked to require an even affective disorder survey (Choudhury et al., 2013).

DATA COLLECTION

Together, audience members might prefer to disclose their Twitter usernames if they have a Twitter account, in which they agree to remove and anonymously evaluate their understanding using a software program (Choudhury et al. 2013). Approximately 40% opted in to share their Twitter profile. Collection of knowledge from social media can be cumbersome and getting ready these social media knowledge could be a primary challenge that bear information whether they may contain bearing material of depression. Consequently, Islam et al., (2018) in their paper expressed that the NCapture is powerful tool meant to rearrange, breakdown and see information in unstructured knowledge like social media, interviews, articles and online page.

DATA SET PREPARATION

Raw data collected from social media, or Social Networking site (SNS) were analyzed utilizing Linguistic Inquiry and Word Count (LIWC) software package (Islam et al., 2018). This LIWC software package which is a text analysis strategy that analyzes text on a line by line basis LIWC works on particular linguistic models such as: article, auxiliaries, verb, conjunction, adverbs, pronoun, preposition, phrases, consent, denial, assurance and quantifier.

Using this data from social media machine-controlled predictions are cross valid and critiqued through customary exactitude (precision), Recall, and F1 scores. Nadeem et al., (2016)

F1 SCORE (F -MEASURE)

F1 = 2 * precision * recall precision + recall Nadeem et al., 2016 defines F1 score as the harmonic mean of Precision and Recall; therefore, it is normally used as a classification metric because each metric is evaluated similarly Linguistic methodword count, pronoun, articles, prepositions, auxiliary verbs, adverbs, conjunctions, denials.

Other synchronous linguisticsverbs, adjectives, comparisons, question marks, numbers, quantifiers.

CLASSIFIERS

At this phase, a predictive model for posts relating to depression and comment recognition is prepared by considering the cognitive psychology alternatives as inputs such as each post and remarks are labeled as either depression comment or non-depression comment or it classifies the patient into one 1 of 4 levels (Minimal, Mild, Moderate, or Severe depression) supported by the model planned by Aldarwish and Ahmed (2017). During this study, we tend to think about four common classifiers: Naïve Bayes Algorithm, K-Nearest Neighbor (KNN), Decision Tree and Support Vector Machine (SVM).

SUPPORT VECTOR MACHINES (SVM)

Supporting Vector Machines jointly called networks of support vectors. The classification or the anomaly detection data is analyzed by a non probabilistic linear binary classifier. It generates an overview in a high-dimension function house and finds an overview which isolates information in two classes, the most widely divided for the greatest purpose of any coaching data category (Islam et al., 2018).

DECISION TREE (DT)

Decision Trees are easy in nature and that they are extensively used inside the Machine Learning field as they merely create a series of fastidiously created queries in other to classify the task.

Owing to the many thousands of mixtures a Hunt's algorithmic program is employed to populate these trees (Nadeem et al., 2016). Islam et al., (2018) explains K-Nearest Neighbor (KNN) in their research as a non-parametric method is used to identify the distances between points of interest and training points.

K-Nearest Neighbor (KNN)

NAÏVE BAYES

A Naïve Bayes classifier is one among the best obtainable inside the Machine Learning field, it's primarily centered off the popular Bayes' theorem, it depends on associated fundamental theory where every feature is autonomous of another, so immensely simplifying the process area. For instance, according to Nadeem et al., (2016) if the fruit is green, round, or about 15 centimeters wide, the algorithmic rule of Naïve Bayes exploits these alternatives, in spite of any achievable correlation between size, shape, and a fruit color, as the fruit would be categorized. Islam et al., (2018) In their approach the Naïve Bayes algorithmic rule made the most effective accuracy, though it trailed alternative classifiers in reference to exactness (precision), recall and Fmeasure (recording 81%, 82%, 81% respectively). Additionally, the Naïve Bayes classifier trailed behind the Logic Regression within the classification assignment, whereas Naïve Bayes earned a precision and F-measure of 81%, the Logistic Regression model recorded a precision score of 86%, outperformed alternative classifier with a F1-score of 84%, the Supported Vector Machine earned a best recall score (83%) in all of the model however fell short in precision (83%) (Nadeem et al., 2016).

DATA ANALYSIS

Consequently, the efficiency of a classifier's Reciever Operating Characteristics (ROCs) curve using the Curve Area (AUC) measure was evaluated quantitatively, Nadeem et al., (2016) reported in his study that of all the classifier with a score of 0.94 trailer behind, the Naïve Bayes strategy was best performed followed by Linear Support Vector Machine (0,80), Classifier Ridge (0.74) and Decision Tree (0.64). A score of 0.50 is taken into account a guess score while a score of one

(1) is considered perfect (Nadeem et al., 2016).

STUDY SIGNIFICANCE

With regard to approaches to public health, social media provides an unprecedented stream of information that connects individuals to a community, and this research shows that we can use this to attempt and make something helpful. Because we tend to sharpen our capacity to detect depression in real time, we can reduce the danger of suicide. This could increase current programs to try and reduce suicide. Once an individual's social media stream indicates major depressive depression, easy interventions like guiding them to take a toll free line, however it may be might have a considerable impact. We tend to believe that enhancing our range of solutions to incorporating social media observation in order to detect and forecast the depression in the entire population of people using social media can reduce the suicide rate well.

CHAPTER FIVE

CONCLUSION

This research has the ability to use social media as a measuring instrument and predict significant depression in individuals. First, the research showed that crowdsourcing is used in collecting labels and how NCapture is used to break down unstructured information, this data is then evaluated using the LIWC computer code, and a variety of cognitive psychology measures such as language, emotion, and style are used to characterize users ' depressive conduct.

Finally, this study examined these distinctive attributes on four major classifiers: Naïve Bayes Algorithm, K-Nearest Neighbor (KNN), Decision Tree and Support Vector Machine (SVM) and therefore the Naïve Bayes classifier that may predict sooner than the rumored origin of depression of a person. The Naïve Bayes classifier yielded an ROC AUC score of 94%.

Table 3 .