DETECTING Experimental results show that the proposed model

DETECTING STRESS BASED ON SOCIAL INTERACTIONS IN SOCIAL NETWORKSABSTRACTPsychological stress is menacing people’s health. It has significance to detect stress timely for proactive care. With the vogue of social media, people are used to share their daily activities and interacting with friends on social media platforms, making it practical to leverage online social network data for stress detection. In this paper, we find that users stress state is closely related to that of his/her friends in social media, and we employ a large-scale dataset from real-world social platforms to systematically study the correlation of users’ stress states and social interactions. We first define a set of stress-related textual and social attributes from various aspects, and then tender a data collection and data comparison to leverage tweet content and social interaction information for stress detection. Experimental results show that the proposed model can improve the detection performance by 6-9% in F1-score. By further analyzing the social interaction data, we also discover several lure phenomena, i.e. the number of social structures of sparse connections (i.e. with no delta connections) of stressed users is around 14% higher than that of non-stressed users, indicating that the social structure of stressed users’ friends tend to be less connected and less complicated than that of non-stressed users and also sending notification through mail so. that the users can try to reduce their stress.EXISTING SYSTEMCurrently, there is no algorithm or any concepts with good accuracy to detect the stress in social media. This stress may influence and affect their personal life. If a person is in stress, it is expressed in their comments or shares in his/her social media page. Through this we can detect the stress manually but is not automated in range or percentage basis to show how much they are in stress. PROPOSED SYSTEM:In this proposed system, we a tweet page, where the users can share their tweet and comment on others tweet like a social media page. Through the words in the context they have shared, the context is analyzed through data mining concept and classified as a positive or negative comment. This percentage/ range can be automatically calculated through this tool. In addition, a feature is added where admin can view the user’s stress percentage level and selectively will send motivational messages and their stress level percentage through mail to be aware of their stress level. INTRODUCTION1.1 MotivationPsychological stress is becoming a threat to people’s health nowadays. With the rapid pace of life, more and more people are feeling stressed. According to a worldwide survey reported by New Jersey in 2010, over half of the population has experienced an appreciable rise in stress over the last three years. Though stress itself is non-clinical and common in our life, excessive and chronic stress can be rather harmful to people’s physical and mental health. According to existing research works, long-term stress has been found to be related to many diseases, e.g. depressions, anxiety insomnia etc.. Moreover, according to Chinese Asylum for Disease Control and Prevention, suicide has become the top cause of death among Chinese youth, and excessive stress is considered to be a major factor of suicide. All these reveal that the fleeting increase of stress has become a substantial challenge to human health and life quality.Thus, there is significant importance to detect stress before it turns into severe problems. Traditional psychological stress detection is mainly based on face-to face interviews, self-report questionnaires or wearable sensors. However, traditional methods are actually reactive, which are usually labor consuming, time costing and hysteretic.Fig1.1 The sampling test results of the diversity of users’ social structures from Sina Weibo, by using the top 3 interacted friends of the users.2 ANALYSISInspired by psychological theories, we first define a set of attributes for stress detection from tweet-level and user-level aspects respectively: 1) tweet-level attributes from content of user’s single tweet, and 2) user-level attributes from user’s weekly tweets. The tweet-level attributes are mainly composed of linguistic, visual, and social attention (i.e., being liked, re-tweeted, or commented) attributes extracted from a single-tweet’s text, image, and attention list. The user-level attributes however are composed of: (a) posting behavior attributes as summarized from a user’s weekly tweet postings; and (b) social interaction attributes extracted from a user’s social interactions with friends. In particular, the social interaction attributes can further be broken into: (i) social interaction content attributes extracted from the content of users’ social interactions with friends; and (ii) social interaction structure attributes extracted from the structures of users’ social interactions with friends.To maximally leverage the user-level information as well as tweet-level content information, we propose a novel hybrid model of factor graph model combined with a convolutional neural network (CNN). This is because CNN is capable of learning unified latent features from multiple modalities, and factor graph model is good at modeling the correlations. The overall steps are as follows: 1) we first design a convolutional neural network (CNN) with cross auto encoders (CAE) to generate user-level content attributes from tweet-level attributes; and 2) we define a partially-labeled factor graph (PFG) to combine user-level social inter-action attributes, user-level posting behavior attributes and the learnt user-level content attributes for stress detection.We assess the proposed model as well as the benefaction of different attributes on a real-world dataset from Sina-Weibo. Experimental results show that by exploiting the users’ social interaction attributes, the proposed model can improve the detection performance (F1-score) by 6-9% over that of the state-of-art methods. This indicates that the pro-posed attributes can serve as good cues in tackling the data sparsity and ambiguity problem. Moreover, the proposed model can also efficiently combine tweet content and social interaction to enhance the stress detection performance.3. PROBLEM FORMULATIONBefore presenting our problem scenerio, let’s first define some necessary notations. Let V be a set of users on a social network, and let jV denote the total number of users. Each user vi 2 V posts a series of tweets, with each tweet containing text, image, or video content; the series of tweets contribute to users social interactions on the social network.3.1. Stress stateThe stress state y of user vi 2 V at time t is represented as a triple (y; vi; t), or briefly yit. In the study, a binary stress state yit 2 f0; 1g is considered, where yit = 1 indicates that user vi is stressed at time t, and yit = 0 indicates that the user is non-stressed at time t, which can be identified from specific expressions in user tweets or clearly identified by user himself, as explained in the experiments. Let Y t be the set of stress states of all users at time t.3.2. Time varying user level attribute matrix.Each user in V is associated with a set of attributes A. Let Xt be  attribute matrix at time t, in which every row t corresponds to a user, each column corresponds to an attribute, and an element xti;j is the j-th attribute value of user vi at time t. A user-level attribute matrix describes user-specific features, and can be defined in different ways. This study considers user-level content attributes, statistical attributes, and social interaction attributes.3.3 Tweet-level AttributesTweet-level attributes describe the linguistic and visual content, as well as social attention factors (being liked, commented, and re-tweeted) of a single tweet.For linguistic attributes, we take the most commonly used linguistic features in sentiment analysis research. Specifically, we are likely to adopt LTP .A Chinese Language Technology Platform to perform lexical analysis, e.g., token and lemma, and then explore the use of a Chinese LIWC dictionary, to map the words into positive/negative emotions. LIWC2007 is a dictionary which categorizes words based on their linguistic or psychological annotations, so we can classify words into different categories, e.g. positive/negative emotion words, degree objectives. We have also tested other linguistic re-sources including NRC and found that the performances were relatively the same profound, so we adopted the commonly used LIWC2007 dictionary for experiments. Furthermore, we extract linguistic attributes of emoticons (e.g.,   and  ) and punctuation marks (‘!’, ‘?’, ‘…’, ‘.’). Weibo defines every emoticon in square brackets (e.g., they use for “laugh”), so we can map the keyword in square brackets to find the emoticons. Twitter adopts Unicode as the representation for all emojis, which can be extracted directly. 4. Contribution Analysis. The definition of factors is important to the performance of the Factor Graph Model. We have three types of factors in our model, i.e., attribute factor, social factor, and dynamic factor. To analyze different factors in our model, we compare the detection agenda with different combinations of factors in this experiment specifically, we first use all the three factors, denoted as FGM, and then we remove the following factors respectively social factor.4.1 Convergence AnalysisWe further investigate the convergence of the learning algorithm, and represent the F1-score with increasing number of iterations. We see that the algorithm converges within around 3000 iterations, which is rapid enough for us to conduct efficient model training on large scale datasets in practice.The extraction of patterns from data has obtained for centuries. Early methods of identifying patterns in data include Bayes’ theorem (1700s) and regression analysis. The proliferation, ubiquity and increasing power of computer technology has dramatically increased data collection, storage, and manipulation ability. As data sets have enlarged in size and complexity, direct “hands-on” data analysis has increasingly been indirect, automated data processing, aided by other discoveries in computer science, such as neural networks, cluster analysis, genetic algorithms , decision trees and decision rules , and support vector machines (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns in large data sets. It bridges the gap from applied statistics and artificial intelligence (which usually provide the mathematical background) to database management by exploiting the way data is stored and indexed in databases to execute the actual learning and discovery algorithms more efficiently, allowing such methods to be applied to ever larger data sets.?Category Short Name # Description The numbers of @-mentions, @-re-tweets, and @-replies Social Engagement 3 in weekly tweet postings, indicating one’s socialPosting Behavior interaction activeness with friends. Tweeting time 24 The numbers of tweets posted in hours with a 24- Dimensional vector. Categorize users’ tweets into mainly four types based on general categories of social media platforms: (1) Image tweets (tweets containing images); (2) Original tweets (tweets that are originally authored and posted by the user); Tweeting type 4 (3) Information query tweets (tweets that ask questions or ask for help ); (4) Information sharing tweets (tweets that contain Outside hyperlinks). We use a 4-dimensional vector of the numbers of tweets in the above 4 types respectively to quantify the tweeting Type attributes. Adopt 10 categories from LIWC that are related to daily life, social events, e.g., personal pronouns, home, work, money, religion, death, Tweeting linguistic style 10 health, ingestion, friends, and family. We extract words from users’ weekly tweet postings, and use a 10-dimensional vector of numbers of words in the 10 categories A 10-dimensional integer vector, Words 10 with each value representing the number of words from Content Style social interaction content of users weekly tweet postings in each word category from LIWC; A 2-dimensional integer vector with each valueSocial Interaction Emoticons 2 representing the number of positive and negative emoticons (e.g.,   and  ) in tweets. Stressed Neighbor Count 1 The number of the user’s stressed neighbors. Social Influence Strong-tie Count 1 The number of stressed neighbors with strong tie. Weak-tie Count 1 The number of stressed neighbors with weak tie. Follower Count 1 The number of the user’s followers. Fans Count 1 The number of the user’s fans. Representing the structure distribution of the user’s Social Structure 8 interacted friends, where each element refers to the existence of the corresponding structure in Fig. 6.Table1: Summary of user-level attributes5. ARCHITECTURE Architecture of above stated model. The model consists of two parts. The first part is a CNN. The second part is a FGM. The CNN will generate user-level content attributes by convolution with CAE filters as input to the FGM. Take the user labeled with a red star as example. Tweet-level attributes of the user are processed through a convolution with CAE to form the user-level content attributes. The user-level attributes are denoted by xti in the left box. Every xti contains three aspects: user-level content attributes, user-level posting behavior attributes, and user-level social interaction attributes. Data of other users follows the same route. In the FGM, attribute factors connect user-level attributes to corresponding stress states. Social factors connect the stress state of different users. Dynamic factors connect stress state of a user over time.5.1 Correlations between Tweet’s Content and Social InteractionsAs the social correlation between users and time-dependent correlation are hard to be modeled using classic classifiers such as SVM, we use a partially-labeled factor graph model (PFG), which was first proposed in, to incorporate social interactions and tweets’ content for learning and detecting user-level stress states.We define an objective function by maximizing the conditional probability of users’ stress states Y given a series of attribute-augmented networksG= fGt = f(V t; Et; Xt; Y t)gg; t 2 f1; : : : ; T g V = V 1 = = V T ; jV j = N, i.e., P (Y jG). The factor graph provides a way to factorize the “global”5.2 Attribute factor. We use this factor f(xti; yit) to represent the correlation between user vi’s stress state at time t and her/his attributes xti. More specifically, we instantiate the factor by an exponential-linear function:Where, xt is a parameter of the proposed model, and Z is a normalization term.5.3 Dynamic factorWe use this factor f(yit; yit+1) to represent the time correlation between user vi’s stress state at time t and t + 1. More specifically, we instantiate the factor by an exponential-linear function:Where, t+1 are the model parameters for this type of factor, h0 is defined as a vector of indicator functions, and Z is the normalization term.6. MODULES6.1 USER MODULE1. User registration-         In this module, user can register through mail-id, username and password. 2. User login: In this module, User can login through their mail-id and password to share and     comment on tweet.3. Share tweet: User can share their tweets.4. Comment tweet: User can express their views by sharing their comments.6.2 ADMIN MODULE5. View tweets and commentsHere, admin can view the tweets and posted comments.6. Analyze positives and negativesIn this module, admin can come to know about the positive and negative comments and also average of negative comments of a particular user.7. Send mail: In this module, admin can send the average/ percentage of the negative tweets of a particular person through mail to show their stress level expressed in social media. The admin can also send motivational messages individually to a particular person to decrease the stress level. 8. Experimental SetupIn the following experiments, we first edify and test our model on the large-scale Sina Weibo dataset. We then test our model on the other 3 datasets to show effectiveness of the proposed model on different data sources or different ground truth labeling methods. For all of our analysis, we use 5-fold cross validation, with over 10 randomized experimental runs.9.Comparison MethodsWe compare the following classification methods for user-level psychological stress detection with our FGM+CNN model (denoted as FGM here).9.1 Logistic Regression (LRC)  It trains a logistic regression classification model and then predicts users’ labels in the test set.9.2 Support Vector Machine (SVM) It is a popular and binary classifier that is proved to be effective on a huge category of classification problems. In our problem we use SVM with RBF kernel.9.3 Random Forest (RF)  It is an ensemble learning method for decision trees by building a set of decision trees with random subsets of attributes and bagging them for classification results.9.4 Gradient Boosted Decision Tree (GBDT)It trains a gradient boosted decision tree model with features associated with each user.10.Deep Neural Network (DNN) for user-level stress detection: It is proposed to deal with the problem of user-level stress detection problem with a convolutional neural network (CNN) with cross autoencoders.10.1 Generate decision tree 1. Check if algorithm satisfies termination criteria. 2. Computer information theoretic criteria for all attributes. 3. Choose best attribute according to the information theoretic criteria 4. Create a decision node based on the best attribute in previous step 5. Induce (i.e. split) the dataset based on newly created decision node in step 4 6. For all sub-dataset in step 5, call algorithm to get a sub-tree (recursive call) 7. Attach the tree obtained in step 6 to the decision node in step 4 8. Return tree 10.2 Input:1. An attribute valued dataset D       2. Tree={}       3. If D is “Pure” OR other stopping criteria met then Terminate      4. End if      5. For all attribute a ? D do Compute information thereotic criteria.6.  If we split on a End for a best = Best attribute according to above computed criteria 7.   Tree= Create a decision node that tests a best in the root 8.   Dv= Induced sub-Datasets from D based on a best 9.   For all Dv do Tree v= explicit of trees10. Attach Tree v to the corresponding branch of Tree11. End for 12. Return Tree11. SOFTWARE REQUIREMENTS • Operating System – Windows• Application Server  –  Tomcat 7.0 • Front – End – Java• Back – End – Postgresql12. HARDWARE REQUIREMENTS • Processor – Pentium –III • Speed – 1.1 Ghz • RAM – 256 MB(min) • Hard Disk – 20 GB • Drive – 1.44 MB • Key Board – Standard Windows Keyboard • Mouse – Two or Three Button Mouse • Monitor – SVGA 12.1 SOFTWARE EXTRUSIONSPostgreSQL manages sequential through a system known as multisession concurrency control (MVCC), which gives each transaction a “snapshot” of the database, allowing changes to be made without been participating to other transactions until the changes are committed. This largely eliminates the need for read locks, and ensures the database maintains the ACID (atomicity, consistency, isolation, durability) principles in an efficient manner. PostgreSQL offers three levels of transaction isolation: Read Committed, Repeatable Read and Serializable. Because PostgreSQL is free to dirty reads, requesting a Read Uncommitted transaction isolation level provides read committed instead. PostgreSQL supports full serializability via the serializable snapshot isolation (SSI) technique.PostgreSQL includes built-in support for regular B-tree and hash indexes, and four index access methods, generalized search trees (GiST), generalized inverted indexes (GIN), Space-Partitioned GiST (SP-GiST) and Block Range Indexes (BRIN). Hash indexes are implemented, but discouraged because they cannot be recovered after a crash or power loss, although this will no longer be the case from version 10. In addition, user-defined index methods can be created, although this is quite an involved process. Indexes in PostgreSQL also support the following features:• Expression indexes can be created with an index of the result where an expression or function, instead of simplifying the data in a column.• Partial indexes, which is index part of a table, can be fetched by adding a WHERE clause to the final part of the CREATE INDEX statement. This allows a smaller index to be fetched.• The planner is capable of using many number of indexes together to satisfy more complex queries, using temporary in the memory bitmap index operations • k-nearest neighbors (k-NN) indexing  provides logical searching of “matching values” to the specified, useful to finding matching words, or closer objects or views with geospatial data. This is achieved without exhaustive matching of values.• In PostgreSQL 9.2 and further versions, index only scans and often allows the system to fetch data without even having to access the main table.• PostgreSQL 9.5 introduced Block Range Indexes (BRIN).13. SCREEN SHOTS13.1 LOGIN PAGE 13.2 SIGN UP PAGE 13.3 USER HOME PAGE 13.4 USER HOME PAGE-SHARED TWEETS 14. APPLICATIONS:The patterns that emerge through collective human mobility behavior are now understood for wide ranging and important. The default configuration of PostgreSQL uses only a small amount of dedicated memory for performance critical purposes such as caching database blocks and sorting. This limitation is primarily because older operating systems required kernel changes to allow allocating large blocks of shared memoryCONCLUSION In this system, we displayed a system for distinguishing users ‘psychological stress states from clients’ week after week online networking information, utilizing tweets’ substance and additionally clients’ social associations. Utilizing true online networking information as the premise, we contemplated the connection between client mental anxiety states and their social communication practices. To completely use both substance and social communication data of clients’ tweets, we proposed a half model which joins the factor graph model (FGM) with a convolution neural system (CNN).