Feb 5, 2013

Using Social Networking Analysis To Measure Influence

Social Networking Analysis (SNA) employs analytical methods, techniques and models to analyze, understand and predict online social networking patterns, structure and dynamics through mining the wealth of data being constantly shared across the web. SNA also leverages sociology, psychology and statistics concepts to build specific network models for various types of social networks. Even though human relationships are governed by random interactions and motivations, most human networks both online and offline follow very similar patterns.


Network Models
One of the most important approaches to understanding social networks is to determine which mathematical models best describe typical social network structures. It turns out that the Normal or Gaussian distribution model (Figure 1), which is good for modeling many real-world quantities, is not an adequate model for degree distributions. In the study of networks, the degree is the number of connections between nodes/vertices and the degree distribution is the probability distribution of degrees over the entire network. 



As such, the Power Law distribution is a much better model for representing degree distribution networks. Many social networks exhibit such distribution represented by the Power Law model. Understanding this phenomenon is important because the Power Law model has a different probability distribution than the Normal distribution.

Centrality
Often, regardless of the industry or organization performing social networking analysis, it is important to understand which models govern their specific target network. It is also critical to understand the smaller, local relationships between the actors (nodes). For example, for intelligence analysis purposes, it is critical to identify how information flows through the network and which nodes are the most active in collecting or sharing information. As such, a centrality of a network describes how important/influential a node is to a network.


Highly central networks operate similar to highly centralized governments such as theocracies or monarchies while least centralized networks mimic democratic system of governments. The centralization of a network is approximately an average of the maximum centrality of a single node over the entire network and can be calculated by Freeman’s general formula

For practical purposes, it is not always required to calculate this number to be able to realize the centrality of a network. For example, comparing today’s terrorist groups to traditional ones it can be observed without going through the calculations that today’s groups are much less centralized and hence harder to target. Low centralized networks, though sometimes not as effective in terms of governance and implementation of an overall strategy, are much more resilient (‘anti-fragile’) to shocks. For example, it’s much easier to contain a virus in a highly centralized network than it is in a low centralized network.

The other concepts of centrality: ‘Closeness’ and ‘Betweenness’ attempt to measure the minimum number of nodes information or a meme would have to travel to get from one node to another. A very close network with many well-connected nodes (‘Betweenness’) would be much better and faster in communicating certain information, virus, knowledge, tradition, and meme across its entire network. A network with a very low ‘Closeness’ would hence be less effective and efficient in doing the same.

Influence
One of the most important outcomes of SNA is determining influencers across a network, as well as their level of influence. There are various ways to locate influencers such as number of followers, friends or connections as well as level of activity on social media. However more models are needed to better locate influencers. 

PageRank
One such model is the PageRank algorithm developed by Google. It assigns ‘PageRank’ values to each web page based on how many times it is linked to by other webpages. The ‘PageRank’ value of a site is also a function of the ‘PageRank’ value of the other sites that are linking to it. For example, an article on CNN.com linking to a small news organization website increases the ‘PageRank’ value of that site much more than if it was linked to from a local bloggers site with few followers. Applying the ‘PageRank’ concept, assuming the availability of the right information, it can be a simple process to figure out influential bloggers or social media actors across the web. For example, viewing the number of followers a twitter account has and measuring the number of times their tweets have been retweeted and by whom, can give an estimate for how influential the person is on Twitter. This can be also true on Linkedin by measuring a person’s number of contacts, endorsements and level and employers of those contacts. 

In vs. Out Degree
This method is simple and more traditional than ‘PageRank’. It simply implies that people whom the rest of network engages with the most, has the most influence. For example, in online forums, experts are those whom reply to most questions but also their replies or answers are further replied to and commented on by others. This is also observed in online reviews where others can vote on the accuracy and reliability of a review. Another example is that often thousands of people can reply to a celebrity tweet but the celebrities often only acknowledge or engage replies from other celebrities or popular figures. Hence, for purposes of this paper, determining the users that an already known influencer or media outlet replies or engages in conversations with, regardless of platform, could point out a new influential node or user. 

Community
In the crowded world of online social networking, often people are influenced not by a single friend or influential figure but rather their communities. Communities facilitate the spread of some information and memes while minimizing outside influence and information flow. In the social media age, however, it is hard not to be exposed to diverse thoughts or ideas. Social media also allows people to diversify their communities or sometimes become fanatics by adopting extreme versions of their offline communities. For example, a religious person belonging to a small religious offline community might become radicalized and a fanatic only when being exposed to extreme religious communities and views of his or her faith online. To track how influential a community is, few factors need to be considered. First, it’s important to consider how the community is structured and where within the community network the target people are located. People closer to the community hubs are much more likely to be active and follow community traditions than those on the edges. It can also be argued that the more friends and family members a person has belonging to the same community, his or her identity will more likely resemble that of the community. As such communities can be very influential as long as they have little competition from other communities especially if the target person’s social network is made up diverse communities. To illustrate, someone born and raised in Saudi Arabia will have a very different community and social network diversity and hence identity than someone born and raised in Lebanon or Turkey. 

Lastly, social networks are ‘Assortative’, meaning people gravitate toward others with similar characteristics as them. However, the Internet is disassortative and allows for more diverse flow of information. 

Implications
The methods and concepts above can be leveraged effectively to collect and analyze online social networks as well predict future contagions and viral memes. Specifically, identifying network ‘hubs’ or influencers and understanding the contagion threshold across various networks can be a very effective way of controlling or mitigating risks and opportunities. We have to look no further but the YouTube video of the Florida pastor attempting to burn Islam’s Holy book and the protests and deaths it caused across the Muslim world to comprehend how important social networking analysis can be. There are thousands of offensive videos uploaded on YouTube everyday; the question SNA tries to answer is why one particular video goes viral. In the case of the Florida pastor video, an Egyptian TV channel had found the video and broadcasted it across Egypt and eventually the Muslim world causing it to go viral and there is no guarantee that they won’t do it again. The question then becomes how to use social media effectively to mitigate and undermine such ideological and destructive contagions. Would the escalation of violence have been as bad if some influential clerics had dismissed the video as foolish and not worth the death and destruction it eventually caused? Could the Department of State have done a better job on social media to calm the tensions across the Muslim world? Or monitoring twitter activity and hashtags, could the news media have known the extreme anger that was building up, further fueled by various Middle Eastern governments? 

Social networking data and analysis methodology are great for collecting and analyzing trends and contagions but SNA will have to further incorporate social, cultural and behavioral sciences to be most effective. Different cultures or ethnicities behave differently online and especially on social media. For example, Americans having been used to being targeted by advertising and commercials are much less likely to click on Facebook ads than their eastern counterparts where Internet, Facebook and advertising are all new concepts. Further, some culture have very indirect way of signaling or communicating as opposed to the direct communication style of westerners and the English language. For example, in Persian and Arabic it is common to use unusual or exaggerated analogies and metaphors to communicate an idea or a thought. A smart SNA approach would have customized profiles to understand these cultural nuances and differences to be most effective.

1 comment: