- using pandas to read csv file

Since the data wasn't collected in one go there are duplicates we need to remove

First let's make sure we are using nx version 2.1

The graph will be a directed multi graph where the source is the author of the tweet, and the target is the original author of the tweet.

We will need to remove the rows where we don't have an original author, i.e. the tweets that aren't retweets or quotes

We can see that the out degrees follow power-law

And displaying the top 30 nodes

The idea is there will be 2 large communities where one side is the people actually boycotting, and the other side will be the people criticizing the people who are boycotting, which is why we are only getting the first touple of communities detected by the GN algorithm

We'll compare the top 10 most retweeted people in both communities and their top tweets

The weight is the number of edges connecting the two nodes in the multi graph

