Desk dos presents the connection anywhere between sex and whether a person delivered a great geotagged tweet when you look at the studies period

Even though there is a few really works one questions if the step 1% API are arbitrary about tweet framework such as for instance hashtags and you will LDA analysis , Myspace holds that testing algorithm try “completely agnostic to almost any substantive metadata” and that is ergo “a fair and you may proportional symbolization across the all of the mix-sections” . Once the we possibly may not expect one systematic bias getting present in the study due to the characteristics of your own step one% API stream i think of this investigation to be an arbitrary decide to try of one’s Fb population. We have zero a good priori reason behind convinced that profiles tweeting when you look at the aren’t affiliate of inhabitants and now we can also be ergo incorporate inferential statistics and you will benefits evaluation to check hypotheses about the whether or not people differences when considering people with geoservices and geotagging enabled disagree to people who don’t. There will probably well be users who have generated geotagged tweets exactly who aren’t acquired on step 1% API weight and it’ll always be a limitation of any lookup that doesn’t fool around with one hundred% of your own study and that’s a significant qualification in virtually any look with this specific data source.

Twitter terms and conditions stop all of us out-of publicly discussing the fresh metadata given by the newest API, hence ‘Dataset1′ and ‘Dataset2′ have just the member ID (that is appropriate) additionally the demographics i’ve derived: tweet words, gender, many years and you may NS-SEC. Duplication with the studies should be presented as a consequence of personal experts using associate IDs to collect the new Facebook-put metadata that people you should never bookofmatches desktop share.

Venue Characteristics versus. Geotagging Individual Tweets

Thinking about all of the pages (‘Dataset1′), full 58.4% (n = 17,539,891) of profiles don’t possess place characteristics permitted as the 41.6% manage (letter = a dozen,480,555), for this reason showing that most users don’t choose that it function. Alternatively, the new ratio ones into the function permitted are highest given you to pages need decide inside. Whenever leaving out retweets (‘Dataset2′) we see one 96.9% (letter = 23,058166) haven’t any geotagged tweets throughout the dataset even though the step three.1% (n = 731,098) do. This is greater than just earlier quotes off geotagged articles out-of around 0.85% due to the fact focus on the studies is on the new proportion away from profiles using this characteristic rather than the proportion out of tweets. But not, it’s known that regardless if a hefty proportion away from pages permitted the global function, hardly any then relocate to in reality geotag their tweets–thus indicating clearly one enabling cities services are an essential however, perhaps not adequate status of geotagging.


Table 1 is a crosstabulation of whether location services are enabled and gender (identified using the method proposed by Sloan et al. 2013 ). Gender could be identified for 11,537,140 individuals (38.4%) and there is a slight preference for males to be less likely to enable the setting than females or users with names classified as unisex. There is a clear discrepancy in the unknown group with a disproportionate number of users opting for ‘not enabled’ and as the gender detection algorithm looks for an identifiable first name using a database of over 40,000 names, we may observe that there is an association between users who do not give their first name and do not opt in to location services (such as organisational and business accounts or those conscious of maintaining a level of privacy). When removing the unknowns the relationship between gender and enabling location services is statistically significant (x 2 = 11, 3 df, p<0.001) as is the effect size despite being very small (Cramer's V = 0.008, p<0.001).

Male users are more likely to geotag their tweets then female users, but only by an increase of 0.1%. Users for which the gender is unknown show a lower geotagging rate, but most interesting is the gap between unisex geotaggers and male/female users, which is notably larger for geotagging than for enabling location services. This means that although similar proportions of users with unisex names enabled location services as those with male or female names, they are notably less likely to geotag their tweets than male or female users. When removing unknowns the difference is statistically significant (x 2 = , 2 df, p<0.001) with a small effect size (Cramer's V = 0.011, p<0.001).