Over the past decade, online social networks (OSNs) have experienced unprecedented growth, attracting billions of users across the globe. These platforms enable individuals to connect and share content, breaking down the barriers of time and location that limit offline social interactions. Among these, Facebook public pages stand out as a prominent type of OSN community, offering spaces for user discussions, business promotions, and public relations activities. These online social communities interact with each other, forming an online community network.
In the digital realm of online spaces, people's behaviors remain closely linked to location. Geolocation information enables online social communities to make recommendations and promote local businesses and services. This dissertation explores the classification of geolocation information for communities and examines how geolocation contributes to neighborhood formation within online community networks.
The dissertation introduces neighborhood state distribution vectors as novel features for graph neural networks to classify the states of Facebook public pages. It also defines intrastate and interstate Facebook public pages based on high-probability state label outputs from the classification model. Furthermore, it profiles states with varying influences over online communities through an analysis of the classification confusion matrix, interstate page percentages, and the presence of interstate pages across state borders. This approach achieves an improved accuracy (0.88) and F1 score (0.88) compared to previous studies.
Additionally, the dissertation identifies key features that influence link formation and neighborhood structuring within the page graph, employing a methodology that combines node similarity and the topological algorithm GNN for link prediction. The study reveals that the page state location stands out as the most significant single feature for link formation. Furthermore, it is observed that incorporating page node degree and page city population features alongside the page state location feature improves link prediction accuracy.
Lastly, the dissertation explores city, county, and cluster neighborhood distribution vectors as unique features for page classification. Addressing the challenge of distinguishing among 630 cities with an initial city classification accuracy of 0.6928, a clustering algorithm is developed to leverage the confusion matrix from city classification, constructing a hierarchical city structure. This approach significantly improves city classification accuracy to 0.8014, employing a cluster-city hierarchical classification strategy.