Skip to main content
eScholarship
Open Access Publications from the University of California

School of Information

Recent Work bannerUC Berkeley

Automatically Assessing the Quality of Wikipedia Articles

Abstract

Since its inception in 2001, Wikipedia has fast become one of the Internet's most dominant sources of information. Dubbed "the free encyclopedia", Wikipedia contains millions of articles that are written, edited, and maintained by volunteers. Due in part to the open, collaborative process by which content is generated, many have questioned the reliability of these articles. The high variance in quality between articles is a potential source of confusion that likely leaves many visitors unable to distinguish between good articles and bad. In this work, we describe how a very simple metric – word count – can be used to as a proxy for article quality, and discuss the implications of this result for Wikipedia in particular, and quality assessment in general.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View