Corpora are excellent resources for learning, particularly considering research showing the importance of frequently used word clusters (called lexical bundles or collocations) in promoting learner fluency. However, in the context of Chinese language, most of the available corpus resources seem to be unrepresentative of how native Chinese use language in everyday life, possibly due to the influence of writers’ awareness of censorship on the Chinese internet.
The goal of this endeavor was to create a corpus of reliably natural text from China’s national newspaper, The People’s Daily（人民日报，人民网），for the purpose of identifying lexical bundles that serve to create structure in Chinese sentences in the news register. The corpus was extracted from The People’s Daily website using a web crawler, to a total of more than five hundred articles and about one million Chinese characters.
As such, the presentation will reveal several trends in collocation, which can serve as a resource to develop learning materials to improve Chinese language learning, especially for improving Chinese reading skill in the domain of news articles.
Josleyn, Randy, "Frequently Used Phrases in China's National Newspaper." (2016). 2016 Undergraduate Research and Scholarship Conference. Paper 93.