Skip to main content

Web Content Analysis: Expanding the Paradigm

  • Chapter
  • First Online:

Abstract

Are established methods of content analysis (CA) adequate to analyze web content, or should new methods be devised to address new technological developments? This article addresses this question by contrasting narrow and broad interpretations of the concept of web content analysis. The utility of a broad interpretation that subsumes the narrow one is then illustrated with reference to research on weblogs (blogs), a popular web format in which features of HTML documents and interactive computer-mediated communication converge. The article concludes by proposing an expanded Web Content Analysis (WebCA) paradigm in which insights from paradigms such as discourse analysis and social network analysis are operationalized and implemented within a general content analytic framework.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    While McMillan (2000) acknowledges that “the size of the sample depends on factors such as the goals of the study” (p. 2, emphasis added), she does not mention that different research goals/questions might call for different types of samples. Rather, she asserts that random samples are required for “rigor” in all CA studies—a claim that many researchers would dispute (see, e.g., note 5).

  2. 2.

    For descriptions of these and other classic interrater reliability measures, see Scott (1955), Holsti (1969), and Krippendorff (1980, 2008).

  3. 3.

    In a review of 25 years of content analyses, Riffe and Freitag (1997; cited in Weare & Lin, 2000) found that most studies were based on convenience or purposive samples; only 22.2% of the studies attempted to be representative of the population of interest.

  4. 4.

    On grounded theory, see Glaser and Strauss (1967).

  5. 5.

    Herring (2004, p. 350) notes that “in CMDA, [sampling] is rarely done randomly, since random sampling sacrifices context, and context is important in interpreting discourse analysis results.”

  6. 6.

    This estimate is based on a report that the number of blogs created at major hosts was 134-144 million in October 2005 (http://www.blogherald.com/2005/10/10/the-blog-herald-blog-count-october-2005/, accessed December 7, 2007). Blog creation, especially in countries outside the U.S., has increased since then, although many blogs have also been abandoned (Wikipedia, June 28, 2008).

  7. 7.

    The (We)blog Research on Genre (BROG) project. See http://en.wikipedia.org/wiki/BROG, accessed August 26, 2009.

  8. 8.

    For example, Herring, Scheidt, et al. (2004, 2005) found that contrary to popular claims that blog entries typically contain links and link often to other blogs, the average number of links in entries in randomly-selected blogs was .65, and most entries contained 0 links. Moreover, the majority of links were to websites created by others, with links to other blogs coming in a distant third.

  9. 9.

    See, e.g., Herring, Scheidt, et al. (2004, 2005); Mishne and Glance (2006).

  10. 10.

    This study is an exception to the generalization that most computational web studies do not orient toward content analysis. The stated goal of Nakajima et al. (2005, p. 1) is to capture and analyze “conversational web content” in blogs.

References

  • Ali-Hasan, N., & Adamic, L. (2007). Expressing social relationships on the blog through links and comments. Paper presented at the international conference for weblogs and social media, Boulder, CO.

    Google Scholar 

  • Balog, K., Mishne, G., & Rijke, M. (2006). Why are they excited? Identifying and explaining spikes in blog mood levels. Paper presented at the 11th meeting of the European Chapter of the Association for Computational Linguistics, Trento, Italy.

    Google Scholar 

  • Baran, S. J. (2002). Introduction to mass communication (2nd ed.) New York: McGraw-Hill.

    Google Scholar 

  • Bates, M. J., & Lu, S. (1997). An exploratory profile of personal home pages: Content, design, metaphors. Online and CDROM Review, 21(6), 331–340.

    Article  Google Scholar 

  • Bauer, M. (2000). Classical content analysis: A review. In M. W. Bauer & G. Gaskell (Eds.), Qualitative researching with text, image, and sound: A practical handbook (pp. 131–151). London: Sage.

    Google Scholar 

  • Berelson, B. (1952). Content analysis in communication research. New York: Free Press.

    Google Scholar 

  • Berelson, B., & Lazarsfeld, P. F. (1948). The analysis of communication content. Chicago/New York: University of Chicago and Columbia University.

    Google Scholar 

  • Blood, R. (2002). Introduction. In J. Rodzvilla (Ed.), We’ve got blog: How weblogs are changing our culture (pp. ix–xiii). Cambridge, MA: Perseus.

    Google Scholar 

  • Bush, C. R. (1951). The analysis of political campaign news. Journalism Quarterly, 28(2), 250–252.

    Google Scholar 

  • Dimitrova, D. V., & Neznanski, M. (2006). Online journalism and the war in cyberspace: A comparison between U.S. and international newspapers. Journal of Computer-Mediated Communication, 12(1), Article 13. Retrieved from http://jcmc.indiana.edu/vol12/issue1/dimitrova.html

  • Efimova, L., & de Moor, A. (2005). Beyond personal web publishing: An exploratory study of conversational blogging practices. Proceedings of the Thirty-Eighth Hawaii International Conference on System Sciences. Los Alamitos, CA: IEEE.

    Google Scholar 

  • Fogg, B. J., Kameda, T., Boyd, J., Marshall, J., Sethi, R., Sockol, M., et al. (2002). Stanford-Makovsky web credibility study 2002: Investigating what makes web sites credible today. Retrieved from http://captology.stanford.edu/pdf/Stanford-MakovskyWebCredStudy2002-prelim.pdf

  • Foot, K. A., Schneider, S. M., Dougherty, M., Xenos, M., & Larsen, E. (2003). Analyzing linking practices: Candidate sites in the 2002 U.S. electoral Web sphere. Journal of Computer-Mediated Communication, 8(4). Retrieved from http://jcmc.indiana.edu/vol8/issue4/foot.html

  • Gibson, G., Kleinberg, J., & Raghavan, P. (1998). Inferring web communities from link topology. Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Pittsburgh, PA: ACM.

    Google Scholar 

  • Glaser, B., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research. Chicago: Aldine.

    Google Scholar 

  • Herring, S. C. (2004). Computer-mediated discourse analysis: An approach to researching online behavior. In S. A. Barab, R. Kling, & J. H. Gray (Eds.), Designing for virtual communities in the service of learning (pp. 338–376). New York: Cambridge University Press.

    Google Scholar 

  • Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in weblogs. Journal of Sociolinguistics, 10(4), 439–459.

    Article  Google Scholar 

  • Herring, S. C., Kouper, I., Paolillo, J., Scheidt, L. A., Tyworth, M., Welsch, P., et al. (2005). Conversations in the blogosphere: An analysis “from the bottom up.” Proceedings of the Thirty-Eighth Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.

    Google Scholar 

  • Herring, S. C., Scheidt, L. A., Bonus, S., & Wright, E. (2004). Bridging the gap: A genre analysis of weblogs. Proceedings of the Thirty-Seventh Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.

    Google Scholar 

  • Herring, S. C., Scheidt, L. A., Bonus, S., & Wright, E. (2005). Weblogs as a bridging genre. Information, Technology & People, 18(2), 142–171.

    Article  Google Scholar 

  • Herring, S. C., Scheidt, L. A., Kouper, I., & Wright, E. (2006). Longitudinal content analysis of weblogs: 2003–2004. In M. Tremayne (Ed.), Blogging, citizenship, and the future of media (pp. 3–20). London: Routledge.

    Google Scholar 

  • Holsti, O. R. (1969). Content analysis for the social sciences and humanities. Reading, MA: Addison Wesley.

    Google Scholar 

  • Huffaker, D. A., & Calvert, S. L. (2005). Gender, identity and language use in teenage blogs. Journal of Computer-Mediated Communication, 10(2). Retrieved from http://jcmc.indiana.edu/vol10/issue2/huffaker.html

  • Jackson, M. (1997). Assessing the structure of communication on the world wide web. Journal of Computer-Mediated Communication, 3(1). Retrieved from http://www.ascusc.org/jcmc/vol3/issue1/jackson.html

  • Krippendorff, K. (1980). Content analysis: An introduction to its methodology. Newbury Park: Sage.

    Google Scholar 

  • Krippendorff, K. (2008). Testing the reliability of content analysis data: What is involved and why. In K. Krippendorff & M. A. Bock (Eds.), The content analysis reader (pp. 350–357). Thousand Oaks, CA: Sage. Retrieved from http://www.asc.upenn.edu/usr/krippendorff/dogs.html

  • Kutz, D. O., & Herring, S. C. (2005). Micro-longitudinal analysis of web news updates. Proceedings of the Thirty-Eighth Hawai’i International Conference on System Sciences. Los Alamitos, CA: IEEE.

    Google Scholar 

  • McMillan, S. J. (2000). The microscope and the moving target: The challenge of applying content analysis to the world wide web. Journalism and Mass Communication Quarterly, 77(1), 80–98.

    MathSciNet  Google Scholar 

  • Mishne, G., & Glance, N. (2006). Leave a reply: An analysis of weblog comments. Proceedings of the 3rd Annual Workshop on the Weblogging Ecosystem, 15th World Wide Web Conference, Edinburgh.

    Google Scholar 

  • Mitra, A. (1999). Characteristics of the WWW text: Tracing discursive strategies. Journal of Computer-Mediated Communication, 5(1). Retrieved from http://www.ascusc.org/jcmc/vol5/issue1/mitra.html

  • Mitra, A., & Cohen, E. (1999). Analyzing the web: Directions and challenges. In S. Jones (Ed.), Doing internet research: Critical issues and methods for examining the net (pp. 179–202). Thousand Oaks, CA: Sage.

    Google Scholar 

  • Nakajima, S., Tatemura, J., Hino, Y., Hara, Y., & Tanaka, K. (2005). Discovering important bloggers based on analyzing blog threads. Paper presented at WWW2005, Chiba, Japan.

    Google Scholar 

  • Park, H. W. (2003). What is hyperlink network analysis? New method for the study of social structure on the web. Connections, 25(1), 49–61.

    Google Scholar 

  • Pfeil, U., Zaphiris, P., & Ang, C. S. (2006). Cultural differences in collaborative authoring of Wikipedia. Journal of Computer-Mediated Communication, 12(1), Article 5. Retrieved from http://jcmc.indiana.edu/vol12/issue1/pfeil.html

  • Scheidt, L. A., & Wright, E. (2004). Common visual design elements of weblogs. In L. Gurak, S. Antonijevic, L. Johnson, C. Ratliff, & J. Reyman (Eds.), Into the blogosphere: Rhetoric, community, and culture of weblogs. Retrieved from http://blog.lib.umn.edu/blogosphere/

  • Schneider, S. M., & Foot, K. A. (2004). The web as an object of study. New Media & Society, 6(1), 114–122.

    Article  Google Scholar 

  • Scott, W. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 17, 321–325.

    Article  Google Scholar 

  • Singh, N., & Baack, D. W. (2004). Web site adaptation: A cross-cultural comparison of U.S. and Mexican web sites. Journal of Computer-Mediated Communication, 9(4). Retrieved from http://jcmc.indiana.edu/vol9/issue4/singh_baack.html

  • Thelwall, M. (2002). The top 100 linked pages on UK university web sites: High inlink counts are not usually directly associated with quality scholarly content. Journal of Information Science, 28(6), 485–493.

    Article  Google Scholar 

  • Trammell, K. D. (2006). Blog offensive: An exploratory analysis of attacks published on campaign blog posts from a political public relations perspective. Public Relations Review, 32(4), 402–406.

    Article  Google Scholar 

  • Trammell, K. D., Tarkowski, A., Hofmokl, J., & Sapp, A. M. (2006). Rzeczpospolita blogów [Republic of Blog]: Examining Polish bloggers through content analysis. Journal of Computer-Mediated Communication, 11(3), Article 2. Retrieved from http://jcmc.indiana.edu/vol11/issue3/trammell.html

  • Tremayne, M., Zheng, N., Lee, J. K., & Jeong, J. (2006). Issue publics on the web: Applying network theory to the war blogosphere. Journal of Computer-Mediated Communication, 12(1), Article 15. Retrieved from http://jcmc.indiana.edu/vol12/issue1/tremayne.html

  • Wakeford, N. (2000). New media, new methodologies: Studying the web. In D. Gauntlett (Ed.), Web.studies: Rewiring media studies for the digital age (pp. 31–42). London: Arnold.

    Google Scholar 

  • Waseleski, C. (2006). Gender and the use of exclamation points in computer-mediated communication: An Analysis of exclamations posted to two electronic discussion lists. Journal of Computer-Mediated Communication, 11(4), Article 6. Retrieved http://jcmc.indiana.edu/vol11/issue4/waseleski.html

  • Weare, C., & Lin, W. Y. (2000). Content analysis of the world wide web – Opportunities and challenges. Social Science Computer Review, 18(3), 272–292.

    Article  Google Scholar 

  • Wikipedia. (2008). Blog. Retrieved on June 28, 2008, from http://en.wikipedia.org/wiki/Blog

  • Williams, P., Tramell, K., Postelnicu, M., Landreville, K., & Martin, J. (2005). Blogging and hyperlinking: Use of the web to enhance visibility during the 2004 U.S. campaign. Journalism Studies, 6(2), 177–186.

    Article  Google Scholar 

  • Young, J., & Foot, K. (2005). Corporate e-cruiting: The construction of work in Fortune 500 recruiting web sites. Journal of Computer-Mediated Communication, 11(1), Article 3. Retrieved from http://jcmc.indiana.edu/vol11/issue1/young.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Susan C. Herring .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Herring, S.C. (2009). Web Content Analysis: Expanding the Paradigm. In: Hunsinger, J., Klastrup, L., Allen, M. (eds) International Handbook of Internet Research. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-9789-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-9789-8_14

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-9788-1

  • Online ISBN: 978-1-4020-9789-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics