Recycling a genre for news automation
The production of Valtteri the Election Bot
The amount of available digital data is increasing at a tremendous rate. These data, however, are of limited use
unless converted into a user-friendly form. We took on this task and built a natural language generation (NLG) driven system that
generates journalistic news stories about elections without human intervention. In this paper, after presenting an overview of
state-of-the-art technologies in NLG, we explain systematically how we identified and then recontextualized the determinant
aspects of the genre of an online news story in the algorithm of our NLG software. In the discussion, we introduce the key results
of a user test we carried out and some improvements that these results suggest. Then, after relating the news items that our NLG
system generates to general aspects of genres and their evolution, we conclude by questioning the idea that journalistic NLG
systems should mimic journalism written by humans. Instead, we suggest that developmental work in the field of news automation
should aim to create a new genre based on the inherent strengths of NLG. Finally, we present a few suggestions as to what this
genre could include.
Article outline
- 1.Introduction: Converting large data sets into user-friendly form
- 2.State-of-the-art in NLG: A spectrum from ruled-based to training-driven methods
- 3.Case study: The production of Valtteri the Election Bot
- 3.1Domain selection: Topicality and relevance
- 3.2Document planning: Readers’ preferences, news values, and rhetorical structures
- 3.3Microplanning: Lexicalization and aggregation
- 3.4Realization: Orthography and typography
- 4.Discussion: Toward a distinctive genre of online news story
- 4.1Challenge: Trade-offs to improve Valtteri’s performance
- 4.2Solution: From non-typicality to a new genre
- 5.Conclusion
- Notes
-
References
References (34)
References
Beckett, C. (2019). New powers, new responsibilities. A global survey of journalism and artificial intelligence. The Journalism AI. Retrieved from <[URL]> (22 February, 2020).
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 151:1–15:58.
Clerwall, C. (2014). Enter the robot journalist: Users’ perceptions of automated content. Journalism Practice, 8(5), 519–531.
Devitt, A. J. (2004). Writing genres. Carbondale, IL: Southern Illinois University Press.
Diakopoulos, N. (2019). Automating the news. How algorithms are rewriting the media. Cambridge, MA: Harvard University Press.
Fairclough, N. (1992). Discourse and social change. Cambridge: Polity Press.
Gatt, A., & Krahmer, E. (2017). Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 601, 75–170.
Graefe, A., Haim, M., Haarmann, B., & Brosius, H.-B. (2016). Readers’ perception of computer-generated news: Credibility, expertise, and readability. Journalism, 19(5), 595–610.
Gupta, M., Gao, J., Aggarwal, C. C., & Han, J. (2014). Outlier detection for temporal data: A survey. IEEE Transactions on Knowledge and Data Engineering, 26(9), 2250–2267.
Hansen, M., Roca-Sales, M., Keegan, J. M., & King, G. (2017). Artificial intelligence: Practice and implications for journalism. Columbia University Academic Commons.
Kim, D. & Lee, J. (2019). Designing an algorithm-driven text generation system for personalized and interactive news reading. International Journal of Human–Computer Interaction, 35(2), 109–122.
Latar, N. L. (2015). The robot journalist in the age of social physics: The end of human journalism? In G. Einav (Ed.), The new world of transitioned media: Digital realignment and industry transformation (pp. 65–80). Wiesbaden: Springer.
Lee, A. M. (2014). How fast is too fast? Examining the impact of speed-driven journalism on news production and audience reception (Unpublished doctoral dissertation). The University of Texas at Austin.
Leppänen, L., Munezero, M., Granroth-Wilding, M., & Toivonen, H. (2017a). Data-driven news generation for automated journalism. In Proceedings of the 10th International Conference on Natural Language Generation, 188–197.
Leppänen, L., Munezero, M., Sirén-Heikel, S., Granroth-Wilding, M., & Toivonen, H. (2017b). Finding and expressing news from structured data. In Proceedings of the 21st International Academic Mindtrek Conference, 174–183.
Lindén, C.-G. (2017). Decades of automation in the newsroom: Why are there still so many jobs in journalism? Digital Journalism, 5(2), 123–140.
Lindén, C.-G., & Tuulonen, H. (Eds.) together with Bäck, A., Diakopoulos, N., Haapanen, L., Leppänen, L., Melin, M., Munezero, M., Sirén-Heikel, S., Södergård, C., & Toivonen, H. (2019). News Automation: The rewards, risks and realities of “machine journalism”. WAN-IFRA guide to the field. Reports / The World Association of Newspapers and News Publishers WAN-IFRA.
Luginbühl, M. (2014). Genre profiles and genre change: The case of TV news. In J. Androutsopoulos (Ed.), Mediatization and Sociolinguistic Change (pp. 305–330). Berlin, New York: de Gruyter.
Martin, J. R. (1985). Process and text: two aspects of human semiosis. In J. D. Benson, & W. S. Greaves (Eds.), Systemic perspectives on discourse (pp. 248–274). Norwood, NJ: Ablex.
Melin, M., Bäck, A., Södergård, C., Munezero, M., Leppänen, L., & Toivonen, H. (2018). No landslide for the human journalist. An empirical study of computer-generated election news in Finland. IEEE Access, 61, 43356–43367.
Miller, C. (1984). Genre as social action. Quarterly Journal of Speech 70(2), 151–167.
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. Cambridge, MA: The MIT press.
Mäntynen, A., & Shore, S. (2014). What is meant by hybridity? An investigation of hybridity and related terms in genre studies. Text and talk, 34(6), 737–758.
O’Neill, D., & Harcup, T. (2009). News values and selectivity. In Wahl-Jorgensen, K. & Hanitzsch, T. (Eds.) Handbook of journalism studies (pp. 161–174). New York, NY: Routledge.
Pietikäinen, S., & Mäntynen, A. (2020). Uusi kurssi kohti diskurssia. Tampere: Vastapaino.
Pöttker, H. (2003). News and its communicative quality: the inverted pyramid – when and why did it appear? Journalism Studies, 4(4), 501–511.
Rosenberg, H., & Feldman, C. S. (2008). No time to think: The menace of media speed and the 24-hour news cycle. New York, NY: Continuum.
Weaver, D. H., & Willnat, L. (Eds.). (2012). The global journalist in the 21st century. London: Routledge.
Wölker, A., & Powell, T. E. (2018). Algorithms in the newsroom? News readers’ perceived credibility and selection of automated journalism. Journalism.
Sirén-Heikel, S., Leppänen, L., Lindén, C.-G., & Bäck, A. (2019). Unboxing news automation: Exploring imagined affordances of automation in news journalism. Nordic Journal of Media Studies 1(1), 47–66.
Cited by (1)
Cited by one other publication
Haapanen, Lauri
2022.
Adapting Media Self-Regulation to the Era of News Automation. In
Futures of Journalism,
► pp. 81 ff.
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.