Recycling a genre for news automation
The production of Valtteri the Election Bot
The amount of available digital data is increasing at a tremendous rate. These data, however, are of limited use
unless converted into a user-friendly form. We took on this task and built a natural language generation (NLG) driven system that
generates journalistic news stories about elections without human intervention. In this paper, after presenting an overview of
state-of-the-art technologies in NLG, we explain systematically how we identified and then recontextualized the determinant
aspects of the genre of an online news story in the algorithm of our NLG software. In the discussion, we introduce the key results
of a user test we carried out and some improvements that these results suggest. Then, after relating the news items that our NLG
system generates to general aspects of genres and their evolution, we conclude by questioning the idea that journalistic NLG
systems should mimic journalism written by humans. Instead, we suggest that developmental work in the field of news automation
should aim to create a new genre based on the inherent strengths of NLG. Finally, we present a few suggestions as to what this
genre could include.
Article outline
- 1.Introduction: Converting large data sets into user-friendly form
- 2.State-of-the-art in NLG: A spectrum from ruled-based to training-driven methods
- 3.Case study: The production of Valtteri the Election Bot
- 3.1Domain selection: Topicality and relevance
- 3.2Document planning: Readers’ preferences, news values, and rhetorical structures
- 3.3Microplanning: Lexicalization and aggregation
- 3.4Realization: Orthography and typography
- 4.Discussion: Toward a distinctive genre of online news story
- 4.1Challenge: Trade-offs to improve Valtteri’s performance
- 4.2Solution: From non-typicality to a new genre
- 5.Conclusion
- Notes
-
References
References (34)
Beckett, C.
(
2019)
New powers, new responsibilities. A global survey of journalism and artificial intelligence.
The Journalism AI. Retrieved from
[URL] (22 February, 2020).
Chandola, V., Banerjee, A., & Kumar, V.
(
2009)
Anomaly detection: A survey.
ACM Computing Surveys, 41(3), 151:1–15:58.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Clerwall, C.
(
2014)
Enter the robot journalist: Users’ perceptions of automated content.
Journalism Practice, 8(5), 519–531.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Devitt, A. J.
(
2004)
Writing genres. Carbondale, IL: Southern Illinois University Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Diakopoulos, N.
(
2019)
Automating the news. How algorithms are rewriting the media. Cambridge, MA: Harvard University Press.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Fairclough, N.
(
1992)
Discourse and social change. Cambridge: Polity Press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gatt, A., & Krahmer, E.
(
2017)
Survey of the state of the art in natural language generation: Core tasks, applications and evaluation.
Journal of Artificial Intelligence Research, 601, 75–170.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Graefe, A., Haim, M., Haarmann, B., & Brosius, H.-B.
(
2016)
Readers’ perception of computer-generated news: Credibility, expertise, and readability.
Journalism, 19(5), 595–610.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gupta, M., Gao, J., Aggarwal, C. C., & Han, J.
(
2014)
Outlier detection for temporal data: A survey.
IEEE Transactions on Knowledge and Data Engineering, 26(9), 2250–2267.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hansen, M., Roca-Sales, M., Keegan, J. M., & King, G.
(
2017)
Artificial intelligence: Practice and implications for journalism.
Columbia University Academic Commons.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kim, D. & Lee, J.
(
2019)
Designing an algorithm-driven text generation system for personalized and interactive news reading.
International Journal of Human–Computer Interaction, 35(2), 109–122.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Latar, N. L.
(
2015)
The robot journalist in the age of social physics: The end of human journalism? In
G. Einav (Ed.),
The new world of transitioned media: Digital realignment and industry transformation (pp. 65–80). Wiesbaden: Springer.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lee, A. M.
(
2014)
How fast is too fast? Examining the impact of speed-driven journalism on news production and audience reception (Unpublished doctoral dissertation). The University of Texas at Austin.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Leppänen, L., Munezero, M., Granroth-Wilding, M., & Toivonen, H.
(
2017a)
Data-driven news generation for automated journalism. In
Proceedings of the 10th International Conference on Natural Language Generation, 188–197.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Leppänen, L., Munezero, M., Sirén-Heikel, S., Granroth-Wilding, M., & Toivonen, H.
(
2017b)
Finding and expressing news from structured data. In
Proceedings of the 21st International Academic Mindtrek Conference, 174–183.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lindén, C.-G.
(
2017)
Decades of automation in the newsroom: Why are there still so many jobs in journalism? Digital Journalism, 5(2), 123–140.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Lindén, C.-G., & Tuulonen, H. Eds. together with Bäck, A., Diakopoulos, N., Haapanen, L., Leppänen, L., Melin, M., Munezero, M., Sirén-Heikel, S., Södergård, C., & Toivonen, H.
(
2019)
News Automation: The rewards, risks and realities of “machine journalism”.
WAN-IFRA guide to the field. Reports / The World Association of Newspapers and News Publishers WAN-IFRA.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Luginbühl, M.
(
2014)
Genre profiles and genre change: The case of TV news. In
J. Androutsopoulos (Ed.),
Mediatization and Sociolinguistic Change (pp. 305–330). Berlin, New York: de Gruyter.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Martin, J. R.
(
1985)
Process and text: two aspects of human semiosis. In
J. D. Benson, &
W. S. Greaves (Eds.),
Systemic perspectives on discourse (pp. 248–274). Norwood, NJ: Ablex.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Melin, M., Bäck, A., Södergård, C., Munezero, M., Leppänen, L., & Toivonen, H.
(
2018)
No landslide for the human journalist. An empirical study of computer-generated election news in Finland.
IEEE Access, 61, 43356–43367.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Miller, C.
(
1984)
Genre as social action.
Quarterly Journal of Speech 70(2), 151–167.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Miller, C., & Shepherd, D.
Murphy, K. P.
(
2012)
Machine learning: A probabilistic perspective. Cambridge, MA: The MIT press.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mäntynen, A., & Shore, S.
(
2014)
What is meant by hybridity? An investigation of hybridity and related terms in genre studies.
Text and talk, 34(6), 737–758.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
O’Neill, D., & Harcup, T.
(
2009)
News values and selectivity. In
Wahl-Jorgensen, K. &
Hanitzsch, T. (Eds.)
Handbook of journalism studies (pp. 161–174). New York, NY: Routledge.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pietikäinen, S., & Mäntynen, A.
(
2020)
Uusi kurssi kohti diskurssia. Tampere: Vastapaino.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Pöttker, H.
(
2003)
News and its communicative quality: the inverted pyramid – when and why did it appear? Journalism Studies, 4(4), 501–511.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Rosenberg, H., & Feldman, C. S.
(
2008)
No time to think: The menace of media speed and the 24-hour news cycle. New York, NY: Continuum.
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Weaver, D. H., & Willnat, L.
(Eds.) (
2012)
The global journalist in the 21st century. London: Routledge.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Wölker, A., & Powell, T. E.
(
2018)
Algorithms in the newsroom? News readers’ perceived credibility and selection of automated journalism.
Journalism.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Sirén-Heikel, S., Leppänen, L., Lindén, C.-G., & Bäck, A.
(
2019)
Unboxing news automation: Exploring imagined affordances of automation in news journalism.
Nordic Journal of Media Studies 1(1), 47–66.
![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (1)
Cited by 1 other publications
Haapanen, Lauri
2022.
Adapting Media Self-Regulation to the Era of News Automation. In
Futures of Journalism,
► pp. 81 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.