doxa.comunicación | 29, pp. 235-254 | 239

July-December of 2019

José Luis Rojas Torrijos and Carlos Toural Bran

ISSN: 1696-019X / e-ISSN: 2386-3978

Thus, the first sub-hypothesis should point out that these types of reports, in spite of working as a kind of template that predisposes the structure, order and length of the headlines and paragraphs in the body of the text, are improved by programming through the introduction of synonyms and second references, as well as statistical context data, in order for the reports to be more informative and attractive.

Moreover, as a second sub-hypothesis, these automated reports are supported by data and supposedly unquestionable facts, which makes it difficult to find subjective elements in them. When these data appear, they reflect the interpretation that the journalist-editor who has entered the data into the programming may have construed of situations that usually occur in a football match. Moreover, these particular ‘readings’ of the game, due to linguistic habits acquired among sports journalists, at times may not correspond exactly to what happened on the field.

Taking into account the starting point of the hypothesis and the sub-hypotheses of this case study, the objectives of this research are as follows:

- Analyse the structure and content of the sports reports generated automatically by the virtual editor AnaFut in El Confidencial.

- Obtain quantitative and qualitative measurements on the type of naturally-generated language produced, paying special attention to the degree of repetition existing in formulas referring to situations of victory, tie or defeat in the course of the matches reported.

- Examine the use of synonyms and second references in texts, as well as the input of statistical data that may lead to an improvement in the quality of the reports.

- Evaluate the extent to which the use of the technology can benefit this type of journalism, based on content analysis and the gathering of opinions through interviews with journalists and questionnaires given to experts, as this technology offers greater topic diversification, a wider scope of news coverage, and the removal of automatic tasks from editors so they can devote more time to reporting and research.

3.2. Sample and methodology

For this purpose, based on a sample that includes the detailed analysis of 80 automatically-generated reports published on the El Confidencial website during the 2018/2019 season, variables have been stablished that include length of the articles, length of paragraphs, types of headlines, lexicon used, etc. Moreover, the overall aim is to evaluate the degree of repetition of these texts and to consider the extent to which this technology can be improved on the basis of programming and human editorial intervention or supervision.

Interviews and questionnaires

In order to complete this analysis and give more context to the study, two other methodological techniques have been used: on one hand, semi-structured interviews with those directly responsible for the El Confidencial Lab, the department that developed the technology, as well as with those in charge of the media’s Sports newsroom, where its application began; on the other hand, questionnaires were given to a panel of five experts in journalistic innovation.