Tagesschau

Mining tagesschau.de, 26 Nov. 2022 (posts)
I like to read tagesschau.de, so I wrote a script to scrape it in regular intervals. My original goal was to determine which articles stay on the front page the longest, which ones allow commenting (a feature that seems to have been disabled almost entirely since March 2020), and if articles are modified after the initial release (without mentioning this), because I sometimes feel that headlines change. Dataset Creation § Tagesschau provides a JSON API, so fetching all of the articles is …
Categories: Data Mining
1046 Words, Tagged with: Tagesschau · Generative Models · Data Mining
Thumbnail for Mining tagesschau.de