Mining tagesschau.de,
26 Nov. 2022
(posts)I like to read tagesschau.de, so I wrote a script to scrape it in regular intervals.
My original goal was to determine which articles stay on the front page the longest, which ones allow commenting (a feature that seems to have been disabled almost entirely since March 2020), and if articles are modified after the initial release (without mentioning this), because I sometimes feel that headlines change.
Dataset Creation Tagesschau provides a JSON API, so fetching all of the articles is …
1040 Words, Tagged with:
Tagesschau ·
Generative Models ·
Data Mining