Mining the Bundestag,
22 Jan. 2023
(posts)
Did you know that the german parliament publishes protocols of all of its proceedings in PDF format? It is relatively straight-forward to download and parse them, so we can easily collect a dataset of transcripts of what seems to be every speech in the Bundestag since the second world war.
My original idea was to mine the speeches for word associations: some words will be associated with other words, based on the intended connotation, and this association might change over time as the connotations change.
1032 Words, Tagged with: Bundestag