Ruis in big data

Screen Shot 2014-03-25 at 14.38.15Een verstandige stap van de ING vorige week: de bank laat zijn proefballon voortijds leeglopen en gaat voorlopig niet proberen de ‘big data’ die haar klantgegevens vormen voorlopig niet proberen te gelde te maken. Dat is heel verstandig van de bank en het biedt de gelegenheid eerst eens even stil te staan met de – niet geringe – consequenties van het big data-denken dat overal om ons heen postvat. En niet alleen in het bonuskaartensysteem van het bedrijfsleven. Big data zijn ook de miljoenen telefoongesprekken die de Nederlandse inlichtingendiensten MIVD en AIVD maandelijks al dan niet legaal afvangen.

Wat is big data? Stel dat iemand  met griep naar zijn huisarts gaat. Als diegene influenza had, is dat gegeven samen met alle andere griepgevallen in Europa bij de Wereldgezondheidsorganisatie beland. Die houdt zo al decennia de verspreiding bij van het influenza-virus in Europa. Door met die data grafieken en kaarten te maken, worden patronen zichtbaar die waardevolle inzichten geven over de terugkerende risicoperiode’s en -gebieden voor griep. Het perspectief van de grieperige patiënt tegenover dat van de WHO is het verschil tussen ‘small data’ en ‘big data’.

Lees verder

Advertenties

Digital Humanities like The Secret of Monkey Island™


Cableway to Hook IsleIn their excellent chapter on the use of digital data in historical research, Frederick W. Gibbs and Trevor J. Owens distinguish between two DH approaches to data. ‘Data’, they argue, ‘does not always have to be used as evidence. It can also help with discovering and framing research questions’. On the one hand, you have ‘complex statistical methods’ and ‘rigorous mathematics’ (or ‘mathematical rigor’) to ‘support epistemological claims’. Gibbs and Owens equal this type of DH research to the wave of quantitative history in the 1960s and 1970s, using data ‘for quantifying, computing and creating knowledge’.

On the other, there is a ‘fundamentally different’ form of using data – a form that is exploratory instead of analytic and deliberately without the mathematical complexity that is needed to derive evidence from quantitative analyses. Above all, it’s a form of data manipulation that can be playful (although the authors removed the adjective at one of the places it appeared in their text). Gibbs and Owens state that ‘playing with data – in all its formats and forms – is more important than ever.
Lees verder

Before your do digital history…

Histogram and word cloud 'Eugenetica'This blog post is the adapted conclusion from the paper ‘A Digital Humanities Approach to the History of Science.
Eugenics revisited in hidden debates by means of semantic text mining’ I wrote in collaboration with Fons Laan, Maarten de Rijke and Toine Pieters. The article was based on the research I did within the historical text mining project BILAND, as well as its predecessor WAHSP. The article is in press as part of the Proceedings of the 1st International Workshop on Histoinformatics
.

In a recent blog post called ‘The Deceptions of Data’, Andrew Prescott has criticized the jubilation of the ‘digital revolution’. He states that “One of the problems confronting data enthusiasts in the humanities is that we feel a need to convince our more old fashioned colleagues about what can be done. But our role as advocates of digitized data shouldn’t mean that we lose our critical sense as scholars. [. . . ] [T]here is a risk that we look more carefully at the technical components of the datasets than the historical context of the information that they represent.” Lees verder