The Internet is first of all a world of words. Everyday an immense amount of data and information transits on the Web and the advent of Web 2.0 has intensified these movements. Our management and calculus capabilities are not yet fully able to use the information extracted from the exponential growth of digital texts. However, the speed at which has improved in recent years the relationship between memorization and the management of great amounts of data, makes us hope well for the near future. The encounter between computer science and linguistics is strongly intertwined with statistics and mathematics. The synthesis takes place both at basic research and technological application levels, in particular in the fields of automatic translation, digital recognition, summary of spoken language and in the management of large information systems. Today, more than ever, the quantitative analysis of language is an extraordinary challenge for social science research methodology. In the treatment of digital texts there is a fertile meeting between disciplines that study the uniqueness and the particularity of their subject and disciplines that try to generalize observations by selecting their properties and creating classes of objects. This distinction brought in the past to the separation of human and natural sciences, of interpretation and explanation sciences. Now a synthesis is possible and it is up to the human and social sciences to accept the challenge and to move in the direction of eliminating the presumed contrast between quality and quantity.
Automatic text analysis, text mining, data visualization
Tipo di pubblicazione: