SOSI - Open Monitoring of Societies and their Interactions

CNRS Humanities & Social Sciences (co-)steers over 200 research units and nearly 300 research structures. The Institute has also promoted and encouraged the emergence of research networks for several years both nationally and internationally. These foresight or thematic networks are made up of both joint research units (UMRs) and research units (URs).



CNRS Humanities & Social Sciences also has a policy of providing priority support for long-term studies or investigations focusing on a number of scientific and social issues that are in line with the scientific priorities of the Institute. Indeed, these kinds of long-term formats are necessary to effectively study certain subjects by developing in-depth and/or diachronic analyses of social, economic, political and cultural dynamics, etc. This includes issues linked to contemporary events.

This form of long-term support is intended to ensure the continuity of the research carried out. This enables research communities to avoid having to move from one call for projects (CFP) to the next, adapt their chosen study issues to fit into the framework of successive CFPs and enables their research to develop without interruptions. It is also intended to help structure sustainable and stable research and research engineering communities.

To achieve this, since 2021 CNRS Humanities & Social Sciences has been developing SOSI (Open Monitoring of Societies and their Interactions) projects. These are forms of accompaniment and support for long-term research studies or surveys in the humanities and social sciences. In some cases, these SOSIs can be located on a dedicated geographical/institutional site to respond to the requirements of the research project and systematically basing themselves on configurations of stakeholders (inhabitants, territorial/local authorities, associations, professionals, artistic community, etc.). Surveys carried out in this way produce scientific results made available to a non-academic audience through an open access database, online encyclopaedia, etc.

SOSIs can involve several research teams from the HSS and even beyond these fields if the research question justifies it along with several disciplinary skill sets. It particularly makes it possible to forge cross-disciplinary links between disciplines and research centres to work on a common subject. Surveys can be based on different types of sources such as archives, administrative data, ethnographic observations or interviews. The scenario envisaged for a SOSI is that 3 to 5 teams should work together on a common research question with dedicated scientific coordination. The format for managing this association of teams can vary which can, for example, be led by a flagship UMR, a research federation or a platform.

CNRS Humanities & Social Sciences can provide financial support or human resources, depending on the requirements identified and discussed with the SOSI's coordinators. Applicants for this funding can also seek other sources of financial support and are actually encouraged to do so. The SOSI could, for example, help finance just one aspect of the project and thereby ensure the continuity of the project as a whole. This support and the renewal thereof are subject to annual arbitration by CNRS Humanities & Social Sciences in the first quarter of each year. The SOSIs selected for support work on subjects that come under the disciplinary, methodological or thematic scientific priorities of CNRS Humanities & Social Sciences and areas of research that make up the major foundations of the research carried out in the units co-steered by the Institute (

The research supported is presented on CNRS Humanities & Social Sciences website ( CNRS Humanities & Social Sciences may contact SOSI coordinators to request a contribution to the Institute's newsletter.

Monitoring and coordination at CNRS Humanities & Social Sciences: Fabrice Boudjaaba

Macroscopes for the analysis of social, opinion and value dynamics in digital worlds

Evolution of the importance of different climate activist communities over time and in reaction to events (Oct. 2019 - June 2020). Each tube corresponds to a community of activists. The thickness of a tube is proportional to the number of activists belonging to that community. The main figure corresponds to the moment the COVID-19 pandemic became the focal point of world news which mechanically caused a drop in activity for climate-related exchanges, thus illustrating the phenomenon of limited collective attention at the global level. This part is an enlargement of the part of the upper diagram surrounded by dotted lines and labelled 'COVID-19'. This upper diagram shows several flurries of activity that correspond to climate summits or important political moments like the 2020 US presidential election.
Data source: Multivac/climatoscope.
Source of visualisation: Chavalarias, D., Bouchaud, P., Chomel, V., Panahi, M., 2023. Les nouveaux fronts du dénialisme et du climato-scepticisme.

How can data from digital worlds be used to understand the long-term evolution of national opinion and value dynamics and what lessons can be drawn from this as regards radicalisation, polarisation and disinformation phenomena? How do the major digital platforms influence forms of socialisation and what impact does this have on social groups?

The objective of this SOSI is to study and thus understand the phenomena of cultural and social change over very long periods (fifteen to twenty years) using data with individual- and minute-level granularity and observations of digital environments.

The SOSI aims to enable and support in-depth diachronic analysis of social, political and cultural dynamics. This will include shedding light on contemporary events through global and longitudinal studies using data (formalisation, quantification, modelling and simulation). Long-term data from digital worlds (e.g. Twitter, Mastodon, YouTube) will be collected and shared while collaborative analysis and visualisation environments and macroscopes will be produced.

Macroscopes were introduced by Joël de Rosnay in 1975 and are the equivalent of microscopes in biology for study of the social sphere. These are tools for observing collectives formed by human beings in an overall and reflective manner. These devices can be used to navigate interactively through large structured sets of 'social traces', provide a synthetic view of a collective phenomenon and make it possible to explore the details of a phenomenon at the micro level (entities and interactions). They are essential intermediary tools for linking social science models with field data.

This SOSI's work relies on the 'Multivac' infrastructure run by the Institut des Systèmes Complexes de Paris Île-de-France(ISC-PIF, UAR3611, CNRS) and also on national computing centres like the Laboratory of the Physics of the Two Infinities - Irène Joliot-Curie (IJCLab, UMR9012, CNRS / Paris-Saclay University) to roll out infrastructures in the cloud.

Its components are:

  • non-stop streaming data collection (several tens of millions of records per day);
  • data storage on distributed environments (Haddop cluster, HDFS);
  • large-scale data indexing;
  • the development of analysis environments and collaborative code on data;
  • the restitution and interrogation of data via REST APIs for the academic community and the development of mobile applications for the general public.

This SOSI will enable researchers from a broad and diverse range of disciplines to work together on joint projects. Researchers from the following laboratories have used Multiva for their work: Langues, Textes, Traitements informatiques, Cognition (LATTICE, UMR8094, CNRS / ENS-PSL / Université Sorbonne Nouvelle) laboratory, the Institut de Science et d'Ingénierie Supramoléculaires (ISIS, UMR7006, CNRS / University de Strasbourg) ISIS (STS), the Centre for Research on Medicine, Science, Health, Mental Health and Society (CERMES3, UMR8211, CNRS / Inserm / Université Paris Cité), the LIP6 (UMR7606, CNRS / Sorbonne University), the Institut de Recherche en Informatique de Toulouse (IRIT, UMR5505, CNRS / Toulouse INP / Université Toulouse III - Paul Sabatier) and the Laboratory of Theoretical Physics and Modelling (LPTM, UMR8089, CNRS / CY Cergy Paris University).

Porteur(s) : Institut des Systèmes Complexes de Paris ÎdF (ISC-PIF), Paris 13.
Responsable Scientifique : David Chavalarias, CNRS (CAMS & ISC-PIF).         
Responsable d’infrastructure: Maziyar Panahi (CNRS/ISC-PIF).

Micro-simulation of public policies (TAXIPP)

The impact of many public policies depends on how these affect populations in all their diversity (socio-economic level, location, demographic composition, type of housing, etc.). The micro-simulation approach enables researchers to use a representative sample and information on each individual or household within a given population to construct that population statistically in its full diversity. Public policies can then be simulated using models representing their current state or potential changes so that their impact on the population as a whole can be tested.

The micro-simulation approach to public policy analysis has brought about the development of three areas of research:

  • The first is statistical analysis used here to construct a complete and detailed representation of the population analysed. This can include data matching, data fusion techniques or incorporating information from household surveys into administratively collected data.
  • The second is modelling public policies which here involves developing algorithms to simulate different types of public intervention. Most often the subject of such simulations is the social and fiscal system but public spending, regulations and standards can also be modelled.
  • Finally, the third area is modelling behavioural responses to policy changes and more specifically modelling the extent of people's reactions to incentives or obligations included in public policies. The state of the art in this area involves measuring the sensitivity of responses (i.e. elasticities) to policy changes on the basis of past experience and then including a behavioural module in the micro-simulation model.

This project's objective is to support the development of the TAXIPP micro-simulation model currently being developed at the Institut des Politiques Publiques (IPP), a research centre dedicated to the analysis and evaluation of public policies. The IPP is the fruit of a partnership between the Paris School of Economics (PSE) and the Group of National Economics and Statistics Schools (GENES). This collaborative project involves researchers from the Paris Jourdan Economics Unit (CNRS, Inrae, EHESS, ENPC, ENS-PSL, University Paris 1 Panthéon-Sorbonne) and the Center for Research in Economics and Statistics (CREST - CNRS, ENSAE Paris, ENSAI, École Polytechnique).

This open source model can be used by all researchers requiring a detailed model of the socio-fiscal system. It currently covers social and fiscal policies aimed at households such as compulsory deductions like income tax or social security contributions and also social benefits like family housing or minimum social benefits. It is based on a whole range of exhaustive administrative sources that are statistically matched to each other and linked to an open-access, collaborative socio-tax simulator (OpenFisca). This simulator calculates the compulsory deductions and social benefits which each individual in the data is subject to while also taking into account their characteristics and any legislation currently being studied in the current or reformed system.

Researchers can use this model to deduce the effects of a reform on public revenue and expenditure, its redistributive effects and a socio-fiscal system's redistributive characteristics at a given point in time. This kind of model can also be used to calculate the degree of potential exposure to a reform of all individuals and compare this exposure with changes in these individuals' behaviour.

This tool constitutes an infrastructure that can be used in the framework of a wide range of research projects studying the French redistribution system. More broadly, it can be used for developments in micro-simulation in other areas of public policy like environmental policy, health, transport, education and so forth. The objective of this project is to make this model accessible to as many people as possible while extending its use and development prospects. Detailed documentation and the source code for the latest version of the model are available online.

Porteur : Antoine Bozio, École des Hautes Études en Sciences Sociales (EHESS) / PSE-École d’Économie de Paris / Institut des politiques publiques (IPP)

Observatory of the history of the French population: large-scale databases and artificial intelligence

Large-scale databases are one of quantitative history's essential tools, particularly for the study of historical demography. However this requires a mass of data to be managed that is often difficult for research teams to control using traditional data collection methods. Very recent advances in artificial intelligence and Deep Learning have enabled historical demographers to collect information automatically. The major advantage of these techniques for the creation of historical demography databases, apart from the obvious savings in time, is that researchers can study larger populations, particularly on those of large cities, which have long been neglected, especially for the 19th and 20th centuries.

The aim of the 'Observatoire de l'histoire de la population française: grandes bases de données et intelligence artificielle' (Observatory of the history of the French population: large-scale databases and artificial intelligence) SOSI is to help construct, finalise, conserve, disseminate and use historical surveys on the French population in a sustainable manner. The project has four stated objectives:

  1. The first of these objectives is to support projects with current funding to construct databases of historical demographics using artificial intelligence so that databases can survive over the long term. Often, funded projects are able to create databases but the research teams rarely have the time required to exploit them fully or maintain them longer than the duration of the funded project.
  2. The second objective is to enable technology to be transferred to two types of survey. The first type is surveys that aim to use artificial intelligence to construct a large database of historical demographics. The second type is surveys launched in the 2000s and 2010s or even earlier that used the traditional manual data entry method. The aim is to avoid losing the considerable amount of existing work.
  3. The third objective is to document, preserve and disseminate the Observatory's surveys in conjunction with IR* Progedo (a research infrastructure).
  4. The final objective is to create and foster a scientific dynamic around historical demography databases by providing the SOSI teams with the right tools and knowledge to construct new databases. Eventually this positive dynamic should also enable some of these databases to be linked and thus become interoperable.

The SOSI ObHisPop is supporting seven studies:

  • POPP: Project for the Oceration of the Paris Population Census (1926-1946) (LARHRA; LITIS/NORMASTIC)
  • EXO-POPP: Optical Extraction of Handwritten Named Entities for the Marriage Certificates of the Population of Paris (1880-1940) –(LARHRA; LITIS /NORMASTIC)
  • The Population of Charleville From the Late-17th Century to the Late-19th Century (CRM)
  • POPSTRAS: Océrisation du Fichier Domiciliaire pour une histoire démographique de Strasbourg (1871-1939) (Oceration of the Household Records for a Demographic History of Strasbourg - SAGE; IDEAS)
  • Survey of the population of Belfort (FEMTO-ST; CIAD)
  • GENPAR survey (CRM)
  • Teams from the 'Ineqkill' and 'EPIBEL' projects (Catholic University of Leuven and Ghent University).

Porteuse : Sandra Brée, chargé de recherche CNRS, Laboratoire de Recherche Rhône-Alpes (LARHRA).

Heritage Observatory of the Pacific Experiment Centre (CEP)

The declassification of government archives on nuclear testing in Polynesia has paved the way for a new phase in socio-historical research into this testing. CNRS Humanities & Social Sciences has set up an investigative project under the scientific direction of the historian Renaud Meltz. Mr Meltz was already involved in a research programme on the Pacific Experiment Centre's history ('Histoire et mémoires du CEP', 2018-2021) which led to the publication of a multidisciplinary book - 'Des bombes en Polynésie. Les essais nucléaires français dans le Pacifique', Vendémiaire, 2022. This new initiative is part of a broader overall project coordinated with CNRS Earth & Space, CNRS Ecology & Environment and CNRS Nuclei & Particles that focuses on examples of upheaval in societies and ecosystems caused by global change or certain events. This study project focuses on the health and socio-cultural legacies of the nuclear tests carried out between 1966 and 1996 on Moruroa and Fangataufa. The project is hosted by the Maison des Sciences de l'Homme du Pacifique (MSH-P, UAR2503, CNRS/University of French Polynesia).

A socio-history of the health of Polynesians and CEP veterans

This project's work is based on declassified government archives, the living archives of the nuclear test sites' monitoring department and the accounts of test veterans and local residents. Its aim is to reconstruct the system set up by the military and civilian authorities to monitor the health aspects of these nuclear tests. This ranges from the health inventory drawn up before the first tests to the implementation of the tests and the monitoring of the tests' consequences. How did this system deal with health and environmental effects - from the prevention and risk information policy through to treatment and including measuring exposure to fallout? The project also looks into ordinary military medicine for people who lived near the test sites run from the base at Hao and the legacy it has left in terms of access to healthcare in Polynesia today.

The study will examine the way in which health policies were rolled out in three locations and confronted with the social factors of exposure to radiation. The locations are a navy building that housed European and Polynesian operators during the tests, an inhabited atoll near Moruroa and a district of the Société archipelago. The survey will be both prosopographical and biographical, involving the life stories of veterans chosen on the basis of the diversity of their profiles and experiences linked to the CEP.

This socio-history of health broadens the spectrum of the legacies of these tests to move beyond the radioactive fallout alone to study changes in people's lifestyles, diets and relationships with their environment. It also deals with the Polynesian health system after the closure of the CEP. This event transformed the system from a state and military regime into a civilian and territorial organisation within the framework defined by the territory's autonomous status and faced with the other health challenges the CEP induced indirectly.

The socio-cultural impact of nuclear Polynesia

The aim of this project is to document the socio-cultural impact of the CEP along with the linguistic, geographical and socio-economic issues involved. PhD students and/or postdoctoral fellows working on the project will document the legacy of the CEP on two scales. A socio-history of the migrations the CEP led to on the scale of French Polynesia (FP) will be developed with the aim of creating a database of inter-island flows since the beginning of the CEP project in 1963. This will make it possible to develop a typology of labour migrations in three dimensions:

  • historical: the first entry into the workforce or before this in phosphate mining or another field,
  • economic: the type and duration of employment contract, skills, etc.
  • geographical: commuting, permanent migration, changing islands within French Polynesia or even the Pacific with New Caledonia, to France or even Europe (Portuguese workers).

We will target relevant areas in the Papeete urban area and one or more initial areas (Australes, Marquises, etc.), aiming to carry out a second qualitative study at the level of local neighbourhoods.

Finally, we put forward the idea that the CEP's socio-cultural impacts will be better understood within a broader framework that enables movements of things, people and ideas to be taken into account. This expanded framework means we can compare the different test sites - from the choice of the Polynesian site in the context of the already nuclearised Pacific zone to the processes of recognition and compensation for the test victims. Ultimately, the SOSI's objective is to carry out a comparative study of the CEP's legacy with reference to the tests carried out in the Pacific by the US and the UK and covering issues ranging from the choice of site to compensation for victims.

Porteur : Renaud Meltz, université de Haute-Alsace, Centre de recherches sur les économies, les sociétés, les arts et les techniques (CRESAT), Maison des Sciences de l'Homme du Pacifique

Observatory of environmental and occupational diseases in the 'Lyon chemical valley'

The project follows on from participatory environment-work health surveys. Its objective is to set up an observatory of environmental and occupational diseases in the Rhône-Alpes region and more specifically in the industrial area of Lyon known as the 'Lyon Chemical Valley'. This zone runs along the banks of Rhône River and includes around fifteen municipalities stretching from Lyon to the border with the Isère region.

This is one of the largest industrial areas in France and yet the impact of this industrial activity on the health of local residents and site employees is still very much understudied. In fact, there is only one departmental cancer register (for Isère) for the whole Rhône-Alpes region. Also in the studies of health risks in French industrial areas listed by Santé Publique France data is rare on air pollution levels in the southern Lyon area. And this despite the fact the 'chemical valley' houses hundreds of former polluted sites and dangerous polluting facilities that are still in operation. Indeed several of these are classified as Seveso sites. The area has suffered pollution and several industrial catastrophes including the Feyzin disaster in January 1966 (an explosion at the refinery that killed 18 people), pollution of the Rhône (acrolein in 1976, hydroquinone at the start of the 1980s and recurring PCB incidents from the 1970s to 2010) and the 1987 fire at the Édouard Herriot port.

The Observatory will have three functions:

  • To take stock of existing scientific knowledge, health standards and collective actions linked to this knowledge. This area was the subject of several health and environmental campaigns in the 1970s but these became significantly less frequent from the 1980s onwards
  • To analyse the reasons for the comparative failure to produce and disseminate knowledge and to take public responsibility for many of the health problems we associate with industrial pollution
  • To jointly produce participatory health data by constructing a multidisciplinary research team involving sociology, political science, toxicology, epidemiology, geography and history specialists.

Porteurs : Gwenola Le Naour, Sciences Po Lyon, Valentin Thomas, Sciences Po Lyon / Institut universitaire européen de Florence ; Triangle (CNRS / ENS Lyon / Sciences Po Lyon / Université Lumière Lyon 2)

Transdisciplinary observatory of environmental and social changes in the Sebikotane - Diamniadio zone of Senegal

Urban expansion zones of West Africa like the towns of Sebikotane and Diamniadio in Senegal's Dakar have seen far-reaching changes in the last decade or so that have transformed peaceful villages into major urban and industrial centres in a short timeframe. These are tangible upheavals that are easy for the local population to describe, having lived through a huge increase in construction following the completion of the motorway. All the tensions inherent to Senegalese society being forced towards modernisation can be seen to play out in these few square kilometres formerly made up of villages, fields and orchards.

The challenge this observatory has set itself is to contribute to the area's social and ecological transformation through scientific investigations of pollution caused by human activities. The aim is to produce knowledge that is relevant to the area because it provides responses to the vital concerns of local residents and also to enhance understanding of this type of anthropogenic pollution situation. Indeed this input will fill scientific knowledge gaps particularly as regards the effects of the accumulation of a very large amount of industrial emissions and effluents. The research also has a political objective through its focus on the area's specific characteristics because the jointly constructed knowledge it will produce can be used by and benefit local residents and the local authorities that represent them. Such knowledge will thus support these stakeholders' ability in taking action to bring about a viable transformation of practices in the area.

This art-science-society pollution observatory has two objectives - to record and analyse ongoing changes and to contribute to public debate in an area where residents are concerned about environmental health issues.

  • The observatory aims to learn collectively from a situation considered experimental to document and represent the effects of an extremely rapid industrial and urban development involving intertwined global, regional and local dynamics.
  • We will also attempt to find and define ways of coping with this kind of development through rehabilitation, repair, redevelopment or regeneration.

To achieve this, the observatory aims to hybridise the research group to broaden its scope. For this reason its work will be based on a residents' council and a network of regional academics as well as the more conventional scientific council. This is intended to integrate a plurality of viewpoints and thus identify shared research questions while benefiting from residents' experience, knowledge and know-how of the area being studied. The knowledge produced will be useful in three ways:

  • socially it will provide responses to local residents' vital concerns;
  • scientifically it will counter ignorance about the effects of cumulative emissions and industrial effluents;
  • politically the knowledge will be jointly constructed and will benefit residents and local authorities, strengthening their capacity for action.

Porteur : Yann Philippe Tastevin, CNRS, Laboratoire international Environnement, Santé, Sociétés (ESS)

Working towards a national research network for action on cancers of occupational and environmental origin

Environmental contamination generally tends to originate in the workplace with examples such as farm workers and residents exposed to pesticides, industrial accidents and disasters, production or storage sites whose activities contaminate local residents and so forth. Occupational exposure to toxic substances is therefore a core environmental health issue and work-related cancers play the role of sentinels for environmental health.

Two Scientific Interest Groups on work-related cancers (French acronym = GISCOPs) based in Seine-Saint-Denis (since 2002) and Vaucluse (since 2017) work with an original approach to interventional research. They study cohorts of cancer patients treated in different hospital departments (AP-HP in Paris and the Hospital Centre in Avignon) to enhance knowledge of work activities that expose people to carcinogens, the traceability of these and the obstacles and/or resistance to recognition of the harm they can cause. This research is at the crossroads where the humanities and social sciences meet chemical physics and clinical, epidemiological and biological sciences. It is action-oriented in that it aims to reveal work-related cancers, promote the prevention thereof and provide input for training.

The GISCOP survey is based on interviews to reconstruct the career paths of cancer patients requested to give detailed descriptions of their actual work activities. These career paths will be analysed by the Collectif pluridisciplinaire d'expertise des conditions de travail et des expositions toxiques (Multidisciplinary expert group on working conditions and toxic exposure) which identifies and characterises exposure to carcinogens at workstations (IARC & EU classifications). Patients considered eligible will be asked to consider occupational disease declarations and those who choose to do so will receive support in dealing with the often lengthy and complicated procedures.

The two GISCOPs have expanded their activities in reaction to the growing importance of environmental health issues. GISCOP 93 is becoming a regional structure whose research results and actions provide input for national initiatives and programmes, particularly in the field of prevention. GISCOP 84 is gradually becoming an interdisciplinary research centre studying health-work-environment issues in the lower Rhône valley in collaboration with hospital doctors and toxicology, epidemiology and biology researchers. These developments are part of a broader drive to break down socially constructed boundaries between occupational and environmental health.

As well as this a variety of stakeholders (researchers, doctors, workers' groups) interested in implementing a similar approach regularly approach the two GISCOPs for assistance. However, neither of the two GISCOPs currently possesses the necessary resources to provide effective support for project leaders wishing to set up similar systems although their own survey approach has now been stabilised with transferable tools created.

In this context, the creation of a shared structure for the two GISCOPs acting as a forerunner for the creation of a national intervention research network will serve as an initial response to the vast health and social issues linked to the health-work-environment by working towards a threefold objective:

  1. to provide a common scientific framework for the two teams' research (comparative analysis of the data collected, exploitation of common results, developing new lines of research) while acting as a forum for exchanges with researchers and research structures nationally and internationally;
  2. to promote and enhance the visibility of the GISCOP approach among stakeholders in this field and provide a framework for dialogue with national institutions on research, training and, more broadly, public policy;
  3. to provide training and tools from the permanent survey to support project leaders setting up other GISCOPs: how to actually set up such a project, the reconstitution of career paths, collective expertise, providing remote access to the secure GISCOP database, etc.

Porteurs : Sylvain Bertschy, CNRS, Institut de recherche interdisciplinaire sur les enjeux sociaux Sciences sociales, Politique, Santé (IRIS) , Moritz Hunsmann, CNRS, Centre Norbert Elias ; Anne Marchand, Université Sorbonne Paris Nord, Centre d’histoire sociale des Mondes Contemporains (CHS) ; Zoé Rollin, Université Paris Cité, Centre de recherche sur les liens sociaux (CERLIS), Judith Wolf, EHESS, Centre Norbert Elias

Visualizing grammars across space and time (ViGramm)

A brief summary of the project

Grammar is implicit knowledge that is part of our intangible cultural heritage but is also the fruit of a cognitive faculty developed throughout biological evolution. Study of grammar is therefore based around the dichotomy between Nature and Culture.

The aim of the ViGramm project is to model grammatical variation in the Romance languages - all linguistic varieties deriving from Latin. Of course this includes the 'official' languages like French and Romanian and also above all the minor varieties or 'dialects' that have not been standardised or normalised by 'school' grammar. Romance varieties are a fertile area for study because they display a set of innovative features that are absent in their shared ancestor (Latin) and also because these features vary from one dialect to another.

The project will address the following research questions:

  • What are the relevant variables?
  • Are these variables correlated?
  • These correlations form a network of clusters. What is this network's structure?
  • Are these variables evenly distributed in space?
  • Is the spatial distribution of these variables or clusters of variables correlated with extralinguistic factors like political barriers or physical borders?

To provide responses researchers will:

  • collect a large amount of data from existing sources like linguistic atlases and databases;
  • manipulate this data using statistics and data visualisation techniques to make sense of grammatical variation in its entirety without focusing on isolated dialects or phenomena;
  • analyse statistical data through the theoretical prism of generative grammar. The theoretical principle at the core of this research is that all syntactic differences can be reduced to the properties of functional elements like determiners, pronouns, auxiliaries and so forth.


The project is based on a simple but innovative methodology. In recent years many projects have focused on digitising major data sources like the large print atlases from the start of the 20th century. The main objective of digitisation was to preserve scientific works the size of which made reprints highly unfeasible and to put the original data online in digital and annotated form.

The ViGramm project is based on a completely different approach which aims to re-use the source rather than to preserve it. The key idea is extraction i.e.; transforming atlas data into metadata or numerical variables that represent: a) the syntactic properties present in each of the millions of sentences making up the corpus; b) the provenance of each item; c) each dialect's geographical coordinates.

The protocol was tested on a set of phenomena (negation and related issues) and on a sample of Italo- and Gallo-Romance dialects. The primary data came from the Atlas Linguistique de la France (ALF), the Atlas Italo-Suisse (AIS), the Atlas Syntaxique de l'Italie (ASIt) and the Thesaurus Occitan (Thesoc).

The metadata are organised in portable .csv files to be made freely accessible in compliance with the Huma-Num infrastructure's best practice guides. These digital variables can be reused in many ways ranging from digital mapping to statistical analysis.

Open science

The maps and graphic formats produced by the project are suitable for dissemination activities aimed at increasing collaboration between the scientific community and communities of speakers of these languages. We will disseminate some of the project materials on social networks to attempt to increase interaction with communities of speakers. As well as the usual scientific publications, project materials may be published on a website with fact sheets written by renowned experts to be made accessible to the general public.

Porteurs : Le projet est porté par Diego Pescarini (BCL, Nice) spécialiste de syntaxe et dialectologie Italo-Romane, en collaboration avec Anne Dagnac (CLLE, Toulouse), spécialiste de syntaxe du français et du picard, et Stella Retali-Medori (LISA, Corte), spécialiste des parlers corses.