WP # 9: Coordination of Complexity Data transfer and sharing

WP9 Complexity Data availability to STREPs. For a long time, one of the most damaging myths for the image of complexity was that it misses a sound experimental grounding. The reason for this misperception might be the fact that many of the complexity relevant data are accumulated under the name of the application disciplines. This is not surprising since funding for making “complexity measurements” were never allocated. On the other hand, an enormous amount of complex data is being accumulated in different fields. Most of these data is not even recognized by the groups producing (gathering them) as complexity data. In fact this is part of the puzzle in understanding and interpreting those data. To make this data available to the STREPs, one has to discover, catalogue, coordinate and meta-process the data bases as to create an uniform readily accessible body of information.

This is a highly non-trivial task involving beyond the mere scientific issues, copyright and other legal problems. This is a serious obstacle to the access to the large amounts of relevant data available in the community at large. While the task is beyond the means of the CA one can initiate a preliminary study of this issue and organize a systematic search for application areas where data bases exist. One can start index, profile and rank those data bases and to help defining the formal, technical and legal arrangements necessary for their sharing by the STREPs.

Leading institution

CNISM - CONSORZIO NAZIONALE INTERUNIVERSITARIO PER LE SCIENZE FISICHE DELLA MATERIA - ROMA ITALY

WP leader

Prof. Rosarion Nunzio Mantegna mantegna@unipa.it

Objectives

  1. Coordination of the harmonization of design, maintenance and development of databases of complex systems of STREPs and other research groups across and outside Europe and abroad dealing with complexity studies.
  2. Organization of the first European conference of design, maintenance and management of complex system databases.
  3. Connection to the main web portal of complex system databases.
  4. Study of a set of recommendations aiming to stimulate the production of pilot software intending to standardize access of the STREPs and the research community to complex system databases.
  5. Study of legal aspects of the access of the STREPs, research groups and non-profit organizations to proprietary databases.

Description of work

The investigation of complex systems heavily relies on the availability of databases. In fact the STREP research of stylized facts or robust statistical regularities is possible only if an access to well organized database is available. Complex systems, such as for example complete genomes of organisms, social networks, biological networks, financial markets, etc, have moved from a status of systems characterized by a low production rate of data to the opposite status of systems characterized by a huge production rate of data.
Large databases exist today in the cited fields. Examples are the bioinformatics databases containing genomic, proteomic and microarray data, social databases of income, social databases of consumer preferences and financial databases containing information about financial transactions and orders. To promote the STREP effort of empirical and theoretical modelling of complex systems it is very important that new databases are built up and continuously updated. From the perspective of the STREPS and this CA, a crucial aspect concerns promotion of standards of databases and software packages able to process and visualize the data recorded inside them.

Instruments

In this respect, another key point is the accessibility of the STREPs to data. It is extremely important at this stage to reflect on the legal problems related with the proprietary nature of some data. In fact the finding of technical and legal solutions allowing the largest dissemination of proprietary data to perform the STREP research on complex systems will provide a competitive advantage to the STREPS and the entire European area of research. In this direction there should be an effort to coordinate the STREPs and the necessity of the STREPs, especially those focused on the development of information technology infrastructures using new protocols and tools such as, for example, the GRID technology.

The growing research interest in the areas of agent-based models, system biology and self-growing networks, as well as the initiation of the STREPs, presents direct needs of empirical verification or falsification. These empirical researches are playing an equivalent role of observational experiments performed in scientific disciplines such as astronomy, oceanography or population dynamics. To perform these studies well designed, continuously updated and accessible, yet controlled databases of complex systems should be coordinated at an European level, and be at the disposal of the STREPs. The relevance of coordination can be exemplified with an example. Currently, several STREPs, as well as other research groups of different disciplines (finance, econometrics, statistics, mathematics and physics) both in the academics and in the financial industry are investigating microstructure of financial markets. It is probably worth stating that this kind of investigation has nothing to do with technical analysis of stock market data. On the contrary, these investigations focus at the heart of the economic process determining price formation in an efficient environment. In fact, the robust statistical regularities detected in different markets subject to different trading rules can provide indications for the market itself and for regulating institutions to improve the degree of efficiency and to minimize the stability problems of the analyzed markets. From a European perspective, this would be especially valuable in the perspective of a progressive convergence of different markets towards a standard of high efficiency and stability.

To perform comparative studies of microstructure in different markets it is necessary to have access to microstructure data of different markets. Competitive advantage in the research is present when the data are collected and/or reconstructed by following the same approach while investigating different markets. The setting up of standards in this and in similar fields would allow the STREP researcher to widen the access at database in an efficient and coordinated way.

We intend to coordinate the harmonization of design, maintenance and development of databases of complex systems. This can be achieved by first organizing a specialized conference devoted to bring together the STREPs and other active research groups dealing with complex systems databases. The conference can also serve as a kick-off activity to develop (i) the guidelines to be followed to realize a European portal of databases used for the empirical analysis and modeling of complex systems and (ii) a set of indications to develop pilot software intending to standardize access of the STREP research community to complex system databases. A study of legal problems related to the access for research and non-profit activities of proprietary data are also to be promoted.

Deliverables