Development and collaboration on AI-related software projects in EU and Japan

The EUJapan Observatory offers different analysis and visualisations about development and use of Artificial Intelligence in manufacturing in EU and Japan. The analysis of Github software repositories can give us interesting insight how today’s AI tools are being developed.

AI tools are developed with different programming language or development tools, have different impact or popularity. The EUJapan Observatory offers analysis and visualisation of these data over time. It also offers visualisation of domestic and international collaboration in AI software development between Japan and other European countries.

JSI as a partner of EU-Japan.AI had developed EUJapan Observatory, which is a web platform, geographically focused on EU and Japan, with the aim to provide users with the latest data about development and use of Artificial Intelligence in manufacturing and various industries. EUJapan Observatory is publicly available at https://eujapan.ijs.si/.

Through EUJapan Observatory we are presenting the analysis of research papers and conferences regarding AI in manufacturing, news about AI published in various media articles and on social media, AI-related open source projects from GitHub, AI-related job postings from Adzuna service (the source of visualizations is OECD AI Observatory (https://oecd.ai/), analysis of European projects from CORDIS dataset (FP7 and Horizon2020 projects), analysis of Japanese projects and analysis of AI investments (based on OECD report on investments in AI, where JSI, an EU-Japan partner, was one of the co-authors). Let’s take a look at AI-related software projects in EU and Japan.

The source of our data are AI-related GitHub software repositories. GitHub is a code hosting platform for version control and collaboration. It lets developers from anywhere (currently there are over 83 million developers registered on the platform) to work and develop software projects together.

Among all Github projects, we first identified those software projects (i.e. AI-related GitHub “repositories”) that are related to AI, then we identify those, for which a geographical location can be determined. AI-related concepts are identified by the machine learning algorithm and are evolving in time. Our database is currently containing data from around 1.4 million AI related GitHub projects and the time distribution can be presented with the following figure:

Figure: The number of AI projects over time, including those for which a location can be determined.

EUJapan Observatory is then providing several analysis and visualisations of data.

Regarding contributions to AI projects by country and project impact, we can view the share of contributions (i.e. “commits”) made to AI projects (i.e. AI-related GitHub “repositories”) by different countries and over time.

The EUJapan Observatory offers filtering by AI projects impact, as given by the number of managed copies (i.e. “forks”) made of that project.

Figure: The number of Github commits to AI projects with more than 100 forks over time.

 

For instance, we can see, that the highest number of “commits” made to AI projects with very high impact (more than 100 forks) is coming from Germany and this is greatly increasing in the last three years (from 2019).

Similar, but not completely the same analysis is provided by contributions to AI projects by country and project popularity. Here we can view the share of contributions (i.e. “commits”) made to AI projects (i.e. AI-related GitHub “repositories”) by different countries and over time. However, we can filter AI projects by popularity given by the number of followers (i.e. “stars”) of that project.

Figure: The number of Github commits to AI projects with more than 100 stars over time.

This visualisation brings us the similar information – the highest number of “commits” made to AI projects with very high impact (more than 100 forks) is coming from Germany, however, the trend of increasing the number of “commits” started in 2018, but it is slowing down in 2021.

The visualisation of share of AI projects by programming language or tool is showing the share of AI projects that use different programming language or development tool over time.

Figure: The share of Github AI projects depending on which programming language or tool it is using.

We can see that the most popular AI development programming language is Python, however it’s use is slowly declining from 2019. The second most popular is Jupyter Notebook development tool, which popularity is increasing for almost last 7 years.

Finally, we can explore the domestic and international collaboration in AI software development. Two countries are said to collaborate on an AI project if there is at least one user from each country with at least one contribution to the project. Domestic collaboration occurs when two users from the same country contribute to a project.

Figure: National and international collaboration in AI software development between Japan and other European countries.

The EUJapan Observatory offers different analysis and visualisations about development and use of Artificial Intelligence in manufacturing in EU and Japan. The analysis of Github software repositories can give us interesting insight how today’s AI tools are being developed.

Author: Matej Kovačič

References

[1] EUJapan Observatory, https://eujapan.ijs.si/