Collaboration Networks

Author

Janpieter VAN DER POL

Collaborations

In strategic technological intelligence, collaboration is a high-value information signal. A technological collaboration implies pooling technical capabilities, which are at the core of firms’ competitive advantage. Observing two actors combining capabilities is therefore a strong signal of technological need.

Collaboration among innovative firms has intensified over several decades (Duysters and Hagedoorn 2000). As technologies become more complex, it is increasingly difficult for one actor to master all required techniques and technologies. As a result, actors build links with others, sometimes even competitors, in order to innovate (Hagedoorn and Narula 1996; Narula and Hagedoorn 1999). Collaboration has been identified as beneficial for firms (McEvily and Marcus 2005), innovation (Kogut and Zander 1992), and growth/survival (McEvily and Marcus 2005; Watson 2007).

The discussion above concerns mainly technological collaboration, where each firm contributes technical input. But collaboration can also be financially motivated. A firm with technical capability but insufficient funds may partner with another actor to finance research. In some data sources, distinguishing motivation is difficult. For example, in patents, having two assignees does not imply equal technological contribution; one actor may be listed due to financial participation. The distinction is clearer in funded research projects where the funder is explicitly identified and participants are expected to provide technological contributions (ANR and European projects).

In a knowledge economy, a firm’s key asset is the knowledge it controls and mobilizes for market positioning (Penrose 1959). From a Knowledge-Based View, collaboration is risky because it opens access to strategically valuable knowledge (Penrose 1959). In that sense, collaboration is a strong indicator of real technological need when alternatives are limited.

When competing firms collaborate and even share intellectual property, this typically indicates that both actors could not obtain the required capability through easier alternatives. In this chapter, collaboration is analyzed mainly through a knowledge-flow lens. These flows may include non-strategic information (tool recommendations, organizational practices, managerial insights) as well as core technical knowledge. They can diffuse from collaborator to collaborator and propagate through the network.

This propagation can be beneficial in two ways:

Better information can improve innovation across the network and ultimately benefit users.
Incoming new ideas improve an actor’s own inventive potential.

An actor that is too closed may face low diversity of ideas, reducing creativity. Repeated collaboration can increase productivity because actors work better together over time, but it can also reduce creativity.

These dynamics are visible in network representations. Strategically, they reveal how actors organize external technological sourcing.

As with any network, we follow a three-step analysis, with a clear objective at each level.

Figure 1: Summary table of information extracted from collaboration networks by analysis level (macro, meso, micro).

For more detail on collaboration effects, see Van Der Pol (2016).

Analyzing a Collaboration Network

This section addresses collaboration networks between actors. Such networks can be built from multiple data sources, so we first discuss source-specific biases and limits.

Data and Network Construction

This subsection explains which information is used to generate collaboration networks from different data sources, and highlights limits and complementary signals relevant for analysis.

Patents

Patent data sources are numerous and differ widely in coverage and cleaning quality. Patent office data are generally available for free, but often do not fully reflect ownership changes tracked in specialized databases. Commercial sources (e.g., Questel Orbit, Orbis-IP) aggregate multiple offices and resolve many issues related to firm name changes and parent-subsidiary structures. Lower-cost alternatives such as PATSTAT exist but require more manual affiliation cleaning. Free sources such as Google Patents can also be useful.

In what follows, we stay close to primary sources (office PDFs, Espacenet, Google Patents) to avoid provider-specific adjustments. The key point is to be aware of potential biases and verify them explicitly.

In the simplest form, a collaboration edge is created when two or more actors appear on the same patent document as assignees.

Figure 2: Example of collaboration extraction from patent assignees (field 73). If two or more assignees are listed, create nodes and edges; filing year can be used as temporal information.

Patent analysis is often done at the family level (all unitary documents covering one invention). This may create issues when assignees differ across documents within the same family.

Figure 3: Two patent documents from the same family with different assignees. Herakles and Airbus Safran Launchers reflect a name change; without cleaning, both appear as distinct nodes.

The most critical case is patent transfer. A single document in a family may be sold, making the buyer appear as a family assignee without any original collaboration.

Figure 4: Partial assignee transfer case: Google acquired Mosaid, which had acquired North American patents from Pirelli. Family-level data can falsely look like co-assignment collaboration.

Remark

A patent co-assignment link is interpreted as pooling technical capabilities and implies knowledge flow. Because it includes shared intellectual property, it is generally a strong tie.

Scientific Publications

Scientific outputs are also widely used for collaboration analysis. Here the key entities are author affiliations, not the authors themselves (co-author networks are treated later).

Publication data are often behind paywalls (Scopus, Web of Science), but free sources exist with different coverage (e.g., ScanR, HAL-linked records, arXiv preprints). Commercial providers usually offer cleaner affiliation matching.

Figure 5: Affiliations used to build collaboration networks from scientific publications.

As with patents, we can add publication date to collaboration edges. However, collaboration necessarily predates publication because research and publication take time. Compared with patents, affiliations (especially universities/research institutes) are often more stable.

Remark

A publication collaboration tie is usually weaker than patent co-assignment because it does not involve direct IP ownership. It still indicates shared work and knowledge flow.

European Projects

A European project is research funded fully or partly by the European Commission, usually in response to calls aligned with EU priorities (e.g., links with China, technological standardization).

These projects generally include diverse actors (multinationals, SMEs, startups, universities, research institutes). Although all participants contribute, they are not necessarily in direct bilateral contact. Knowledge flow should therefore not be over-interpreted as universal across all pairs.

Data are available through the European Commission (CORDIS). Raw files provide detailed project-level information (partners, dates, funding amounts, country), but no full affiliation cleaning.

Unlike patents and publications (outputs), European projects are research inputs and should be interpreted accordingly.

Figure 6: Example project page from CORDIS. Projects include roles such as coordinator and participants; role metadata can be carried into the network.

Project start and end dates are available and provide clearer temporal anchors than publication/patent dates.

Remark

A project tie indicates shared contribution to a project objective. It does not necessarily imply strong direct knowledge flow between all participant pairs. Complementary publication evidence is useful to refine interpretation.

ANR Projects

ANR-funded projects are similar to EU projects in structure. Data are available on ANR project pages and as CSV/XLSX files (e.g., via data.gouv.fr), including coordinator, partners, project start date, duration, and additional fields such as abstracts.

Figure 7: Information available for ANR-funded projects.

Remark

ANR project ties should be interpreted with the same caution as EU project ties: they indicate knowledge combination potential and possible knowledge flow, not guaranteed direct bilateral transfer.

On Combining Sources

Given multiple sources, it is tempting to build one global collaboration network. But combining sources implicitly treats all links as equivalent (patent, EU project, publication, ANR), which creates interpretation issues.

Main risks:

One project can produce multiple publications, inflating edge weights if each source event is counted independently.
Patent co-assignment is less frequent than publication/project collaboration; its stronger, rarer ties may be diluted in denser mixed-source networks.
The key interpretation problem is semantic: what does the combined network actually represent? Mixed networks are useful to map an ecosystem visually, but network indicators become harder to interpret unless source-specific analyses are run in parallel.

For an example combining scientific publications and patents, see Pol and Rameshkoumar (2018).

Analyzing a Collaboration Ecosystem

To understand innovation emergence in a domain, collaboration network analysis is essential. It helps identify actors open to collaboration, communities with repeated collaboration patterns, and each actor’s local ecosystem. Combined with a standard domain analysis, this informs external technological sourcing strategies.

Here we analyze 5G collaboration networks from Scopus affiliations. We compare:

Publications with funding acknowledgment.
Publications without funding acknowledgment.

Scopus identifies funding either from author declarations or acknowledgment text. The dataset includes 22,532 scientific documents between 2010 and 2021: 7,435 with funding and 13,107 without.

Macro Analysis of the Collaboration Network

Figure Figure 8 shows the affiliation collaboration network built from non-funded publications only. An edge indicates at least two distinct co-publications (edge weight >= 2).

Figure 8: Collaboration network from non-funded scientific publications. Node size represents number of collaborators; node color represents Louvain community.

The network has a single connected component (no isolated collaboration clusters). It contains 526 actors and 1,185 collaboration links, with mean degree 2.25. Density is low (0.009) and clustering is high (0.44), indicating clear community structure. Modularity (0.697) identifies 8 communities.

No single actor structures the whole network; instead, central actors are mostly central within their own communities. The global structure is driven by interconnection of well-defined communities rather than a strict hierarchical architecture.

Meso Analysis

Communities identified at macro level suggest a geographic logic. To test this, nodes are colored by country in Figure Figure 9.

Figure 9: Collaboration network from non-funded publications, colored by country.

Nationally structured communities become more explicit. Industrial actors often act as gatekeepers between national research communities. A salient case is the Japanese cluster, weakly connected to the rest of the network, with private actors (e.g., NEC, Fujitsu) bridging to external communities.

Universities are central within national communities, while large industrial actors are more central at the global network level (e.g., Huawei, Ericsson, Nokia, IBM, Intel).

The French community is relatively inward-looking, with links to the wider network primarily mediated by industrial actors (Orange Labs, Thales, Montimage) plus Grenoble. The Japanese community is also relatively peripheral, while the US community is more distributed and the Chinese community is structured around major universities strongly connected internationally.

Micro Analysis of the French Collaboration Subnetwork

Figure Figure 10 shows links involving at least one French actor.

Figure 10: French 5G collaboration subnetwork (all links with at least one French actor).

CNRS is highly central and connected to many universities. Major structuring actors include Rennes, Grenoble, and INRIA. Industrial actors (e.g., Thales, Orange; and at European scale Nokia, STMicroelectronics) are present but often less interconnected with each other than with universities.

Each industrial actor tends to have its own local collaboration ecosystem around specific academic partners. Huawei appears in a more peripheral position through links with Rennes and Thales. US actors are largely absent from this French subnetwork.

Comparison with the Funded Network

We now compare the non-funded network with the funded network (multi-affiliation publications with explicit funding acknowledgment).

Figure 11: Funded collaboration network with the same country color code as the non-funded network.

Visual comparison suggests both networks have one giant component. The funded network appears denser, with the French community closer to the broader European cluster. The Japanese community is less visible. To validate visual interpretation, we use summary statistics.

Table Table 1 compares funded, non-funded, and combined networks.

Table 1: Network statistics comparison between funded, non-funded, and combined collaboration networks.

Metric	Funded Network	Non-Funded Network	Combined
Actors	587	526	755
Links	1731	1185	2494
Density	0.010	0.009	0.009
Centralization	0.0045	0.0055	0.0033
Clustering	0.511	0.440	0.470
Triangles	1919	716	2903
Diameter	10	10	9
Average path length	3.64	3.886	3.49
Mean degree	2.9	4.5	6.6
TPT	3.269	1.361	3.845

The funded network includes 61 more actors and 546 more links than the non-funded network, indicating both entry of new actors and additional collaboration among existing actors. Higher clustering and triangle counts indicate stronger local densification in funded collaborations.

The combined network (funded + non-funded) has 755 actors and 2,494 links, with stronger overall connectivity.

The original LaTeX chapter references an additional figure (5G_scopus_players_added_through_financing) stored outside this workspace. That file is not available in the current project, so it is not embedded here.

A set of examples of actors appearing only in one network is shown below (as in the source chapter):

Actors Exclusive to Non-Funded Network	Actors Exclusive to Funded Network
AIRBUS EP	ECOLE NORMALE SUPERIEURE PARIS
CEMEF FR	INST MINES TELECOM FR
DISPOSABLE LAB	ISEP FRANCE
LRI UNIV PARIS SUD FR	ITER
METEOR NETWORK	OMMIC S A S BREVANNES FRANCE
TECHNOLOGICAL RES INST SYSTEMX	TELECOM RES DEVELOPMENT LANNION
UNIVs FRANCHE COMTE	VIRTUAL OPEN SYSTEMS FRANCE
UNIVs AIX MARSEILLE	WHEN AB PARIS FRANCE
UNIVs LYON FR

Analyzing a Co-Author Network

Co-author network analysis can answer strategic questions such as:

Which researchers are leading figures in a field?
Which researchers are most influential within a focal organization?
What are the internal research teams?
What research themes are emerging?
How is R&D organized within an actor?

We illustrate this with one research laboratory and one firm case.

Researcher Network

This example uses scientific outputs (journal papers, books, conference communications, posters, etc.) to:

Map the research ecosystem (external collaborators of laboratory researchers).
Identify internal researchers who provide access to external actors.
Identify internal collaboration patterns.
Identify research teams.

We first focus on internal links only (researchers from the same lab), using five years of outputs and filtering to intra-lab co-authorship ties.

Figure 12: Internal co-publication network of researchers in a research laboratory. Node size = degree; node color = Louvain community; edge thickness = collaboration frequency.

The network has four connected components. Secondary components indicate groups collaborating internally without connecting to the rest of the lab network. Reasons can vary (topic specialization, recent project-based collaboration, local organizational effects).

The giant component itself is community-structured, connected through bridge researchers (gatekeepers). These bridging roles are important for laboratory cohesion and cross-community information flow.

Some actors are central inside a community but do not bridge communities, reflecting specialization. Edge thickness also reveals strong dyads/triads with repeated collaboration intensity.

For lab management, both types of actors matter:

Community cores (theme leaders) for thematic steering.
Bridge actors for cross-team cohesion and information circulation.

Internal collaboration is only part of the picture. External openness also matters.

Figure 13: Full co-authorship network for the laboratory over its whole period. Blue nodes represent internal researchers; other nodes are external collaborators.

This full network shows a dense internal core plus more externally oriented researchers at the periphery. Some “ball-like” clusters represent single publications with many co-authors.

To inspect temporal dynamics, edges can be colored by recency:

Figure 14: Laboratory co-authorship network with recent links (<= 5 years) in blue and older links (> 5 years) in red.

This highlights that many observed links are relatively old. That does not imply collaboration has ended, but it helps distinguish stable legacy ties from recently active ones.

A methodological caveat remains: publications with very large author lists often imply weaker average pairwise collaboration intensity.

Solvay Inventor Network

Inventor identity is mandatory in patent filing, making inventors a rich signal for analyzing firm R&D activity through networks. Inventor networks reveal teams working on specific themes, allowing estimation of effort allocation by topic and detection of cross-theme mobility when specialized inventors begin co-inventing with other technical communities.

This case uses Solvay patents from Questel Orbit, 1996 to present, totaling 3,396 patent families. Over this period, Solvay acquired Rhodia and Cytec; inventor-network structure is expected to reflect these integrations.

Figure 15: Solvay inventor collaboration network. Node = inventor; edge = at least one shared patent; node color = modularity community.

Figure 16: Solvay inventor network with temporal activity overlay. Red nodes: no co-filing since 2015; green nodes: at least one filing since 2016; node size = centrality.

Figure 17: Alternative Solvay inventor view with the same encoding of activity and centrality.

References

Duysters, Geert, and John Hagedoorn. 2000. “Organizational Modes of Strategic Technology Partnering.” Journal of Scientifi & Industrial Research 59: 640–49.

Hagedoorn, J., and R. Narula. 1996. “Choosing Organizational Modes of Strategic Technology Partnering: International and Sectoral Differences.” Journal of International Business Studies 27 (2): 265–84.

Kogut, Bruce, and Udo Zander. 1992. “Knowledge of the Firm, Combinative Capabilities, and the Replication of Technology.” Organization Science 3 (3): 383–97.

McEvily, B., and A. Marcus. 2005. “Embedded Ties and the Acquisition of Competitive Capabilities.” Strategic Management Journal 26 (11): 1033–55.

Narula, Rajneesh, and John Hagedoorn. 1999. “Innovating Through Strategic Alliances: Moving Towards International Partnerships and Contractual Agreements.” Technovation 19 (5): 283–94.

Penrose, Edith T. 1959. “The Theory of the Growth of the Firm, 1959.” Cambridge, MA.

Pol, Johannes van der, and Jean-Paul Rameshkoumar. 2018. “The Co-Evolution of Knowledge and Collaboration Networks: The Role of the Technology Life-Cycle.” Scientometrics 114 (1): 307–23.

Van Der Pol, Johannes. 2016. “Social network of firms, innovation and industrial performance.” Theses 2016BORD0207. Université de Bordeaux. https://tel.archives-ouvertes.fr/tel-01532053.

Watson, J. 2007. “Modeling the Relationship Between Networking and Firm Performance.” Journal of Business Venturing 22 (6): 852–74.