If collaboration is good, is more collaboration better?
Project management methodologies that have been successful in production-centric environments, e.g., agile, dev-ops, lean are increasingly being deployed in big data projects. However, big data projects are a combination of production and creative work.
Software engineering and development is arguably production-centric and well-suited to optimisation workflows. On the other hand, Science –and research in general- explores how to reach a long-term goal. Outcomes of the scientific process are highly non-linear; significant results are obtained in a similar fashion to an artist’s creative breakthrough.
Data science is no exception to Science; it is a creation endeavour, not a production one.
To maximise the potential of data science teams, one should provide an environment that is suitable for creativity. Fortunately, that’s a well-researched area with evidence-based answers; unfortunately, these findings are often ignored.
Adequate collaboration is the most critical enabler of creativity, but not all collaboration principles are equal: how many works of art, such as paintings or books, are the products of teamwork? In short, not that many .
Creative “outbursts”, in research or artistic pursuits follow the common pattern of mentally reaching out to novel – and apparently unrelated – ideas to solve a problem or fulfill a vision, until they coalesce into what, to an outsider, appears as an epiphany.
Artists’ collaborative life is well documented: from circles of philosophers to Andy Warhol’s Factory and the close connections painters and writers in European capitals a few centuries ago. The most creative people experience a mixture of solitary work and external influences: the collaborative aspect having less to do with creating the work, but more with the inspiration it provides.
A number of studies on aspects as diverse as, e.g., on the ideation process , the quality and success Broadway shows  or communication vs. productivity in the workplace  all demonstrate the same two aspects: too much close collaboration is harmful – as it naturally leads to cliques, groupthink, and echo chambers – , while too little contact with “the outside world” also hampers creativity.
In the realm of research – academic or otherwise – that form of collaboration had been ongoing for a long time: a personal space to create (the fast disappearing office), and a collective space to make face-to-face contact  and exchange ideas (the fast disappearing workplace cafeteria, external seminars or conferences).
Instead of following these findings, which have been long-held best practices, recent trends have almost obliterated them: open offices, small kitchens replicated around various floors, restriction of travel budgets and constant collaborative meetings with a core team are stifling innovation. Indeed, pretty much every department or company I have visited over the past few years has showcased environments with similar answers to similar problems. This is not about skills shortage; this is about buzz-word driven project methodologies without understanding their context  or looking at the evidence.
On the optimist side, the recent emergence of “people analytics” as an area of focus may offer solutions to re-ignite the true innovation that leads to significant competitive advantage in data science. Indeed, the research is already there and most promising answers involve collaborative network graphs.
Among the key features of interest: successful creative projects and companies are composed of people that have a low local clustering coefficient  and short average path length , i.e., people whose collaborative and conversational networks are compact, but not inter-related.
Left: A highly connected graph of short paths, forming almost a clique (LCC = 0.66, APL = 1.1). This type of collaboration, occurring when everyone works tightly with everyone else and no one outside, leads to unproductive “groupthink”
Middle: a graph of long paths and limited inter-connectivity (LCC= 0, APL = 2.05). This type of collaboration, occurring when people only work with closely related trusted parties, can lead to “echo chambers”. Note that these types of paths are often in fact disconnected.
Right: a graph containing short path lengths and limited inter-connectivity (LCC = 0.05, APL = 1.5). The combination of short paths (easy access to diverse people) and close collaboration on a small scale is beneficial to the creative process
You can’t manage what you can’t measure. Fortunately, you can measure and quantify the structure of internal collaboration within an organisation. With that information, you can manage teams, projects or departments to maximizse the inventiveness and creativity of knowledge workers, which results in more significant findings and outcomes.
It’s not that: more collaboration => better outcomes
But: better collaboration => more outcomes
 Music is an exception here as there are at least two distinct areas: songwriting and composing
 Andrew T. Stephen, Peter Pal Zubcsek, and Jacob Goldenberg (2016) Lower Connectivity Is Better: The Effects of Network Structure on Redundancy of Ideas and Customer Innovativeness in Interdependent Ideation Tasks. Journal of Marketing Research: April 2016, Vol. 53, No. 2, pp. 263-279.
 Brian Uzzi and Jarrett Spiro (2005) Collaboration and Creativity: The Small World Problem. American Journal of Sociology: September 2005
 Alex Pentland (2013) Beyond the Echo Chamber. Harvard Business Review: November 2013
 Unscripted face-to-face communication is overwhelmingly more conductive to engagement and idea sharing 
 Alex Pentland (2012) The new science of building great teams. Harvard Business Review: April 2012
 A number between 0 and 1 that measures the proportion of a person’s contacts who also know each other. If everyone knows everyone else, the network is called a clique.
 The average number of people a person has to “go through” to contact everyone in the network