Charity Leaks

Do you like good detective stories? Ever heard of Citizen Kane? Spotlight? All these movies with the heroic journalists? If so, come aboard, forget fiction but keep the popcorn, and dive into the world of the rogue offshore financial industry, where leaders, criminals, celebrities and yes, charities, rub shoulders.

Where it all begins.

"Interested in data? [...] There are a couple of conditions. My life is in danger. We will only chat over encrypted files. No meeting, ever."

If you believe that this quote comes from a spy novel, you are wrong. These are the words that started one of the biggest data leaks in History, leading to resignations, imprisonments and more. They come from an anonymous source, referred to as John Doe, to Bastian Obermayer, an investigative journalist working for the Munich-based newspaper Süddeutsche Zeitung.

This exchange resulted in more than 11.5 million leaked documents and 2.5 terabytes of data detailing the financial information of more than 214'000 offshores entities, stretching from 1970 to 2016 and spanning more than 150 countries.

In their manifesto, titled “The Revolution will be Digitized”, the anonymous whistleblower explained about the leaks that they "understood enough about their contents to realise the scale of the injustices they described" and wanted to shed light upon the crimes described within.

Where does the data come from? The information leaked from Mossack Fonseca, a Panamanian law firm and the world’s fourth larger provider of offshore financial services.

After the exchange with the anonymous informer, the German newspaper sought help from the ICIJ, and after a year of investigation into the leaked data, the results were finally published.

Among the leaked names: one current head of state, several former heads of state and prime ministers, actors, famous football players, and of course, some of the world's most well-known charities.

One Dataset to rule them All.

The databases released by ICIJ contain information on:

  • The Offshore Leaks: 130 000 offshore accounts disclosed in a report (2013)
  • The Panamas Papers: 214 000 offshore accounts from Mossack Fonseca, as presented above (2016)
  • The Bahamas Leaks: 1.3 million internal files from the company register of the Bahamas (2016)
  • The Paradise Papers: 13.4 million electronic documents relating to offshore investments (2017)

This database is powered by Neo4j, a graph database that structures data in nodes and edges. The data was downloaded as several CSV files, one for each actor type and one for the links between them. The different nodes represent:

  • The offshore entity: a company created in a low-tax jurisdiction that often attracts non-resident client through preferential tax treatment
  • The officer: a person or company who plays a role in an offshore entity
  • The intermediary: a go-between for someone seeking an offshore corporation and an offshore service provider (usually a law firm)
  • The address: a contact postal address as it appears in the original database

In total, this database contains information about more than 785 000 offshore entities and links to people and companies in more than 200 countries. Note that this data comes from leaked records and not a standardized corporate registry, so there may be duplicates (as similar appearances were not merged).

No cleaning or standardization was necessary (see ICIJ site for more details).

Before investigating the charities, we can explore the dataset and the offshore companies to have a more general view of the situation. We were particularly interested in the Panamas Papers dataset. How were the offshore companies distributed around the world? How many countries sheltered these 200 000 companies?

We were not surprised to see that majority of our entities were located in a small number of countries. Indeed, half of the countries quoted in the dataset put together contain only a few of the total number of offshore companies.

Note the presence of the United Kingdom in the top 10; the UK is strongly appreciated by entrepreneurs and companies for the possible UK non-domiciled status that enables actors to avoid taxes on foreign income.

In conclusion, there was no particular surprise with this top 20, each country being a well-known tax haven. As a whole, this leads to a non-uniform geographic distribution.

But now let's take a closer look at our charities! We need first to find them in the database. Since no detail was given in the dataset about the different companies' functions, we need external data to help our investigation.

We chose to use Forbes Magazine's list of the top 100 richest US charities, as well as Wikipedia's lists of main non-governmental organizations and main charitable foundations. We scraped these websites in order to form a new dataset of well-known charities. By merging together these results, we obtained detailed information on more than 300 charities that we could use to investigate the ICIJ database. Collected details about charities include names, leaders, headquarters addresses and locations of major offices. Note that the available data varies from charity to charity, resulting in a sometimes sparse dataset.

We decided to isolate potential charities in the Panama Papers by their name. We considered an offshore company to be a potential match for a charity when its name was closely related to that of one of the scraped charitable foundations. This was achieved by text mining, where our algorithm matched word sequences based on a percentage of common significative words. Using this method, we reduced the dataset from hundreds of thousands of entities to a few hundred names.

Considering the reasonable number of matches and since humans possess an expertise in natural language processing superior to any training tool, we inspected and corrected the potential matches manually.

Among the matches, we found Amnesty International, the International Red Cross, and many others of the world's biggest aid agencies.

Fishing for rich villains

Once we had our potential matches, we were faced with the following question: how can we find out if an entity is an actual charity, and not some random shell corporation with a stolen name?

Checking the directory...

After investigating several methods, we finally chose to check the addresses of the matches we had. By definition, a shell company is a corporation that exists only on paper and has no office and no employees, but may have a bank account, and also needs a mailbox address. It could therefore be assumed that if a real charity’s headquarters match the mail address of a shell with the same name, these are indeed one and the same company.

In the database provided by ICIJ, the majority of the offshore entities have an address registered as a specific node (see example here). We decided to inspect these addresses for our matches and then to compare them with the headquarters addresses of the corresponding charities that were found during the web scraping. If the addresses match, there is a high probability that the offshore accounts do indeed belong to the real charity.

After extracting every node with a registered address corresponding to a match, we obtained a list of 27 offshore entities. Among them, we discarded every match whose charity headquarter location was not available in our data. We also removed the entities with an address registered in the country of origin of the leak (the Bahamas, Aruba, etc...).

We obtained 11 suspected matches, a dataset small enough for manual inspection. Reading though, we kept only the entities where the countries of the two addresses matched. Some matches are more relevant than the others since the town corresponds, whereas for most, we only have the country.

In this final list, we found the names of famous charities such as Amnesty International, The Nature Conservancy, World Vision, and The American Cancer Society. It is interesting to note that some charities appear two times in our list (The Nature Conservancy and Memorial Sloane-Kettering Cancer Center) with different addresses. Each of these companies is therefore a suspect. (The fact that most of our matches are located in the United States, however, should be treated with caution, as during our web scraping, the charities taken from the Forbes article were much more likely to have addresses than the ones from Wikipedia, and all of the former are US-American.)

We searched the web for these companies associated with their specific leak source. It seems that the common good is not the main objective of every charity. Indeed, we found that The Nature Conservancy, one of the world's biggest environmental groups, was revealed not to be using all its donations for its environmental mission but instead investing in oil drilling.
(Environmental Groups invest in Oil Drilling)

Same for The Duke Endowment where we found that Duke University uses offshore funds to grow their endowments by investing in fossil fuels.
(US universities use offshore funds to grow their huge endowments)

On a lighter note, the American Cancer Society appears not to be a match. Indeed, its headquarters are located in Atlanta, Georgia, but the shell company with the same name is registered in Michigan, a state that also has a small town named Atlanta. Clearly whoever set up this shell has a sense of humour.

These results show that our first investigation method was relatively relevant. We did not find any signs of fraud for most of the members of our list, but a lot of articles turned up to document the presence of Amnesty International in the leaks. In these articles, Amnesty International denies being implicated and denounces the use of its name. Because of the lack of data on the implication of the other charities, we choose to keep these suspects in custody while proceeding to the second part of our analysis.

Who do you hang with?

We chose to evaluate the first-level connections of our matches in order to detect clusters. Indeed, intermediaries are crucial for an offshore service provider, handling the set-up and administration of the shell companies. If several suspected charities are all connected by a single intermediary, it could mean that said intermediary is specialized in charities. But if one of the connected entities is clearly a fake, there is also a high probability than its neighbours are fake too. In this case, it would mean that the given intermediary is in the habit of stealing charity names for its shell companies.

To get these graphs, we extracted every node corresponding to a matched entity and its associated edges. The actors at the other end of these edges were extracted as well. With this data, we could represent the first level connections for every entity in the form of a network and see the degree of interconnectivity between the matches.

The following networks are user-interactive and each nodes can be dragged around. By pressing down the SHIFT key, multiple nodes can be selected and moved together at once. A button under each network allows to refresh and shuffle around all the nodes.

The panama papers:

This graph is already very interesting to look at. About half of all the charity nodes (in red) are somehow connected to another charity node, which is a lot for a dataset containing over 200'000 nodes. This confirms our suspicions from above that there must be intermediaries who either specialize in charities or are in the habit of using their names. A closer look shows us some familiar names, such as the World Wildlife Fund and UNICEF, connected through the Tarbes Trust.

The paradise papers:

The original graph for the Paradise Papers' matches is much too large for us to comfortably display, so we filtered it down, removing any non-charity nodes that are only connected to one other node (and therefore aren't connecting anyone).

We now have a much clearer image of the connecting nodes, and we notice several things. First, UNICEF and the World Wildlife Fund are back, but this time they are connected to the "International Red Cross", which the real Red Cross insists is a fake (Misuse of ICRC identity). This is a strong argument for UNICEF and the World Wildlife Fund also being stolen names (same thing for the Cancer society, which is also in this cluster). Notably, this cluster is not the only place where the "International Red Cross" shows up, suggesting multiple different nodes have this name, which reinforces the idea that the name is stolen. In the second, smaller cluster, we also find Amnesty International, which therefore can also be considered a victim of name theft. We can thus claim with high probability that Amnesty International and The Cancer Society are indeed innocent in this respect.

The offshore leaks:

This network is also very interesting. We find again the "International Red Cross" and "Amnesty International" (and also "UNICEF"!), this time connected through a different intermediary. Clearly, there is more than one company out there ripping off the names of famous charities.

The Bahamas leaks:

While they lack the big charity names of the previous leaks, the Bahamas Papers nevertheless contain some information that is particularly relevant in Switzerland. Indeed, there is a cluster of three charities, United Way, Sandy, and East Meets West, where the central nodes is Credit Suisse, the second largest bank in Switzerland. A bit of research tells us that this is probably the real bank; Credit Suisse is listed as the third largest facilitator for offshore accounts in the Panama Papers, and together with UBS is responsible for one in ten of all trusts leaked in the Bahamas Papers.

Give, but give wisely!

From the leaked data published by the ICIJ, we raised the question about how charities handle money. We insisted on the importance of their reputation as a key to success. With our investigation we have shown that the majority of well-known charities present in the leaks were not implicated directly and that their names were often usurped by the law firms to help obscure the origin of money in questionable funds.

It is important that these charities not suffer from the presence of their names in these leaks. For now, we know that the biggest charities have published a disclaimer about their implication (for example The Red Cross and Amnesty International).

However, we did discover some black sheep. It is shocking to discover than these guilty charities can spend through these firms hide away the donated money and spend it on a goal opposite to their initial objective — for instance The Nature Conservancy and their link with oil drilling.

We asked how charities behave and found a certain amount of structure in the way shells for charities or shells impersonating charities are set up, but our work we scratched the surface of the issue, and could easily be extended. First, more data on charities could be integrated, as we only worked with the most famous and easily accessible ones. Another path we could follow would be to establish a scoring algorithm to rate every charity named in the leaks according to the probability that it is real.

To conclude, these findings should not impact the donations made, but rather draw attention to the importance of researching who you give to.

And if you realize with this lecture that you once helped a billionaire buy a pool, don't feel guilty:

“It's not how much we give but how much love we put into giving.”

Mother Theresa

Authors

Sabrina Kall
Sabrina Kall
Ruijia Wang </a>
Ruijia Wang
Theo Imler
Theo Imler