Multi-channel Attribution Using Neo4j Graph Database


Globally more than $500 Billion was spent on advertising (Lunden, 2013). One of the greatest challenges of spending money on advertising is trying to understand the impact of those dollars on sales. With the proliferation of multiple mediums or channels (TV, search engines, social media, gaming platforms and mobile) on which precious marketing dollars can be spent, a Chief Marketing Officer (CMO) is in dire need of insights into the return on his investment in each medium. More importantly, the CMO needs timely data to prove that spending on a specific channel has a good return on investment. Neo4j can be used to help marketing applications get answers to tough questions:

  • How much was the increase in web awareness of the product after a commercial was aired in a specific TV channel on a specific date in a specific geographic area?
  • How much of that web awareness translated into foot traffic into the stores in that geographic area?
  • How much of that increased web awareness translated into increase in revenue?

Multi-channel attribution is a tough nut to crack but with Neo4j, it becomes a lot easier to collect and analyze vast amounts of data across multiple channels and domains. This paper looks at a multi-channel attribution graph model for advertising in the pharmaceutical industry.

A Basic Neo4j Model for Multi-channel Attribution

Pfizer spent $156 million advertising Lipitor on television in 2011 (Japsen, 2012). Pfizer can look at the influence of their advertisements on consumers by capturing relevant web site visits and the Internet searches after advertising campaign then look at the number of prescriptions written in that area. One can also compare the searches conducted and the prescriptions written in one area versus another where there was no advertising for the product, essentially conducting an A-B test.

The graph model presented below captures simulated data on when Pfizer aired an advertisement for their respective cholesterol drug and the Google searches conducted during the time the commercial was active. For simplicity the monthly search data is captured, but in reality search data during or right after a commercial is aired can also be captured.


Similar information can be captured for all competitors, for example, Merck and AstraZeneca and performance of Pfizer’s advertising spending compared across competitors in the industry. This model can be expanded to included data on prescriptions written in each region and by each physician.

The following Cypher command would create the graph capturing the advertising and search data for the cholesterol-lowering drug sold by Pfizer, Merck and AstraZeneca.

CREATE (Pfizer:Company {name:’Pfizer Inc.,’, sector:’Healthcare’, industry:’Drug Manufacturers – Major’})

CREATE (Lipitor:Drug {name:’Lipitor’, purpose:’Cholestrol Lowering’, companyname:’Pfizer Inc.,’})

CREATE (CBS_AD_MAR_2013:Advertisement {name:’CBS AD Lipitor Mar 2013′, airdate:’Mar-2013′, product:’Lipitor’})

CREATE (CBS_AD_APR_2013:Advertisement {name:’CBS AD Lipitor Apr 2013′, airdate:’Apr-2013′, product:’Lipitor’})

CREATE (Lipitor_Search_Mar_2013:Search {name:’Lipitor_Search_Mar_2013′, searchdate:’Mar-2013′, count:’100′, engine:’Google’, product:’Lipitor’})

CREATE (Lipitor_Search_Apr_2013:Search {name:’Lipitor_Search_Apr_2013′, searchdate:’Apr-2013′, count:’200′, engine:’Google’, product:’Lipitor’})

CREATE (Pfizer)-[:MAKES]->(Lipitor),(Pfizer)-[:AIRS]->(CBS_AD_MAR_2013),(CBS_AD_MAR_2013)-[:ISADVERTISEMENTFOR]->(Lipitor),(Pfizer)-[:AIRS]->(CBS_AD_APR_2013),(CBS_AD_APR_2013)-[:ISADVERTISEMENTFOR]->(Lipitor),(Lipitor_Search_Mar_2013)-[:SEARCHFOR]->(Lipitor), (Lipitor_Search_Apr_2013)-[:SEARCHFOR]->(Lipitor)

CREATE (AstraZeneca:Company {name:’AstraZeneca Inc.,’, sector:’Healthcare’, industry:’Drug Manufacturers – Major’})

CREATE (Crestor:Drug {name:’Crestor’, purpose:’Cholestrol Lowering’, companyname:’AstraZeneca Inc.,’})

CREATE (CRESTOR_CBS_AD_MAR_2013:Advertisement {name:’CBS AD Crestor Mar 2013′, airdate:’Mar-2013′, product:’Crestor’})

CREATE (CRESTOR_CBS_AD_APR_2013:Advertisement {name:’CBS AD Crestor APR 2013′, airdate:’Apr-2013′, product:’Crestor’})

CREATE (Crestor_Search_Mar_2013:Search {name:’Crestor_Search_Mar_2013′, searchdate:’Mar-2013′, count:’400′, engine:’Google’, product:’Crestor’})

CREATE (Crestor_Search_Apr_2013:Search {name:’Crestor_Search_Apr_2013′, searchdate:’Apr-2013′, count:’900′, engine:’Google’, product:’Crestor’})

CREATE (AstraZeneca)-[:MAKES]->(Crestor),(AstraZeneca)-[:AIRS]->(CRESTOR_CBS_AD_MAR_2013),(CRESTOR_CBS_AD_MAR_2013)-[:ISADVERTISEMENTFOR]->(Crestor),(AstraZeneca)-[:AIRS]->(CRESTOR_CBS_AD_APR_2013), (CRESTOR_CBS_AD_APR_2013)-[:ISADVERTISEMENTFOR]->(Crestor),(Crestor_Search_Mar_2013)-[:SEARCHFOR]->(Crestor), (Crestor_Search_Apr_2013)-[:SEARCHFOR]->(Crestor)

CREATE (Merck:Company {name:’Merck Inc.,’, sector:’Healthcare’, industry:’Drug Manufacturers – Major’})

CREATE (Zocor:Drug {name:’Zocor’, purpose:’Cholestrol Lowering’, companyname:’Merck Inc.,’})

CREATE (ZOCOR_CBS_AD_MAR_2013:Advertisement {name:’CBS AD Zocor Mar 2013′, airdate:’Mar-2013′, product:’Zocor’})

CREATE (ZOCOR_CBS_AD_APR_2013:Advertisement {name:’CBS AD Zocor APR 2013′, airdate:’Apr-2013′, product:’Zocor’})

CREATE (Zocor_Search_Mar_2013:Search {name:’Zocor_Search_Mar_2013′, searchdate:’Mar-2013′, count:’218′, engine:’Google’, product:’Zocor’})

CREATE (Zocor_Search_Apr_2013:Search {name:’Zocor_Search_Apr_2013′, searchdate:’Apr-2013′,count:’376′, engine:’Google’, product:’Zocor’})

CREATE (Merck)-[:MAKES]->(Zocor),(Merck)-[:AIRS]->(ZOCOR_CBS_AD_MAR_2013),(ZOCOR_CBS_AD_MAR_2013)-[:ISADVERTISEMENTFOR]->(Zocor),(Merck)-[:AIRS]->(ZOCOR_CBS_AD_APR_2013),(ZOCOR_CBS_AD_APR_2013)-[:ISADVERTISEMENTFOR]->(Zocor), (Zocor_Search_Mar_2013)-[:SEARCHFOR]->(Zocor), (Zocor_Search_Apr_2013)-[:SEARCHFOR]->(Zocor)

RETURN Pfizer;

A simple Cypher query to find all the company names in Neo4j would look like this:

MATCH (companies:Company) RETURN


Figure 1: Results from a Cypher query returning company names.

The companies presented and the data captured in this graph model could be further interconnected into one very large graph and the queries can be conducted across.

If we wish to find out the number of searches that were conducted during an advertisement campaign, we could find that out using the following query:

MATCH (ad:Advertisement)-[:ISADVERTISEMENTFOR]->(dru:Drug)<-[:SEARCHFOR]-(srh:Search) WHERE ad.airdate = srh.searchdate RETURN ad, srh


Figure 2: Cypher Query Matching the Advertisement and the Search Results.

This graph model illustrates the power of Neo4j in exploring answers to questions in multi-channel attribution and in other areas of marketing.

Works Cited

Japsen, B. (2012, February 2). Drug Makers Dial Down TV Advertising. Retrieved April 12, 2014, from New York Times:

Lunden, I. (2013, September 30). Digital Ads Will Be 22% Of All U.S. Ad Spend In 2013, Mobile Ads 3.7%; Total Global Ad Spend In 2013 $503B. Retrieved April 12, 2014, from Tech Crunch:






3 thoughts on “Multi-channel Attribution Using Neo4j Graph Database

  1. Pingback: What I Learned This Week – Nov 19th, 2017 | Business Applications

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s