Enrichment of Influencer Database
Table of Contents
1. Result
2. Background
I was, indirectly, working for an influencer marketing agency, and I automate a manual task by enriching a database.
My "task" was observing channel list from a platform called Noxinfluencer, and do a manual check for each channel.
Once a channel meet the criteria, I had to manually copy some information to a master Google Sheet.
- The criteria were as follows:
- Good indicators:
- Content in English language [Covered by NoxInfluancer]
- Creator live in Anglos countries [Covered by NoxInfluancer]
- Channel have at least 10,000 subscribers [Covered by NoxInfluancer]
- Content contain "talks"
- Average videos' view >=10,000 views
- Average videos' duration >= 10 minutes
- Average upload period < 6 months
- Content in English language [Covered by NoxInfluancer]
- Bad indicators:
- Content is music only [Covered by NoxInfluancer (sometimes)]
- Content is meant for kids
- Content is music only [Covered by NoxInfluancer (sometimes)]
- Good indicators:
- Element to manually capture if channel meet the criteria:
- Channel's Name
- Channel's URL
- Channel's Category/Keyword
- Channel's Average video views (in thousands)
- Channel's Videos count
- Creator's Email (If found in channel's description)
- Channel's Name
Example of the Master Google Sheet:
| Name | URL | Category | AVG views K | Video count | |
|---|---|---|---|---|---|
| A Walk on the Wild Side | URL | VLOG Tourism Entertainment | 9 | 539 | |
| raimi reyes | URL | Life Style Beauty raimi | 9 | raimi@gleamfutures.com | 149 |
| elanna pecherle | URL | Beauty Makeup Film & Animation | 8 | jessica@collabagency.com | 593 |
| Milk Man Steve | URL | Gaming Action-adventure | 5 | oofgangfire@gmail.com | 94 |
3. Plan
- Gather all YouTubers' channels links from Noxinfluencer after adding the four basic filters.
- Gather all data from the last 20 videos using a library called Pytube (doesn't work anymore after YouTube updated their API).
- Reporting requirement: Email by analyzing the text in description and about me page.
- Criteria 4&5: Analyzing the description of each video for words such as short film, hip hop, ASMR, AMV, Fortnite, Minecraft, and Roblox.
- Criteria 6: Using two metrics I developed myself to helps determine if there is speech throughout the video.
- Narative score \[\text{Video natarive score}=\frac{\text{Subtitles' lenght}}{\text{Video durration}}\] \[\text{Channel natarive score}=\frac{\sum_{1^{st}\text{video}}^{20^{th}}\text{Video natarive score}}{20}\]
- Narative probability \[\text{Videos' narration}= 1 \text{ if auto-subtitle exist, else }0\] \[\text{Channel narative probability}=\frac{\sum_{1^{st}\text{video}}^{20^{th}}\text{Videos' narration}}{20}\]
- Narative score \[\text{Video natarive score}=\frac{\text{Subtitles' lenght}}{\text{Video durration}}\] \[\text{Channel natarive score}=\frac{\sum_{1^{st}\text{video}}^{20^{th}}\text{Video natarive score}}{20}\]
- Criteria 7: The average views from the last 20 videos.
- Criteria 8: The average video duration from the last 20 videos.
- Criteria 9: Wasn't possible due to the Google API limitation.
- Reporting requirement: Email by analyzing the text in description and about me page.
- Developing a score that prioritize channels for manual reviewes \[ Score=\frac{\log{_\text{video count}}* \log{\bar{x}_\text{views}}*\log{\bar{x}_\text{lengh}}*\log{_\text{Channel natarive score}}}{\log{_\text{subs}}} * P(\text{Worth}) * P(\text{Channel narative}) \]