Splunk/custom-search-youtube
Jump to navigation
Jump to search
You are here: | Custom search youtube
|
Description
What it does
Streaming custom search command that shows information (video length, video title) about youtube videos based on squid proxy logs. Clicking on a row will open the corresponding youtube video in a new tab.
Screenshot
Download
Download the app here. You can install it from the Manage apps > Install app from file menu.
Source
Code
This is the main code that should be copied to $SPLUNK_HOME/etc/apps/youtube/bin/youtube.py:
#!/usr/bin/env python
#
# Author: Sebastien Damaye
# Description: streaming custom search command that shows information about youtube
# videos based on squid proxy logs
# Use as follows:
# source="*squid*" uri="*www.youtube.com/watch?v=*" | sort _time | youtube uri | table _time clientip uri youtube
#
import splunk.Intersplunk
import re
import urllib2
import urlparse
import time
def getYoutube(uri):
m = re.match(r'^https:\/\/www.youtube.com\/watch\?v=([a-zA-Z0-9_-]+)(&.+)?$', uri)
if m:
response = urllib2.urlopen('http://youtube.com/get_video_info?video_id=%s' % m.group(1))
html = response.read()
qs = urlparse.parse_qs(html)
if 'title' in qs:
title = qs['title'][0]
length = time.strftime("%H:%M:%S", time.gmtime(int(qs['length_seconds'][0])))
return "(%s) %s" % (length, title)
else:
return "Error while retrieving info"
else:
return "Regexp not recognized"
# get the previous search results
results,unused1,unused2 = splunk.Intersplunk.getOrganizedResults()
# for each results, add a 'youtube' attribute, calculated from the uri field
for result in results:
result["youtube"] = getYoutube(result["uri"])
# output results
splunk.Intersplunk.outputResults(results)
Dashboard
<form script="table_drilldown_url_field.js">
<label>youtube videos</label>
<fieldset submitButton="false" autoRun="true">
<input type="time" token="TimeRangePicker" searchWhenChanged="true">
<label>TimeRange</label>
<default>
<earliest>@d</earliest>
<latest>now</latest>
</default>
</input>
</fieldset>
<row>
<panel>
<table id="link">
<search>
<query>source="*squid*" uri="*www.youtube.com/watch?v=*" | sort _time | youtube uri | table _time clientip uri youtube v</query>
<earliest>$TimeRangePicker.earliest$</earliest>
<latest>$TimeRangePicker.latest$</latest>
</search>
<drilldown target="_blank">
<link>
<![CDATA[
https://www.youtube.com/watch?v=$row.v$
]]>
</link>
</drilldown>
<option name="wrap">true</option>
<option name="rowNumbers">false</option>
<option name="dataOverlayMode">none</option>
<option name="drilldown">row</option>
<option name="count">30</option>
<fields>["_time","clientip","uri","youtube"]</fields>
</table>
</panel>
</row>
</form>
Known limitations
- It takes time to render the table because it's dependant from the youtube webservice (http://youtube.com/get_video_info) and the table won't be displayed until it has gathered the info for all entries. Beware that the script could take a very long time depending on the number of videos to analyze.
- In some cases, the google webservice does not return results (for legal reasons I guess). In these cases, the script displays Error while retrieving the info
Comments
Keywords: splunk youtube