Yahoo SEM isBooksClosed

As a part of Internet Marketing and SEM, you will need to pull daily summary of keyword performance reports from search engines like Google, Yahoo and MSN.

In case of Yahoo, you should check "isBooksClosed" to get the correct data for your report dates; meaning the data you pulled will not change if isBooksClosed is true and Yahoo analytics has collected all the data for the date you requested the report.

This is especially important if you are pulling the previous day data early in the morning as Yahoo may not have had time to get collect and complete the analytics. We have run into many times books not closed condition for few accounts out of many tens of accounts. We then need to repull the data at later time of the day and run the daily summaries.

I have reproduced the part of Yahoo explaination related to this from http://developer.searchmarketing.yahoo.com/docs/V6/gsg/BasicReportService.php#books_closed

Google's extended searches

Internet search users are very well aware of Google search engine and typically they end up entering their query or keywords in the search box and click the resulting links of interest/relevant to them (typically first page results and especially among top 3 o 4 results).  Large percentage of users seem to be unaware of Google's extended search that can provide exact result he or she is looking for without having to hop to one more site.

For example, weather for a particular city or current time at a location or dictionary or area code or sports scores and many more.   Following are few examples.

For weather: To find current weather at Boston, USA just enter keywords "weather Boston, USA" and you will see today and next few days weather in the first result!

For time: To find current time at Bangalore, India (or Bengaluru, India) enter "time Bengaluru, India" and first result is the time.

For dictionary: use "define" keyword first before entering your query like "define avatar"

Yahoo SEM daily report - Currency and Dups handling

If you are pulling daily keyword performance data from major search engines to store in your in Data Warehouse, you need to perform many of transformation before having a daily, weekly or monthly summaries available for business (end) users.  In the case of Yahoo the required transformations that I needed to make are, one to handle the currency conversion and two to handle Yahoo sending more than one keyword when only one is expected. This has seem to happen when query or landing page url is modified for the same bid unit and for high impressions queries. Also, for disabled queries/keywords.

The following sql handles both with data read from staging table and loaded into final table after some transformation and cleaning. It does currency conversion for all countries except US and Mexico (units are in US dollars) which do not need conversions.

Sub-query below uses MySQL GROUP_CONCAT to find any duplicate keywords for the bid unit and dim_curreny has the daily currency exchange rates pulled from external third party services early in the morning. se_query_group table has the grouping of the keywords based on company requirements and search engine. Query also filters out any keyword that has zero impressions.

The query can further be modified for better performance of using query_id by joining with data warehouse query tables and using query_id in the joins also using search engine bid unit as the surrogate key.

Ruby libmysql.dll or mysql_api.so error

Installing Ruby MySQL module is pretty straight forward either on *nix or Windows by running
> gem  install  mysql
But on Windows when you try to import the module through "require mysql" you may get an error like
C:/Ruby/lib/ruby/gems/1.8/gems/mysql-2.8.1-x86-mswin32/lib/1.8/mysql_api.so: 126: The specified module could not be found.   - C:/Ruby/lib/ruby/gems/1.8/gems/mysql-2.8.1-x86-mswin32/lib/1.8/mysql_api.so (LoadError)
#  from C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
...
...

In that scenario, you are missing "libmysql.dll" in the system32 directory.  In my case I had MySQL server copy and had the dll in folder C:/Program Files/MySQL/MySQL Server 5.4/lib/opt

Mapping URL to top level domain

When parsing web logs as part of web analytics, one might be interested in the top level domain analysis before drilling down further.  Following query maps an url (referer) to top level domain.  Top level domain include not only generic ones like '.com', '.edu', '.org', etc but also country level top levels like '.uk', '.de', etc.  You can download the list from IANA (Internet Assigned Numbers Authority) .

After downloading it import into a table dim_int_mktg_tld which you can join with the staging data.
CREATE TABLE `dim_int_mktg_tld` (
  `tld` varchar(16) NOT NULL,
  `type` varchar(32) default 'country-code',  #-- Can be country-code or generic
  `info` varchar(500) default NULL,
  PRIMARY KEY  (`tld`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

#--  Join with yesterdays staging data and get TLD
SELECT WL.url, TLD.tld
FROM stg_weblog_date WL
LEFT OUTER JOIN dim_int_mktg_tld TLD
ON substring_index(
     substring_index(
       substring_index(url, '://', -1),
       '/',1), 
    '.',-1) = TLD.tld
WHERE WL.date = DATE_ADD(CURRENT_DATE, interval -1 day)

#-- Example: http://www.abc.co.au/in/test
#-- Innermost substring_index returns everything to the right of '://' 
#--   i.e, www.abc.co.au/in/test,
#-- then the middle one gets everything to the left of first '/'
#--   i.e, www.abc.co.au
#-- then the outer most substring_index returns everything to the right of last '.'
#--   i.e, au
#-- which is then joined with TLD table

HTH,
Shiva

Download Google keywords through SUDS

There are sometimes situations where you may want to download all the keywords (criteria) uploaded to Google Adwords and if you have substantial keyword basket (say, more than a million), then either downloading manually or checking through the tools provided by Google become prohibitive.  Also, Google daily keyword performance report provides metrics for all the keywords that have had at least one impression.

Following application is a simple tool that downloads all keywords and dumps that data to different files.    I have removed error handling code and hardcoded many values to simplify it and you can modify this your requirements and don't forget to add your account and API details.  I have been able to download few million keywords using multiple processes.

Environment:
Ubuntu karmic OS running Python 2.6.4 for Adwords API version 13 (V13) using Python SUDS SOAP module.