Special characters in table names

In DW environment, we allow business user to upload data into new tables and run their own adhoc queries when needed. The user interface allows the user to upload up to 10,000 (10K) rows in either comma or tab delimited format to a temporary database/table. The user can specify the table name they want the data to upload to. Many savvy business users can thus run their own SQL queries joining with the data in DW tables.

All this works fine but many a times user provides table names with non-alphanumeric characters including space, forward or backward slash, dollar sign, etc. At some point DBAs noted that managing these tables becoming an issue especially the tables names with special characters. The following monitoring tool helped alleviate or eliminated the problem and also the UI was modified to check for any special characters and remove them before creating users' temporary tables.

Since MySQL doesn't provide regular expression (RLIKE or REGEXP) in SHOW TABLE syntax one is forced to use UNIX or other language (awk, perl, python, etc.) for RE features.

The following pulls all table names with special characters including $, \, space, ?, and @. You can add other characters that are of interest to you to this set.
linux> mysql --host=hostname_or_ip  --database=db_name   --user=root  --password=your_pw  --execute="SHOW TABLES" | egrep [$\\\ ?!@]

# This shows tables like (not including double quotes)
"non_alphanum_table_name test it 2"
"non_alphanum_table_name; test it"
"non_alphanum_table_name; test it $"
"non_alphanum_table_name; test it $ \"
"non_alphanum_table_name_test_$_!_2"

Getting MySQL DB size

Many a times you want to find the size of database (schema) for performance monitoring or to allocate more disk space or to take some proactive action on db size.  You can pull relevant information from MySQL INFORMATION_SCHEMA with the following query.  It sums up each table data and its index size for all tables in a single schema. 

SELECT s.schema_name,
  CONCAT(IFNULL(ROUND((SUM(t.data_length) +
       SUM(t.index_length))/1024/1024,2),0.00),"Mb") total_size,
  CONCAT(IFNULL(ROUND((SUM(t.data_length))/1024/1024,2),0.00),"Mb") data_size,
  CONCAT(IFNULL(ROUND((SUM(t.index_length))/1024/1024,2),0.00),"Mb") index_size,
  CONCAT(IFNULL(ROUND(((SUM(t.data_length)+SUM(t.index_length)) -
         SUM(t.data_free))/1024/1024,2),0.00),"Mb") data_used,
  CONCAT(IFNULL(ROUND(SUM(data_free)/1024/1024,2),0.00),"Mb") data_free,
  IFNULL(ROUND((((SUM(t.data_length)+SUM(t.index_length)) -
       SUM(t.data_free))/((SUM(t.data_length) + 
       SUM(t.index_length)))*100),2),0) pct_used,
  SUM(t.table_rows) total_rows,
  COUNT(t.table_name) total_tables,
  CONCAT(ROUND(MAX(t.data_length+t.index_length)/1024/1024, 0.00), 
      "Mb") biggest_table_size,
  GROUP_CONCAT(distinct t.engine) engines_used,
  GROUP_CONCAT(distinct t.table_collation) collations_used
FROM INFORMATION_SCHEMA.SCHEMATA s
  LEFT JOIN INFORMATION_SCHEMA.TABLES t 
ON s.schema_name = t.table_schema
WHERE s.schema_name = "test_dw_advertising"
GROUP BY s.schema_name\G


*************************** 1. row ***************************
       schema_name: test_dw_advertising
        total_size: 212151.14Mb
         data_size: 154660.83Mb
        index_size: 57490.31Mb
         data_used: 210039.45Mb
         data_free: 2111.69Mb
          pct_used: 99.00
        total_rows: 309940227
      total_tables: 118
biggest_table_size: 74360Mb
      engines_used: InnoDB,MyISAM
   collations_used: latin1_swedish_ci
1 row in set (1 min 38.26 sec)


The total size of allocated (including free data size) is nearly 212G, out of which data used ~155G and indexes used ~57G for 118 tables leaving 2G free (99% of disk allocated is used). Biggest table used 74.3G disk space and this schema has both InnoDB and MyISAM tables using single collation latin1_swedish_ci.

HTH,
Shiva

Talend - Oracle - Java Path error

Few weeks ago on of our QA systems I installed Oracle 10g Express for testing and installation was smooth and all went well.  But recently, I needed to do some data profiling in a specific table in MySQL DW environment due to fact that some of queries were taking more than twice the amount of normal runs.  Specifically, one query that used to take 5-6 minutes churning through couple of million rows in a staging table started taking more than 15 minutes.  I suspected some data issue but needed to quickly run some tests knowing how the recent data profiles against earlier weeks data. 

I downloaded Talend Data Profiler to this new test machine (where I had Oracle 10g) and tried to install it.   Soon after double clicking on exe file, I got an error indicating something like "org.talend.rcp.branding.top.product could not be found." in configuration\12725000123.log.  The log also has some more information regarding which Java version it is trying to use and command line options passed to run it.  The error itself doesn't indicate much and this wasted quite a bit of time.

Going through manual of Talend I found out that profiler needs Java version 1.5 or later version.  But I was pretty sure that I had the latest Java environment on this box and checked twice by running java -version in cygwin and checking the path.  Also by visiting http://www.java.com then clicking on "Verify Installation" button.  They both indicated latest version.