Special characters in table names

In DW environment, we allow business user to upload data into new tables and run their own adhoc queries when needed. The user interface allows the user to upload up to 10,000 (10K) rows in either comma or tab delimited format to a temporary database/table. The user can specify the table name they want the data to upload to. Many savvy business users can thus run their own SQL queries joining with the data in DW tables.

All this works fine but many a times user provides table names with non-alphanumeric characters including space, forward or backward slash, dollar sign, etc. At some point DBAs noted that managing these tables becoming an issue especially the tables names with special characters. The following monitoring tool helped alleviate or eliminated the problem and also the UI was modified to check for any special characters and remove them before creating users' temporary tables.

Since MySQL doesn't provide regular expression (RLIKE or REGEXP) in SHOW TABLE syntax one is forced to use UNIX or other language (awk, perl, python, etc.) for RE features.

The following pulls all table names with special characters including $, \, space, ?, and @. You can add other characters that are of interest to you to this set.
linux> mysql --host=hostname_or_ip  --database=db_name   --user=root  --password=your_pw  --execute="SHOW TABLES" | egrep [$\\\ ?!@]

# This shows tables like (not including double quotes)
"non_alphanum_table_name test it 2"
"non_alphanum_table_name; test it"
"non_alphanum_table_name; test it $"
"non_alphanum_table_name; test it $ \"
"non_alphanum_table_name_test_$_!_2"



Here you have choice either delete those tables or store the name of these tables into a separate table called "ERROR_TABLE_NAME" and work with business users to later modify them. Option 2 is friendly and professional and you won't annoy users like in option 1 :).

Quickly zap an awk script like
{ print "INSERT INTO ERROR_TABLE_NAME(name) VALUES('" $0 "'); COMMIT;" }
to insert each table name into ERROR_TABLE_NAME table.

You need to store the script in a file (say, error_tbl_name.awk) since at command line passing a single quote in the above string doesn't work. When you pass a table name to MySQL, you need to quote the name like 'non_alphanum_table_name test it 2'. But single quote is used by awk which was not easily escaped and errored out.

Now piping all three commands above will result in
linux> mysql --host=hostname_or_ip  --database=db_name   --user=root  --password=your_pw  --execute="SHOW TABLES" | egrep [$\\\ ?!@] | awk -f error_tbl_name.awk
-- The INSERT statements generated for each looks like
INSERT INTO ERROR_TABLE_NAME(name) VALUES('non_alphanum_table_name test it 2'); commit;
INSERT INTO ERROR_TABLE_NAME(name) VALUES('non_alphanum_table_name; test it'); commit;
INSERT INTO ERROR_TABLE_NAME(name) VALUES('non_alphanum_table_name; test it $'); commit;
INSERT INTO ERROR_TABLE_NAME(name) VALUES('non_alphanum_table_name; test it $ \\'); commit;
INSERT INTO ERROR_TABLE_NAME(name) VALUES('non_alphanum_table_name_test_$_!_2'); commit;

Also, note you have commit after each insert, due to fact that the auto commit is turned off and when piping is done mysql command is executed for each table which will generate error otherwise.

As final step execute

linux> mysql --host=hostname_or_ip  --database=db_name   --user=root  --password=your_pw  --execute="SHOW TABLES" | egrep [$\\\ ?!@] | awk -f error_tbl_name.awk |  mysql --host=hostname_or_ip  --database=db_name   --user=root  --password=your_pw

Change these table names to new standard table names and send an email to business users with list of above table names.

There you have it.
HTH,
Shiva

No comments:

Post a Comment