The issue was one table was storing data in "latin1" while other in "utf8".
select count(1)
from table_1 T1
left outer join table_2 T2
on T1.id = T2.id
where T2.id is null
Note: "id"s here are varchars and they are not integers.
T1 had around 100K while T2 has around 800K joining on MySQL5.1, quad machine with 16G RAM. I expected much quicker response and on running the explain plan I could see the query doing the full table scan! First I wasn't sure why this was happening but after digging little bit and found out that one table was defined with "charset=latin1" and another with "charset=utf8".
MySQL was converting latin1 to utf8 and then performing the join which ended up with full table scan. On converting T2 to utf8 with collation utf8_unicode_ci, the query timing dropped to 10secs or so!
With that said by default now we create all tables with charset=utf8.
No comments:
Post a Comment