2008
04.17

I was curious about some of the databases I get to work with. “How do my clients store their data? What data types are most prevalent?” Well, a few keystrokes later, I had my answers:

SELECT c.data_type, count(c.data_type) AS frequency
FROM information_schema.columns AS c
INNER JOIN information_schema.tables AS t
ON c.table_schema = t.table_schema AND
c.table_name = t.table_name
WHERE c.table_schema NOT IN ('information_schema','mysql') AND
t.table_type = 'base table'
GROUP BY data_type;

Which gave me a nice “data type distribution” table:

data_type frequency
blob 7
char 611
date 85
datetime 125
decimal 133
double unsigned 1
enum 677
float 5
int 2334
mediumblob 21
mediumint 3
set 9
smallint 7
text 57
time 551
timestamp 20
tinyint 6
varchar 946

Not content with just that, I whipped up another statement to show me the top 10 tables that used a particular data type (replace the WHATEVER with the data type you’re interested in) :

SELECT c.table_schema, c.table_name, count(c.data_type) AS count
FROM information_schema.columns c
INNER JOIN information_schema.tables AS t
ON c.table_schema = t.table_schema AND
c.table_name = t.table_name
WHERE c.table_schema NOT IN ('information_schema','mysql') AND
t.table_type = 'base table' AND
c.data_type = 'WHATEVER'
GROUP BY c.table_schema, c.table_name
ORDER BY count DESC
LIMIT 10;

Which showed me:

table_schema table_name count
database_1 object_payment 14
database_2 object_payment 12
database_3 object_payment 11
database_2 object_space 9
database_1 object_space 9
database_1 aggregation_table_1 8
database_2 aggregation_table_1 8
database_2 aggregation_table_3 7
database_1 aggregation_table_4 7
database_1 object_user 7

Finally, I wrote them up into stored procedure form (and included a 3rd that gives the data type distribution for specified databases) and committed them to my project on Sourceforge.

  1. In the first query, you have a typo – at the end of the WHERE, it should be t.table_type, not c.table_type.

  2. That’s what I get for trying to do too many things at the same time. Thanks for pointing it out.