Investigate disk usage
Recently I had to investigate the usage of a very big volume… with a lot of data-files, owned by several users.
I started with “agedu”, but somehow I was not able to get the information I needed.. so I started using find with stat and put everything into MySQL.
So the first step was to do the following find command:
# find /nfs/bigfiler -exec stat –format=”%F:%n:%s:%U:%u:%G:%g:%X:%Y:%Z” {} ; > /scratch/big-filer.info
Create a table in MySQL:
CREATE TABLE `pieter.bigfiler_content` (
`fileid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`file_type` char(32) NOT NULL,
`filename` varchar(512) NOT NULL,
`size` int(11) NOT NULL,
`user` char(16) DEFAULT NULL,
`uid` char(16) DEFAULT NULL,
`groupname` char(16) DEFAULT NULL,
`gid` char(16) DEFAULT NULL,
`time_access` datetime DEFAULT NULL,
`time_mod` datetime DEFAULT NULL,
`time_change` datetime DEFAULT NULL,
PRIMARY KEY (`fileid`),
KEY `idx_file_type` (`file_type`) USING BTREE,
KEY `idx_user` (`user`) USING BTREE
)
Load the data into MySQL:
LOAD DATA INFILE '/scratch/big-filer.info'
INTO TABLE pieter.bigfiler_content
FIELDS TERMINATED BY ':'
LINES TERMINATED BY 'n'
(file_type, filename, size, user, uid, groupname, gid, @time_access, @time_mod, @time_changed)
SET time_access = FROM_UNIXTIME(@time_access),
time_mod = FROM_UNIXTIME(@time_mod),
time_change = FROM_UNIXTIME(@time_change);
And now you can run nice queries to analyse the data 😀