Ticket #1232 (new defect)

Opened 2 weeks ago

domain_traffic becomes huge and slows down server (+possible fix)

Reported by: ispcomm Assigned to:
Priority: minor Milestone: Working
Component: Backend (Engine) Version: ispCP ω 1.0.0 - RC4
Severity: Easy Keywords:
Cc:

Description

Ok,

after some some of hosting some 100 domains the domain_traffic table becomes huge (in the order of millions of records). Reseller login times become very long (in the order of 1-2 minutes) because the current code tries to SELECT * from domain_traffic (for some reason).

Apart from fixing this code, I am proposing a better solution to the problem: a daily summary for records older than X days/months etc. This will replace tents of hundreds of rows from the domain_traffic with a single row containing the totals for web/ftp/mail/etc. The solution has been implemented on vhcs but should work with minor mods in ispcp.

I propose a modification to the current traffic correction script but do not implement it here. Also temporary tables should be used in mysql5 instead of the fixed ones I show here (this is trivial too).

First I need an extra column in domain_traffic to make life easier (it could be avoided with carefull handling of the times, but for first try I prefer it this way.

ALTER TABLE `domain_traffic` ADD `issum` TINYINT NOT NULL DEFAULT '0';

Then I create a "summary" table to hold summarized records. The definition is here:

CREATE TABLE  `domain_traffic_daily` (
  `dtraff_id` int(10) unsigned NOT NULL auto_increment,
  `domain_id` int(10) unsigned default NULL,
  `dtraff_time` int(10) unsigned default NULL,
  `dtraff_web` int(10) unsigned default NULL,
  `dtraff_ftp` int(10) unsigned default NULL,
  `dtraff_mail` int(10) unsigned default NULL,
  `dtraff_pop` int(10) unsigned default NULL,
  `days` int(10) unsigned NOT NULL,
  PRIMARY KEY  (`dtraff_id`)
) ENGINE=MyISAM ;

Now the query to summarize data is executed. The sums are made ignoring the "NULL" status of the rows. A null in one row won't compromize the sum (mysql5).

INSERT into domain_traffic_daily (domain_id,dtraff_time,dtraff_web,dtraff_ftp,dtraff_mail,dtraff_pop,days)
select
  domain_id,
  dtraff_time,
  sum(dtraff_web),
  sum(dtraff_ftp),
  sum(dtraff_mail),
  sum(dtraff_pop),
  to_days(from_unixtime(dtraff_time)) as days
from
  domain_traffic
where
  dtraff_time < UNIX_TIMESTAMP(DATE_SUB(CURDATE(), INTERVAL 3 MONTH))
group by
  domain_id,
  days
order by
  days,domain_id;

Next step is deleting old data from the domain_traffic table:

delete from domain_traffic
where
  dtraff_time < UNIX_TIMESTAMP(DATE_SUB(CURDATE(), INTERVAL 3 month));

At last we move the summarized data back to domain_traffic:

insert into
  domain_traffic (
    domain_id,
    dtraff_time,
    dtraff_web,
    dtraff_ftp,
    dtraff_mail,
    dtraff_pop,
    issum
    )

select
  domain_id,
  dtraff_time,
  dtraff_web,
  dtraff_ftp,
  dtraff_mail,
  dtraff_pop,
  '1'
from
  domain_traffic_daily

That's it. The mod is implemented in a way to stay compatible with the current vhcs2/ispcp web and other routines.

There's one problem in the current implementation: If the totals are bigger than the MAX_INT they're rounded to MAX_INT. this is 4Gbytes on a 32bit machine and will loose traffic on busy sites. I do have domains with excess of 4Gbyte/day web and mail traffic. The solution is to use 64bit machine or double values. The latter might cause troubles in current ispcp (not tested).

Attachments


Add/Change #1232 (domain_traffic becomes huge and slows down server (+possible fix))