PDA

View Full Version : Hit Tracker


pkuczaj
05-07-2004, 07:55 PM
I want to add a single command that will add stats about people hitting the site. I want it to trigger on ALL activity, and do an insert into the database. I have the command written, and the table created, but I need to locate a proper location in the php files to place it. I've tried global.php but it executes twice. Any suggestions?

Command:
$DB_site->query("INSERT INTO " . TABLE_PREFIX . "hit_data (hitid, host, resolvedhost, useragent, port, referer, timestamp) VALUES (NULL, '" . $_SERVER["REMOTE_ADDR"] . "', '" . $_SERVER["RESOLVED_ADDR"] . "', '" . $_SERVER["HTTP_USER_AGENT"] . "', '" . $_SERVER["REMOTE_PORT"] . "', '" . $_SERVER["HTTP_REFERER"] . "', '" . $_SERVER["TIMESTAMP_NOW"] . "')");

leitel
05-08-2004, 03:14 AM
I applaud your effort. I too have contemplated such a hack. Just haven't got to it. I find lacking in vB more robust features for understanding what interests site users/visitors. I'll stay tuned for now.

One thing occurs to me is to place the command in the form of a function with appropriate parameters in a 'include' function file. Maybe even global since it will be called globally.

pkuczaj
05-09-2004, 06:20 AM
I've currently got it running in global. But I'm finding that a single formview will generate 1 - I've seen 9 hits. Global must be used for attachments or some other sub function. I need to find a single location, or a controlling variable. If I could find the place that executes the header template that would do it.

I've further refined the insert to include then username as well, so I can further query the resulting data. My one concern at this point is the additional load on the machine.

Any thoughts?

pkuczaj
05-09-2004, 07:21 AM
Ok. My latest revision. This one seems to be working. But I'm still testing it out. This is placed into the global.php file.

if (!headers_sent())
{
$DB_site->query("INSERT INTO " . TABLE_PREFIX . "hit_data (hitid, host, username, useragent, port, referer) VALUES (NULL, '" . $_SERVER["REMOTE_ADDR"] . "', '" . addslashes($bbuserinfo['username']) . "', '" . addslashes($_SERVER["HTTP_USER_AGENT"]) . "', '" . $_SERVER["REMOTE_PORT"] . "', '" . addslashes($_SERVER["HTTP_REFERER"]) . "')");
}

SamirDarji
09-04-2004, 10:20 PM
global.php is included in various different php files for different functions. I'm a newb myself, but I'm guessing that with more calls to global.php, the number of times the counter increments is also incrementing. My instinct would be to put it in index.php itself since that is the first file being called.

I'm interested in what you are working on because I need to implement a counter (or thread-views type of thing) for calendar events. I need to have this developed and working by the end of this month. Some of the logic I'll be developing may be of use to you and vice-versa.

pkuczaj
09-05-2004, 02:45 AM
The only problem being is that, if the user goes to a post or a thread directly from the e-mail notification then they bypass the index.php altogether.

global.php is included in various different php files for different functions. I'm a newb myself, but I'm guessing that with more calls to global.php, the number of times the counter increments is also incrementing. My instinct would be to put it in index.php itself since that is the first file being called.

I'm interested in what you are working on because I need to implement a counter (or thread-views type of thing) for calendar events. I need to have this developed and working by the end of this month. Some of the logic I'll be developing may be of use to you and vice-versa.

SamirDarji
09-05-2004, 04:03 PM
The only problem being is that, if the user goes to a post or a thread directly from the e-mail notification then they bypass the index.php altogether.

True. The challenge of what you want is you want it triggered on all activity. Technically, when global.php was indicating multiple hits that was all activity since many different calls to global.php were required. The difficult thing is there will have to be conditionals that help determine when a particular hit is valid to count or needs to be disregarded.

I'm sure some of this logic must exist in vbulletin itself since thread views cannot be artificially inflated by simply hitting refresh.

AN-net
09-05-2004, 05:39 PM
running such a query on a large board can become database and server intensive which is not advised for webmasters who run their forums on shared hosting;)

pkuczaj
09-05-2004, 06:32 PM
I would agree the logic does exist, but I just haven't found it. (I haven't looked in a while either)

True. The challenge of what you want is you want it triggered on all activity. Technically, when global.php was indicating multiple hits that was all activity since many different calls to global.php were required. The difficult thing is there will have to be conditionals that help determine when a particular hit is valid to count or needs to be disregarded.

I'm sure some of this logic must exist in vbulletin itself since thread views cannot be artificially inflated by simply hitting refresh.

pkuczaj
09-05-2004, 06:39 PM
I do run a large board. I get about 8,500-9,000 posts a day. I did have the previous hit counter script running on the board, but with the duplicate hit problem I wrote another little application that does the hit counting for me. It's more accurate because it's only included in the footer.

The hit counting it intense, but it's worth the trouble and the load because it justifies banner advertising, and it's information that's required to do that.

My Linux AMD 3000 is currently running at a top static of a load average of 5-6 as peak times spiking to 12-15 at times. I'm currently in the process of aquiring a SCSI Ultra 320 drive and controller in hopes that it relieves some of the load on the box. If that doesn't work then I need to go with a dual processor box.

running such a query on a large board can become database and server intensive which is not advised for webmasters who run their forums on shared hosting;)

SamirDarji
09-05-2004, 07:29 PM
I didn't think about the footer. Actually, all it would require is finding something that is on every page of your site, regardless of where, and place the counter in that template or php file.

SCSI will help out with that quite a bit. My brother swears by it and will never go IDE for any reason. His cpu utilization never goes above 5% from disk activities, and he messes with video production. That's a lot better than the 100% he hits on his backup IDE drives. He hates those things. I would never run a server on IDE that utilizes its processor even half way.

The other way to eliminate the CPU overhead is to have a separate file server linked by a dedicated gigabit ethernet to your main server. This way, the cpu in the separate file server takes all the pounding from the IDE controller, leaving your main server's cpu free to process the php for the site.

pkuczaj
09-05-2004, 07:31 PM
But then your Nics hit your cpu quite hard. So it's trade off. The only real way is to go SCSI I think. I'm already running three IP Addresses on the box, and that increase the util quite a bit, and to add the traffic of the database hitting a SMB/shared drive would kill it.

I didn't think about the footer. Actually, all it would require is finding something that is on every page of your site, regardless of where, and place the counter in that template or php file.

SCSI will help out with that quite a bit. My brother swears by it and will never go IDE for any reason. His cpu utilization never goes above 5% from disk activities, and he messes with video production. That's a lot better than the 100% he hits on his backup IDE drives. He hates those things. I would never run a server on IDE that utilizes its processor even half way.

The other way to eliminate the CPU overhead is to have a separate file server linked by a dedicated gigabit ethernet to your main server. This way, the cpu in the separate file server takes all the pounding from the IDE controller, leaving your main server's cpu free to process the php for the site.

SamirDarji
09-05-2004, 09:27 PM
But then your Nics hit your cpu quite hard. So it's trade off. The only real way is to go SCSI I think. I'm already running three IP Addresses on the box, and that increase the util quite a bit, and to add the traffic of the database hitting a SMB/shared drive would kill it.

The nics would hit, but a good nic will offload alot of the processing to itself similar to how a SCSI controller does. And don't run the smb connection on any existing nics, but actually have a completely separate gigabit link to the other server just for file transfers. That would allow the same bandwidth to the 3 IPs. I think the amount of cpu power saved by eliminating the IDE would be more than the additional nic would use.

Of course, then there is also the option of splitting the database and web servers onto 2 boxes. ;)

pkuczaj
09-06-2004, 02:47 AM
I've actually though about the web server and database (MySQL) running on two boxes. But if I'm going to go with a new box, why not go with a dual processor box, and run everything local as it is now? I didn't know that about the upgraded Nic (offloading the cpu), I'll go research a good Nic now, and pick one up for the upgrade that I've go scheduled. I'm upgrading the HD from a 100G IDE to a 146G SCSI Ultra 320 with a caching controller and increasing the UPS time because of the controller.

The nics would hit, but a good nic will offload alot of the processing to itself similar to how a SCSI controller does. And don't run the smb connection on any existing nics, but actually have a completely separate gigabit link to the other server just for file transfers. That would allow the same bandwidth to the 3 IPs. I think the amount of cpu power saved by eliminating the IDE would be more than the additional nic would use.

Of course, then there is also the option of splitting the database and web servers onto 2 boxes. ;)

SamirDarji
09-06-2004, 03:19 AM
I've actually though about the web server and database (MySQL) running on two boxes. But if I'm going to go with a new box, why not go with a dual processor box, and run everything local as it is now? I didn't know that about the upgraded Nic (offloading the cpu), I'll go research a good Nic now, and pick one up for the upgrade that I've go scheduled. I'm upgrading the HD from a 100G IDE to a 146G SCSI Ultra 320 with a caching controller and increasing the UPS time because of the controller.

The dual processors won't do as much as you think. I was running vb at work as a knowledgebase on a dual xeon and I'd watch as only one cpu would be used at a time. It was really retarded. A friend of mine did some masters research in this area and discovered that the best case improvement that dual processors can make is 50%. I initially liked the idea of dual processors when they first came to the desktop, but after doing the research, it's usually not worth it. Besides, if you're already to the limit on your current box, I don't know if a mere 50%-75% improvement could sustain you for too long. And considering the investment, it's probably not too good bang for buck if it only buys a short time before the next upgrade.

3Com and Intel make some good server nics that are specifically designed to be very low on cpu utilization. The HD upgrade from the 100g IDE to the 146g SCSI will be tremendous as far as data transfer is concerned, although I'd hesitate on the caching controller. Back in the day I did some study on caching vs non-caching and the software caches at the time showed that it was possible to achieve the same performance as a caching controller with just a regular controller and a software cache.

Umm....what was this thread originally talking about? I forgot :D

pkuczaj
09-06-2004, 04:43 AM
One last question on this topic. Are you running the dual processor in Windows or Linux? I find that Linux is far superious in utilizing the dual processors over Windows. I use both in my environment, and I find that alot of Windows application have difficulty or is impossible to fully utilize the second processor.

The dual processors won't do as much as you think. I was running vb at work as a knowledgebase on a dual xeon and I'd watch as only one cpu would be used at a time. It was really retarded. A friend of mine did some masters research in this area and discovered that the best case improvement that dual processors can make is 50%. I initially liked the idea of dual processors when they first came to the desktop, but after doing the research, it's usually not worth it. Besides, if you're already to the limit on your current box, I don't know if a mere 50%-75% improvement could sustain you for too long. And considering the investment, it's probably not too good bang for buck if it only buys a short time before the next upgrade.

3Com and Intel make some good server nics that are specifically designed to be very low on cpu utilization. The HD upgrade from the 100g IDE to the 146g SCSI will be tremendous as far as data transfer is concerned, although I'd hesitate on the caching controller. Back in the day I did some study on caching vs non-caching and the software caches at the time showed that it was possible to achieve the same performance as a caching controller with just a regular controller and a software cache.

Umm....what was this thread originally talking about? I forgot :D

SamirDarji
09-06-2004, 05:34 AM
One last question on this topic. Are you running the dual processor in Windows or Linux? I find that Linux is far superious in utilizing the dual processors over Windows. I use both in my environment, and I find that alot of Windows application have difficulty or is impossible to fully utilize the second processor.

The dual processor environment I had experience with was a Win2k3 dell blade server. But my friend's reasearch was all unix-based and using x86 compatible chips. And 50% was the best case scenario. The real-world usage was much less. Even with advances in chips, designs, and software, I would think we're only closer to that 50% for real-world usage. But again, if the costs are only 50% more and you're getting 50% more gain, then the price/performance ratio is fine. And I believe that a dual setup would work in your case, but what I'd be afraid of is for how long. The cost and headache of upgrading the hardware is much more than the hardware itself for a high usage site like yours. A dual setup wouldn't be worth it if it only lasts a year before another major upgrade.