vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.5 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=113)
-   -   Image Status Checker / Dead Image Finder (https://vborg.vbsupport.ru/showthread.php?t=124113)

bairy 08-15-2006 10:00 PM

Image Status Checker / Dead Image Finder
 
Note this hack works with vb3.6


What does this do?
It scans all your posts, extracts all the img tags, and scans each of the images to see if they're still valid.


Why?
I had a look at all the images on my site and was alarmed at how many were now gone - deleted from photobucket accounts etc. Since the only way you can check the images on your board is to manually read every post, I decided to come up with a better way... and this is it.


How does it work?
The first part: In the AdminCP, under Maintenance and Update Counters... right at the bottom is this hack. It works by looking up every img tag, then requesting the image, and reading the http status code. So code 200 means 'image ok', 404/410 means 'image gone' etc. That then gets stored in a database table. A server has 15 seconds to reply to the request or the status is labelled as "Unknown"
The second part: The browsing element, imagestatuscheck.php (original filename huh!). This allows you to browse all the images found in the last scan using some powerful filtering (statuses to display, search, order by).


Hack features
  • General
  • Fully phrased.
  • Templates are grouped. Who's online handled.
  • Part 1 - Admin
  • Reads the post table, scans all the [img] tags on demand and records the actual http status code returned.
  • If it gets stuck during the scan, you can restart the section it's currently doing.
  • If an image appears in more than one post, it's only checked once.
  • Start from, per page and timeout options for scanning.
  • Part 2 - Browser
  • Status codes are put into one of three descriptions for simplicity: Working, Dead, Unknown. Unknown is if the server didn't respond or similar - on the basis that a temporary timeout doesn't necessarily mean the image has gone.
  • In the browser, image urls are force wrapped. Unless people post using all caps, you have a low screen resolution, or the font size is big, the table should never stretch.
  • Filtering allows you to show just the working/dead/unknown images, and there's a search facility for a variety of fields.
  • Convenient link to edit the post (if a dead link is found). This works by can_moderate - edit links only appear for people who own the post, or can moderate the forum it's in.
  • Works by canview - if someone can't view a particular forum (e.g. staff forum) normally, they can't view the images within it.
  • Uses css for common stuff to reduce the size of the outputted pages.


Bad Things
It's far from a perfect hack, there are many things to do. Please be aware that I won't be doing them, but if anyone else wants a crack, feel free!
  • Only supports http://, not https://
  • Can only handle replies like: HTTP 1.x 200 as the first line.
  • Only supports [img] tags. If you have HTML turned on in any forums it won't see <img src=> images.
  • Biggie: There's no way to update a single post or image without a full re-scan. That means if someone edits their post to update or remove a dead link, it will not change on the browser until a full re-scan is done. I did play with various update methods but most are flawed in one way or another. A planned feature will be to update the table dynamically whenever a post is made, edited or deleted, and on demand using a link.
  • No cron job.
  • No session variables. (People without cookies will be logged out a lot).


Footnotes
Originally I planned to throw something together quickly just for me to use but it turned into a "I may as well make a nice interface... oh and I may as well put some filtering controls in and I ..."


A [url] link checker can be found here


Installation
Upload imagestatuscheck.php to your vB directory. Install the product, set overwrite to yes.


Customizing
  • By default it's set to only allow moderators, super-moderators and administrators to view the browser. This can be changed with the setting in AdminCP > vB Options.
  • The phrases all start with ics_ if you want to change them.
  • You can add a link to imagestatuschecker.php on the navbar (or anywhere) if you want your members to be able to view it.


Screenies
Shot 1 is AdminCP during scan
Shot 2 is a typical Browser section output
Shot 3 is no results output


Changelog
See attached file for specific changes.
1.00 - 16th August 06
1.01 - 17th August 06
1.02 - 27th December 06

ChrisSy 08-16-2006 02:27 PM

Looks like a very well made hack, and i dont mean to offend you but im a bit unsure of its use. Once you've found the posts mssing images, then what?

Is it possible to include a feature that scans threads for off-site linked images and then backs the images up into a folder on your server.

That way you can restore them when the img uploader sites decide to delete them.

bairy 08-16-2006 03:05 PM

Quote:

Originally Posted by ChrisSy
Looks like a very well made hack, and i dont mean to offend you but im a bit unsure of its use. Once you've found the posts mssing images, then what?

Whatever you like. All this script does is tells you if images linked in posts are working or not. If not, you (or the post owner) can edit the post to either update the link or delete it.

Quote:

Originally Posted by ChrisSy
Is it possible to include a feature that scans threads for off-site linked images and then backs the images up into a folder on your server.

I should think so but it's not something I'll be developing.

Jay... 08-16-2006 04:35 PM

is there anyway this can be done for all links? Thats what i am looking for

bairy 08-16-2006 05:04 PM

I'll probably knock one out for [url] at some point, the code won't be too different.

Jay... 08-16-2006 05:13 PM

Quote:

Originally Posted by bairy
I'll probably knock one out for [url=] at some point, the code won't be too different.

nice one, if i press install will you be keeping us updated?

ntock 08-16-2006 06:11 PM

Looks cool, I'd install if it'd replace all dead images with an image stored on your server which looks like "3rd party image not hosted anymore." etc. Great work though :)

Gryphon 08-16-2006 06:49 PM

Get an error on scan. Found the offending post, but you might want to account for the odd duck who tries to post weird urls.

Also got an error when someone said [img] in their post and then later put an existing [img]http://img.jp[*/img], it tried to insert the following into the database:
Code:

in their post and then later put an existing [img]http://img.jp[*/img]
Code:

Database error in vBulletin 3.6.0:

Invalid SQL:
INSERT INTO vb3_imagestatus VALUES (NULL, 87423, 1510, 'javascript:ShowLarge('/path/to/image.jpg');', '');

MySQL Error  : You have an error in your SQL syntax.  Check the manual that corresponds to your MySQL server version for the right syntax to use near '');', '')' at line 1

and

Code:

Invalid SQL:
INSERT INTO vb3_imagestatus VALUES (NULL, 99805, 63, 'http://fakemeit'sprobablyaredXyoudope.jpg', '');

MySQL Error  : You have an error in your SQL syntax.  Check the manual that corresponds to your MySQL server version for the right syntax to use near 'sprobablyaredXyoudope.jpg', '')' at line 1


bairy 08-16-2006 06:56 PM

Jay... : yes
ntock : good suggestion.. though I'd rather leave the original url in so it can be corrected by the post owner if it's just been moved.
Blackjack : Looks like I forgot to escape the string to account for those dodgy urls. A job for the next release.

Gryphon 08-16-2006 07:03 PM

There was also another issue, I edited my post.

Mr Chad 08-16-2006 07:03 PM

wouldnt this use alot of bandwidth?

rmxs 08-16-2006 07:03 PM

Thanks installed :)

rmxs 08-16-2006 07:07 PM

OK i try it it works byt i get many worning links with Unknown status

Y this happents?

Can you tell me how can i add it also to navbar for moder admin smoder groups only??
i mean if there is no 5,6,7 dont show the link

EDIT:

Ok i made it its easy LOL

<if condition="$bbuserinfo[usergroupid] == 6">

<td class="vbmenu_control"><a href="imagestatuscheck.php">DIF</a></td>

</if>

bairy 08-16-2006 07:51 PM

Chad : Each image is requested one by one and only the first 12 characters of the return are read, as they are the ones with the status code in them. After that the connection is closed. Theoretically it will use output about 200 bytes and input 12 bytes per request. Practically I don't know how web servers work, but I suspect once php has sent a close to the other server the transfer will stop. So no, not much bandwidth

ForYou 08-16-2006 08:23 PM

Hello ,

there is error ,

Database error in vBulletin 3.6.0:

Invalid SQL:
INSERT INTO imagestatus VALUES (NULL, 172959, 3498, 'http://www.dohaeye.com/lyrics/3'air%20elnass.jpg', '');

MySQL Error : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'air%20elnass.jpg', '')' at line 1
Error Number : 1064
Date : Wednesday, August 16th 2006 @ 09:21:55 PM
Script : http://www.dha.net/moda/admincp/misc...tus&cis=findem
Referrer : http://www.dha.net/moda/admincp/misc...k_image_status
IP Address : 213.6.1.100
Username : Admin
Classname : vb_database

bairy 08-16-2006 10:33 PM

The sql errors are because the image url isn't escaped. Silly oversight. I'll probably get an updated product out tomorrow along with a couple of other changes.

EasyTarget 08-16-2006 10:58 PM

on a related note, what about scanning all posts for img tags, rehosting all remote images at a local location and editting the posts with the new url so that you don't have to worry about the images going away?

bairy 08-17-2006 06:42 AM

Quote:

Originally Posted by rmxs
OK i try it it works byt i get many worning links with Unknown status

Y this happents?

Sorry. Missed this question.
Unknown generally means one of two things:
1. Server didn't reply within 15 seconds
2. Server didn't send a nice http header back


Quote:

Originally Posted by EasyTarget
on a related note, what about scanning all posts for img tags, rehosting all remote images at a local location and editting the posts with the new url so that you don't have to worry about the images going away?

Copyright issues. The idea is quite good though. I might include a way to allow you to manually do that.

rmxs 08-17-2006 02:11 PM

Thanks bairy

Mr Chad 08-17-2006 05:21 PM

Quote:

Originally Posted by bairy
Chad : Each image is requested one by one and only the first 12 characters of the return are read, as they are the ones with the status code in them. After that the connection is closed. Theoretically it will use output about 200 bytes and input 12 bytes per request. Practically I don't know how web servers work, but I suspect once php has sent a close to the other server the transfer will stop. So no, not much bandwidth

ahh thanks that clears it up, good job coding it.

bairy 08-17-2006 06:32 PM

Updated to 1.01 to clear up the early bugs and improve a few things:

- Misc: Install code creates empty db table
- Misc: Corrected silly oversight to reduce db errors (escaping image urls)
- Scanner: Added options to maintenance section
- Scanner: Rewrote quite a bit of the code to work with the new options
- Browser: Added "you haven't scanned yet" warning if the table is missing (unlikely but best to be handled)
- Browser: isc_no_results template wasn't included in the 1.00 product for some reason. It is now and is used when there are no results
- Browser: Added a perpage, lower limit 5, upper limit 100. Outside these and it defaults to 30

Reupload imagestatuscheck.php. Reimport the product xml with overwrite set to yes.

Snatch 08-18-2006 05:17 AM

If I click on "search/Filter" it blinks an than it shows me the startscreen of imagestatuscheck.php but no resulst.

What is wrong?

GreeTz
Snatch

bairy 08-18-2006 07:21 AM

Have you run a scan first?
If so, how many images did it scan?

Snatch 08-18-2006 07:43 AM

LoL sorry, my fault.

GreeTz
Snatch

Snatch 08-18-2006 07:50 AM

2 more questions.

1:
now i runed the process for find death images.
But when I go to the .php File i get this Error
Code:

Datenbankfehler in vBulletin 3.6.0:

Invalid SQL:

        SELECT
        i.id AS iid, i.postid, i.userid, i.imageurl, i.status,
        u.username,
        t.forumid, t.title AS threadtitle,
        f.title AS forumtitle
        FROM imagestatus i
        LEFT JOIN user u ON (i.userid = u.userid)
        LEFT JOIN post p ON (i.postid = p.postid)
        LEFT JOIN thread t ON (p.threadid = t.threadid)
        LEFT JOIN forum f ON (t.forumid = f.forumid)
        WHERE i.`status` IN (401,402,403,404,405,406,407,409,410,411,412,413,414,415,416,417,000,0,100,101,201,202,203,204,205,300,301,303,305,400,408,500,501,502,503,504,505)
        AND t.forumid NOT IN (0)
       
        ORDER BY u.username asc
        LIMIT 0, 30;

MySQL-Fehler : Got error 28 from storage engine
Fehler-Nr.  : 1030
Datum        : Friday, August 18th 2006 @ 10:48:58 AM
Skript      : http://www.celebritymarkt.de/imagestatuscheck.php
Referrer    :
IP-Adresse  :
Benutzername :
Klassenname  : vb_database

Or is it so, that I can only use the php file if the search are finisched ?
562,783 images remaining Muhahaha

2:
What means the text "duplicate / dealt with" behind the ImageUrl?
Show Attach!
The first 2 Pages are o.k. but then only "duplicate / dealt with"

GreeTz
Snatch

bairy 08-18-2006 09:54 AM

Error code 28 means no more space left. Either the hard drive ran out of space or your allowed disk space maxed itself.
If you really have 562k images, and I believe you do, then that's not really a surprise as the script creates a new table with all the images in it. I have 1300 images and takes up about 170k. So multiplying it up there's probably a table size of 70mb or so.

However it obviously managed to get some images in at least.

The duplicate/dealt with message comes up because:
Lets say you have one image and it's been linked in 2 posts. There's no point scanning the same image twice since one scan will tell us if it's valid. Therefore it's scanned once and if the image comes up again it's counted as 'duplicate' or 'dealt with' (they mean the same thing in this case).
Another reason is if you resume a scan (not restart it). As it will already have scanned some of the images and they'll be classed as "dealt with".
If you have a lot of images saying that then it could be because you're doing another scan but not from the start, or it could be related to the error 28, depending what got inserted and what didn't.

osso12 10-26-2006 04:04 AM

Does this work with VB 3.6.2?
If so, everytime I run a scanner, and then run statuscheck.php,
I get:
You haven't run the scanner yet. You will find it in the Admin Control Panel under Maintainance -> Update Counters, at the bottom.
Non-admins don't see this message.
Tried a hundred times, but keeps doing the same thing.:down:

image status checker in vb options: 5,6,7
I need to get this to work.
Please someone help.

bchertov 12-18-2006 03:32 PM

{I first posted my query in the URL checker thread}

Hi,

I have a custom HTML Daily Digest that includes Images that are inserted using {IMG} tags. I want to prevent images from forcing the Digest to be too wide because they are over 750 pixels wide. I can resize it in the digest if I know the image is too wide. So I'm looking for some code that will tell my how wide an {IMG} is. Can this hack help me? Can you help me?

Thanks!
Barry

bairy 12-18-2006 04:47 PM

Ahhh now I see.

I've just realised that basically, no.
I think that in order to get the dimensions of an image, the server would have to fully download it and then analyse it as the information isn't included in the http headers. That would drain the destination server's bandwidth and take a lot longer.

My only real suggestion is to load up the images you want to include in a web browser, right click them and click properties, and see the dimensions there.

bchertov 12-19-2006 03:18 PM

Quote:

Originally Posted by bairy (Post 1141065)
My only real suggestion is to load up the images you want to include in a web browser, right click them and click properties, and see the dimensions there.

Thanks, but I was trying to find some automated way of doing this. I guess I'll check the image resizing hacks to see how they do it. Thanks anyway.

mauro1947 12-19-2006 03:19 PM

Hi!
Does this mod works on vBulletin 3.6.4???
Thanks
Bye!

bairy 12-19-2006 03:45 PM

It works in 3.6.0 and doesn't rely on much vb code, so I would say yes it'll be fine in 3.6.4

Hornstar 12-19-2006 11:49 PM

Nice work, this looks like something I would need as i have lots of images.

Bounce 12-20-2006 10:37 PM

Quote:

Originally Posted by bairy (Post 1141654)
It works in 3.6.0 and doesn't rely on much vb code, so I would say yes it'll be fine in 3.6.4

All I get is no images found :(

3.6.4

Run the scan in maintenance: There are a total of 8,057 images.

my link above has been run

I've removed 2 images to try it :(

bairy 12-21-2006 07:41 AM

Did you actually run the scan (looks like screenshot 1 in the first post), or just get as far as the image count screen?

Bounce 12-21-2006 12:56 PM

Quote:

Originally Posted by bairy (Post 1142683)
Did you actually run the scan (looks like screenshot 1 in the first post), or just get as far as the image count screen?

I Ran the scan in admincp/ maintenance: There were are a total of 8,057 images.

But when i went to the link no images were found ?

Thanks

bairy 12-23-2006 08:31 AM

Hmm,
Could you go into "Execute SQL Query", just underneath "Update Counters" and run:
Code:

select * from imagestatus
On the next page it'll say Results: x

I'm interested in what number x is.

Bounce 12-23-2006 10:30 AM

Quote:

Originally Posted by bairy (Post 1143895)
On the next page it'll say Results: x

I'm interested in what number x is.

Results: 13,314 (0.0064s), Page 1 of 666

All at Status 000

If this is any use to you :)

bairy 12-23-2006 12:55 PM

Ah I believe it's because I missed something that throws the error message even when the table exists. Do you have a table prefix?

I'm not sure why they're all status 000, we'll deal with that after.

Bounce 12-24-2006 04:59 PM

Quote:

Originally Posted by bairy (Post 1143967)
Ah I believe it's because I missed something that throws the error message even when the table exists. Do you have a table prefix?

Not that I know of
Code:

//        ****** TABLE PREFIX ******
        //        Prefix that your vBulletin tables have in the database.
$config['Database']['tableprefix'] = '';

Should I have ?


All times are GMT. The time now is 05:51 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01333 seconds
  • Memory Usage 1,854KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (6)bbcode_code_printable
  • (11)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete