Version: , by blackvborg
Developer Last Online: Nov 2008
Version: Unknown
Rating:
Released: 02-03-2008
Last Update: Never
Installs: 0
No support by the author.
Hello,
ho do you handle the replication of images/attachments among multiple servers to achive HA/LB ?
If you've multiple frontend web servers for your forum you've the problem of "sharing" attachments. Using rsync could be a good solution at the beginning but it's not efficient with hundred of thousand of files, moreover some webservers haven't the new images until the replication is complete.
I've read that some of you use a dedicated attachment server, but it's a single point of failure and for load balancing has the same problem as mentioned above (the lag before sync).
Any idea? I think a SAN/NAS is not a very affordable solution, NFS has other drawbacks
Thank you for sharing your solutions!
Show Your Support
This modification may not be copied, reproduced or published elsewhere without author's permission.
Here's the route we took with our multi-webserver setup:
1) Initially we tried rsync, but found that it sort of sucked for users when they'd upload an avatar or attachment and the link would be broken for some people until the rsync happened (1 minute is still too long to lag IMO).
2) We setup FASD or whatever it is called to monitor directories for changes and rsync based on activity. This worked OK, but it was sort of prone to problems, and still lagged a wee bit.
3) We decided to add a bunch of webservers, so we moved to a setup where we have an image server (running thttpd) that serves all the images and most of the static content (js, etc) on the site. This server also runs NFSd, and serves our webroot out to all of our webservers. The webservers mount the webroot via NFS, have rw access to avatars and attachments but only ro access to everything else.
#3 has been working well for us for a couple of years now. Sure, it's a single point of failure, but it has the least amount of headache compared to some of the other solutions I've seen (hacking code so image uploads go to a specific server, etc).
In the near future I'd like to start thinking about provisioning a better disaster recovery system where the webservers rsync a copy off the NFS to their local disks, and a new virtualhost so in the event of an imageserver failure, I can point the images url at our loadbalancer as opposed to the images server and move the local webroots into place and the webservers will pick up the slack until our images server is back online.
Hello mute, thank you for your post; I've evauleated all your points and the best (in theory) seems the 2nd, but it's not much reliable (eg the notify of file change happens when the write starts and not when it finish, moreover you've some headeache with the replication of new directories)
about NFS yes, it a SPOF. have you some figure about the performance of apache running against a local filesystem vs a network filesystem like NFS? in some benchmarks I've seen that the performance are much lower. for sure it's the easiest implementation.
About HA and NFS I think you could rsync the nfs content on a second passive server (or just one of your webserver) and in case of failure change the NFS server to the backup server
BTW I don't like none of them
for Loco.M: I'd replicate attachments bacause I need to have more webservers
for nexialys: I've not found much to date; I'm looking for realtime mirroring software; not found much except network filesystem like redhat gfs (poor performance for me)
Hello mute, thank you for your post; I've evauleated all your points and the best (in theory) seems the 2nd, but it's not much reliable (eg the notify of file change happens when the write starts and not when it finish, moreover you've some headeache with the replication of new directories)
about NFS yes, it a SPOF. have you some figure about the performance of apache running against a local filesystem vs a network filesystem like NFS? in some benchmarks I've seen that the performance are much lower. for sure it's the easiest implementation.
About HA and NFS I think you could rsync the nfs content on a second passive server (or just one of your webserver) and in case of failure change the NFS server to the backup server
BTW I don't like none of them
for Loco.M: I'd replicate attachments bacause I need to have more webservers
for nexialys: I've not found much to date; I'm looking for realtime mirroring software; not found much except network filesystem like redhat gfs (poor performance for me)
any other advice?
Well, as far as NFS vs locally hosted content, I'm sure there's a performance hit, but I don't have any hard numbers. One thing to remember is that if you are using a PHP cache like Xcache, APC, etc, then a lot of your dynamic content (the only thing we serve via apache) is cached in memory, so the amount of traffic over the NFS is actually a lot less than you'd expect.
I just use an NFS mount for the front-end servers and keep an rsync'ed copy of the mount on another server. My servers utilize RAID arrays, so while I have lost hard drives occasionally over the years, I've never lost an array or had any unexpected downtime due to hard drive failure, nor have I ever suffered any data loss. If there is a performance hit for setting things up this way, I haven't noticed it. Eric