Quote:
Originally Posted by dasdingansich
hi,
I?m trying to build replacement code for ebay.de affiliate links, and I got stuck finding the right regex
I need to extract the sellers name from an URL, which usually looks like this:
Code:
http://shop.ebay.de/merchant/user_name
works perfectly fine with:
Code:
http://shop\.ebay\.de/merchant/([\.\w_-]+)
then I can use the sellers name as $p1
however, sometimes ebay puts thos "W0QQ" variables behind it, then it would look something like this:
Code:
http://shop.ebay.de/merchant/user_name_W0QQ_nkwZQQ_armrsZ1QQ_fromZQQ_mdoZ
I need "_W0QQ" and everything after that to be excluded from $p1
first I tried to create a break at the first instance of an underscore, but then I realized usernames can contain underscores as well.
thanks in advance!
|
It's tricky when you run into an area you need grouped that can easily match the rest of the string. One thing you can do is check to see if the underscore in your grouping is absolutely necessary. If not, then you can easily put an optional word character match after your grouping:
[\w_]*
If it is necessary, then try looking for the user name someplace else besides the URL string. Likely, it will be somewhere in the HTML of the page and you can make a ReGex match to that. In this case your grouping, the $p1 reference, will be in the
Extraction regex instead.