regular expression syntax

The header filtering feature allows you to control what headers are passed from your browser to websites.

regular expression syntax

Postby sandhya » Tue Aug 03, 2004 4:51 pm

perl regular exprssion syntax to build header's Type and Value.

Metacharacters
Char Meaning
^ beginning of string
$ end of the string.
. any character except newline.
* match 0 or more times
+ match 1 or more times
? match 0 or 1 times; or: shortest match
| alternative
( ) grouping; "storing"
[ ] set of characters.
{ } repetition modifier
\ quote or special.

Repetition
A* zero or more a's
A+ one or more a's
A? zero or one a's (i.e., optional a)
A{m} exactly m a's
A{m,} at least m a's
A{m,n} at least m but at most n a's

Special notations with \
Single characters
\t tab
\n newline
\r return
\xhh character with hex.code hh
\b “word” boundary
\B not a “word” boundary

matching
\w matches any single character classified as a "word" character (alphanumeric or _)
\W matches any non-"word" character
\s matches any non-"word" character
\S matches any non-"word" character
\d matches any non-"word" character
\D matches any non-digit character

Character sets specialities inside [..]
[characters] matches any of the characters in the sequence
[x-y] matches any of the characters from x to y (inclusively) in the ASCII code
[\-] matches the hyphen character –
[\n] matches the newline;

Examples:
^abc abc at the beginning of the string
abc$ abc at the end of the string
a|b either of a and b
\d\d any two decimal digits, such as 42; same as \d{2}
\w+ matches characters one or more than one.
Ab?c an a followed by an optional b followed by a c; that is, either abc or ac
ab+ same, but there's at least one b ("ab", "abbb", etc.);

(\d+\.\d+\.\d+\.\d+) matches an IP address.
/^(http:\/\/)?([^\/]+)/i for an URL
(\d{1,5}) matches a Port
/\bhomer\b/ Do you think he can hit a homer?
sandhya
 
Posts: 29
Joined: Fri Jul 23, 2004 4:47 pm

Postby pygy » Fri Oct 31, 2008 9:28 pm

Could you help me?

I'm trying to filter all url that matches the beginning of a regexp, but doesn't match it's end. Both strategies I've tried failed:

delicious.com/tags/[^(Nickname)] matches anything that ends with any of the characters included in "Nickname".

delicious.com/tags/[^Nn][^Ii][^Cc][^Kk][^Nn][^Aa][^Mm][^Ee] is slightly better, bzecause it's posistion sensitive, but for example "delicious.com/tags/xix" will match, because of the I in the second place.

Is there a way to match a string that doesn't contains a specific substring?
pygy
 
Posts: 6
Joined: Tue Oct 28, 2008 8:39 pm

Re: regular expression syntax

Postby sachin » Fri Oct 31, 2008 11:56 pm

Do you want to match URLs that contain - delicious.com/tags/ - followed by anything?
.* matches any character, any number of times.
So delicious\.com/tags/.* will match a string that contains the exact string 'delicious.com/tags/' followed by any character, any number of times.

Similarly, if you want to match delicious.com/tags/nick followed by anything, it would be delicious\.com/tags/nick.*
sachin
 

Postby pygy » Sun Nov 02, 2008 4:44 pm

I'd like to match all delicious/tags/nick except one.
pygy
 
Posts: 6
Joined: Tue Oct 28, 2008 8:39 pm


Return to Header Filtering

Who is online

Users browsing this forum: No registered users and 1 guest

cron