Friday, March 16, 2012

Please help me with regular expression

Hi

I need a correct regular expression for this pattern:
aaa;some text;abcdef;this is another text

-> semi-colon separated strings that contain only letters, numbers and spaces.
(= no wild chars like*#,:% etc.)

But these should not match:
aaa;bbb;;ccc;ddd (two consecutive semicolons with nothing between them)
aaa;bbb; ;ccc;ddd (two consecutive semicolons with only spaces between them)
(the string between the semicolons must contain at least one letter or number)

Thanks much in advance.
In case something's unclear, I can some more detailed explanation.
H.look at
http://virtual.park.uga.edu/humcomp/perl/regex2a.html

Jagdip

(For the lazy people who cannot be bothered to learn anything new and just want the answers)
\w+[;]

Any character/number one or more times followed by a semi colon...

aaa;bbb;;ccc;ddd

will match:
aaa;
bbb;
ccc;

will not match empty or the ddd cause theres no semi on the end.

Let me know if you have problems..
Thanks for replying aikeith.

Your suggestion is not bad, however, it contains the following bugs:

- stringaaa;bbb;ccc;;ddd;eee;ffff shouldNOT match, becase it contains several consecutive semicolons (with nothing between them)

- stringaaa,bbb;!@.,?#$%^&*;xxx shouldNOT match, because it contains several disapproved characters (commas, -, #, ...) - only letters and numbers and spaces are allowed.

- stringaaaSHOULD match because it contains one correct word (no semicolons, yeah, but they're needed only when two or more words are to be separated)

Thanks for help anyway, if you know how to fix the stuff above, I'll be very grateful.
H.
I suppose I misunderstood your question.

I dont get this though:
"- string aaa SHOULD match because it contains one correct word (no semicolons, yeah, but they're needed only when two or more words are to be separated) "

so would this match?

aaa;bbb
I got this:

[[[a-zA-Z0-9]+[' ']*[a-zA-Z0-9]*]+[;]]+

Jagdip
Sorry, got it slightly wrong. Not sure if it will work, but this is what I meant

[[[a-zA-Z0-9]+[[' '][a-zA-Z0-9]+]*]+[;]]+

Jagdip
aaa;bbb should match.

You can imagine it like you have some list of items (words):

bread
milk
coke
chocolate bars
potatoes

and you write then in a sinle line, separated by semicolons:

bread;milk;coke;chocolate bars;potatoes

And now:

1) the list can contain only one word
(the string"bread"should match

2) the list cannot contain empty items: (thus two consecutive semicolons are disapproved)
(the string"bread;milk;;chocolate bars" shouldNOT match)

3) the words in the list can only contain numbers, letters and spaces (like in "chocolate bars")
(the string"2 breads;27 eggs"should match but the string"bread ###;milk %%%" shouldNOT match)

I hope I made it a little bit more clear.
Thanks for your interest, but I tried your expression on

aaa;bbb

and it didn't match (although it should)

(in fact, I failed to come up with any string that matches)

I admit I'm not real good at explaining things, so maybe you didn't understand me correctly, if you are still interested please look at my post above, I tried to explain more clearly.

Thanks anyway, H.

0 comments:

Post a Comment