|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Jul 30, 2008 4:15 pm
[2.34] Nested alternations in string lists don't work for regexes |
If you execute the following code:
Code: |
tmp=a
#additem tmp {(?:b|c)}
#REGEX {^(%%string(@tmp))} {#SH fired - %1}
#SH a
#SH b
#SH ~"b~" |
you will see
Code: |
a
fired - a
b
"b"
fired - "b" |
This is because:
Code: |
#SH %string(@tmp)
a|"(?:b|c)" |
Granted, the above RE can be done differently, but in zMUD it worked, and I sometimes used REs with such constructs (such as "on (?:his|her|its) phone") |
|
|
|
Guinn Wizard
Joined: 03 Mar 2001 Posts: 1127 Location: London
|
Posted: Wed Jul 30, 2008 4:28 pm |
You could get around it using %concat
tmp = %concat(@tmp,|(?:b|c)) |
|
_________________ CMUD Pro, Windows Vista x64
Core2 Q6600, 4GB RAM, GeForce 8800GT
Because you need it for text... ;) |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Jul 30, 2008 4:46 pm |
The above doesn't quite work because I need to use the variable in an RE, and I have it as a string list.
The point is that I use a string list as an aid to understanding the RE. The reason is that it's horribly complex as is... I suppose I could use a string instead, but the UI for the stringlist is much easier to manage.
And if you're curious, an RE to detect almost any non-emit MUSH pose is:
Code: |
^(((@Chat/PrefixPlace)\b['\w\s\d]*?,|@Chat/PrefixOther)\s*)?\[?(@Chat/wholist|@Chat/whoAlso)([,']|'s)?\s[\w'"].* |
When I had the alternatives inline, it was really hard to understand where to add in new stuff. Now, it's pretty simple to just tailor the lists to match any new behavior I see on the MUSH. |
|
|
|
oldguy2 Wizard
Joined: 17 Jun 2006 Posts: 1201
|
Posted: Wed Jul 30, 2008 5:05 pm |
Why use "^(%%string(@tmp))"? Why not just make it "^(@tmp)"? It works just fine so the title of this thread is false.
#show a
#show b
#show c
#show ~"b~"
a
fired - a
b
fired - b
c
fired - c
"b" |
|
|
|
Rahab Wizard
Joined: 22 Mar 2007 Posts: 2320
|
Posted: Wed Jul 30, 2008 5:09 pm |
If I'm understanding what you want to do with this, wouldn't the solution be to do:
Code: |
#additem tmp b
#additem tmp c
|
instead of
Code: |
#additem tmp {(?:b|c)}
|
|
|
|
|
oldguy2 Wizard
Joined: 17 Jun 2006 Posts: 1201
|
Posted: Wed Jul 30, 2008 5:21 pm |
No I don't think so. That's completely different from the stringlist "a|(?:b|c)".
Look at it this way, pretend a, b, and c are all different triggers placed in a stringlist, which is why he said earlier "and I sometimes used REs with such constructs (such as "on (?:his|her|its) phone")".
Like this for example:
Code: |
<?xml version="1.0" encoding="ISO-8859-1" ?>
<cmud>
<var name="tmp" type="StringList" copy="yes">A man walks down the road|"A (?:cat|boy|dog) chases after the man"|"(?:Sam says hello|Joe tells the man good afternoon)"</var>
</cmud> |
Will work just fine for the trigger:
Code: |
<?xml version="1.0" encoding="ISO-8859-1" ?>
<cmud>
<trigger priority="20" regex="true" copy="yes">
<pattern>^(@tmp)\.</pattern>
<value>#show fired - %1</value>
</trigger>
</cmud> |
Output:
A man walks down the road.
fired - A man walks down the road
A cat chases after the man.
fired - A cat chases after the man
A dog chases after the man.
fired - A dog chases after the man
Joe tells the man good afternoon.
fired - Joe tells the man good afternoon
This actually was broken earlier, but Zugg fixed it in this version. So the statement of nested alternations in string lists don't work for regex isn't true. |
|
Last edited by oldguy2 on Wed Jul 30, 2008 5:36 pm; edited 1 time in total |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Jul 30, 2008 5:22 pm |
What I really want is a string list that will turn into "Nearby|On (?:his|her|its) phone". Granted, I can use a string, or I could expand out the alternatives, but in actual use, there are greater than 10 alternatives in the string list, and the string list UI is good to keep it straight. (And it worked in zMUD).
As for oldguy2's question, when I use "^(@tmp)", I get a parse error as it tries to do the sorting that Zugg has mentioned elsewhere (but that I disagree with anyways).
FYI, without the %%string(), using the RE I mentioned ("Nearby..."), I fail to match "Nearby" like it should. And because of the sorting, I can't see what the pattern it is actually trying (the compiled pattern tab just shows a var reference, and not the result)
Edit: Ah, the test tab shows me that by using just (@tmp), the pattern is turning into "^(On (?:his|Nearby|her) phone)". |
|
|
|
oldguy2 Wizard
Joined: 17 Jun 2006 Posts: 1201
|
Posted: Wed Jul 30, 2008 5:43 pm |
Zhiroc see my example above. I have every single trigger that triggers the same action in stinglists, and some are pretty complicated with nested alternations. Maybe you need to reload 2.34?
I get "^(On (?:his|her|its) phone|Nearby)"
Edit: Hrm...you know what, I didn't reload the newer 2.34 after he took out the mapping, but I don't think that did anything. So I have the first version of 2.34 on my system. Try reloading your version see if it works. If not, then maybe whatever he did messed it up again? |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Jul 30, 2008 6:10 pm |
Hmm, I downloaded 2.34 last night, but something's wrong... Yeah, I better reinstall, just in case...
|
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Jul 30, 2008 7:22 pm |
OK, yes, after reloading 2.34, the "^(@tmp)" regex works fine (other than the reordering). However, if I ever wanted to preserve the order of the stringlist, using %%string() as outlined in the other thread, it doesn't work, as quotes are inserted around the nested alternation.
|
|
|
|
oldguy2 Wizard
Joined: 17 Jun 2006 Posts: 1201
|
Posted: Wed Jul 30, 2008 7:53 pm |
So don't put it in a variable. Just put the whole thing in the trigger pattern?
|
|
|
|
Zugg MASTER
Joined: 25 Sep 2000 Posts: 23379 Location: Colorado, USA
|
Posted: Wed Jul 30, 2008 7:57 pm |
Quote: |
using %%string() as outlined in the other thread, it doesn't work, as quotes are inserted around the nested alternation |
Yes, I can see that as a problem. When an item in a string list has | characters in it, then CMUD needs to put " quotes around the entire string item for the string list to work properly as a normal string value. Normally when expanding a @var that is a string list, CMUD removes these quotes (when it does the automatic sort). But since %%string bypasses the sort, the code the strips the quotes is also being bypassed. I'll see if I can improve that in the next version. |
|
|
|
Rahab Wizard
Joined: 22 Mar 2007 Posts: 2320
|
Posted: Thu Jul 31, 2008 12:38 pm |
This is slightly off-topic, but I'd like to understand. A couple people have implied that {a|b|c} is not the same as {a|(?:b|c)}, but it looks to me that they are logically equivalent. Or did you just mean that the more general issue of nested alternations is a problem?
|
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Thu Jul 31, 2008 4:03 pm |
It is equivalent, I think (though technically the backtracking might be different, but I think even here it's the same).
However, this was a simple case. The actual case of interest was more complicated, and as soon as you add any wildcards or other text around the non-capture group, or make it a capture group, it's different. For example:
(a|(?:b|c)+)
(a|(?i:b|c))
(a|(b|c))
(a|x(?:b|c))
are all distinctly different than (a|b|c) |
|
|
|
oldguy2 Wizard
Joined: 17 Jun 2006 Posts: 1201
|
Posted: Thu Jul 31, 2008 6:44 pm |
Yeah sorry Rahab. I said it wasn't the same because of what he was really referring to outside of the very simple example he posted. I guess he explained it above this post though. Sorry to confuse folks.
|
|
|
|
Rahab Wizard
Joined: 22 Mar 2007 Posts: 2320
|
Posted: Fri Aug 01, 2008 12:59 pm |
That's all right. Once I realized that Zhiroc was worried about the more general case, I knew that my simplistic answer was not useful. I just wanted to be sure my understanding of the regex was correct and that they were logically equivalent. Thanks!
|
|
|
|
|
|