Christina ✅ 🇨🇦 is a user on mastodon.sdf.org. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Christina ✅ 🇨🇦 @cosullivan

Dear regexperts:

What is the syntax for searching for and retrieving from a word list multiple patterns: example, words [alpha] of a defined length in which e is the only vowel?

· Web · 1 · 1

@cosullivan for #unix

awk '/^[b-hj-np-tvwz][b-hj-np-tvwz][b-hj-np-tvwz]$/{print$1}' ?

@saper Thank you. Removal of the first $ in your awk command worked to print out a list, but the vowels aiouy still appeared in the words matched.

@cosullivan $ should not be removed. You need to repeat the pattern in [...] to achieve desired fixed length only. Alternatively you can add * after the last ] if your words can have variable length.

@saper the list of matches does not print when the first $ is included in the expression though. Also, the resulting matches include 'wreaked' (can't have any of aiouy) and 'wpm' (no e).

However, I can use the [b-hj-np-tvwz] and e in many combinations for matches, so that works.

@cosullivan @saper something like

($0 ~ /...../) && ($0 ~ /[A-DF-Za-df-z]/) {print}

The first checks length, the second checks content. Or, seemingly perverse,

$0 ~ /[^e][^e][^e][^e][^e]/ {print}

@cosullivan @saper er,

[^aiou]

I never get it right the first time. Usually not the second time, either.

@dbucklin @saper thanks to both of you I think I have an inelegant, multiple pattern match expression model: awk '(length($1) == 7) && (/[^aiouy]e[^aiouy]e[^aiouy]e/) { print $1 }' efile.txt
this list returns words like revere, bedene, decede...

this isn't for anything anyone else has to look at -- it's just for me.

@cosullivan @dbucklin so you want that the second, fourth, etc. characters are always "e"?

@saper
No, I wanted an awk command for two pattern-match expressions I could adapt.

In the below example, I can vary the word length by changing the digit after == in the first expression, and I can use [^aiouy] to match any consonant plus an e, place those and known consonants and es between the (/ /) to get a list of possible matches, for a crossword puzzle.

awk '(length($1) == 6) && (/s[^aiouy]ee[^aiouy]e/) { print $1 }' efile.txt

@dbucklin