New (?) idea for shell scripting - Printable Version
+- nixers (
+-- Forum: Development & Graphics (
+--- Forum: Programming On Unix (
+--- Thread: New (?) idea for shell scripting (/Thread-New-idea-for-shell-scripting)

New (?) idea for shell scripting - pranomostro - 01-09-2015

Hey nixers,
this is going to be a long post, just to prepare you.

I recently had an idea for a new way to use a shell when scripting. The idea itself is pretty basic, it just contains a pipe that is reading from an input file and appending to the same file.

The most basic example for this is to have a file named 'a' containing a single line at the beginning. To make this file infinitely big, just write
tail -f a >>a

Ok, this isn't very exciting, so I had another idea how to illustrate my idea. Suppose you want to find out how vulnerable a checksum is to to collisions. So you write the checksum of anything into a file, for example
md5sum | sed 's/  -$//' >sums
and then you start a little shell script with this content:

#!/usr/bin/env rc
tail -f -n1 sums | ./unbuf 'md5sum | sed -u ''s/  -$//''' >>sums

The content of unbuf is:

#!/usr/bin/env rc
while(read | eval $"*) {}

Even if you don't know rc, this is fairly easy to understand: tail reads from sums, generates the next checksum from the last one and appends it to the file. unbuf makes sure that a checksum is only generated from one input line. Of course, you will still have to check if there are any doubles, with
sort sums | uniq -d
but this is a nice way of finding recurring checksums (I have only tried md5sum yet, and after 10,000,000 lines there was no duplicate, so this is fairly good).

My next example is the reason why I thought of this in the beginning.
In his book »Gödel Escher Bach«, Douglas R. Hofstadter describes a logical system called MU.
In this system, you start with one or more sentences containing the letters m, u and u, for example mui or uuimmuiui. You have got 4 rules for modifying sentences:
1: You can append i to a sentence that ends with u (mui->muiu
2: If a string begins with m, you can duplicate everything after the m (muu->muuuu)
3: You can substitute iii by u (miiiu->muu)
4: You can leave out uu (muuiuu->muui)

So I wrote a little script that generates mu expressions by reading from an input file and appending to it:

#!/usr/bin/env rc
tail -f mu | ./apr | grep --line-buffered -E '^.{,80}$' | uu >>mu

uu is a lua script that prints a line if it wasn't already in the input (unsorted uniq):
#!/usr/bin/env lua

local tab={}
while c ~= nil do
        if tab[c]==nil then

apr is another lua script where for each input line, each mu rule is applyed one or zero times (apply rules):
#!/usr/bin/env lua

function rule1(str)
        return string.gsub(str, "^(%a+i)$", "%1u")

function rule2(str)
        return str:gsub("^m(%a+)$", "m%1%1")

function rule3(str)
        return str:gsub("^(%a+)iii(%a+)$", "%1u%2")

function rule4(str)
        return str:gsub("^(%a+)uu(%a+)$", "%1%2")

while c~=nil do
        a, b=rule1(c)
        if b~=0 then print(a) end
        a, b=rule2(c)
        if b~=0 then print(a) end
        a, b=rule3(c)
        if b~=0 then print(a) end
        a, b=rule4(c)
        if b~=0 then print(a) end
(My lua fu is not very strong, so this script has a fundamental flaw).
If you have got a starting-mu-expression in mu, it generates all (okay, not _all_) possible mu expressions resulting from that shorter than 80 characters.

Of course, this idea has few fundamental flaws: The script does not terminate when there are no input lines left, it just sits there waiting. One solution for that could be to use a program that quits (and closes the pipe) when there have been no input lines for a certain time. One other flaw is that input-output buffering is really bad in this case: when the programs do not print their results immediately, the pipe does not start. Additionally, not many unix tools support explicit line buffering-grep is one exception, for everything else you have to use unbuf.

Nevertheless, in my opinion this a trick that can be very elegant in some ways (especially the mu example) and also quite helpful.
What do you think? Is this useless tinkering with shell syntax in your opinion or could this be used in "real software"? Tell me your opinion!

P.S: I would call it a ring, because it does not have an end or beginning, the data just flows and accumulates in a circular way.

RE: New (?) idea for shell scripting - venam - 01-09-2015

Sounds like recursive programming with the shell, very interesting.
This might not be very efficient due to all the IO but it's a nice "hack" I haven't seen before.
It could turn out helpful in situations where you only have access to a shell and not any other programming language interpreter/compiler.

We'll done!

NB: You might have been able to use xargs instead of the "unbuf" script you wrote.

RE: New (?) idea for shell scripting - pranomostro - 01-09-2015

@venam: it's not exactly recursive, since no program is called from itself. And yes, it isn't that performant, but it is performant enough-for example, muring is quite fast and this implementation is a lot simpler than anything I could imagine in C.
Ah, yeah, I'll try xargs instead of unbuf. Sounds far easier.
I just wanted to share this and hear your opinion about it.

RE: New (?) idea for shell scripting - z3bra - 02-09-2015

Buffered input are a huge pain to deal with.. I had a discussion with some good programmers someday, and we came up with an "unbuf" binary which would force unbuffered input. It wouldn't work with every program though, because some explicitely reset it, but it was a nice program to have under the hood.

I'll try to find it again.

RE: New (?) idea for shell scripting - pranomostro - 02-09-2015

@z3bra: It would be great if you could find that one!

RE: New (?) idea for shell scripting - z3bra - 02-09-2015

God bless IRC logs:

RE: New (?) idea for shell scripting - pranomostro - 03-09-2015

Thanks, I'll use that :)

RE: New (?) idea for shell scripting - pranomostro - 09-09-2015

I found out that there is a coreutils utility, stdbuf.
It lets you set the buffer for a program, for example:
stdbuf -i0 -o512 awk 1
lets you set the input unbuffered for awk.