Can not understand the pipe() in my own shell

This is the code i found for my own shell. It works fine, but the thing i can't understand is pipe section of the code.

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

char* cmndtkn[256];
char buffer[256];
char* path=NULL;
char pwd[128];


int main(){

//setting path variable 
    char *env;
    env=getenv("PATH"); 
    putenv(env);
    system("clear");


printf("\t MY OWN SHELL !!!!!!!!!!\n ");
printf("_______________________________________\n\n");

while(1){

    fflush(stdin);
    getcwd(pwd,128);
    printf("[MOSH~%s]$",pwd);
    fgets(buffer,sizeof(buffer),stdin);
    buffer[sizeof(buffer)-1] = '\0';

    //tokenize the input command line   
    char* tkn = strtok(buffer," \t\n");
    int i=0;
    int indictr=0;



        // loop for every part of the command
        while(tkn!=NULL)
        {

            if(strcoll(tkn,"exit")==0 ){
                exit(0);                
            }

            else if(strcoll(buffer,"cd")==0){
            path = buffer;
            chdir(path+=3);
            }

            else if(strcoll(tkn,"|")==0){
            indictr=i;
            }

            cmndtkn[i++] = tkn;
            tkn = strtok(NULL," \t\n");
        }cmndtkn[i]='\0';

// execute when command has pipe. when | command is found indictr is greater than 0.
    if(indictr>0){

        char* leftcmnd[indictr+1];
        char* rightcmnd[i-indictr];
        int a,b;

        for(b=0;b<indictr;b++)
            leftcmnd[b]=cmndtkn[b];
        leftcmnd[indictr]=NULL;

        for(a=0;a<i-indictr-1;a++)
            rightcmnd[a]=cmndtkn[a+indictr+1];
        rightcmnd[i-indictr]=NULL;

        if(!fork())
        {   
            fflush(stdout);
            int pfds[2];
            pipe(pfds);

                if(!fork()){
                    close(1);
                    dup(pfds[1]);
                    close(pfds[0]);
                    execvp(leftcmnd[0],leftcmnd);
                }   
                else{
                    close(0);
                    dup(pfds[0]);
                    close(pfds[1]);
                    execvp(rightcmnd[0],rightcmnd);
                }
        }else
            wait(NULL);

//command not include pipe 

    }else{
        if(!fork()){
        fflush(stdout);
        execvp(cmndtkn[0],cmndtkn);

        }else
            wait(NULL);
    }

}

}

What is the purpose of the calls to close() with parameters of 0 and 1 mean and what does the call to dup() do?

Answers


On Unix, the dup() call uses the lowest numbered unused file descriptor. So, the close(1) before the call to dup() is to coerce dup() to use file descriptor 1. Similarly for close(0).

So, the aliasing is to get the process to use the write end of the pipe for stdout (file descriptor 1 is used for console output), and the read end of the pipe for stdin (file descriptor 0 is used for console input).

The code may have been more clearly expressed with dup2() instead.

dup2(fd[1], 1); /* alias fd[1] to 1 */

From your question about how ls | sort works, your question is not limited to why the dup() system call is being made. Your question is actually how pipes in Unix work, and how a shell command pipeline works.

A pipe in Unix is a pair of file descriptors that are related in that writing data on tje writable descriptor allows that data to be read from the readable descriptor. The pipe() call returns this pair in an array, where the first array element is readable, and second array element is writable.

In Unix, a fork() followed by some kind of exec() is the only way to produce a new process (there are other library calls, such as system() or popen() that create processes, but they call fork() and do an exec() under the hood). A fork() produces a child process. The child process sees the return value of 0 from the call, while the parent sees a non-zero return value that is either the PID of the child process, or a -1 indicating that an error has occurred.

The child process is a duplicate of the parent. This means that when a child modifies a variable, it is modifying a copy of the variable that resides in its own process. The parent does not see the modification occur, as the parent has the original copy). However, a duplicated pair of file descriptors that form a pipe can be used to allow a child process its parent to communicate with each other.

So, ls | sort means that there are two processes being spawned, and the output written by ls is being read as input by sort. Two processes means two calls to fork() to create two child processes. One child process will exec() the ls command, the other child process will exec() the sort command. A pipe is used between them to allow the processes to talk to each other. The ls process writes to the writable end of the pipe, the sort process reads from the readable end of the pipe.

The ls process is coerced into writing into the writable end of the pipe with the dup() call after issuing close(1). The sort process is coerced into reading the readable end of the pipe with the dup() call after close(0).

In addition, the close() calls that close the pipe file descriptors are used to make sure that the ls process is the only process to have an open reference to the writable fd, the the sort process is the only process to have an open reference to the readable fd. That step is important because after ls exits, it will close the writable end of the fd, and the sort process will expect to see an EOF as a result. However, this will not occur if some other process still has the writable fd open.


http://en.wikipedia.org/wiki/Standard_streams#Standard_input_.28stdin.29

stdin is file descriptor 0.

stdout is file descriptor 1.

In the !fork section, the process closes stdout then calls dup on pfds[1] which according to:

http://linux.die.net/man/2/dup

Creates a duplicate of the specified file descriptor at the lowest available position, which will be 1, since it was just closed (and stdin hasn't been closed yet). This means everything sent to stdout will really go to pfds[1].

So, basically, it's setting up the two new processes to talk to each other. the !fork section is for the new child which will send data to stdout (file descriptor 1), the parent (the else block) closes stdin, so it really reads from pfds[0] when it tries to read from stdout.

Each process has to close the file descriptor in pfds it's not using, as there are two open handles to the file now that the process has forked. Each process now execs to left/right-cmnd, but the new stdin and stdout mappings remain for the new processes.

Forking twice is explained here: Why fork() twice


Need Your Help

Logon identified — any backend storage of this number?

c# asp.net session session-cookies

When you log on to an ASP.NET app you are issued with a logon cookie (I think it's called ASPX_AUTH or similar). What is the structure of this cookie? Does the server actually maintain any logon st...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.