A C program similar to grep command in UNIX
By: Abinaya in C Tutorials on 2007-09-22
Let us design and write a program to print each line of its input that contains a particular ``pattern'' or string of characters. (This is a special case of the UNIX program grep.) For example, searching for the pattern of letters ``ould'' in the set of linesAh Love! could you and I with Fate conspire To grasp this sorry Scheme of Things entire, Would not we shatter it to bits -- and then Re-mould it nearer to the Heart's Desire!will produce the output
Ah Love! could you and I with Fate conspire Would not we shatter it to bits -- and then Re-mould it nearer to the Heart's Desire!The job falls neatly into three pieces:
while (there's another line) if (the line contains the pattern) print itAlthough it's certainly possible to put the code for all of this in main, a better way is to use the structure to advantage by making each part a separate function. Three small pieces are better to deal with than one big one, because irrelevant details can be buried in the functions, and the chance of unwanted interactions is minimized. And the pieces may even be useful in other programs.
``While there's another line'' is getline function, and ``print it'' is printf, which someone has already provided for us. This means we need only write a routine to decide whether the line contains an occurrence of the pattern.
We can solve that problem by writing a function strindex(s,t) that returns the position or index in the string s where the string t begins, or -1 if s does not contain t. Because C arrays begin at position zero, indexes will be zero or positive, and so a negative value like -1 is convenient for signaling failure. When we later need more sophisticated pattern matching, we only have to replace strindex; the rest of the code can remain the same. (The standard library provides a function strstr that is similar to strindex, except that it returns a pointer instead of an index.)
Given this much design, filling in the details of the program is straightforward. Here is the whole thing, so you can see how the pieces fit together. For now, the pattern to be searched for is a literal string, which is not the most general of mechanisms. We will return shortly to a discussion of how to initialize character arrays. There is also a slightly different version of getline;
#include <stdio.h> #define MAXLINE 1000 /* maximum input line length */ int getline(char line[], int max) int strindex(char source[], char searchfor[]); char pattern[] = "ould"; /* pattern to search for */ /* find all lines matching pattern */ main() { char line[MAXLINE]; int found = 0; while (getline(line, MAXLINE) > 0) if (strindex(line, pattern) >= 0) { printf("%s", line); found++; } return found; } /* getline: get line into s, return length */ int getline(char s[], int lim) { int c, i; i = 0; while (--lim > 0 && (c=getchar()) != EOF && c != '\n') s[i++] = c; if (c == '\n') s[i++] = c; s[i] = '\0'; return i; } /* strindex: return index of t in s, -1 if none */ int strindex(char s[], char t[]) { int i, j, k; for (i = 0; s[i] != '\0'; i++) { for (j=i, k=0; t[k]!='\0' && s[j]==t[k]; j++, k++) ; if (k > 0 && t[k] == '\0') return i; } return -1; }Each function definition has the form
return-type function-name(argument declarations) { declarations and statements }Various parts may be absent; a minimal function is
dummy() {}which does nothing and returns nothing. A do-nothing function like this is sometimes useful as a place holder during program development. If the return type is omitted, int is assumed.
A program is just a set of definitions of variables and functions. Communication between the functions is by arguments and values returned by the functions, and through external variables. The functions can occur in any order in the source file, and the source program can be split into multiple files, so long as no function is split.
The return statement is the mechanism for returning a value from the called function to its caller. Any expression can follow return:
return expression;The expression will be converted to the return type of the function if necessary. Parentheses are often used around the expression, but they are optional.
The calling function is free to ignore the returned value. Furthermore, there need to be no expression after return; in that case, no value is returned to the caller. Control also returns to the caller with no value when execution ``falls off the end'' of the function by reaching the closing right brace. It is not illegal, but probably a sign of trouble, if a function returns a value from one place and no value from another. In any case, if a function fails to return a value, its ``value'' is certain to be garbage.
The pattern-searching program returns a status from main, the number of matches found. This value is available for use by the environment that called the program
The mechanics of how to compile and load a C program that resides on multiple source files vary from one system to the next. On the UNIX system, for example, the cc command does the job. Suppose that the three functions are stored in three files called main.c, getline.c, and strindex.c. Then the command
cc main.c getline.c strindex.ccompiles the three files, placing the resulting object code in files main.o, getline.o, and strindex.o, then loads them all into an executable file called a.out. If there is an error, say in main.c, the file can be recompiled by itself and the result loaded with the previous object files, with the command
cc main.c getline.o strindex.oThe cc command uses the ``.c'' versus ``.o'' naming convention to distinguish source files from object files.
Add Comment
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
- Your name, rating, website address, town, country, state and comment will be publicly displayed if entered.
- Aside from the data entered into these form fields, other stored data about your comment will include:
- Your IP address (not displayed)
- The time/date of your submission (displayed)
- Your email address will not be shared. It is collected for only two reasons:
- Administrative purposes, should a need to contact you arise.
- To inform you of new comments, should you subscribe to receive notifications.
- A cookie may be set on your computer. This is used to remember your inputs. It will expire by itself.
This policy is subject to change at any time and without notice.
These terms and conditions contain rules about posting comments. By submitting a comment, you are declaring that you agree with these rules:
- Although the administrator will attempt to moderate comments, it is impossible for every comment to have been moderated at any given time.
- You acknowledge that all comments express the views and opinions of the original author and not those of the administrator.
- You agree not to post any material which is knowingly false, obscene, hateful, threatening, harassing or invasive of a person's privacy.
- The administrator has the right to edit, move or remove any comment for any reason and without notice.
Failure to comply with these rules may result in being banned from submitting further comments.
These terms and conditions are subject to change at any time and without notice.
- Data Science
- Android
- React Native
- AJAX
- ASP.net
- C
- C++
- C#
- Cocoa
- Cloud Computing
- HTML5
- Java
- Javascript
- JSF
- JSP
- J2ME
- Java Beans
- EJB
- JDBC
- Linux
- Mac OS X
- iPhone
- MySQL
- Office 365
- Perl
- PHP
- Python
- Ruby
- VB.net
- Hibernate
- Struts
- SAP
- Trends
- Tech Reviews
- WebServices
- XML
- Certification
- Interview
categories
Related Tutorials
Sum of the elements of an array in C
Printing a simple histogram in C
Find square and square root for a given number in C
Simple arithmetic calculations in C
Passing double value to a function in C
Passing pointer to a function in C
Infix to Prefix And Postfix in C
while, do while and for loops in C
Comments