Recursively Search and Replace Terms in Multiple Files with grep, xargs and sed

Posted by Hodge on Jan 13, 2009 in Environment, Linux, Ubuntu8 commentsPrint This Post

I recently offered to update some simple information on a website for a friend – normally an easy enough task, but unfortunately, even though the original developer of the site generated it with PHP, they didn’t utilise a database, or even combine common data/information/text (such as the header and footer of a HTML template) into manageable files. As a result, instead of simply changing text in a single file, which could be adopted site-wide, I was faced with (potentially) changing over 100 individual PHP files in a text editor! Not a pleasant task – if done manually…

Thankfully, Linux has dozens of command line utilities which can aid in just such a horribly monotonous task. In this case, I used grep, xargs and sed to undertake the task. I’ve written briefly before about grep, and how it can be used to recursively search files – extending the same concept, and throwing xargs and sed into the equation, it’s possible to recursively search files, and replace terms within those files. The command structure is as follows:

grep -lr -e '<searchterm>' * | xargs sed -i 's/<searchterm>/<targetterm>/g'

What is basically happening, is the command is telling Linux to find all the files containing <searchterm> and replace all occurrences with <targetterm>. grep is called first to find any files containing <searchterm>. When <searchterm> is matched, the information output by grep (i.e. the filename) is passed to xargs, which then executes sed. sed looks for the <searchterm> within the file passed via grep and xargs and replaces all instances of <searchterm> with <targetterm>.

A practical use can be changing all instances of “apples” to “oranges” in every html file contained in public_html and its sub-directories:

cd /home/username/public_html
grep -lr -e 'apples' *.html | xargs sed -i 's/apples/oranges/g'

To break it down further:

  1. grep is asked to list anything containing “apples”. The -r option tells grep to do this recursively, through all sub-directories, and the -l option tells grep to only list the file names containing “apples” (the default behaviour of grep is to ouput the filename, and every line within the file containing “apples”). The -e option tells grep that the search term (“apples”) may contain a regular expression, and to disregard the leading hyphen in the search term, so as not to interpret the search term as an option (all options passed to Linux commands begin with a hyphen). Finally, *.html tells grep to search all HTML files. This of course could be changed to *.php, or *.txt for example.
  2. Once grep finds an instance of “apples” within a file, the filename is piped | through to xargs, which allows additional commands to be executed without interrupting the original command, and, once the additional command has finished, to continue running the original command. In this case, sed can be executed without interrupting grep. xargs basically allows extending commands, feeding the output of one to another. For example, grep finds one or more instances of “apples” in fruit.html, and simply outputs the file name fruit.html (not the contents of the file itself, or even the lines containing “apples”). “fruit.html” is then piped through to xargs.
  3. xargs takes the output fed to it and effectively uses it as input to sed – in this case, the filename, “fruit.html”. sed is a command for streaming, filtering and altering text. sed is asked to look at fruit.html (sent to it from grep, through the pipe and via xargs), find any occurrences of “apples” and replace them with “oranges”. The -i option tells sed to do this in-place – i.e. to work with fruit.html to it, not to create a new file etc. Everything within the single quotes is the expression sed works with. First, s tells sed to substitute “apples” for “oranges”. The / is the delimiter, which effectively separates the search term and target term, and also separates those in turn from the options passed to sed (s and g). Finally, the g switch tells sed to make these changes global within the file – change all instances of <searchterm> as opposed to only the first instance (which is the default behaviour of sed).
  4. Once sed has finished editing fruit.html, the command loops back to grep, and the process continues until all instances in all files are changed.

What makes this combination of commands so powerful, is it’s not only possible to change words, or phrases, but complex regular expressions can be employed to search and replace intricate patterns within multiple files.

With this incredibly versatile set of commands in my toolbox, it took a matter of seconds to complete my task, instead of potentially dozens of minutes.

Finally, this isn’t the only and exclusive method for achieving the same result – there are numerous commands which may be strung together to the same effect, but as a frequent user of grep, xargs and sed, this seemed the sensible method :)

Linux: turning monotonous tasks into tasks. (OK, not a great slogan, but I was never really interested in marketing…)


Something not quite right? Inaccuracies or invalid code? Didn’t work for you? Don’t like me using Ss instead of Zs? Add a comment below! All comments are welcome. Except spam, because spam is a bit crap.


Related posts:

  1. Quickly archive multiple directories into separate archive files

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , , ,

8 comments

» Comments RSS Feed
  1. yes, great 1 more solution is this,
    very simple command for unix, linux based operating system.

    grep ’searchString’ DirectoryToSearch –include=*.py -nr

  2. how would i go about searching and replacing an url in multiple html files

    in your example it says apples/oranges with / being the delimiter but an url has many //’s

    how would i go about writing a log of what was edited

  3. Hi Joe,

    You would have to escape the / forward slashes with back slashes \ like so:

    grep -lr -e 'apples' *.html | xargs sed -i 's/http:\/\/www.example.com\//http:\/\/www.64bitjungle.com\//g'

    Or if you’re just using sed on a single file:

    sed -i 's/http:\/\/www.example.com\//http:\/\/www.64bitjungle.com\//g' filename.txt

    Hope that helps

  4. How would you go about doing this recursively to all files/folders under the current directory, i’ve already tried:

    grep -lr -e ” *.txt | xargs sed -i ’s/plums/apples/g’ -R

    but no luck.

    Any help would be appreciated.

  5. @joe

    The character you put after ’s’ (s being substitute command) determines delimiter you want to use. So you can always specify delimiter that is not contained in your search string to avoid errors, and provide better readability compared to escaping forward slashes. In your case to search for http://www.example.com/oldpage.html and replace it with http://www.example.com/newpage.html you could use ‘_’ as separator:

    sed ’s_http://www.example.com/oldpage.html_http://www.example.com/newpage.html_g’

  6. With all the knowledge above I created a script that asks you for input and then find and replace recursive in files. It will ask for a final confirmation- and shows the files to be changed.

    This one is meant to be integrated in Thunar file manager. But will also work on any other machine that has a desktop environment installed.

    http://www.barrydegraaff.tk/files/Linux/scripts/thunar_scripts/search_and_replace.sh

    usage:
    chmod +x search_and_replace.sh
    search_and_replace.sh /path/to/folder

    Requires zenity, gxmessage, xargs, grep and sed installed.

    I consider myself a bash newbie, so any enhancements are welcome, please notify me at
    http://www.barrydegraaff.tk/index.php?page=_content/contact.html

    Happy coding!

    Barry

  7. Instead of

    grep -lr -e ” * | xargs sed -i ’s///g’

    Why not:

    find -type f | xargs sed -i ’s///g’

    This saves both grep and sed parsing the file. Pass all files to sed, and let it parse it and replace.

  8. Hi Dave,

    No real reason other than I like to use grep :)

    That’s one of many great things about *NIX systems – there is always more than one way to solve a problem.

    Thanks for your suggestion.

Leave a comment