Recursively Search and Replace Terms in Multiple Files with grep, xargs and sed

Posted by Hodge on Jan 13, 2009 in Environment, Linux, Ubuntu10 commentsPrint This Post

I recently offered to update some simple information on a website for a friend – normally an easy enough task, but unfortunately, even though the original developer of the site generated it with PHP, they didn’t utilise a database, or even combine common data/information/text (such as the header and footer of a HTML template) into manageable files. As a result, instead of simply changing text in a single file, which could be adopted site-wide, I was faced with (potentially) changing over 100 individual PHP files in a text editor! Not a pleasant task – if done manually…

Thankfully, Linux has dozens of command line utilities which can aid in just such a horribly monotonous task. In this case, I used grep, xargs and sed to undertake the task. I’ve written briefly before about grep, and how it can be used to recursively search files – extending the same concept, and throwing xargs and sed into the equation, it’s possible to recursively search files, and replace terms within those files. The command structure is as follows:

grep -lr -e '<searchterm>' * | xargs sed -i 's/<searchterm>/<targetterm>/g'

What is basically happening, is the command is telling Linux to find all the files containing <searchterm> and replace all occurrences with <targetterm>. grep is called first to find any files containing <searchterm>. When <searchterm> is matched, the information output by grep (i.e. the filename) is passed to xargs, which then executes sed. sed looks for the <searchterm> within the file passed via grep and xargs and replaces all instances of <searchterm> with <targetterm>.

A practical use can be changing all instances of “apples” to “oranges” in every html file contained in public_html and its sub-directories:

cd /home/username/public_html
grep -lr -e 'apples' *.html | xargs sed -i 's/apples/oranges/g'

To break it down further:

  1. grep is asked to list anything containing “apples”. The -r option tells grep to do this recursively, through all sub-directories, and the -l option tells grep to only list the file names containing “apples” (the default behaviour of grep is to ouput the filename, and every line within the file containing “apples”). The -e option tells grep that the search term (“apples”) may contain a regular expression, and to disregard the leading hyphen in the search term, so as not to interpret the search term as an option (all options passed to Linux commands begin with a hyphen). Finally, *.html tells grep to search all HTML files. This of course could be changed to *.php, or *.txt for example.
  2. Once grep finds an instance of “apples” within a file, the filename is piped | through to xargs, which allows additional commands to be executed without interrupting the original command, and, once the additional command has finished, to continue running the original command. In this case, sed can be executed without interrupting grep. xargs basically allows extending commands, feeding the output of one to another. For example, grep finds one or more instances of “apples” in fruit.html, and simply outputs the file name fruit.html (not the contents of the file itself, or even the lines containing “apples”). “fruit.html” is then piped through to xargs.
  3. xargs takes the output fed to it and effectively uses it as input to sed – in this case, the filename, “fruit.html”. sed is a command for streaming, filtering and altering text. sed is asked to look at fruit.html (sent to it from grep, through the pipe and via xargs), find any occurrences of “apples” and replace them with “oranges”. The -i option tells sed to do this in-place – i.e. to work with fruit.html to it, not to create a new file etc. Everything within the single quotes is the expression sed works with. First, s tells sed to substitute “apples” for “oranges”. The / is the delimiter, which effectively separates the search term and target term, and also separates those in turn from the options passed to sed (s and g). Finally, the g switch tells sed to make these changes global within the file – change all instances of <searchterm> as opposed to only the first instance (which is the default behaviour of sed).
  4. Once sed has finished editing fruit.html, the command loops back to grep, and the process continues until all instances in all files are changed.

What makes this combination of commands so powerful, is it’s not only possible to change words, or phrases, but complex regular expressions can be employed to search and replace intricate patterns within multiple files.

With this incredibly versatile set of commands in my toolbox, it took a matter of seconds to complete my task, instead of potentially dozens of minutes.

Finally, this isn’t the only and exclusive method for achieving the same result – there are numerous commands which may be strung together to the same effect, but as a frequent user of grep, xargs and sed, this seemed the sensible method :)

Linux: turning monotonous tasks into tasks. (OK, not a great slogan, but I was never really interested in marketing…)


Something not quite right? Inaccuracies or invalid code? Didn’t work for you? Don’t like me using Ss instead of Zs? Add a comment below! All comments are welcome. Except spam, because spam is a bit crap.


eBay Logo  

Lot 6 New HP G60 15.6 Laptop Notebook Win 7 HDMI Webcam


Lot 6 New HP G60 15.6 Laptop Notebook Win 7 HDMI Webcam


$3,149.95


Lenovo ThinkPad X201s 5397FFU 5397-FFU Notebook/Laptop


Lenovo ThinkPad X201s 5397FFU 5397-FFU Notebook/Laptop


$2,646.40


Lenovo ThinkPad X201s 5413FFU 5413-FFU Notebook/Laptop


Lenovo ThinkPad X201s 5413FFU 5413-FFU Notebook/Laptop


$2,636.88


New Lenovo ThinkPad W510 43192RU Laptop Notebook


New Lenovo ThinkPad W510 43192RU Laptop Notebook


$2,494.99


Lenovo ThinkPad X200s 7469-5GU 74695GU Notebook/Laptop


Lenovo ThinkPad X200s 7469-5GU 74695GU Notebook/Laptop


$2,385.48


No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , , ,

10 comments

» Comments RSS Feed
  1. yes, great 1 more solution is this,
    very simple command for unix, linux based operating system.

    grep ’searchString’ DirectoryToSearch –include=*.py -nr

  2. how would i go about searching and replacing an url in multiple html files

    in your example it says apples/oranges with / being the delimiter but an url has many //’s

    how would i go about writing a log of what was edited

  3. Hi Joe,

    You would have to escape the / forward slashes with back slashes \ like so:

    grep -lr -e 'apples' *.html | xargs sed -i 's/http:\/\/www.example.com\//http:\/\/www.64bitjungle.com\//g'

    Or if you’re just using sed on a single file:

    sed -i 's/http:\/\/www.example.com\//http:\/\/www.64bitjungle.com\//g' filename.txt

    Hope that helps

  4. How would you go about doing this recursively to all files/folders under the current directory, i’ve already tried:

    grep -lr -e ” *.txt | xargs sed -i ’s/plums/apples/g’ -R

    but no luck.

    Any help would be appreciated.

  5. @joe

    The character you put after ’s’ (s being substitute command) determines delimiter you want to use. So you can always specify delimiter that is not contained in your search string to avoid errors, and provide better readability compared to escaping forward slashes. In your case to search for http://www.example.com/oldpage.html and replace it with http://www.example.com/newpage.html you could use ‘_’ as separator:

    sed ’s_http://www.example.com/oldpage.html_http://www.example.com/newpage.html_g’

  6. With all the knowledge above I created a script that asks you for input and then find and replace recursive in files. It will ask for a final confirmation- and shows the files to be changed.

    This one is meant to be integrated in Thunar file manager. But will also work on any other machine that has a desktop environment installed.

    http://www.barrydegraaff.tk/files/Linux/scripts/thunar_scripts/search_and_replace.sh

    usage:
    chmod +x search_and_replace.sh
    search_and_replace.sh /path/to/folder

    Requires zenity, gxmessage, xargs, grep and sed installed.

    I consider myself a bash newbie, so any enhancements are welcome, please notify me at
    http://www.barrydegraaff.tk/index.php?page=_content/contact.html

    Happy coding!

    Barry

  7. Instead of

    grep -lr -e ” * | xargs sed -i ’s///g’

    Why not:

    find -type f | xargs sed -i ’s///g’

    This saves both grep and sed parsing the file. Pass all files to sed, and let it parse it and replace.

  8. Hi Dave,

    No real reason other than I like to use grep :)

    That’s one of many great things about *NIX systems – there is always more than one way to solve a problem.

    Thanks for your suggestion.

  9. [...] post concerns finding and replacing terms across multiple files via a simple command.  Thanks to this great blog post, it all comes down to one single line.  I had found a previously-useful line of code, but the [...]

  10. Thanks for this – it was really interesting & I learnt about xargs, which I’d never had to fiddle with before.

    I suspect my search/rename task might have been beyond the capabilities of this combination of commands – or at least it was beyond the capabilities of me! I was looking to replace HTML quote characters (') with the literal ‘ symbol throughout files in a hierarchy of directories. This caused me NO END of issues. It seemed the major problems came from the fact that my filenames had both whitespace and literal quotes in them – which cause issues for xargs. The literal quote I was using in my sed string might have been an issue as well, but I think that escaping that made it OK.

    I fussed around for half an hour trying to find a combination of escape characters and command line arguments that would make things work but eventually I gave up and had to go looking for a more dummy-friendly solution. I found a GUI program called regexxer that was excellent – well worth checking out for people who can’t get the command-line solution to perform for them.

Leave a comment