Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Bash Cookbook
  • Toc
  • feedback
Bash Cookbook

Bash Cookbook

By : Brash, Ganesh Sanjiv Naik
1 (1)
close
Bash Cookbook

Bash Cookbook

1 (1)
By: Brash, Ganesh Sanjiv Naik

Overview of this book

In Linux, one of the most commonly used and most powerful tools is the Bash shell. With its collection of engaging recipes, Bash Cookbook takes you through a series of exercises designed to teach you how to effectively use the Bash shell in order to create and execute your own scripts. The book starts by introducing you to the basics of using the Bash shell, also teaching you the fundamentals of generating any input from a command. With the help of a number of exercises, you will get to grips with the automation of daily tasks for sysadmins and power users. Once you have a hands-on understanding of the subject, you will move on to exploring more advanced projects that can solve real-world problems comprehensively on a Linux system. In addition to this, you will discover projects such as creating an application with a menu, beginning scripts on startup, parsing and displaying human-readable information, and executing remote commands with authentication using self-generated Secure Shell (SSH) keys. By the end of this book, you will have gained significant experience of solving real-world problems, from automating routine tasks to managing your systems and creating your own scripts.
Table of Contents (10 chapters)
close

Scraping the web and collecting files


In this recipe, we will learn how to collect data by web scraping. We will write a script for that.

Getting ready

Besides having a Terminal open, you need to have basic knowledge of the grep and wget commands.

How to do it…

Now, we will write a script to scrape the contents from imdb.com. We will use the grep and wget commands in the script to get the contents. Create a scrap_contents.shscript and write the following code in it:

$ mkdir -p data
$ cd data
$ wget -q -r -l5 -x 5  https://imdb.com
$ cd ..
$ grep -r -Po -h '(?<=href=")[^"]*' data/ > links.csv
$ grep "^http" links.csv > links_filtered.csv
$ sort -u links_filtered.csv > links_final.csv
$ rm -rf data links.csv links_filtered.csv

How it works…

In the preceding script, we have written code to get contents from a website. The wget utility is used for retrieving files from the web using the http, https, and ftp protocols. In this example, we are getting data from imdb.com and therefore we specified...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete