Outputs the content in the file in the standard output
1
2
3
➜~ cat words.txt
the day is sunny the the
the sunny isis
1
tr -s ' ''\n'
tr -s uses for truncating the input as per given command followed by it. In our case, we are interested in truncating each whitespace( ’ ‘) and replace it with newline(’\n’) as shown below:
1
2
3
4
5
6
7
8
9
10
11
➜~ cat words.txt | tr -s ' ''\n'the
day
issunny
the
the
the
sunny
isis
1
sort
This sort the input in ascending order so that uniq can find duplicate words adjacently (order does not matter for uniq) as shown below:
1
2
3
4
5
6
7
8
9
10
11
➜~ cat words.txt | tr -s ' ''\n'| sort
day
isisissunny
sunny
the
the
the
the
1
uniq --count
This command provides word frequency as “count word” format.
Filter adjacent matching lines from INPUT (or standard input),
writing to OUTPUT (or standard output).
Note: ‘uniq’ does not detect repeated lines unless they are adjacent.
awk formats the input given for each line. In our example, we want the second column (2) appears first and the first column appears first and the first column appears second separated by whitespace(" “)